We present a thorough analysis of the capabilities of the linear combination (LC) model for fusion of information retrieval systems. The LC model combines the results lists of multiple IR systems by scoring each document using a weighted sum of the scores from each of the component systems. We first present both empirical and analytical justification for the hypotheses that such a model should only be used when the systems involved have high performance, a large overlap of relevant documents, and a small overlap of nonrelevant documents. The empirical approach allows us to very accurately predict the performance of a combined system. We also derive a formula for a theoretically optimal weighting scheme for combining 2 systems. We introduce d—the difference between the average score on relevant documents and the average score on nonrelevant documents—as a performance measure which not only allows mathematical reasoning about system performance, but also allows the selection of weights which generalize well to new documents. We describe a number of experiments involving large numbers of different IR systems which support these findings.