Document Type

Technical Report

Publication Date

12-1-2000

Technical Report Number

TR2000-382

Abstract

We introduce a new, probabilistic model for combining the outputs of an arbitrary number of query retrieval systems. By gathering simple statistics on the average performance of a given set of query retrieval systems, we construct a Bayes optimal mechanism for combining the outputs of these systems. Our construction yields a metasearch strategy whose empirical performance nearly always exceeds the performance of any of the constituent systems. Our construction is also robust in the sense that if ``good'' and ``bad'' systems are combined, the performance of the composite is still on par with, or exceeds, that of the best constituent system. Finally, our model and theory provide theoretical and empirical avenues for the improvement of this metasearch strategy.

Comments

Preliminary version appeared in SIGIR 2000.

Share

COinS