Date of Award

5-1-2003

Document Type

Thesis (Undergraduate)

Department or Program

Department of Computer Science

First Advisor

Jay Aslam

Abstract

Evaluating retrieval systems, such as those submitted to the annual TREC competition, usually requires a large number of documents to be read and judged for relevance to query topics. Test collections are far too big to be exhaustively judged, so only a subset of documents is selected to form the judgment ``pool.'' The selection method that TREC uses produces pools that are still quite large. Research has indicated that it is possible to rank the retrieval systems correctly using substantially smaller pools. This paper introduces an active learning algorithm whose goal is to reach the correct rankings using the smallest possible number of relevance judgments. It adds one document to the pool at a time, always trying to select the document with the highest information gain. Several variants of this algorithm are described, each with improvements on the one before. Results from experiments are included for comparison with the traditional TREC pooling method. The best version of the algorithm reliably outperforms the traditional method, although its degree of improvement varies.

Comments

Originally posted in the Dartmouth College Computer Science Technical Report Series, number TR2003-449.

Recommended Citation

Torrey, Lisa A., "An Active Learning Approach to Efficiently Ranking Retrieval Engines" (2003). Dartmouth College Undergraduate Theses. 28.
https://digitalcommons.dartmouth.edu/senior_theses/28

Download

Included in

Computer Sciences Commons

COinS

Dartmouth College Undergraduate Theses

An Active Learning Approach to Efficiently Ranking Retrieval Engines

Date of Award

Document Type

Department or Program

First Advisor

Abstract

Comments

Recommended Citation

Included in

Browse

Search

Contribute

Questions?

Dartmouth College Undergraduate Theses

An Active Learning Approach to Efficiently Ranking Retrieval Engines

Author

Date of Award

Document Type

Department or Program

First Advisor

Abstract

Comments

Recommended Citation

Included in

Share

Browse

Search

Contribute

Questions?