Date of Award
Department of Computer Science
Javed A. Aslam
The problems of word sense disambiguation and document indexing for information retrieval have been extensively studied. It has been observed that indexing using disambiguated meanings, rather than word stems, should improve information retrieval results. We present a new corpus-based algorithm for performing word sense disambiguation. The algorithm does not need to train on many senses of each word; it uses instead the probability that certain concepts will occur together. That algorithm is then used to index several corpa of documents. Our indexing algorithm does not generally outperform the traditional stem-based tf.idf model.
Whaley, Jason M., "An Application of Word Sense Disambiguation to Information Retrieval" (1999). Dartmouth College Undergraduate Theses. 198.