Date of Award
6-8-1999
Document Type
Thesis (Undergraduate)
Department or Program
Department of Computer Science
First Advisor
Javed A. Aslam
Abstract
The problems of word sense disambiguation and document indexing for information retrieval have been extensively studied. It has been observed that indexing using disambiguated meanings, rather than word stems, should improve information retrieval results. We present a new corpus-based algorithm for performing word sense disambiguation. The algorithm does not need to train on many senses of each word; it uses instead the probability that certain concepts will occur together. That algorithm is then used to index several corpa of documents. Our indexing algorithm does not generally outperform the traditional stem-based tf.idf model.
Recommended Citation
Whaley, Jason M., "An Application of Word Sense Disambiguation to Information Retrieval" (1999). Dartmouth College Undergraduate Theses. 198.
https://digitalcommons.dartmouth.edu/senior_theses/198
Comments
Originally posted in the Dartmouth College Computer Science Technical Report Series, number PCS-TR99-352.