Date of Award
5-30-1997
Document Type
Thesis (Undergraduate)
Department or Program
Department of Computer Science
First Advisor
Javed Aslam
Abstract
This thesis presents a system for web-based information retrieval that supports precise and informative post-query organization (automated document clustering by topic) to decrease real search time on the part of the user. Most existing Information Retrieval systems depend on the user to perform intelligent, specific queries with Boolean operators in order to minimize the set of returned documents. The user essentially must guess the appropriate keywords before performing the query. Other systems use a vector space model which is more suitable to performing the document similarity operations which permit hierarchical clustering of returned documents by topic. This allows "post query" refinement by the user. The system we propose is a hybrid beween these two systems, compatibile with the former, while providing the enhanced document organization permissable by the latter.
Recommended Citation
Hagen, Eric, "An Information Retrieval System for Performing Hierarchical Document Clustering" (1997). Dartmouth College Undergraduate Theses. 183.
https://digitalcommons.dartmouth.edu/senior_theses/183
Comments
Originally posted in the Dartmouth College Computer Science Technical Report Series, number PCS-TR97-318.