Date of Award
Department of Computer Science
This thesis presents a system for web-based information retrieval that supports precise and informative post-query organization (automated document clustering by topic) to decrease real search time on the part of the user. Most existing Information Retrieval systems depend on the user to perform intelligent, specific queries with Boolean operators in order to minimize the set of returned documents. The user essentially must guess the appropriate keywords before performing the query. Other systems use a vector space model which is more suitable to performing the document similarity operations which permit hierarchical clustering of returned documents by topic. This allows "post query" refinement by the user. The system we propose is a hybrid beween these two systems, compatibile with the former, while providing the enhanced document organization permissable by the latter.
Hagen, Eric, "An Information Retrieval System for Performing Hierarchical Document Clustering" (1997). Dartmouth College Undergraduate Theses. 183.