Date of Award


Document Type

Thesis (Undergraduate)


Department of Computer Science

First Advisor

Javed Aslam


This thesis presents a system for web-based information retrieval that supports precise and informative post-query organization (automated document clustering by topic) to decrease real search time on the part of the user. Most existing Information Retrieval systems depend on the user to perform intelligent, specific queries with Boolean operators in order to minimize the set of returned documents. The user essentially must guess the appropriate keywords before performing the query. Other systems use a vector space model which is more suitable to performing the document similarity operations which permit hierarchical clustering of returned documents by topic. This allows "post query" refinement by the user. The system we propose is a hybrid beween these two systems, compatibile with the former, while providing the enhanced document organization permissable by the latter.


Originally posted in the Dartmouth College Computer Science Technical Report Series, number PCS-TR97-318.