Date of Award


Document Type

Thesis (Master's)


Department of Computer Science

First Advisor

Afra Zomorodian


The first step in topological data analysis is often the construction of a simplicial complex. This complex approximates the lost topology of a sampled point set. Current techniques often assume that the input is embedded in a metric -- often Euclidean -- space, and make significant use of the underlying geometry for efficient computation. Consequently, these techniques do not extend to non-Euclidean or non-metric spaces. In this thesis, we present an oracle-based framework for constructing simplicial complexes over arbitrary topological spaces. The framework consists of an oracle and an algorithm that builds the simplicial complex by making calls to the oracle. We compare different algorithmic approaches for the construction, as well as alternate ways of representing the simplicial complex in the computation. Finally, we demonstrate the utility of our framework as a tool for approaching problems of diverse nature by presenting three applications: to multiword search in Google, to the computational analysis of a language and to the study of protein structure.


Originally posted in the Dartmouth College Computer Science Technical Report Series, number TR2012-721.