Date of Award

6-1-2012

Document Type

Thesis (Master's)

Department or Program

Department of Computer Science

First Advisor

Afra Zomorodian

Abstract

The first step in topological data analysis is often the construction of a simplicial complex. This complex approximates the lost topology of a sampled point set. Current techniques often assume that the input is embedded in a metric -- often Euclidean -- space, and make significant use of the underlying geometry for efficient computation. Consequently, these techniques do not extend to non-Euclidean or non-metric spaces. In this thesis, we present an oracle-based framework for constructing simplicial complexes over arbitrary topological spaces. The framework consists of an oracle and an algorithm that builds the simplicial complex by making calls to the oracle. We compare different algorithmic approaches for the construction, as well as alternate ways of representing the simplicial complex in the computation. Finally, we demonstrate the utility of our framework as a tool for approaching problems of diverse nature by presenting three applications: to multiword search in Google, to the computational analysis of a language and to the study of protein structure.

Comments

Originally posted in the Dartmouth College Computer Science Technical Report Series, number TR2012-721.

Share

COinS