Date of Award
Summer 6-6-2021
Document Type
Thesis (Undergraduate)
Department or Program
Department of Computer Science
First Advisor
Amit Chakrabarti
Abstract
In this thesis, we explore the problem of approximating the number of elementary substructures called simplices in large k-uniform hypergraphs. The hypergraphs are assumed to be too large to be stored in memory, so we adopt a data stream model, where the hypergraph is defined by a sequence of hyperedges.
First we propose an algorithm that (ε, δ)-estimates the number of simplices using O(m1+1/k / T) bits of space. In addition, we prove that no constant-pass streaming algorithm can (ε, δ)- approximate the number of simplices using less than O( m 1+1/k / T ) bits of space. Thus we resolve the space complexity of the simplex counting problem by providing an algorithm that matches the lower bound.
Second, we examine the triangle counting question –a hypergraph where k = 2. We develop and analyze an almost optimal O (n+m 3/2 / T) triangle-counting algorithm based on ideas introduced in [KMPT12]. The proposed algorithm is subsequently used to establish a method for uniformly sampling triangles in a graph stream using O(m 3/2 / T) bits of space, which beats the state-of-the-art O(mn / T) algorithm given by [PTTW13]
Recommended Citation
Haris, Themistoklis, "Counting and Sampling Small Structures in Graph and Hypergraph Data Streams" (2021). Dartmouth College Undergraduate Theses. 230.
https://digitalcommons.dartmouth.edu/senior_theses/230
Included in
Databases and Information Systems Commons, Data Science Commons, OS and Networks Commons, Theory and Algorithms Commons