Date of Award

6-11-2022

Document Type

Thesis (Ph.D.)

Department or Program

Computer Science

First Advisor

Amit Chakrabarti

Abstract

Structured data-sets are often easy to represent using graphs. The prevalence of massive data-sets in the modern world gives rise to big graphs such as web graphs, social networks, biological networks, and citation graphs. Most of these graphs keep growing continuously and pose two major challenges in their processing: (a) it is infeasible to store them entirely in the memory of a regular server, and (b) even if stored entirely, it is incredibly inefficient to reread the whole graph every time a new query appears. Thus, a natural approach for efficiently processing and analyzing such graphs is reading them as a stream of edge insertions and deletions and maintaining a summary that can be (a) stored in affordable memory (significantly smaller than the input size) and (b) used to detect properties of the original graph. In this thesis, we explore the strengths and limitations of such graph streaming algorithms under three main paradigms: classical or standard streaming, adversarially robust streaming, and streaming verification.

In the classical streaming model, an algorithm needs to process an adversarially chosen input stream using space sublinear in the input size and return a desired output at the end of the stream. Here, we study a collection of fundamental directed graph problems like reachability, acyclicity testing, and topological sorting. Our investigation reveals that while most problems are provably hard for general digraphs, they admit efficient algorithms for the special and widely-studied subclass of tournament graphs. Further, we exhibit certain problems that become drastically easier when the stream elements arrive in random order rather than adversarial order, as well as problems that do not get much easier even under this relaxation. Furthermore, we study the graph coloring problem in this model and design color-efficient algorithms using novel parameterizations and establish complexity separations between different versions of the problem.

The classical streaming setting assumes that the entire input stream is fixed by an adversary before the algorithm reads it. Many randomized algorithms in this setting, however, fail when the stream is extended by an adaptive adversary based on past outputs received. This is the so-called adversarially robust streaming model. We show that graph coloring is significantly harder in the robust setting than in the classical setting, thus establishing the first such separation for a ``natural'' problem. We also design a class of efficient robust coloring algorithms using novel techniques.

In classical streaming, many important problems turn out to be ``intractable'', i.e., provably impossible to solve in sublinear space. It is then natural to consider an enhanced streaming setting where a space-bounded client outsources the computation to a space-unbounded but untrusted cloud service, who replies with the solution and a supporting ``proof'' that the client needs to verify. This is called streaming verification or the annotated streaming model. It allows algorithms or verification schemes for the otherwise intractable problems using both space and proof length sublinear in the input size. We devise efficient schemes that improve upon the state of the art for a variety of fundamental graph problems including triangle counting, maximum matching, topological sorting, maximal independent set, graph connectivity, and shortest paths, as well as for computing frequency-based functions such as distinct items and maximum frequency, which have broad applications in graph streaming. Some of our schemes were conjectured to be impossible, while some others attain smooth and optimal tradeoffs between space and communication costs.

Share

COinS