Algorithms for bipartite matching problems with connections to sparsification and streaming

Placeholder Show Content

Abstract/Contents

Abstract
The problem of finding maximum matchings in bipartite graphs is a classical problem in combinatorial optimization with a long algorithmic history. Graph sparsification is a more recent paradigm of replacing a graph with a smaller subgraph that preserves some useful properties of the original graph, perhaps approximately. Traditionally, sparsification has been used for obtaining faster algorithms for cut-based optimization problems. The contributions of this thesis are centered around new algorithms for bipartite matching problems, in which, surprisingly, graph sparsification plays a major role, and efficient algorithms for constructing sparsifiers in modern data models. In the first part of the thesis we develop sublinear time algorithms for finding perfect matchings in regular bipartite graphs. These graphs have been studied extensively in the context of expander constructions, and have several applications in combinatorial optimization. The problem of finding perfect matchings in regular bipartite graphs has seen almost 100 years of algorithmic history, with the first algorithm dating back to K\"onig in 1916 and an algorithm with runtime linear in the number of edges in the graph discovered in 2000. In this thesis we show that, even though traditionally the use of sparsification has been restricted to cut-based problems, in fact sparsification yields extremely efficient {\em sublinear time} algorithms for finding perfect matchings in regular bipartite graphs when the graph is given in adjacency array representation. Thus, our algorithms recover a perfect matching (with high probability) without looking the whole input. We present two approaches, one based on independent sampling and another on random walks, obtaining an algorithm that recovers a perfect matching in $O(n\log n)$ time, within $O(\log n)$ of output complexity, essentially closing the problem. In the second part of the thesis we study the streaming complexity of maximum bipartite matching. This problem is relevant to modern data models, where the algorithm is constrained in space and is only allowed few passes over the input. We are interested in determining the best tradeoff between the space usage and the quality of the solution obtained. We first study the problem in the single pass setting. A central object of our study is a new notion of sparsification relevant to matching problems: we define the notion of an $\e$-matching cover of a bipartite graph as a subgraph that approximately preserves sizes of matchings between every two subsets of vertices, which can be viewed as a 'sparsifier' for matching problems. We give an efficient construction of a sparse subgraph that we call a 'matching skeleton', which we show is a linear-size matching cover for a certain range of parameters (in fact, for $\e> 1/2$). We then show that our 'sparsifier' can be applied repeatedly while maintaining a non-trivial approximation ratio in the streaming model with vertex arrivals, obtaining the first $1-1/e$ deterministic one-pass streaming algorithm that uses linear space for this setting. Further, we show that this is in fact best possible: no algorithm can obtain a better than $1-1/e$ approximation in a single pass unless it uses significantly more than quasilinear space. This is a rather striking conclusion since a $1-1/e$ approximation can be obtained even in the more restrictive online model for this setting. Thus, we show that streaming algorithms can get no advantage over online algorithms for this problem unless they use substantially more than quasilinear space. Our impossibility results for approximating matchings in a single pass using small space exploit a surprising connection between the sparsifiers that we define and a family of graphs known as \rs graphs. In particular, we show that bounding the best possible size of $\e$-covers for general $\e$ is essentially equivalent to determining the optimal size of an $\e$-\rs graph. These graphs have received significant attention due to applications in PCP constructions, property testing and additive combinatorics, but determining their optimal size still remains a challenging open problem. Besides giving matching upper and lower bounds for single pass algorithms in the vertex arrival setting, we also consider the problem of approximating matchings in multiple passes. Here we give an algorithm that achieves a factor of $1-e^{-k}k^{k}/k!=1-\frac{1}{\sqrt{2\pi k}}+o(1/k)$ in $k$ passes, improving upon the previously best known approximation. In the third part of the thesis we consider the concept of {\em spectral sparsification} introduced by Spielman and Teng. Here, we uncover a connection between spectral sparsification and spanners, i.e. subgraphs that approximately preserve shortest path distances. This connection allows us to obtain a quasilinear time algorithm for constructing spectral sparsifiers using approximate distance oracles and entirely bypassing linear system solvers, which was previously the only known way of constructing spectral sparsifiers in quasilinear time. Finally, in the last part of the thesis we design an efficient implementation of cut-preserving sparsification in a streaming setting with edge deletions using only one pass over the data.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2012
Issuance monographic
Language English

Creators/Contributors

Associated with Kapralov, Mikhail
Associated with Stanford University, Department of Computational and Mathematical Engineering.
Primary advisor Goel, Ashish
Thesis advisor Goel, Ashish
Thesis advisor Roughgarden, Tim
Thesis advisor Saberi, Amin
Advisor Roughgarden, Tim
Advisor Saberi, Amin

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Mikhail Kapralov.
Note Submitted to the Department of Computational and Mathematical Engineering.
Thesis Ph.D. Stanford University 2012
Location electronic resource

Access conditions

Copyright
© 2012 by Mikhail Kapralov

Also listed in

Loading usage metrics...