Graph algorithms for systems design

Placeholder Show Content

Abstract/Contents

Abstract
Graphs are a powerful tool for data and system representation. Many types of complex and highly structured data can be represented using graphs, such as social networks, computer networks, and molecules. Graphs can also be used to represent computer systems, such as distributed storage networks and peer-to-peer communication networks. In this dissertation, we discuss approaches to processing graph data at scale and using graphs to design better systems. We first discuss two methods of processing graph data at scale. While they are extremely powerful, graph datasets pose unique challenges for their processing and storage. Graph Neural Networks (GNNs) are an effective method of applying deep learning to graph structured data. However, training GNNs can be extremely expensive to compute because of the interconnected and highly structured nature of graphs. We study methods for hierarchical aggregation, an approach to improving the efficiency of training GNNs. Another approach to understanding graph datasets is to examine frequencies of small, repeated patterns, called motifs. We propose the Temporal Activity State Block Model, an efficient analytical model for computing expected motif frequencies in temporal graphs, which have the added complexity of edges arriving over large timespans. We next cover two methods of applying graphs to design better systems. In a distributed storage system, it is often necessary to store data with redundancy in case a server fails, and the design of where and how frequently to create such redundancy can be represented as a graph problem. Fractional repetition (FR) codes are an approach to this designed to maximize storage capacity while ensuring failed nodes can be replaced by sending replacement data from the surviving nodes. We present load-balanced fractional repetition codes which are a strengthening of FR codes with additional guarantees about how efficiently failed nodes can be replaced. We next consider the problem of sending messages efficiently in a peer-to-peer network. The problem can be characterized by a graph representing which peers have data that another peer wants. Index coding is a method for designing efficient communication from a central server to a set of receivers. We adapt this approach to the peer-to-peer model and introduce and study embedded index coding.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2022; ©2022
Publication date 2022; 2022
Issuance monographic
Language English

Creators/Contributors

Author Porter, Alexandra Marie
Degree supervisor Leskovec, Jurij
Degree supervisor Wootters, Mary
Thesis advisor Leskovec, Jurij
Thesis advisor Wootters, Mary
Thesis advisor Charikar, Moses
Degree committee member Charikar, Moses
Associated with Stanford University, Computer Science Department

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Alexandra Porter.
Note Submitted to the Computer Science Department.
Thesis Thesis Ph.D. Stanford University 2022.
Location https://purl.stanford.edu/sz057dp7669

Access conditions

Copyright
© 2022 by Alexandra Marie Porter
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...