Graph algorithms for systems design

Porter, Alexandra Marie

Graph algorithms for systems design

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fsz057dp7669" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Graphs are a powerful tool for data and system representation. Many types of complex and highly structured data can be represented using graphs, such as social networks, computer networks, and molecules. Graphs can also be used to represent computer systems, such as distributed storage networks and peer-to-peer communication networks. In this dissertation, we discuss approaches to processing graph data at scale and using graphs to design better systems. We first discuss two methods of processing graph data at scale. While they are extremely powerful, graph datasets pose unique challenges for their processing and storage. Graph Neural Networks (GNNs) are an effective method of applying deep learning to graph structured data. However, training GNNs can be extremely expensive to compute because of the interconnected and highly structured nature of graphs. We study methods for hierarchical aggregation, an approach to improving the efficiency of training GNNs. Another approach to understanding graph datasets is to examine frequencies of small, repeated patterns, called motifs. We propose the Temporal Activity State Block Model, an efficient analytical model for computing expected motif frequencies in temporal graphs, which have the added complexity of edges arriving over large timespans. We next cover two methods of applying graphs to design better systems. In a distributed storage system, it is often necessary to store data with redundancy in case a server fails, and the design of where and how frequently to create such redundancy can be represented as a graph problem. Fractional repetition (FR) codes are an approach to this designed to maximize storage capacity while ensuring failed nodes can be replaced by sending replacement data from the surviving nodes. We present load-balanced fractional repetition codes which are a strengthening of FR codes with additional guarantees about how efficiently failed nodes can be replaced. We next consider the problem of sending messages efficiently in a peer-to-peer network. The problem can be characterized by a graph representing which peers have data that another peer wants. Index coding is a method for designing efficient communication from a central server to a set of receivers. We adapt this approach to the peer-to-peer model and introduce and study embedded index coding.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2022; ©2022
Publication date	2022; 2022
Issuance	monographic
Language	English

Creators/Contributors

Author	Porter, Alexandra Marie
Degree supervisor	Leskovec, Jurij
Degree supervisor	Wootters, Mary
Thesis advisor	Leskovec, Jurij
Thesis advisor	Wootters, Mary
Thesis advisor	Charikar, Moses
Degree committee member	Charikar, Moses
Associated with	Stanford University, Computer Science Department

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Alexandra Porter.
Note	Submitted to the Computer Science Department.
Thesis	Thesis Ph.D. Stanford University 2022.
Location	https://purl.stanford.edu/sz057dp7669

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...