Graph algorithms for systems design
- Graphs are a powerful tool for data and system representation. Many types of complex and highly structured data can be represented using graphs, such as social networks, computer networks, and molecules. Graphs can also be used to represent computer systems, such as distributed storage networks and peer-to-peer communication networks. In this dissertation, we discuss approaches to processing graph data at scale and using graphs to design better systems. We first discuss two methods of processing graph data at scale. While they are extremely powerful, graph datasets pose unique challenges for their processing and storage. Graph Neural Networks (GNNs) are an effective method of applying deep learning to graph structured data. However, training GNNs can be extremely expensive to compute because of the interconnected and highly structured nature of graphs. We study methods for hierarchical aggregation, an approach to improving the efficiency of training GNNs. Another approach to understanding graph datasets is to examine frequencies of small, repeated patterns, called motifs. We propose the Temporal Activity State Block Model, an efficient analytical model for computing expected motif frequencies in temporal graphs, which have the added complexity of edges arriving over large timespans. We next cover two methods of applying graphs to design better systems. In a distributed storage system, it is often necessary to store data with redundancy in case a server fails, and the design of where and how frequently to create such redundancy can be represented as a graph problem. Fractional repetition (FR) codes are an approach to this designed to maximize storage capacity while ensuring failed nodes can be replaced by sending replacement data from the surviving nodes. We present load-balanced fractional repetition codes which are a strengthening of FR codes with additional guarantees about how efficiently failed nodes can be replaced. We next consider the problem of sending messages efficiently in a peer-to-peer network. The problem can be characterized by a graph representing which peers have data that another peer wants. Index coding is a method for designing efficient communication from a central server to a set of receivers. We adapt this approach to the peer-to-peer model and introduce and study embedded index coding.
|Type of resource
|electronic resource; remote; computer; online resource
|1 online resource.
|Porter, Alexandra Marie
|Degree committee member
|Stanford University, Computer Science Department
|Statement of responsibility
|Submitted to the Computer Science Department.
|Thesis Ph.D. Stanford University 2022.
- © 2022 by Alexandra Marie Porter
Also listed in
Loading usage metrics...