Hardware acceleration For graph neural networks

Placeholder Show Content

Abstract/Contents

Abstract
Traditional deep neural networks (DNNs) rely on regularly structured inputs such as vectors, images, or sequences. This reliance on regularity makes them difficult to use in domains where data is naturally irregular, such as connections on social media. Graph neural networks (GNNs) extend DNNs to operate on arbitrarily structured graph-valued data. However, GNNs do not accelerate efficiently on CPUs, GPUs, and DNN accelerators such as a TPU, due to their irregular and input-dependent pattern of computation. As a result, GNNs have much higher inference latency than other types of DNNs in practice. This limits their use to applications where inference can be precomputed off-line. This dissertation presents hardware and software techniques to reduce the inference latency of GNNs. To this end, we make the following three contributions. First, we introduce a decomposition of GNN inference into a series of three computational phases: aggregate, combine, and update. This decomposition permits a simple representation for a broad class of GNNs we call GReTA. Second, we introduce a GNN accelerator architecture called GRIP. GRIP alleviates the bottlenecks in each phase by using separate memory subsystems specialized for the different access patterns in each phase of inference. Finally, we introduce optimizations to reduce the working memory and bandwidth requirements for GNN inference including caching partitions of feature data, inter-phase pipelining, and merging computation. We also introduce a novel optimization called vertex-tiling that substantially improves latency by increasing the reuse of weight values during inference. Taken together, these techniques significantly reduce the latency of GNN inference over existing state-of-the-art implementations. Evaluated on a broad range of models and datasets, our accelerator reduces latency by 7-70x compared to a CPU and GPU baseline.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2021; ©2021
Publication date 2021; 2021
Issuance monographic
Language English

Creators/Contributors

Author Kiningham, Kevin Nicholas
Degree supervisor Levis, Philip
Thesis advisor Levis, Philip
Thesis advisor Horowitz, Mark (Mark Alan)
Thesis advisor Ré, Christopher
Degree committee member Horowitz, Mark (Mark Alan)
Degree committee member Ré, Christopher
Associated with Stanford University, Department of Electrical Engineering

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Kevin Kiningham.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis Ph.D. Stanford University 2021.
Location https://purl.stanford.edu/rd907hk5005

Access conditions

Copyright
© 2021 by Kevin Nicholas Kiningham
License
This work is licensed under a Creative Commons Attribution 3.0 Unported license (CC BY).

Also listed in

Loading usage metrics...