Hardware acceleration For graph neural networks
Abstract/Contents
- Abstract
- Traditional deep neural networks (DNNs) rely on regularly structured inputs such as vectors, images, or sequences. This reliance on regularity makes them difficult to use in domains where data is naturally irregular, such as connections on social media. Graph neural networks (GNNs) extend DNNs to operate on arbitrarily structured graph-valued data. However, GNNs do not accelerate efficiently on CPUs, GPUs, and DNN accelerators such as a TPU, due to their irregular and input-dependent pattern of computation. As a result, GNNs have much higher inference latency than other types of DNNs in practice. This limits their use to applications where inference can be precomputed off-line. This dissertation presents hardware and software techniques to reduce the inference latency of GNNs. To this end, we make the following three contributions. First, we introduce a decomposition of GNN inference into a series of three computational phases: aggregate, combine, and update. This decomposition permits a simple representation for a broad class of GNNs we call GReTA. Second, we introduce a GNN accelerator architecture called GRIP. GRIP alleviates the bottlenecks in each phase by using separate memory subsystems specialized for the different access patterns in each phase of inference. Finally, we introduce optimizations to reduce the working memory and bandwidth requirements for GNN inference including caching partitions of feature data, inter-phase pipelining, and merging computation. We also introduce a novel optimization called vertex-tiling that substantially improves latency by increasing the reuse of weight values during inference. Taken together, these techniques significantly reduce the latency of GNN inference over existing state-of-the-art implementations. Evaluated on a broad range of models and datasets, our accelerator reduces latency by 7-70x compared to a CPU and GPU baseline.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2021; ©2021 |
Publication date | 2021; 2021 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Kiningham, Kevin Nicholas |
---|---|
Degree supervisor | Levis, Philip |
Thesis advisor | Levis, Philip |
Thesis advisor | Horowitz, Mark (Mark Alan) |
Thesis advisor | Ré, Christopher |
Degree committee member | Horowitz, Mark (Mark Alan) |
Degree committee member | Ré, Christopher |
Associated with | Stanford University, Department of Electrical Engineering |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Kevin Kiningham. |
---|---|
Note | Submitted to the Department of Electrical Engineering. |
Thesis | Thesis Ph.D. Stanford University 2021. |
Location | https://purl.stanford.edu/rd907hk5005 |
Access conditions
- Copyright
- © 2021 by Kevin Nicholas Kiningham
- License
- This work is licensed under a Creative Commons Attribution 3.0 Unported license (CC BY).
Also listed in
Loading usage metrics...