Scalable near-data processing systems for data-intensive applications

Placeholder Show Content

Abstract/Contents

Abstract
Emerging big data applications, such as deep learning, graph processing, and data analytics, process massive data sets within rigorous time constraints. For such data-intensive workloads, the frequent and expensive data movement between memory and compute modules dominates both execution time and energy consumption, seriously impeding future performance scaling. Moreover, the end of silicon scaling has made all compute systems energy-constrained. It now becomes increasingly critical to address this energy bottleneck for data-intensive applications. One promising way to alleviate the inefficiencies of data movement is to avoid it altogether by executing computations closer to data locations, an approach commonly referred to as Near-Data Processing (NDP). Recent advances in integration technology allow us to implement NDP systems in a practical way by vertically stacking logic chips and memory modules. Hence, it is now the time to develop architectural support across both hardware and software levels for NDP. This involves developing practical system architectures and programming models as an easy-to-use hardware/software interface, designing efficient processing logic hardware to exploit the abundant 3D memory bandwidth, and investigating scalable software dataflow schemes that achieve optimized scheduling on the hardware resources. The focus of this dissertation is to architect practical, efficient, and scalable NDP systems for data-intensive processing. To this end, we present a coherent set of hardware and software solutions to address architectural challenges for both general-purpose and specialized computing platforms. First, we propose a practical and scalable NDP system architecture for big data applications such as deep learning and graph analytics. The architecture features simple yet efficient support for virtual memory, cache coherence, and data communication, which leads to a 2.5x energy efficiency improvement over prior NDP designs and 16x over conventional systems. Second, we design an efficient NDP compute logic HRL, which uses a reconfigurable array with both fine-grained compute units for efficient arithmetic computations, and coarse-grained logic blocks for flexible data and control flows. HRL improves the energy efficiency by 2x over conventional fine-grained and coarse-grained reconfigurable circuits. Third, we investigate domain-specific NDP accelerators for deep learning, and develop TETRIS, a neural network accelerator using 3D-stacked DRAM. We develop both the hardware architecture and dataflow scheduling for TETRIS, enabling 4x higher performance and 1.5x better energy efficiency compared to state-of-the-art accelerators. Finally, we present the enabling techniques for using dense commodity DRAM arrays as a fine-grained reconfigurable fabric called DRAF. DRAF is 10x denser and 3x more power-efficient than conventional FPGAs, and also supports multiple design contexts. These features make DRAF appropriate for cost and power constrained applications in multi-tenancy environments such as datacenters and mobile devices.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2018; ©2018
Publication date 2018; 2018
Issuance monographic
Language English

Creators/Contributors

Author Gao, Mingyu
Degree supervisor Kozyrakis, Christoforos, 1974-
Thesis advisor Kozyrakis, Christoforos, 1974-
Thesis advisor Dally, William J
Thesis advisor Horowitz, Mark (Mark Alan)
Degree committee member Dally, William J
Degree committee member Horowitz, Mark (Mark Alan)
Associated with Stanford University, Department of Electrical Engineering.

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Mingyu Gao.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis Ph.D. Stanford University 2018.
Location electronic resource

Access conditions

Copyright
© 2018 by Mingyu Gao
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...