Efficient methods and hardware for deep learning
Abstract/Contents
- Abstract
- The future will be populated with intelligent devices that require inexpensive, low-power hardware platforms. Deep neural networks have evolved to be the state-of-the-art technique for machine learning tasks. However, these algorithms are computationally intensive, which makes it difficult to deploy on embedded devices with limited hardware resources and a tight power budget. Since Moore's law and technology scaling are slowing down, technology alone will not address this issue. To solve this problem, we focus on efficient algorithms and domain-specific architectures specially designed for the algorithm. By performing optimizations across the full stack from application through hardware, we improved the efficiency of deep learning through smaller model size, higher prediction accuracy, faster prediction speed, and lower power consumption. Our approach starts by changing the algorithm, using "Deep Compression" that significantly reduces the number of parameters and computation requirements of deep learning models by pruning, trained quantization, and variable length coding. "Deep Compression" can reduce the model size by 18x to 49x without hurting the prediction accuracy. We also discovered that pruning and the sparsity constraint not only applies to model compression but also applies to regularization, and we proposed dense-sparse-dense training (DSD), which can improve the prediction accuracy for a wide range of deep learning models. To efficiently implement "Deep Compression" in hardware, we developed EIE, the "Efficient Inference Engine", a domain-specific hardware accelerator that performs inference directly on the compressed model which significantly saves memory bandwidth. Taking advantage of the compressed model, and being able to deal with the irregular computation pattern efficiently, EIE improves the speed by 13x and energy efficiency by 3,400x over GPU.
Description
Type of resource | text |
---|---|
Form | electronic; electronic resource; remote |
Extent | 1 online resource. |
Publication date | 2017 |
Issuance | monographic |
Language | English |
Creators/Contributors
Associated with | Han, Song | |
---|---|---|
Associated with | Stanford University, Department of Electrical Engineering. | |
Primary advisor | Dally, William J | |
Primary advisor | Horowitz, Mark | |
Thesis advisor | Dally, William J | |
Thesis advisor | Horowitz, Mark | |
Thesis advisor | Li, Fei Fei, 1976- | |
Advisor | Li, Fei Fei, 1976- |
Subjects
Genre | Theses |
---|
Bibliographic information
Statement of responsibility | Song Han. |
---|---|
Note | Submitted to the Department of Electrical Engineering. |
Thesis | Thesis (Ph.D.)--Stanford University, 2017. |
Location | electronic resource |
Access conditions
- Copyright
- © 2017 by Song Han
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...