Mixed-signal processing for machine learning

Placeholder Show Content

Abstract/Contents

Abstract
Recent advancements in machine learning algorithms, hardware, and datasets have led to the successful deployment of deep neural networks (DNNs) in various cloud-based services. Today, new applications are emerging where sensor bandwidth is higher than network bandwidth, and where network latency is not tolerable. By pushing DNN processing closer to the sensor, we can avoid throwing data away and improve the user experience. In the long term, it is foreseeable that such DNN processors will run on harvested energy, eliminating the cost and overhead of wired power connections and battery replacement. A significant challenge arises from the fact that DNNs are both memory and compute intensive, requiring millions of parameters and billions of arithmetic operations to perform a single inference. This thesis presents circuit and architecture techniques that leverage the noise tolerance and parallel structure of DNNs to bring inference systems closer to the energy-efficiency limits of CMOS technology. In the low signal to noise ratio (SNR) regime where DNNs operate, thermally-limited analog signal processing circuits are more energy-efficient than digital. However, the massive scale of DNNs favors circuits compatible with dense digital memory. Mixed-signal processing allows us to integrate analog efficiency with digital scalability, but close attention must be paid to energy consumed at the analog-digital interface and in memory access. Binarized neural networks minimize this overhead, and hence operate closer to the analog energy limit. This thesis describes a mixed-signal binary convolutional neural network (CNN) processor for embedded inference applications that achieves 3.8 uJ/classification at 86% accuracy on the CIFAR-10 image classification dataset. A weight-stationary, parallel-processing architecture amortizes memory access across many computations, leaving wide vector summation as the remaining energy bottleneck. This design features an energy-efficient switched-capacitor neuron that addresses this challenge, employing a 1024-bit thermometer-coded capacitive DAC section for summing point-wise products of CNN filter weights and activations and a 9-bit binary-weighted section for adding the filter bias. The design occupies 6 mm^2 in 28 nm CMOS, contains 328 KB of on-chip SRAM, operates at 237 frames/s (FPS), and consumes 0.9 mW from 0.6 V/0.8 V supplies. The corresponding energy per classification (3.8 uJ) amounts to a 40x improvement over the previous low energy benchmark on CIFAR-10, achieved by both architectural specialization and mixed-signal processing. The switched-capacitor neuron array is 12.9x more energy efficient than a synthesized digital implementation, which amounts to a 4x advantage in system-level energy per classification at the same classification accuracy. We provide an apples-to-apples comparison of the mixed-signal, hand-designed digital, and synthesized digital implementations of this architecture.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2019; ©2019
Publication date 2019; 2019
Issuance monographic
Language English

Creators/Contributors

Author Bankman, Daniel Joel
Degree supervisor Murmann, Boris
Thesis advisor Murmann, Boris
Thesis advisor Arbabian, Amin
Thesis advisor Mitra, Subhasish
Degree committee member Arbabian, Amin
Degree committee member Mitra, Subhasish
Associated with Stanford University, Department of Electrical Engineering.

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Daniel Bankman.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis Ph.D. Stanford University 2019.
Location electronic resource

Access conditions

Copyright
© 2019 by Daniel Joel Bankman
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...