Brain-inspired computing with resistive non-volatile memories

Placeholder Show Content

Abstract/Contents

Abstract
Training of large scale neural networks require thousands of processors, massive amounts of off-chip memory, and kilowatts of power to complete within a reasonable amount of time. The bulk of this power is consumed in moving data between the fast multicore processors and off-chip memory. After trained in backend infrastructure, model parameters can be deployed on mobile devices for inference tasks possibly combined with further training for local customization. However, off-chip memory accesses also dominate required power in mobile devices, limiting the size and scale of neural network models deployed on mobile devices by the amount of on-chip memory available, which is limited due to the large area of SRAM. For both training and inference tasks, resistive switching memories offer a compact, scalable and low power alternative that permits on-chip co-located processing and memory in fine-grain distributed parallel architecture. In this dissertation, approaches to utilizing resistive memories in analog fashion as synaptic weights in neural networks are described and analyzed. First, proof-of-concept demonstrations of training different types of neural networks with noisy energy efficient analog weight updates are presented using fabricated resistive memory devices. Effects of device variation on energy efficiency and training accuracy are measured. Then, using experimentally validated compact model of metal-oxide resistive switching memory, effects of device non-idealities on training accuracy as well as energy efficiency is studied using a single-layer neural network (Restricted Boltzmann Machine) as a case study. Requirements for resistive memories as well as programming schemes for mitigating the effects of device non-idealities are elucidated. An architecture with resistive memories optimized for inference tasks, Efficient Resistive Neural Inference Engine, is also presented. This architecture combines algorithm and hardware co-optimizations by using pruning, quantization and noise injection on algorithm level, and low power low-precision analog and digital circuits on circuit level. Architecture is then benchmarked against fully digital CMOS system. Resistive technology outperforms digital CMOS by ~8× if a cell can store a weight with ~6-bit precision and if activations can tolerate 1 or 2-bit precision. If a cell cannot store more than 2 bits or if activations require > 3-bit precision, benefits diminish significantly.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2017
Issuance monographic
Language English

Creators/Contributors

Associated with Eryilmaz, Sukru Burc
Associated with Stanford University, Department of Electrical Engineering.
Primary advisor Wong, Hon-Sum Philip, 1959-
Thesis advisor Wong, Hon-Sum Philip, 1959-
Thesis advisor Mitra, Subhasish
Thesis advisor Wong, S
Advisor Mitra, Subhasish
Advisor Wong, S

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Sukru Burc Eryilmaz.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis (Ph.D.)--Stanford University, 2017.
Location electronic resource

Access conditions

Copyright
© 2017 by Sukru Burc Eryilmaz
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...