Brain-inspired computing with resistive non-volatile memories

Eryilmaz, Sukru Burc; Stanford University, Department of Electrical Engineering.

Brain-inspired computing with resistive non-volatile memories

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fss224wz9219" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Training of large scale neural networks require thousands of processors, massive amounts of off-chip memory, and kilowatts of power to complete within a reasonable amount of time. The bulk of this power is consumed in moving data between the fast multicore processors and off-chip memory. After trained in backend infrastructure, model parameters can be deployed on mobile devices for inference tasks possibly combined with further training for local customization. However, off-chip memory accesses also dominate required power in mobile devices, limiting the size and scale of neural network models deployed on mobile devices by the amount of on-chip memory available, which is limited due to the large area of SRAM. For both training and inference tasks, resistive switching memories offer a compact, scalable and low power alternative that permits on-chip co-located processing and memory in fine-grain distributed parallel architecture. In this dissertation, approaches to utilizing resistive memories in analog fashion as synaptic weights in neural networks are described and analyzed. First, proof-of-concept demonstrations of training different types of neural networks with noisy energy efficient analog weight updates are presented using fabricated resistive memory devices. Effects of device variation on energy efficiency and training accuracy are measured. Then, using experimentally validated compact model of metal-oxide resistive switching memory, effects of device non-idealities on training accuracy as well as energy efficiency is studied using a single-layer neural network (Restricted Boltzmann Machine) as a case study. Requirements for resistive memories as well as programming schemes for mitigating the effects of device non-idealities are elucidated. An architecture with resistive memories optimized for inference tasks, Efficient Resistive Neural Inference Engine, is also presented. This architecture combines algorithm and hardware co-optimizations by using pruning, quantization and noise injection on algorithm level, and low power low-precision analog and digital circuits on circuit level. Architecture is then benchmarked against fully digital CMOS system. Resistive technology outperforms digital CMOS by ~8× if a cell can store a weight with ~6-bit precision and if activations can tolerate 1 or 2-bit precision. If a cell cannot store more than 2 bits or if activations require > 3-bit precision, benefits diminish significantly.

Description

Type of resource	text
Form	electronic; electronic resource; remote
Extent	1 online resource.
Publication date	2017
Issuance	monographic
Language	English

Creators/Contributors

Associated with	Eryilmaz, Sukru Burc
Associated with	Stanford University, Department of Electrical Engineering.
Primary advisor	Wong, Hon-Sum Philip, 1959-
Thesis advisor	Wong, Hon-Sum Philip, 1959-
Thesis advisor	Mitra, Subhasish
Thesis advisor	Wong, S
Advisor	Mitra, Subhasish
Advisor	Wong, S

Subjects

Genre	Theses

Bibliographic information

Statement of responsibility	Sukru Burc Eryilmaz.
Note	Submitted to the Department of Electrical Engineering.
Thesis	Thesis (Ph.D.)--Stanford University, 2017.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...