RRAM compute-in-memory hardware for efficient, versatile, and accurate AI inference

Wan, Weier

RRAM compute-in-memory hardware for efficient, versatile, and accurate AI inference

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fnx804ww3510" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Performing ever-demanding artificial intelligence (AI) tasks directly on the resource-constrained edge devices calls for unprecedented energy-efficiency of edge AI hardware. AI hardware today consumes most energy through data movement between separate compute and memory units. Compute-in-memory (CIM) architectures using resistive random-access memory (RRAM) integrated on the CMOS logic platform alleviate this challenge by storing AI model weights in dense and analog RRAM devices, and performing AI computation directly within RRAM, thus eliminating explicit weight memory access. To date, AI application-level benchmarks on fully integrated RRAM-CIM hardware have been limited in diversity and complexity. Meanwhile, the energy-efficiency benefit of CIM usually comes at the cost of functional flexibility and computational accuracy, hampering its practical use at the edge. Such trade-offs between efficiency, versatility and accuracy cannot be addressed by isolated improvements on any single layer of the design. As our attempts to ameliorate this fundamental trade-off, this thesis will present two RRAM-CIM chips that I led in design, tape-out, and test. The first chip contains a single RRAM-CIM core with 65K RRAM devices; the second chip, named NeuRRAM, integrates 48 RRAM-CIM cores with a total of 3M RRAM devices. The two chips integrate multiple innovations including a voltage-mode sensing scheme, a transposable neurosynaptic array architecture, and non-ideality-aware model training and fine-tuning techniques. By co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, the two chips simultaneously deliver a high degree of versatility in reconfiguring CIM cores for diverse model architectures, higher energy- and area-efficiency than prior arts across various computational bit-precisions, and fully hardware measured accuracy comparable to software models across diverse AI benchmarks, including MNIST and CIFAR-10 image classifications using convolutional neural networks, Google speech command recognition using a long short-term memory, and MNIST image recovery using a Restricted Boltzmann Machine.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2022; ©2022
Publication date	2022; 2022
Issuance	monographic
Language	English

Creators/Contributors

Author	Wan, Weier
Degree supervisor	Wong, Hon-Sum Philip, 1959-
Thesis advisor	Wong, Hon-Sum Philip, 1959-
Thesis advisor	Cauwenberghs, Gert
Thesis advisor	Raina, Priyanka, (Assistant Professor of Electrical Engineering)
Degree committee member	Cauwenberghs, Gert
Degree committee member	Raina, Priyanka, (Assistant Professor of Electrical Engineering)
Associated with	Stanford University, Department of Electrical Engineering

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Weier Wan.
Note	Submitted to the Department of Electrical Engineering.
Thesis	Thesis Ph.D. Stanford University 2022.
Location	https://purl.stanford.edu/nx804ww3510

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...