RRAM compute-in-memory hardware for efficient, versatile, and accurate AI inference

Placeholder Show Content

Abstract/Contents

Abstract
Performing ever-demanding artificial intelligence (AI) tasks directly on the resource-constrained edge devices calls for unprecedented energy-efficiency of edge AI hardware. AI hardware today consumes most energy through data movement between separate compute and memory units. Compute-in-memory (CIM) architectures using resistive random-access memory (RRAM) integrated on the CMOS logic platform alleviate this challenge by storing AI model weights in dense and analog RRAM devices, and performing AI computation directly within RRAM, thus eliminating explicit weight memory access. To date, AI application-level benchmarks on fully integrated RRAM-CIM hardware have been limited in diversity and complexity. Meanwhile, the energy-efficiency benefit of CIM usually comes at the cost of functional flexibility and computational accuracy, hampering its practical use at the edge. Such trade-offs between efficiency, versatility and accuracy cannot be addressed by isolated improvements on any single layer of the design. As our attempts to ameliorate this fundamental trade-off, this thesis will present two RRAM-CIM chips that I led in design, tape-out, and test. The first chip contains a single RRAM-CIM core with 65K RRAM devices; the second chip, named NeuRRAM, integrates 48 RRAM-CIM cores with a total of 3M RRAM devices. The two chips integrate multiple innovations including a voltage-mode sensing scheme, a transposable neurosynaptic array architecture, and non-ideality-aware model training and fine-tuning techniques. By co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, the two chips simultaneously deliver a high degree of versatility in reconfiguring CIM cores for diverse model architectures, higher energy- and area-efficiency than prior arts across various computational bit-precisions, and fully hardware measured accuracy comparable to software models across diverse AI benchmarks, including MNIST and CIFAR-10 image classifications using convolutional neural networks, Google speech command recognition using a long short-term memory, and MNIST image recovery using a Restricted Boltzmann Machine.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2022; ©2022
Publication date 2022; 2022
Issuance monographic
Language English

Creators/Contributors

Author Wan, Weier
Degree supervisor Wong, Hon-Sum Philip, 1959-
Thesis advisor Wong, Hon-Sum Philip, 1959-
Thesis advisor Cauwenberghs, Gert
Thesis advisor Raina, Priyanka, (Assistant Professor of Electrical Engineering)
Degree committee member Cauwenberghs, Gert
Degree committee member Raina, Priyanka, (Assistant Professor of Electrical Engineering)
Associated with Stanford University, Department of Electrical Engineering

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Weier Wan.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis Ph.D. Stanford University 2022.
Location https://purl.stanford.edu/nx804ww3510

Access conditions

Copyright
© 2022 by Weier Wan
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...