Beyond text : applying deep learning to signal data

Placeholder Show Content

Abstract/Contents

Abstract
Sequence modeling primitives have been responsible for breakthroughs across domains like natural language processing and genomics. Despite these advances, existing primitives still struggle to model the large class of signal data acquired from physical sensors. This data has unique characteristics that make it challenging to model: signal data resolution affects the training and generalization of models, signal data is sampled at high rates, resulting in dense data with long-range dependencies, and signal data is highly diverse, with application areas including healthcare, video processing, and industrial sensing. All of these properties raise the bar for universal approaches to modeling this data. This thesis develops a new set of approaches for modeling signal data using state space models. First, we introduce a sequence model called S4 that serves as a general building block for modeling signal data. Second, we generalize this modeling layer to multidimensional signals like images and videos, yielding the first state-of-the-art signal model on large-scale benchmarks such as ImageNet. Incorporating S4 into a multiscale architecture makes it possible to model extremely long sequences of audio, including on a previously unsolved task involving unconditional autoregressive generation of raw audio samples. Finally, we demonstrate the widespread applicability of our approach to a variety of signal data, including a real-world application involving impedance sensor data used in the diagnosis of gastroesophageal reflux disease. Taken together, this new set of approaches provides a universal and versatile set of primitives for modeling diverse, multidimensional signals.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2024; ©2024
Publication date 2024; 2024
Issuance monographic
Language English

Creators/Contributors

Author Goel, Karan
Degree supervisor Ré, Christopher
Thesis advisor Ré, Christopher
Thesis advisor Fatahalian, Kayvon
Thesis advisor Liang, Percy
Degree committee member Fatahalian, Kayvon
Degree committee member Liang, Percy
Associated with Stanford University, School of Engineering
Associated with Stanford University, Computer Science Department

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Karan Goel.
Note Submitted to the Computer Science Department.
Thesis Thesis Ph.D. Stanford University 2024.
Location https://purl.stanford.edu/qb603fk1926

Access conditions

Copyright
© 2024 by Karan Goel
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...