Beyond text : applying deep learning to signal data
Abstract/Contents
- Abstract
- Sequence modeling primitives have been responsible for breakthroughs across domains like natural language processing and genomics. Despite these advances, existing primitives still struggle to model the large class of signal data acquired from physical sensors. This data has unique characteristics that make it challenging to model: signal data resolution affects the training and generalization of models, signal data is sampled at high rates, resulting in dense data with long-range dependencies, and signal data is highly diverse, with application areas including healthcare, video processing, and industrial sensing. All of these properties raise the bar for universal approaches to modeling this data. This thesis develops a new set of approaches for modeling signal data using state space models. First, we introduce a sequence model called S4 that serves as a general building block for modeling signal data. Second, we generalize this modeling layer to multidimensional signals like images and videos, yielding the first state-of-the-art signal model on large-scale benchmarks such as ImageNet. Incorporating S4 into a multiscale architecture makes it possible to model extremely long sequences of audio, including on a previously unsolved task involving unconditional autoregressive generation of raw audio samples. Finally, we demonstrate the widespread applicability of our approach to a variety of signal data, including a real-world application involving impedance sensor data used in the diagnosis of gastroesophageal reflux disease. Taken together, this new set of approaches provides a universal and versatile set of primitives for modeling diverse, multidimensional signals.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2024; ©2024 |
Publication date | 2024; 2024 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Goel, Karan |
---|---|
Degree supervisor | Ré, Christopher |
Thesis advisor | Ré, Christopher |
Thesis advisor | Fatahalian, Kayvon |
Thesis advisor | Liang, Percy |
Degree committee member | Fatahalian, Kayvon |
Degree committee member | Liang, Percy |
Associated with | Stanford University, School of Engineering |
Associated with | Stanford University, Computer Science Department |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Karan Goel. |
---|---|
Note | Submitted to the Computer Science Department. |
Thesis | Thesis Ph.D. Stanford University 2024. |
Location | https://purl.stanford.edu/qb603fk1926 |
Access conditions
- Copyright
- © 2024 by Karan Goel
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...