Modeling sequences with structured state spaces

Placeholder Show Content

Abstract/Contents

Abstract
Substantial recent progress in machine learning has been driven by advances in sequence models, which form the backbone of deep learning models that have achieved widespread success across scientific applications. However, existing methods require extensive specialization to different tasks, modalities, and capabilities; suffer from computational efficiency bottlenecks; and have difficulty modeling more complex sequential data, such as when long dependencies are involved. As such, it remains of fundamental importance to continue to develop principled and practical methods for modeling general sequences. This thesis develops a new approach to deep sequence modeling using state space models, a flexible method that is theoretically grounded, computationally efficient, and achieves strong results across a wide variety of data modalities and applications. First, we introduce a class of models with numerous representations and properties that generalize the strengths of standard deep sequence models such as recurrent neural networks and convolutional neural networks. However, we show that computing these models can be challenging, and develop new classes of structured state spaces that are very fast on modern hardware, both when scaling to long sequences and in other settings such as autoregressive inference. Finally, we present a novel mathematical framework for incrementally modeling continuous signals, which can be combined with state space models to endow them with principled state representations and improve their ability to model long-range dependencies. Together, this new class of methods provides effective and versatile building blocks for machine learning models, especially towards addressing general sequential data at scale.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2023; ©2023
Publication date 2023; 2023
Issuance monographic
Language English

Creators/Contributors

Author Gu, Albert
Degree supervisor Ré, Christopher
Thesis advisor Ré, Christopher
Thesis advisor Liang, Percy
Thesis advisor Linderman, Scott
Degree committee member Liang, Percy
Degree committee member Linderman, Scott
Associated with Stanford University, School of Engineering
Associated with Stanford University, Computer Science Department

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Albert Gu.
Note Submitted to the Computer Science Department.
Thesis Thesis Ph.D. Stanford University 2023.
Location https://purl.stanford.edu/mb976vf9362

Access conditions

Copyright
© 2023 by Albert Gu
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...