Inference and learning in nonlinear latent variable models

Placeholder Show Content

Abstract/Contents

Abstract
A core goal of modeling is to help us understand the world around us, but often the phenomena we wish to model are only observed indirectly. For example, we often detect black holes via the gravitational effects they have on surrounding objects. Unobserved phenomena are commonly modeled using latent variables that are statistically related to observed variables, but never directly observed themselves. These latent variable models are a powerful formalism that can enable parsimonious and interpretable representations of data, but are difficult to use, especially when the relationships between variables are complex. This thesis develops techniques for fitting latent variable models where the dependencies between variables are parameterized by nonlinear functions such as deep neural networks or nonlinear differential equations. The nonlinear dependencies make analytic methods intractable, and the main thrust of this thesis is in extending sampling algorithms from the Monte Carlo literature to work with deep generative models. In particular, this thesis focuses on modeling sequential data such as neural voltage traces or speech audio. First, I introduce FIVO, a method for fitting nonlinear sequential latent variable models using filtering sequential Monte Carlo, and use it to improve models of speech audio and piano sheet music. Then, I develop a smoothing-based extension of FIVO called SIXO that successfully fits biophysical models of neural membrane potential. Next, I introduce NAS-X, an extension of SIXO that works with discrete latent variables. Finally, I develop methods for fitting models with embedded sampling algorithms and draw connections to energy-based modeling. These methods establish new standards for inference and learning in nonlinear latent variable models. For example, in the Hodgkin-Huxley model of neural membrane potential, NAS-X and SIXO achieve a 32-fold improvement in inference log-likelihood over previous methods. Improved inference performance results in downstream gains in parameter learning, and enables fitting latent variable models based on nonlinear differential equations with hundreds of parameters. Overall, this thesis extends Monte Carlo algorithms to bring powerful models to bear on tough problems in sequence modeling.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2023; ©2023
Publication date 2023; 2023
Issuance monographic
Language English

Creators/Contributors

Author Lawson, John Dieterich
Degree supervisor Liang, Percy
Degree supervisor Linderman, Scott
Thesis advisor Liang, Percy
Thesis advisor Linderman, Scott
Thesis advisor Ermon, Stefano
Thesis advisor Fox, Emily
Degree committee member Ermon, Stefano
Degree committee member Fox, Emily
Associated with Stanford University, School of Engineering
Associated with Stanford University, Computer Science Department

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Dieterich Lawson.
Note Submitted to the Computer Science Department.
Thesis Thesis Ph.D. Stanford University 2023.
Location https://purl.stanford.edu/pk530cy7546

Access conditions

Copyright
© 2023 by John Dieterich Lawson
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...