Inference and learning in nonlinear latent variable models
- A core goal of modeling is to help us understand the world around us, but often the phenomena we wish to model are only observed indirectly. For example, we often detect black holes via the gravitational effects they have on surrounding objects. Unobserved phenomena are commonly modeled using latent variables that are statistically related to observed variables, but never directly observed themselves. These latent variable models are a powerful formalism that can enable parsimonious and interpretable representations of data, but are difficult to use, especially when the relationships between variables are complex. This thesis develops techniques for fitting latent variable models where the dependencies between variables are parameterized by nonlinear functions such as deep neural networks or nonlinear differential equations. The nonlinear dependencies make analytic methods intractable, and the main thrust of this thesis is in extending sampling algorithms from the Monte Carlo literature to work with deep generative models. In particular, this thesis focuses on modeling sequential data such as neural voltage traces or speech audio. First, I introduce FIVO, a method for fitting nonlinear sequential latent variable models using filtering sequential Monte Carlo, and use it to improve models of speech audio and piano sheet music. Then, I develop a smoothing-based extension of FIVO called SIXO that successfully fits biophysical models of neural membrane potential. Next, I introduce NAS-X, an extension of SIXO that works with discrete latent variables. Finally, I develop methods for fitting models with embedded sampling algorithms and draw connections to energy-based modeling. These methods establish new standards for inference and learning in nonlinear latent variable models. For example, in the Hodgkin-Huxley model of neural membrane potential, NAS-X and SIXO achieve a 32-fold improvement in inference log-likelihood over previous methods. Improved inference performance results in downstream gains in parameter learning, and enables fitting latent variable models based on nonlinear differential equations with hundreds of parameters. Overall, this thesis extends Monte Carlo algorithms to bring powerful models to bear on tough problems in sequence modeling.
|Type of resource
|electronic resource; remote; computer; online resource
|1 online resource.
|Lawson, John Dieterich
|Degree committee member
|Degree committee member
|Stanford University, School of Engineering
|Stanford University, Computer Science Department
|Statement of responsibility
|Submitted to the Computer Science Department.
|Thesis Ph.D. Stanford University 2023.
- © 2023 by John Dieterich Lawson
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...