Geometric aspects of deep learning
Abstract/Contents
- Abstract
- Machine learning using deep neural networks -- deep learning -- has been extremely successful at learning solutions to a very broad suite of difficult problems across a wide range of domains spanning computer vision, game play, natural language processing and understanding, and even fundamental science. Despite this success, we still do not have a detailed, predictive understanding of how deep neural networks work, and what makes them so effective at learning and generalization. In this thesis we study the loss landscapes of deep neural networks using the lens of high-dimensional geometry. We approach the problem of understanding deep neural networks experimentally, similarly to the methods used in the natural sciences. We first discuss a phenomenological approach to modeling the large scale structure of deep neural network loss landscapes using high-dimensional geometry. Using this model, we then continue to investigate the diversity of functions neural networks learn and how it relates to the underlying geometric structure of the solution manifold. We focus on deep ensembles, robustness, and on approximate Bayesian techniques. Finally, we switch gears and investigate the role of nonlinearity in deep learning. We study deep neural networks within the Neural Tangent Kernel framework and empirically establish the role of nonlinearity for the training dynamics of finite-size networks. Using the concept of the nonlinear advantage, we empirically demonstrate the importance of nonlinearity in the very early phases of training, and its waning role farther into optimization.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2021; ©2021 |
Publication date | 2021; 2021 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Fort, Stanislav |
---|---|
Degree supervisor | Ganguli, Surya, 1977- |
Thesis advisor | Ganguli, Surya, 1977- |
Thesis advisor | Shenker, Stephen Hart, 1953- |
Thesis advisor | Yamins, Daniel |
Degree committee member | Shenker, Stephen Hart, 1953- |
Degree committee member | Yamins, Daniel |
Associated with | Stanford University, Department of Physics |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Stanislav Fort. |
---|---|
Note | Submitted to the Department of Physics. |
Thesis | Thesis Ph.D. Stanford University 2021. |
Location | https://purl.stanford.edu/jk243mm2141 |
Access conditions
- Copyright
- © 2021 by Stanislav Fort
Also listed in
Loading usage metrics...