Geometric aspects of deep learning

Placeholder Show Content

Abstract/Contents

Abstract
Machine learning using deep neural networks -- deep learning -- has been extremely successful at learning solutions to a very broad suite of difficult problems across a wide range of domains spanning computer vision, game play, natural language processing and understanding, and even fundamental science. Despite this success, we still do not have a detailed, predictive understanding of how deep neural networks work, and what makes them so effective at learning and generalization. In this thesis we study the loss landscapes of deep neural networks using the lens of high-dimensional geometry. We approach the problem of understanding deep neural networks experimentally, similarly to the methods used in the natural sciences. We first discuss a phenomenological approach to modeling the large scale structure of deep neural network loss landscapes using high-dimensional geometry. Using this model, we then continue to investigate the diversity of functions neural networks learn and how it relates to the underlying geometric structure of the solution manifold. We focus on deep ensembles, robustness, and on approximate Bayesian techniques. Finally, we switch gears and investigate the role of nonlinearity in deep learning. We study deep neural networks within the Neural Tangent Kernel framework and empirically establish the role of nonlinearity for the training dynamics of finite-size networks. Using the concept of the nonlinear advantage, we empirically demonstrate the importance of nonlinearity in the very early phases of training, and its waning role farther into optimization.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2021; ©2021
Publication date 2021; 2021
Issuance monographic
Language English

Creators/Contributors

Author Fort, Stanislav
Degree supervisor Ganguli, Surya, 1977-
Thesis advisor Ganguli, Surya, 1977-
Thesis advisor Shenker, Stephen Hart, 1953-
Thesis advisor Yamins, Daniel
Degree committee member Shenker, Stephen Hart, 1953-
Degree committee member Yamins, Daniel
Associated with Stanford University, Department of Physics

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Stanislav Fort.
Note Submitted to the Department of Physics.
Thesis Thesis Ph.D. Stanford University 2021.
Location https://purl.stanford.edu/jk243mm2141

Access conditions

Copyright
© 2021 by Stanislav Fort

Also listed in

Loading usage metrics...