Optimization and high-dimensional loss landscapes in deep learning
Abstract/Contents
- Abstract
- Despite deep learning's impressive success, many questions remain concerning how training such high-dimensional models behaves in practice and why it reliably produces useful networks. We employ an empirical approach, performing experiments guided by theoretical predictions, to study the following through the lens of the loss landscape. (1) How do loss landscape properties affect the success or failure of weight pruning methods? Recent work on two fronts -- the lottery tickets hypothesis and training restricted to random subspaces -- has demonstrated that deep neural networks can be successfully optimized using far fewer degrees of freedom than the total number of parameters. In particular, lottery tickets, or sparse subnetworks capable of matching the full model's accuracy, can be identified via iterative pruning and retraining of the weights. We first provide a framework for the success of low-dimensional training in terms of the high-dimensional geometry of the loss landscape. We then leverage this framework both to better understand the success of lottery tickets and to predict how aggressively we can prune the weights at each iteration. (2) What are the algorithmic advantages of recurrent connections in neural networks? One of the brain's most striking anatomical features is the ubiquity of lateral and recurrent connections. Yet while the strong computational abilities of feedforward networks have been extensively studied, our understanding of the role of recurrent computations that might explain their prevalence remains an important open challenge. We demonstrate that recurrent connections are efficient for performing tasks that can be solved via repeated, local propagation of information and propose that they can be combined with feedforward architectures for efficient computation across timescales.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2022; ©2022 |
Publication date | 2022; 2022 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Larsen, Brett William |
---|---|
Degree supervisor | Druckmann, Shaul |
Degree supervisor | Ganguli, Surya, 1977- |
Thesis advisor | Druckmann, Shaul |
Thesis advisor | Ganguli, Surya, 1977- |
Thesis advisor | Goldhaber-Gordon, David, 1972- |
Degree committee member | Goldhaber-Gordon, David, 1972- |
Associated with | Stanford University, Department of Physics |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Brett W. Larsen. |
---|---|
Note | Submitted to the Department of Physics. |
Thesis | Thesis Ph.D. Stanford University 2022. |
Location | https://purl.stanford.edu/yj314kt7539 |
Access conditions
- Copyright
- © 2022 by Brett William Larsen
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...