Optimization and high-dimensional loss landscapes in deep learning

Placeholder Show Content

Abstract/Contents

Abstract
Despite deep learning's impressive success, many questions remain concerning how training such high-dimensional models behaves in practice and why it reliably produces useful networks. We employ an empirical approach, performing experiments guided by theoretical predictions, to study the following through the lens of the loss landscape. (1) How do loss landscape properties affect the success or failure of weight pruning methods? Recent work on two fronts -- the lottery tickets hypothesis and training restricted to random subspaces -- has demonstrated that deep neural networks can be successfully optimized using far fewer degrees of freedom than the total number of parameters. In particular, lottery tickets, or sparse subnetworks capable of matching the full model's accuracy, can be identified via iterative pruning and retraining of the weights. We first provide a framework for the success of low-dimensional training in terms of the high-dimensional geometry of the loss landscape. We then leverage this framework both to better understand the success of lottery tickets and to predict how aggressively we can prune the weights at each iteration. (2) What are the algorithmic advantages of recurrent connections in neural networks? One of the brain's most striking anatomical features is the ubiquity of lateral and recurrent connections. Yet while the strong computational abilities of feedforward networks have been extensively studied, our understanding of the role of recurrent computations that might explain their prevalence remains an important open challenge. We demonstrate that recurrent connections are efficient for performing tasks that can be solved via repeated, local propagation of information and propose that they can be combined with feedforward architectures for efficient computation across timescales.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2022; ©2022
Publication date 2022; 2022
Issuance monographic
Language English

Creators/Contributors

Author Larsen, Brett William
Degree supervisor Druckmann, Shaul
Degree supervisor Ganguli, Surya, 1977-
Thesis advisor Druckmann, Shaul
Thesis advisor Ganguli, Surya, 1977-
Thesis advisor Goldhaber-Gordon, David, 1972-
Degree committee member Goldhaber-Gordon, David, 1972-
Associated with Stanford University, Department of Physics

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Brett W. Larsen.
Note Submitted to the Department of Physics.
Thesis Thesis Ph.D. Stanford University 2022.
Location https://purl.stanford.edu/yj314kt7539

Access conditions

Copyright
© 2022 by Brett William Larsen
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...