Deep learning on a diet : an error landscape perspective on parameter and data efficiency in deep learning

Placeholder Show Content

Abstract/Contents

Abstract
Many of the recent remarkable breakthroughs in artificial intelligence have come from scaling up both the number of parameters in artificial neural networks (ANNs) and the size of datasets used to train them. This scaling however is unsustainable and it is important to develop methods to achieve similar results under resource constraints. But when and how can we decrease the number of parameters and examples while still training ANNs to the same performance? We investigate methods for pruning both network parameters and the dataset from the perspective of the optimization error surface for ANNs. Through an empirically driven investigation, we show how geometric properties of the error landscape, such as curvature and error basins, determine how many parameters can be pruned and which examples are important for generalization. Overall, we take a step towards understanding the scientific principles that underlie data and parameter efficiency in ANNs.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2023; ©2023
Publication date 2023; 2023
Issuance monographic
Language English

Creators/Contributors

Author Paul, Mansheej
Degree supervisor Ganguli, Surya
Thesis advisor Ganguli, Surya
Thesis advisor Druckmann, Shaul
Thesis advisor Yamins, Dan
Degree committee member Druckmann, Shaul
Degree committee member Yamins, Dan
Associated with Stanford University, School of Humanities and Sciences
Associated with Stanford University, Department of Applied Physics

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Mansheej Paul.
Note Submitted to the Department of Applied Physics.
Thesis Thesis Ph.D. Stanford University 2023.
Location https://purl.stanford.edu/wh462kf8223

Access conditions

Copyright
© 2023 by Mansheej Paul
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...