Neural network models of visual learning

Placeholder Show Content

Abstract/Contents

Abstract
Humans show remarkable ability in not only recognizing the complicated visual environment surrounding them but also efficiently learning from this environment. The ventral visual stream underlies this critical ability and is currently best modeled by deep neural networks both quantitatively and qualitatively. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised methods requiring many more labels than are accessible to infants during development. In this dissertation, we first propose strong learning algorithms that learn from totally unlabelled or only partially labelled data. Then, we show that these algorithms together have largely closed this gap. We find that neural network models learned with deep unsupervised contrastive embedding methods achieve neural prediction accuracy in multiple ventral visual cortical areas that equals or exceeds that of models derived using today's best supervised methods, even when trained solely with real human child developmental data collected from head-mounted cameras, despite the fact that these datasets are noisy and limited. The proposed semi-supervised method has also proven to leverage small numbers of labelled examples to produce representations with substantially improved error-pattern consistency to human behavior. Furthermore, we propose two learning benchmarks measuring how well unsupervised models are able to predict human visual learning effects in both real-time and life-long timescales. Testing multiple high-performing unsupervised learning algorithms at both time-scales, we show how specific algorithm designs are helping matching the human learning results. Taken together, these results illustrate one of the first uses of unsupervised learning to provide a quantitative model of a multi-area cortical brain system, and present a strong candidate for a biologically-plausible computational theory of primate sensory learning. In addition to this, we also present models of other functions or species, serving as pre-steps of extending the models of visual learning to these domains.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2022; ©2022
Publication date 2022; 2022
Issuance monographic
Language English

Creators/Contributors

Author Zhuang, Chengxu
Degree supervisor Yamins, Daniel
Thesis advisor Yamins, Daniel
Thesis advisor Finn, Chelsea
Thesis advisor Goodman, Noah (Noah D.)
Thesis advisor Grill-Spector, Kalanit
Degree committee member Finn, Chelsea
Degree committee member Goodman, Noah (Noah D.)
Degree committee member Grill-Spector, Kalanit
Associated with Stanford University, Department of Psychology

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Chengxu Zhuang.
Note Submitted to the Department of Psychology.
Thesis Thesis Ph.D. Stanford University 2022.
Location https://purl.stanford.edu/vh955gk8166

Access conditions

Copyright
© 2022 by Chengxu Zhuang
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...