Deep linear neural networks : a theory of learning in the brain and mind

Placeholder Show Content

Abstract/Contents

Abstract
Humans and other organisms show an incredibly sophisticated ability to learn about their environments during their lifetimes. This learning is thought to alter the strength of connections between neurons in the brain, but we still do not understand the principles linking synaptic changes at the neural level to behavioral changes at the psychological level. Part of the difficulty stems from depth: the brain has a deep, many-layered structure that substantially complicates the learning process. To understand the specific impact of depth, I develop the theory of gradient descent learning in deep linear neural networks. Despite their linearity, the learning problem in these networks remains nonconvex and exhibits rich nonlinear learning dynamics. I give new exact solutions to the dynamics that quantitatively answer fundamental theoretical questions such as how learning speed scales with depth. These solutions revise the basic conceptual picture underlying deep learning systems--both engineered and biological--with ramifications for a variety of phenomena. I highlight three consequences at different levels of detail. First, the theory shows that layerwise unsupervised learning is a domain general strategy for speeding up subsequent learning, which I link to critical period plasticity in sensory cortices. Second, the theory suggests that depth influences the size and timing of receptive field changes in visual perceptual learning. And third, by considering data drawn from structured probabilistic graphical models, the theory reveals that only deep (and not shallow) networks undergo quasi stage-like transitions during learning reminiscent of those found in infant semantic development. These applications span levels of analysis from single neurons to cognitive psychology, demonstrating the potential of deep linear networks to connect detailed changes in neuronal networks to changes in high-level behavior and cognition.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2015
Issuance monographic
Language English

Creators/Contributors

Associated with Saxe, Andrew Michael
Associated with Stanford University, Department of Electrical Engineering.
Primary advisor McClelland, James C
Thesis advisor McClelland, James C
Thesis advisor Ng, Andrew Y, 1976-
Thesis advisor Shenoy, Krishna V. (Krishna Vaughn)
Advisor Ng, Andrew Y, 1976-
Advisor Shenoy, Krishna V. (Krishna Vaughn)

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Andrew Michael Saxe.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis (Ph.D.)--Stanford University, 2015.
Location electronic resource

Access conditions

Copyright
© 2015 by Andrew Michael Saxe
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...