Engineering recurrent neural networks for low-rank and noise-robust computation

Placeholder Show Content

Abstract/Contents

Abstract
Making sense of dynamical computation in nonlinear recurrent neural networks is a major goal in neuroscience. The advent of modern machine learning approaches has made it possible, via black-box training methods, to efficiently generate computational models of a network performing a given task; indeed, deep learning has thrived on building large, flexible, and highly non-convex models which nonetheless can be effectively optimized to achieve remarkable out-of-sample generalization performance. However, the resulting trained network models can be so complex that they defy intuitive understanding. What design principles govern how the connectivity and dynamics of recurrent neural networks (RNNs) endow them with their computational capabilities? It is evident that there remains a large "explainability gap" between the empirical ability of trained recurrent neural networks to capture variance in neural recordings, on one hand, and the theoretical difficulty of writing down constraints on weight space from task-relevant considerations, on the other. This thesis presents new approaches to closing the explainability gap in neural networks, and in particular, in RNNs. First, we present several novel methods for constructing task-performant RNNs directly from a high-level description of the task to be performed. Critically, unlike black-box machine learning methods for training networks, our construction methods rely solely on simple and easily interpreted mathematical operations. In doing, our approach makes explicit the relationship between network structure and task performance. Harnessing the role of fixed points in recurrent computation, we find forward engineering methods that produce exactly solvable nonlinear networks for a variety of context-dependent computations, including those of arbitrary finite state machines. Second, we examine tools for discovering low-rank structure both in trained recurrent network models and in the learning dynamics of gradient descent in deep networks. First, we introduce a novel method for discovering low-rank structure in trained recurrent networks. In many temporal signal processing tasks in biology, including sequence memory, sequence classification, and natural language processing, neural networks operate in a transient regime far from fixed points. We develop a general approach for capturing transient computations in recurrent networks by dramatically reducing the complexity of networks trained to solve transient processing tasks. Our method, called dynamics-reweighted singular value decomposition (DR-SVD), performs a reweighted dimensionality reduction to obtain a much lower rank connectivity matrix that preserves the dynamics of the original neural network. Second, we show that learning dynamics of deep feedforward networks exhibit low-rank tensor structure which is discoverable and interpretable through the lens of tensor decomposition. Finally, through a study of a fundamental symmetry present in RNNs with homogeneous activation functions, we derive a novel exploration of weight space that improves the noise robustness of a trained RNN without sacrificing performance on the task, or even without requiring any knowledge of the particular task being performed. Our exploration takes the form of a novel, biologically plausible local learning rule that provably increases the robustness of neural dynamics to noise in nonlinear recurrent neural networks with homogeneous nonlinearities, and promotes balance between the incoming and outgoing synaptic weights of each neuron in the network. Our rule, which we refer to as synaptic balancing, is consistent with many known aspects of experimentally observed heterosynaptic plasticity, and moreover makes new experimentally testable predictions relating plasticity at the incoming and outgoing synapses of individual neurons.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2021; ©2021
Publication date 2021; 2021
Issuance monographic
Language English

Creators/Contributors

Author Stock, Christopher Hopkins
Degree supervisor Ganguli, Surya, 1977-
Thesis advisor Ganguli, Surya, 1977-
Thesis advisor Baccus, Stephen A
Thesis advisor Druckmann, Shaul
Thesis advisor Newsome, William T
Thesis advisor Sussillo, David
Degree committee member Baccus, Stephen A
Degree committee member Druckmann, Shaul
Degree committee member Newsome, William T
Degree committee member Sussillo, David
Associated with Stanford University, Neurosciences Program

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Christopher Hopkins Stock.
Note Submitted to the Neurosciences Program.
Thesis Thesis Ph.D. Stanford University 2021.
Location https://purl.stanford.edu/zt425jt1140

Access conditions

Copyright
© 2021 by Christopher Hopkins Stock
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...