Engineering recurrent neural networks for low-rank and noise-robust computation

Stock, Christopher Hopkins

Engineering recurrent neural networks for low-rank and noise-robust computation

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fzt425jt1140" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Making sense of dynamical computation in nonlinear recurrent neural networks is a major goal in neuroscience. The advent of modern machine learning approaches has made it possible, via black-box training methods, to efficiently generate computational models of a network performing a given task; indeed, deep learning has thrived on building large, flexible, and highly non-convex models which nonetheless can be effectively optimized to achieve remarkable out-of-sample generalization performance. However, the resulting trained network models can be so complex that they defy intuitive understanding. What design principles govern how the connectivity and dynamics of recurrent neural networks (RNNs) endow them with their computational capabilities? It is evident that there remains a large "explainability gap" between the empirical ability of trained recurrent neural networks to capture variance in neural recordings, on one hand, and the theoretical difficulty of writing down constraints on weight space from task-relevant considerations, on the other. This thesis presents new approaches to closing the explainability gap in neural networks, and in particular, in RNNs. First, we present several novel methods for constructing task-performant RNNs directly from a high-level description of the task to be performed. Critically, unlike black-box machine learning methods for training networks, our construction methods rely solely on simple and easily interpreted mathematical operations. In doing, our approach makes explicit the relationship between network structure and task performance. Harnessing the role of fixed points in recurrent computation, we find forward engineering methods that produce exactly solvable nonlinear networks for a variety of context-dependent computations, including those of arbitrary finite state machines. Second, we examine tools for discovering low-rank structure both in trained recurrent network models and in the learning dynamics of gradient descent in deep networks. First, we introduce a novel method for discovering low-rank structure in trained recurrent networks. In many temporal signal processing tasks in biology, including sequence memory, sequence classification, and natural language processing, neural networks operate in a transient regime far from fixed points. We develop a general approach for capturing transient computations in recurrent networks by dramatically reducing the complexity of networks trained to solve transient processing tasks. Our method, called dynamics-reweighted singular value decomposition (DR-SVD), performs a reweighted dimensionality reduction to obtain a much lower rank connectivity matrix that preserves the dynamics of the original neural network. Second, we show that learning dynamics of deep feedforward networks exhibit low-rank tensor structure which is discoverable and interpretable through the lens of tensor decomposition. Finally, through a study of a fundamental symmetry present in RNNs with homogeneous activation functions, we derive a novel exploration of weight space that improves the noise robustness of a trained RNN without sacrificing performance on the task, or even without requiring any knowledge of the particular task being performed. Our exploration takes the form of a novel, biologically plausible local learning rule that provably increases the robustness of neural dynamics to noise in nonlinear recurrent neural networks with homogeneous nonlinearities, and promotes balance between the incoming and outgoing synaptic weights of each neuron in the network. Our rule, which we refer to as synaptic balancing, is consistent with many known aspects of experimentally observed heterosynaptic plasticity, and moreover makes new experimentally testable predictions relating plasticity at the incoming and outgoing synapses of individual neurons.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2021; ©2021
Publication date	2021; 2021
Issuance	monographic
Language	English

Creators/Contributors

Author	Stock, Christopher Hopkins
Degree supervisor	Ganguli, Surya, 1977-
Thesis advisor	Ganguli, Surya, 1977-
Thesis advisor	Baccus, Stephen A
Thesis advisor	Druckmann, Shaul
Thesis advisor	Newsome, William T
Thesis advisor	Sussillo, David
Degree committee member	Baccus, Stephen A
Degree committee member	Druckmann, Shaul
Degree committee member	Newsome, William T
Degree committee member	Sussillo, David
Associated with	Stanford University, Neurosciences Program

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Christopher Hopkins Stock.
Note	Submitted to the Neurosciences Program.
Thesis	Thesis Ph.D. Stanford University 2021.
Location	https://purl.stanford.edu/zt425jt1140

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...