Deep episodic value iteration : a theory of sample efficient learning for machine learning and cognitive (neuro) science

Placeholder Show Content

Abstract/Contents

Abstract
This thesis aims to show that, with some extensions, deep learning is computationally capable of addressing rapid adaptation, can account for behavioral phenomena surrounding this ability, and is parsimonious with existing theories regarding the functional role of the relevant neuroanatomy. The main contribution of this thesis is an algorithm that can account for planning from limited data, which we call Deep Episodic Value Iteration (DEVI). DEVI uses a deep neural network to learn a similarity metric, which is applied recursively for planning over episodic memories. By training this network end-to-end based on planning performance, we posit that DEVI should be capable of meta reinforcement learning, even in high dimensional state spaces. We evaluate DEVI in comparison with traditional deep learning techniques as well as other approaches to meta learning. We close by showing how DEVI can be seen as a computational account of schema formation and schema-consistent learning.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2018
Issuance monographic
Language English

Creators/Contributors

Associated with Hansen, Steven Stenberg
Associated with Stanford University, Department of Psychology.
Primary advisor McClelland, James L
Thesis advisor McClelland, James L
Thesis advisor Goodman, Noah
Thesis advisor Poldrack, Russell A
Advisor Goodman, Noah
Advisor Poldrack, Russell A

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Steven Stenberg Hansen.
Note Submitted to the Department of Psychology.
Thesis Thesis (Ph.D.)--Stanford University, 2018.
Location electronic resource

Access conditions

Copyright
© 2018 by Steven Stenberg Hansen
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...