Efficient reinforcement learning with value function generalization

Placeholder Show Content

Abstract/Contents

Abstract
Reinforcement learning (RL) is concerned with how an agent should learn to make decisions over time while interacting with an environment. A growing body of work has produced RL algorithms with sample and computational efficiency guarantees. However, most of this work focuses on "tabula rasa" learning; i.e. algorithms aim to learn with little or no prior knowledge about the environment. Such algorithms exhibit sample complexities that grow at least linearly in the number of states, and they are of limited practical import since state spaces in most relevant contexts are enormous. There is a need for algorithms that generalize in order to learn how to make effective decisions at states beyond the scope of past experience. This dissertation focuses on the open issue of developing efficient RL algorithms that leverage value function generalization (VFG). It consists of two parts. In the first part, we present sample complexity results for two classes of RL problems -- deterministic systems with general forms of VFG and Markov decision processes (MDPs) with a finite hypothesis class. The results provide upper bounds that are independent of state and action space cardinalities and polynomial in other problem parameters. In the second part, building on insights from our sample complexity analyses, we propose randomized least-square value iteration (RLSVI), a RL algorithm for MDPs with VFG via linear hypothesis classes. The algorithm is based on a new notion of randomized value function exploration. We compare through computational studies the performance of RLSVI against least-square value iterations (LSVI) with Boltzmann exploration or epsilon-greedy exploration, which are widely used in RL with VFG. Results demonstrate that RLSVI is orders of magnitude more efficient.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2014
Issuance monographic
Language English

Creators/Contributors

Associated with Wen, Zheng
Associated with Stanford University, Department of Electrical Engineering.
Primary advisor Van Roy, Benjamin
Thesis advisor Van Roy, Benjamin
Thesis advisor Boyd, Stephen P
Thesis advisor Johari, Ramesh, 1976-
Thesis advisor O'Neill, Daniel
Advisor Boyd, Stephen P
Advisor Johari, Ramesh, 1976-
Advisor O'Neill, Daniel

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Zheng Wen.
Note Submitted to the Department of Electrical Engineering.
Thesis Ph.D. Stanford University 2014
Location electronic resource

Access conditions

Copyright
© 2014 by Zheng Wen
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...