Learning and control with inaccurate models

Placeholder Show Content

Abstract/Contents

Abstract
A key challenge in applying model-based Reinforcement Learning and optimal control methods to complex dynamical systems, such as those arising in many robotics tasks, is the difficulty of obtaining an accurate model of the system. These algorithms perform very well when they are given or can learn an accurate dynamics model, but often times it is very challenging to build an accurate model by any means: effects such as hidden or incomplete state, dynamic or unknown system elements, and other effects, can render the modeling task very difficult. This work presents methods for dealing with such situations, by proposing algorithms that can achieve good performance on control tasks even using only \emph{inaccurate} models of the system. In particular, we present three algorithmic contributions in this work that exploit inaccurate system models in different ways: we present an approximate policy gradient method, based on an approximation we call the Signed Derivative, that can perform well provided only that the sign of certain model derivative terms are known; we present a method for using a distribution over possible inaccurate models to identify a linear subspace of control policies that perform well in all models, then learn a member of this subspace on the real system; finally, we propose an algorithm for integrating previously observed trajectories with inaccurate models in a probabilistic manner, achieving better performance than is possible with either element alone. In addition to these algorithmic contributions, a central focus of this thesis is the application of these methods to challenging robotic domains, extending the state of the art. The methods have enabled a quadruped robot to cross a wide variety of challenging terrain, using a combination of slow static walking, dynamic trotting gaits, and dynamic jumping maneuvers. We also apply these methods to a full-sized autonomous car, where they enable it to execute a "powerslide'' into a narrow parking spot, one of the most challenging maneuvers demonstrated on an autonomous car. Both these domains represent highly challenging robotics tasks where the dynamical system is difficult to model, and our methods demonstrate that we can attain excellent performance on these tasks even without an accurate model of the system.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2010
Issuance monographic
Language English

Creators/Contributors

Associated with Kolter, Jeremy
Associated with Stanford University, Computer Science Department
Primary advisor Ng, Andrew Y, 1976-
Thesis advisor Ng, Andrew Y, 1976-
Thesis advisor Koller, Daphne
Thesis advisor Thrun, Sebastian, 1967-
Advisor Koller, Daphne
Advisor Thrun, Sebastian, 1967-

Subjects

Genre Theses

Bibliographic information

Statement of responsibility J. Zico Kolter.
Note Submitted to the Department of Computer Science.
Thesis Thesis (Ph.D.)--Stanford University, 2010.
Location electronic resource

Access conditions

Copyright
© 2010 by Jeremy Kolter
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...