Learning and control with inaccurate models
Abstract/Contents
- Abstract
- A key challenge in applying model-based Reinforcement Learning and optimal control methods to complex dynamical systems, such as those arising in many robotics tasks, is the difficulty of obtaining an accurate model of the system. These algorithms perform very well when they are given or can learn an accurate dynamics model, but often times it is very challenging to build an accurate model by any means: effects such as hidden or incomplete state, dynamic or unknown system elements, and other effects, can render the modeling task very difficult. This work presents methods for dealing with such situations, by proposing algorithms that can achieve good performance on control tasks even using only \emph{inaccurate} models of the system. In particular, we present three algorithmic contributions in this work that exploit inaccurate system models in different ways: we present an approximate policy gradient method, based on an approximation we call the Signed Derivative, that can perform well provided only that the sign of certain model derivative terms are known; we present a method for using a distribution over possible inaccurate models to identify a linear subspace of control policies that perform well in all models, then learn a member of this subspace on the real system; finally, we propose an algorithm for integrating previously observed trajectories with inaccurate models in a probabilistic manner, achieving better performance than is possible with either element alone. In addition to these algorithmic contributions, a central focus of this thesis is the application of these methods to challenging robotic domains, extending the state of the art. The methods have enabled a quadruped robot to cross a wide variety of challenging terrain, using a combination of slow static walking, dynamic trotting gaits, and dynamic jumping maneuvers. We also apply these methods to a full-sized autonomous car, where they enable it to execute a "powerslide'' into a narrow parking spot, one of the most challenging maneuvers demonstrated on an autonomous car. Both these domains represent highly challenging robotics tasks where the dynamical system is difficult to model, and our methods demonstrate that we can attain excellent performance on these tasks even without an accurate model of the system.
Description
Type of resource | text |
---|---|
Form | electronic; electronic resource; remote |
Extent | 1 online resource. |
Publication date | 2010 |
Issuance | monographic |
Language | English |
Creators/Contributors
Associated with | Kolter, Jeremy |
---|---|
Associated with | Stanford University, Computer Science Department |
Primary advisor | Ng, Andrew Y, 1976- |
Thesis advisor | Ng, Andrew Y, 1976- |
Thesis advisor | Koller, Daphne |
Thesis advisor | Thrun, Sebastian, 1967- |
Advisor | Koller, Daphne |
Advisor | Thrun, Sebastian, 1967- |
Subjects
Genre | Theses |
---|
Bibliographic information
Statement of responsibility | J. Zico Kolter. |
---|---|
Note | Submitted to the Department of Computer Science. |
Thesis | Thesis (Ph.D.)--Stanford University, 2010. |
Location | electronic resource |
Access conditions
- Copyright
- © 2010 by Jeremy Kolter
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...