Learning and control with inaccurate models

Kolter, Jeremy; Stanford University, Computer Science Department

Learning and control with inaccurate models

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fvm467pq0465" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: A key challenge in applying model-based Reinforcement Learning and optimal control methods to complex dynamical systems, such as those arising in many robotics tasks, is the difficulty of obtaining an accurate model of the system. These algorithms perform very well when they are given or can learn an accurate dynamics model, but often times it is very challenging to build an accurate model by any means: effects such as hidden or incomplete state, dynamic or unknown system elements, and other effects, can render the modeling task very difficult. This work presents methods for dealing with such situations, by proposing algorithms that can achieve good performance on control tasks even using only \emph{inaccurate} models of the system. In particular, we present three algorithmic contributions in this work that exploit inaccurate system models in different ways: we present an approximate policy gradient method, based on an approximation we call the Signed Derivative, that can perform well provided only that the sign of certain model derivative terms are known; we present a method for using a distribution over possible inaccurate models to identify a linear subspace of control policies that perform well in all models, then learn a member of this subspace on the real system; finally, we propose an algorithm for integrating previously observed trajectories with inaccurate models in a probabilistic manner, achieving better performance than is possible with either element alone. In addition to these algorithmic contributions, a central focus of this thesis is the application of these methods to challenging robotic domains, extending the state of the art. The methods have enabled a quadruped robot to cross a wide variety of challenging terrain, using a combination of slow static walking, dynamic trotting gaits, and dynamic jumping maneuvers. We also apply these methods to a full-sized autonomous car, where they enable it to execute a "powerslide'' into a narrow parking spot, one of the most challenging maneuvers demonstrated on an autonomous car. Both these domains represent highly challenging robotics tasks where the dynamical system is difficult to model, and our methods demonstrate that we can attain excellent performance on these tasks even without an accurate model of the system.

Description

Type of resource	text
Form	electronic; electronic resource; remote
Extent	1 online resource.
Publication date	2010
Issuance	monographic
Language	English

Creators/Contributors

Associated with	Kolter, Jeremy
Associated with	Stanford University, Computer Science Department
Primary advisor	Ng, Andrew Y, 1976-
Thesis advisor	Ng, Andrew Y, 1976-
Thesis advisor	Koller, Daphne
Thesis advisor	Thrun, Sebastian, 1967-
Advisor	Koller, Daphne
Advisor	Thrun, Sebastian, 1967-

Subjects

Genre	Theses

Bibliographic information

Statement of responsibility	J. Zico Kolter.
Note	Submitted to the Department of Computer Science.
Thesis	Thesis (Ph.D.)--Stanford University, 2010.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...