Guaranteeing safe online machine learning via reachability analysis

Placeholder Show Content

Abstract/Contents

Abstract
Reinforcement learning has proven itself to be a powerful technique in robotics, however it has rarely been employed to learn in a hardware-in-the-loop environment due to the fact that spurious training data could cause a robot to take an unsafe (and potentially catastrophic) action. This thesis proposes a method for overcoming this limitation known as Guaranteed Safe Online Learning via Reachability (GSOLR), in which the control outputs from the reinforcement learning algorithm are wrapped inside another controller based on reachability analysis that seeks to guarantee safety against worst-case disturbances. After defining the relevant backwards reachability constructs and explaining how they can be calculated, the thesis formalizes the concept of GSOLR and shows how it can be used on both a simple simulated inverted pendulum example and a non-simulated target tracking problem, in which an observing quadrotor helicopter must keep a target ground vehicle with unknown (but bounded) dynamics inside its field of view at all times, while simultaneously attempting to build a motion model of the target. Extensions to GSOLR are then presented, which allow the safety of the system to automatically become neither too liberal nor too conservative, thus allowing the machine learning algorithm running in parallel the widest possible latitude while still guaranteeing system safety. These extensions are also demonstrated on the inverted pendulum example as well as a practical example, namely that of safely learning an altitude controller for a quadrotor helicopter. These examples demonstrate the GSOLR framework's robustness to errors in machine learning algorithms, and indicate its potential for allowing high-performance machine learning systems to be used in safety-critical situations in the future.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2013
Issuance monographic
Language English

Creators/Contributors

Associated with Gillula, Jeremy H
Associated with Stanford University, Department of Computer Science.
Primary advisor Tomlin, Claire J, 1969-
Thesis advisor Tomlin, Claire J, 1969-
Thesis advisor Fedkiw, Ronald P, 1968-
Thesis advisor Latombe, Jean-Claude
Advisor Fedkiw, Ronald P, 1968-
Advisor Latombe, Jean-Claude

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Jeremy H. Gillula.
Note Submitted to the Department of Computer Science.
Thesis Ph.D. Stanford University 2013
Location electronic resource

Access conditions

Copyright
© 2013 by Jeremy Hugh Gillula
License
This work is licensed under a Creative Commons Attribution Non Commercial Share Alike 3.0 Unported license (CC BY-NC-SA).

Also listed in

Loading usage metrics...