Guaranteeing safe online machine learning via reachability analysis
Abstract/Contents
- Abstract
- Reinforcement learning has proven itself to be a powerful technique in robotics, however it has rarely been employed to learn in a hardware-in-the-loop environment due to the fact that spurious training data could cause a robot to take an unsafe (and potentially catastrophic) action. This thesis proposes a method for overcoming this limitation known as Guaranteed Safe Online Learning via Reachability (GSOLR), in which the control outputs from the reinforcement learning algorithm are wrapped inside another controller based on reachability analysis that seeks to guarantee safety against worst-case disturbances. After defining the relevant backwards reachability constructs and explaining how they can be calculated, the thesis formalizes the concept of GSOLR and shows how it can be used on both a simple simulated inverted pendulum example and a non-simulated target tracking problem, in which an observing quadrotor helicopter must keep a target ground vehicle with unknown (but bounded) dynamics inside its field of view at all times, while simultaneously attempting to build a motion model of the target. Extensions to GSOLR are then presented, which allow the safety of the system to automatically become neither too liberal nor too conservative, thus allowing the machine learning algorithm running in parallel the widest possible latitude while still guaranteeing system safety. These extensions are also demonstrated on the inverted pendulum example as well as a practical example, namely that of safely learning an altitude controller for a quadrotor helicopter. These examples demonstrate the GSOLR framework's robustness to errors in machine learning algorithms, and indicate its potential for allowing high-performance machine learning systems to be used in safety-critical situations in the future.
Description
Type of resource | text |
---|---|
Form | electronic; electronic resource; remote |
Extent | 1 online resource. |
Publication date | 2013 |
Issuance | monographic |
Language | English |
Creators/Contributors
Associated with | Gillula, Jeremy H |
---|---|
Associated with | Stanford University, Department of Computer Science. |
Primary advisor | Tomlin, Claire J, 1969- |
Thesis advisor | Tomlin, Claire J, 1969- |
Thesis advisor | Fedkiw, Ronald P, 1968- |
Thesis advisor | Latombe, Jean-Claude |
Advisor | Fedkiw, Ronald P, 1968- |
Advisor | Latombe, Jean-Claude |
Subjects
Genre | Theses |
---|
Bibliographic information
Statement of responsibility | Jeremy H. Gillula. |
---|---|
Note | Submitted to the Department of Computer Science. |
Thesis | Ph.D. Stanford University 2013 |
Location | electronic resource |
Access conditions
- Copyright
- © 2013 by Jeremy Hugh Gillula
- License
- This work is licensed under a Creative Commons Attribution Non Commercial Share Alike 3.0 Unported license (CC BY-NC-SA).
Also listed in
Loading usage metrics...