More sample efficient and robust reinforcement learning with domain knowledge

Mu, Tong

More sample efficient and robust reinforcement learning with domain knowledge

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fzt338pg7556" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Reinforcement Learning has achieved great success on environments with good simulators (for example, Atari, Starcraft, Go, and various robotic tasks). In these settings, agents were able to achieve performance on par with or exceeding human performance. However, the application of reinforcement learning to real world human-facing applications has been limited due to various issues such as large sample complexity. This dissertation proposes methods that work towards addressing these issues by utilizing domain knowledge and structure. Domain knowledge was the main component of the first class of successful AI systems, expert-rule based systems. However, due to many challenges, including the large amount of expensive expert time required, the research community has shifted towards data-driven methods that learn automatically. This dissertation presents methods that aim to combine the benefits of expert-based systems with the strengths of reinforcement learning. These methods can have better performance, achieving better sample efficiency or more robust performance, with only minimal burden on the experts. This dissertation proposes multiple novel methods for leveraging different types of domain knowledge in multiple different reinforcement learning settings. It will introduce methods for incorporating expert domain knowledge and heuristics to speed online reinforcement learning; for incorporating repeated structure in procedure/imitation learning; for incorporating anticipated domain distribution shift for batch contextual bandit settings; and for incorporating a curriculum graph to create better personalized adaptive progressions in a real world educational webgame. We empirically evaluate our methods in simulators designed with real-world data, such as recommendation systems and educational activities sequencing. We additionally test one of our methods in a real world Korean language learning webgame. For all our methods, we demonstrate that we can achieve faster or more robust performance. This shows promise for reinforcement learning methods to be helpful in human-facing applications.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2022; ©2022
Publication date	2022; 2022
Issuance	monographic
Language	English

Creators/Contributors

Author	Mu, Tong
Degree supervisor	Brunskill, Emma
Thesis advisor	Brunskill, Emma
Thesis advisor	Sadigh, Dorsa
Thesis advisor	Van Roy, Benjamin
Degree committee member	Sadigh, Dorsa
Degree committee member	Van Roy, Benjamin
Associated with	Stanford University, Department of Electrical Engineering

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Tong Mu.
Note	Submitted to the Department of Electrical Engineering.
Thesis	Thesis Ph.D. Stanford University 2022.
Location	https://purl.stanford.edu/zt338pg7556

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...