Data-driven sequential decision making by understanding and adopting rational behavior

Placeholder Show Content

Abstract/Contents

Abstract
A remarkable feature of an intelligent agent is the ability to make sequences of smart decisions that are executed in coordination to reach goals. As can be seen by watching humans, a polished sequential decision making policy yields elegant behaviors such as smooth driving, dexterous locomotion, and prudent investments. Learning optimal policies for sequential decision making is challenging due to issues such as the difficulty of long-horizon credit assignment, exploration in exponentially large search spaces, and designing suitable reward functions to encourage the correct behavior. In this dissertation, we are interested in, perhaps, one of the most natural forms of learning that humans engage in: learning from observations. We would like to focus on algorithms that enable data-driven learning of sequential decision making policies by observing optimal behavior demonstrated by other rational agents. This process comprises two main steps: understanding and adoption. In the first part, we discuss how to design algorithms that allow an agent to understand and thus internalize rational behavior. We develop an active world model learning algorithm that enables an ego-agent to build models of complex behaviors demonstrated by human-like animate agents by efficiently directing its attention. We further investigate the feasibility of building models of other rational agents by Inverse Reinforcement Learning. In the second part, we develop methods to adopt rational behavior from demonstrations. We develop algorithms for Imitation Learning in the presence of domain mismatch such as morphological and viewpoint differences. We further propose algorithms for imitation via Inverse Reinforcement Learning where we propose algorithms that extract underlying rewards from demonstrations of complex behaviors such as robotic locomotion. We hope that these contributions bring us one step closer to solving real-world sequential decision making problems with machine learning.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2023; ©2023
Publication date 2023; 2023
Issuance monographic
Language English

Creators/Contributors

Author Kim, Kuno
Degree supervisor Ermon, Stefano
Thesis advisor Ermon, Stefano
Thesis advisor Haber, Nick
Thesis advisor Sadigh, Dorsa
Degree committee member Haber, Nick
Degree committee member Sadigh, Dorsa
Associated with Stanford University, School of Engineering
Associated with Stanford University, Computer Science Department

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Kun Ho Kim.
Note Submitted to the Computer Science Department.
Thesis Thesis Ph.D. Stanford University 2023.
Location https://purl.stanford.edu/ff540tm4390

Access conditions

Copyright
© 2023 by Kun ho Kim
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...