Robust learning and evaluation in sequential decision making

Placeholder Show Content

Abstract/Contents

Abstract
Reinforcement learning (RL), as a branch of artificial intelligence, is concerned with making a good sequence of decisions given experience and rewards in a stochastic environment. RL algorithms, propelled by the rise of deep learning and neural networks, have shown an impressive performance in achieving human-level performance in games like Go, Chess, and Atari. However, when applied to high-stakes real-world applications, these impressive performances are not matched. This dissertation tackles some important challenges around robustness that hinder our ability to unleash the potential of RL to real-world applications. We look at the robustness of RL algorithms in both online and offline settings. In an online setting, we develop an algorithm for sample efficient safe policy learning. In an offline setting, we tackle issues of unobserved confounders and heterogeneity in off-policy policy evaluation.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2021; ©2021
Publication date 2021; 2021
Issuance monographic
Language English

Creators/Contributors

Author Keramati, Ramtin
Degree supervisor Brunskill, Emma
Thesis advisor Brunskill, Emma
Thesis advisor Pavone, Marco, 1980-
Thesis advisor Van Roy, Benjamin
Degree committee member Pavone, Marco, 1980-
Degree committee member Van Roy, Benjamin
Associated with Stanford University, Institute for Computational and Mathematical Engineering

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Ramtin Keramati.
Note Submitted to the Institute for Computational and Mathematical Engineering.
Thesis Thesis Ph.D. Stanford University 2021.
Location https://purl.stanford.edu/dd732zb2339

Access conditions

Copyright
© 2021 by Ramtin Keramati
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...