Efficient exploration in bandit and reinforcement learning

Placeholder Show Content


Sequential decision making problems appear is the core problem in many real world applications. In such problems, an agent is aiming at achieving a certain goal by optimally taking a sequence of actions based on noisy observations. Bandit and reinforcement learning are fundamental frameworks for modeling decision making under uncertainty. Efficient exploration in such problems significantly increases data efficiency by speeding up the learning process and requiring less data for making decisions. As such, it is of utmost importance to design sophisticated exploration schemes based on the special characteristics of each practical problem. In this dissertation, we first consider a safe exploration problem in linear bandits and proposes an algorithm that satisfies safety constraints while minimizing the regret. We provide theoretical analysis and simulation results to demonstrate the efficiency of the proposed algorithm. Then, we consider best arm identification problem in generalized linear bandits and provide a gap-based exploration strategy that achieves desirable accuracy. We also provide an upper bound on the sample complexity of the proposed algorithm and offer numerical studies to evaluate its performance.


Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2019; ©2019
Publication date 2019; 2019
Issuance monographic
Language English


Author Kazerouni, Abbas
Degree supervisor Wein, Lawrence
Thesis advisor Wein, Lawrence
Thesis advisor Van Roy, Benjamin
Thesis advisor Weissman, Tsachy
Degree committee member Van Roy, Benjamin
Degree committee member Weissman, Tsachy
Associated with Stanford University, Department of Electrical Engineering.


Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Abbas Kazerouni.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis Ph.D. Stanford University 2019.
Location electronic resource

Access conditions

© 2019 by Abbas Kazerouni
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...