Coordinated exploration in concurrent reinforcement learning

Dimakopoulou, Maria

Coordinated exploration in concurrent reinforcement learning

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fhs944xz0420" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Consider a farm of robots operating concurrently, learning to carry out a task, and sharing data with one another in real time. There are clear benefits to scale, since a larger number of robots can gather and share more data in the same amount of time that enables each robot in the group to learn faster than it would learn on its own. However, the benefits of parallelization can be greatly magnified if the robots explore the environment in a coordinated fashion, diversifying their experience and adapting appropriately as data is gathered. We identify three properties which are necessary for efficient coordinated exploration -- adaptivity, commitment, diversity -- and demonstrate that straightforward extensions of statistically efficient optimistic and posterior sampling approaches from single-agent to concurrent reinforcement learning fail to satisfy them. As an alternative, we propose seed sampling, which extends posterior sampling in a manner that meets these requirements. We proceed to design concurrent reinforcement learning algorithms that achieve the three properties of efficient coordinated exploration in real systems with potentially enormous state spaces, to which tabular methods do not scale. This is achieved by coupling the seed sampling concept with learning general value function or policy function representations that allow the agents to generalize and operate in arbitrarily large scale environments. Overall, this dissertation presents a concept that enables rapid learning of effective policies in the presence of complex goals, when we have multiple agents who learn to operate in parallel in a common, unknown environment and collaborate in real time. There is a multitude of applications that can benefit from this algorithmic framework, ranging from recommendation systems, to robot automation, to autonomous vehicles.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2018; ©2018
Publication date	2018; 2018
Issuance	monographic
Language	English

Creators/Contributors

Author	Dimakopoulou, Maria
Degree supervisor	Van Roy, Benjamin
Thesis advisor	Van Roy, Benjamin
Thesis advisor	Athey, Susan
Thesis advisor	Brunskill, Emma
Degree committee member	Athey, Susan
Degree committee member	Brunskill, Emma
Associated with	Stanford University, Department of Management Science and Engineering.

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Maria Dimakopoulou.
Note	Submitted to the Department of Management Science and Engineering.
Thesis	Thesis Ph.D. Stanford University 2018.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...