Towards understanding the interaction between learning algorithms and humans

Placeholder Show Content

Abstract/Contents

Abstract
This thesis considers several learning problems where interaction between the end-user and the algorithm play an important role. The thesis is organized as follows. The first chapter provides background material on mechanism design and reinforcement learning, and discusses related work. This sets us up for the next three chapters, which cover our contributions. In the second chapter, we consider recommendation systems. Many recommendation algorithms rely on user data to generate recommendations. However, these recommendations also affect the data obtained from future users. We aim to understand the effects of this dynamic interaction. We propose a simple model where users with heterogeneous preferences arrive over time. Based on this model, we prove that naive estimators, i.e. those which ignore this feedback loop, are not consistent. We show that consistent estimators are efficient in the presence of myopic agents. Our results are validated using simulations. In the third chapter, we consider a platform that wants to learn a personalized policy for each user, but the platform faces the risk of a user abandoning the platform if they are dissatisfied with the actions of the platform. For example, a platform is interested in personalizing the number of newsletters it sends, but faces the risk that the user unsubscribes forever. We propose a general thresholded learning model for scenarios like this, and discuss the structure of optimal policies. We describe salient features of optimal personalization algorithms and how feedback the platform receives impacts the results. Furthermore, we investigate how the platform can efficiently learn the heterogeneity across users by interacting with a population and provide performance guarantees. In the fourth chapter, we propose a new experimentation framework for the setting where there are many hypotheses and observations are costly. In such scenario, it is important to internalize the opportunity cost of assigning a sample to an experiment. We fully characterize the optimal policy and give an algorithm to compute it. Furthermore, we provide a simple heuristic that helps understand the optimal policy. Simulations based on baseball batting average data demonstrate superior performance compared to alternative algorithms. We also discuss more general insights gained from this testing paradigm, such as the paradox of power; high-powered tests can lead to inefficient sampling.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2018; ©2018
Publication date 2018; 2018
Issuance monographic
Language English

Creators/Contributors

Author Schmit, Sven
Degree supervisor Johari, Ramesh, 1976-
Thesis advisor Johari, Ramesh, 1976-
Thesis advisor Goel, Ashish
Thesis advisor Ugander, Johan
Degree committee member Goel, Ashish
Degree committee member Ugander, Johan
Associated with Stanford University, Institute for Computational and Mathematical Engineering.

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Sven Schmit.
Note Submitted to the Institute for Computational and Mathematical Engineering.
Thesis Thesis Ph.D. Stanford University 2018.
Location electronic resource

Access conditions

Copyright
© 2018 by Sven Peter Schmit
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...