Towards understanding the interaction between learning algorithms and humans
Abstract/Contents
- Abstract
- This thesis considers several learning problems where interaction between the end-user and the algorithm play an important role. The thesis is organized as follows. The first chapter provides background material on mechanism design and reinforcement learning, and discusses related work. This sets us up for the next three chapters, which cover our contributions. In the second chapter, we consider recommendation systems. Many recommendation algorithms rely on user data to generate recommendations. However, these recommendations also affect the data obtained from future users. We aim to understand the effects of this dynamic interaction. We propose a simple model where users with heterogeneous preferences arrive over time. Based on this model, we prove that naive estimators, i.e. those which ignore this feedback loop, are not consistent. We show that consistent estimators are efficient in the presence of myopic agents. Our results are validated using simulations. In the third chapter, we consider a platform that wants to learn a personalized policy for each user, but the platform faces the risk of a user abandoning the platform if they are dissatisfied with the actions of the platform. For example, a platform is interested in personalizing the number of newsletters it sends, but faces the risk that the user unsubscribes forever. We propose a general thresholded learning model for scenarios like this, and discuss the structure of optimal policies. We describe salient features of optimal personalization algorithms and how feedback the platform receives impacts the results. Furthermore, we investigate how the platform can efficiently learn the heterogeneity across users by interacting with a population and provide performance guarantees. In the fourth chapter, we propose a new experimentation framework for the setting where there are many hypotheses and observations are costly. In such scenario, it is important to internalize the opportunity cost of assigning a sample to an experiment. We fully characterize the optimal policy and give an algorithm to compute it. Furthermore, we provide a simple heuristic that helps understand the optimal policy. Simulations based on baseball batting average data demonstrate superior performance compared to alternative algorithms. We also discuss more general insights gained from this testing paradigm, such as the paradox of power; high-powered tests can lead to inefficient sampling.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2018; ©2018 |
Publication date | 2018; 2018 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Schmit, Sven |
---|---|
Degree supervisor | Johari, Ramesh, 1976- |
Thesis advisor | Johari, Ramesh, 1976- |
Thesis advisor | Goel, Ashish |
Thesis advisor | Ugander, Johan |
Degree committee member | Goel, Ashish |
Degree committee member | Ugander, Johan |
Associated with | Stanford University, Institute for Computational and Mathematical Engineering. |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Sven Schmit. |
---|---|
Note | Submitted to the Institute for Computational and Mathematical Engineering. |
Thesis | Thesis Ph.D. Stanford University 2018. |
Location | electronic resource |
Access conditions
- Copyright
- © 2018 by Sven Peter Schmit
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...