The surprising power of little data
Abstract/Contents
- Abstract
- Despite the rapid growth of the size of our datasets, the inherent complexity of the problems we are solving is also growing, if not at an even faster rate. This prompts the question of how to infer the most information from the available data. This thesis discusses several examples that reveal a surprising ability to extract accurate information from modest amounts of data. The first setting that we discuss considers data provided by a large number of heterogeneous individuals, and we show that the empirical distribution of the data can be significantly "de-noised". The second setting considers estimating the covariance spectrum of a high-dimensional distribution, in the sublinear sample regime where the empirical distribution of the data is misleading. The final setting focuses on estimating "learnability": given too little data to learn an accurate prediction model, we can accurately estimate the value of collecting more data. Specifically, for some natural model classes, we can estimate the performance of the best model in the class, given too little data to find any model in the class that would achieve good prediction error. We extend our techniques for estimating learnability to the more general stochastic optimization problems, including the contextual bandit setting. In most of these settings, our algorithms are provably information-theoretically optimal, highly practical, and empirically evaluated by real-world datasets.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2019; ©2019 |
Publication date | 2019; 2019 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Kong, Weihao | |
---|---|---|
Degree supervisor | Valiant, Gregory | |
Thesis advisor | Valiant, Gregory | |
Thesis advisor | Charikar, Moses | |
Thesis advisor | Reingold, Omer | |
Degree committee member | Charikar, Moses | |
Degree committee member | Reingold, Omer | |
Associated with | Stanford University, Computer Science Department. |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Weihao Kong. |
---|---|
Note | Submitted to the Computer Science Department. |
Thesis | Thesis Ph.D. Stanford University 2019. |
Location | electronic resource |
Access conditions
- Copyright
- © 2019 by Weihao Kong
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...