Fundamental limits and optimal uses of data in modern learning problems

Placeholder Show Content

Abstract/Contents

Abstract
What is the best we can do with the amount of data at our disposal with a given learning task? Modern learning problems—with a modest amount of data or subject to data processing constraints—frequently raise the needs to understand the fundamental limits and make judicious use of the available small or imperfect data. Given a learning problem, the aim of this thesis is to exploit its key structure, as well as optimally trade between real-world resources, to achieve statistical efficiency. This thesis covers three examples spanning statistics, information theory, theoretical computer science, operations management, and economics. The first example is the estimation of entropy of continuous distributions, an important and classic problem in information theory and statistics. In this problem, we identify and overcome the bottleneck caused by small densities, and provide the first characterization of the minimax risk without the prevalent assumption that the densities are bounded away from zero. The second example is the unified estimation of properties of discrete distributions. On one hand, we show that two approaches, based on either local moment matching or profile maximum likelihood, could both be universally optimal. On the other hand, we determine the fundamental limits of all unified estimators, and identify a tight separation between ad-hoc and unified approaches emerged only in the high-accuracy regime. The third example concerns the optimal bidding in first-price auctions, which have very recently swept the online ad industry. We characterize the optimal performances and propose computationally efficient algorithms, under various assumptions on the characteristics of the other bidders' bids, of the bidder's private valuation, of the feedback structure of the auction, and of the reference policies with which our bidder competes. Experimentation on first-price auction datasets from Verizon Media demonstrates the promise of our schemes relative to existing bidding algorithms.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2021; ©2021
Publication date 2021; 2021
Issuance monographic
Language English

Creators/Contributors

Author Han, Yanjun
Degree supervisor Weissman, Tsachy
Thesis advisor Weissman, Tsachy
Thesis advisor Montanari, Andrea
Thesis advisor Özgür, Ayfer
Degree committee member Montanari, Andrea
Degree committee member Özgür, Ayfer
Associated with Stanford University, Department of Electrical Engineering

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Yanjun Han.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis Ph.D. Stanford University 2021.
Location https://purl.stanford.edu/tn249zy5641

Access conditions

Copyright
© 2021 by Yanjun Han
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...