Fundamental limits and optimal uses of data in modern learning problems

Han, Yanjun

Fundamental limits and optimal uses of data in modern learning problems

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Ftn249zy5641" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: What is the best we can do with the amount of data at our disposal with a given learning task? Modern learning problemsâwith a modest amount of data or subject to data processing constraintsâfrequently raise the needs to understand the fundamental limits and make judicious use of the available small or imperfect data. Given a learning problem, the aim of this thesis is to exploit its key structure, as well as optimally trade between real-world resources, to achieve statistical efficiency. This thesis covers three examples spanning statistics, information theory, theoretical computer science, operations management, and economics. The first example is the estimation of entropy of continuous distributions, an important and classic problem in information theory and statistics. In this problem, we identify and overcome the bottleneck caused by small densities, and provide the first characterization of the minimax risk without the prevalent assumption that the densities are bounded away from zero. The second example is the unified estimation of properties of discrete distributions. On one hand, we show that two approaches, based on either local moment matching or profile maximum likelihood, could both be universally optimal. On the other hand, we determine the fundamental limits of all unified estimators, and identify a tight separation between ad-hoc and unified approaches emerged only in the high-accuracy regime. The third example concerns the optimal bidding in first-price auctions, which have very recently swept the online ad industry. We characterize the optimal performances and propose computationally efficient algorithms, under various assumptions on the characteristics of the other bidders' bids, of the bidder's private valuation, of the feedback structure of the auction, and of the reference policies with which our bidder competes. Experimentation on first-price auction datasets from Verizon Media demonstrates the promise of our schemes relative to existing bidding algorithms.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2021; ©2021
Publication date	2021; 2021
Issuance	monographic
Language	English

Creators/Contributors

Author	Han, Yanjun
Degree supervisor	Weissman, Tsachy
Thesis advisor	Weissman, Tsachy
Thesis advisor	Montanari, Andrea
Thesis advisor	Özgür, Ayfer
Degree committee member	Montanari, Andrea
Degree committee member	Özgür, Ayfer
Associated with	Stanford University, Department of Electrical Engineering

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Yanjun Han.
Note	Submitted to the Department of Electrical Engineering.
Thesis	Thesis Ph.D. Stanford University 2021.
Location	https://purl.stanford.edu/tn249zy5641

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...