Topics in robust mean estimation and applications to importance sampling
Abstract/Contents
- Abstract
- The sample mean is often used to aggregate different unbiased estimates of a real parameter, producing a final estimate that is unbiased but possibly with high variance. This thesis proposes two new robust estimators that can adaptively trade off some bias for variance, resulting in more competitive non-parametric mean estimators. The first estimator winsorizes the sample mean in a data-dependent way. The threshold level at which to winsorize is determined by a concrete version of the Balancing Principle, also known as Lepski's Method. The procedure chooses a threshold level among a pre-defined set by roughly balancing the bias and variance of the estimator when winsorized at different levels. While the assumptions of the Balancing Principle are probabilistic, in the importance sampling setting it is possible to bound such probabilities to obtain a finite-sample theorem yielding a principled way to perform winsorization with optimality guarantees. The second estimator introduces an aggregation rule that roughly interpolates between the sample mean and median, resulting in estimates with much smaller variance at the expense of bias. While the procedure is non-parametric, its squared bias is asymptotically negligible relative to the variance, similar to maximum likelihood estimators. The estimator is consistent, and concentration bounds for the its bias and L1 error are derived, as well as a fast, non-randomized approximating algorithm that enjoys similar theoretical properties. The empirical performances of the estimators are examined in real and simulated data, with a focus on importance sampling. They generally match the performance of the sample mean in low-variance settings, while exhibiting far better results in high-variance scenarios.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2019; ©2019 |
Publication date | 2019; 2019 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Najberg Orenstein, Paulo |
---|---|
Degree supervisor | Diaconis, Persi |
Thesis advisor | Diaconis, Persi |
Thesis advisor | Chatterjee, Sourav |
Thesis advisor | Wong, Wing Hung |
Degree committee member | Chatterjee, Sourav |
Degree committee member | Wong, Wing Hung |
Associated with | Stanford University, Department of Statistics. |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Paulo Najberg Orenstein. |
---|---|
Note | Submitted to the Department of Statistics. |
Thesis | Thesis Ph.D. Stanford University 2019. |
Location | electronic resource |
Access conditions
- Copyright
- © 2019 by Paulo Najberg Orenstein
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...