Topics in selective inference
- This thesis addresses problems of inference and estimation after using the data to select a reasonable statistical model. In particular, it uses the conditional approach espoused in selective inference. This approach allows us to compute exact level-$\alpha$ tests and 1-$\alpha$ confidence intervals through the study of the conditional distributions that conditions on the selection event. Chapter 2 -- based on Tian and Taylor [2015b] -- introduces the concept of randomized selection. The selectively valid tests after randomized selection are more powerful. It allows consistent estimation and weak convergence of selective inference procedures, which extends selective inference to nonparametric settings. Finally, we propose a framework of inference after combining multiple randomized selection procedures, providing the framework for adaptive data analysis. Chapter 3 -- based on Tian et al. [2016a, b] -- proposes a novel approach of computing the selective tests through MCMC sampling. The novelty lies in the reparametrization of the conditional distribution, which makes sampling feasible for selection after solving a general penalized regression problem. This, together with the result in Tian and Taylor [2015b] allows selective inference with arbitrary data generating distribution and general model selection through penalized regression. Applications include LASSO, forward stepwise, stagewise algorithms, marginal screening and generalized LASSO, where selective tests are hard to compute otherwise. Chapter 4 -- based on Tian  -- takes a different perspective at the impact of model selection. In addition to statistical inference, the effect of model selection is also manifested in the estimation of prediction error. Although the prediction error of a linear estimator can be computed through the Cp formula, it is not valid when variables used to construct it are also selected using the data. In this work, I propose a method for estimating prediction error of linear estimators after arbitrary model selection procedures. Direct applications include, estimating degrees of freedom of arbitrary selection procedures and heritability estimation in genetics.
|Type of resource
|electronic; electronic resource; remote
|1 online resource.
|Stanford University, Department of Statistics.
|Statement of responsibility
|Submitted to the Department of Statistics.
|Thesis (Ph.D.)--Stanford University, 2017.
- © 2017 by Xiaoying Tian
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...