Topics in selective inference

Placeholder Show Content

Abstract/Contents

Abstract
This thesis addresses problems of inference and estimation after using the data to select a reasonable statistical model. In particular, it uses the conditional approach espoused in selective inference. This approach allows us to compute exact level-$\alpha$ tests and 1-$\alpha$ confidence intervals through the study of the conditional distributions that conditions on the selection event. Chapter 2 -- based on Tian and Taylor [2015b] -- introduces the concept of randomized selection. The selectively valid tests after randomized selection are more powerful. It allows consistent estimation and weak convergence of selective inference procedures, which extends selective inference to nonparametric settings. Finally, we propose a framework of inference after combining multiple randomized selection procedures, providing the framework for adaptive data analysis. Chapter 3 -- based on Tian et al. [2016a, b] -- proposes a novel approach of computing the selective tests through MCMC sampling. The novelty lies in the reparametrization of the conditional distribution, which makes sampling feasible for selection after solving a general penalized regression problem. This, together with the result in Tian and Taylor [2015b] allows selective inference with arbitrary data generating distribution and general model selection through penalized regression. Applications include LASSO, forward stepwise, stagewise algorithms, marginal screening and generalized LASSO, where selective tests are hard to compute otherwise. Chapter 4 -- based on Tian [2016] -- takes a different perspective at the impact of model selection. In addition to statistical inference, the effect of model selection is also manifested in the estimation of prediction error. Although the prediction error of a linear estimator can be computed through the Cp formula, it is not valid when variables used to construct it are also selected using the data. In this work, I propose a method for estimating prediction error of linear estimators after arbitrary model selection procedures. Direct applications include, estimating degrees of freedom of arbitrary selection procedures and heritability estimation in genetics.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2017
Issuance monographic
Language English

Creators/Contributors

Associated with Tian, Xiaoying
Associated with Stanford University, Department of Statistics.
Primary advisor Taylor, Jonathan
Thesis advisor Taylor, Jonathan
Thesis advisor Hastie, Trevor
Thesis advisor Tibshirani, Robert
Advisor Hastie, Trevor
Advisor Tibshirani, Robert

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Xiaoying Tian.
Note Submitted to the Department of Statistics.
Thesis Thesis (Ph.D.)--Stanford University, 2017.
Location electronic resource

Access conditions

Copyright
© 2017 by Xiaoying Tian
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...