Applied statistical methods for high-dimensional generalized linear models

Placeholder Show Content

Abstract/Contents

Abstract
The Generalized Linear Model (GLM) is a fundamental statistical model to describe the relation between a response variable and a set of covariates. The model coefficients of a GLM are usually estimated using the maximum likelihood estimator (MLE) and confidence intervals for the coefficients are constructed using the classical asymptotic theory of the MLE. While the classical theory is valid under the condition that the number of variables p is vanishing compared to the number of observations n, it is invalid when p is comparable to n. To infer model parameters in the high-dimensional setting, researchers have been studying the asymptotic distribution of the MLE when p grows with n at a constant ratio, which they found to be informative in practical settings. These works typically focus on the setting when the covariates are i.i.d. or multivariate Gaussian. One open question is how to estimate the MLE distribution for general covariates. In this work, we study the distribution of the MLE with the objective of achieving valid inference for a high-dimensional GLM. We take two approaches in our study. First, we derive the theoretical distribution of a high-dimensional logistic regression when the covariates are multivariate Gaussian, and we demonstrate that our theory is accurate for moderate sample sizes. Second, when covariates are not Gaussian, we develop a resized bootstrap method to approximate the MLE distribution. We observe in simulated examples that the resized bootstrap method provides valid inference for a variety of GLM and covariate distributions. One application of our method is constructing confidence intervals for GLM coefficients.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2021; ©2021
Publication date 2021; 2021
Issuance monographic
Language English

Creators/Contributors

Author Zhao, Qian, (Researcher in applied statistical methods)
Degree supervisor Candès, Emmanuel J. (Emmanuel Jean)
Thesis advisor Candès, Emmanuel J. (Emmanuel Jean)
Thesis advisor Montanari, Andrea
Thesis advisor Taylor, Jonathan E
Degree committee member Montanari, Andrea
Degree committee member Taylor, Jonathan E
Associated with Stanford University, Department of Statistics

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Qian Zhao.
Note Submitted to the Department of Statistics.
Thesis Thesis Ph.D. Stanford University 2021.
Location https://purl.stanford.edu/wc409cq6066

Access conditions

Copyright
© 2021 by Qian Zhao

Also listed in

Loading usage metrics...