Regularization in high-dimensional statistics

Placeholder Show Content

Abstract/Contents

Abstract
Modern datasets are growing in terms of samples but even more so in terms of variables. We often encounter datasets where samples consists of time series, images, even movies, so that each sample has thousands, even millions of variables. Classical statistical approaches are inadequate for working with such high-dimensional data because they rely on theoretical and computational tools developed without such data in mind. The work in this thesis seeks to close the apparent gap between the growing size of emerging datasets and the capabilities of existing approaches to statistical estimation, inference, and computing. This thesis focuses on two problems that arise in learning from high-dimensional data (versus black-box approaches that do not yield insights into the underlying data-generation process). They are: 1. model selection and post-selection inference: discover the latent low-dimensional structure in high-dimensional data; 2. scalable statistical computing: design scalable estimators and algorithms that avoid communication and minimize ``passes'' over the data. The work relies crucially on results from convex analysis and geometry. Many of the algorithms and proofs are inspired by results from this beautiful but dusty corner of mathematics.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2015
Issuance monographic
Language English

Creators/Contributors

Associated with Sun, Yuekai
Associated with Stanford University, Institute for Computational and Mathematical Engineering.
Primary advisor Saunders, Michael
Primary advisor Taylor, Jonathan
Thesis advisor Saunders, Michael
Thesis advisor Taylor, Jonathan
Thesis advisor Montanari, Andrea
Advisor Montanari, Andrea

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Yuekai Sun.
Note Submitted to the Institute for Computational and Mathematical Engineering.
Thesis Thesis (Ph.D.)--Stanford University, 2015.
Location electronic resource

Access conditions

Copyright
© 2015 by Yuekai Sun
License
This work is licensed under a Creative Commons Attribution 3.0 Unported license (CC BY).

Also listed in

Loading usage metrics...