Stein's lemma and subsampling in large-scale optimization

Erdogdu, Murat A; Stanford University, Department of Statistics.

Stein's lemma and subsampling in large-scale optimization

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fnz674mj6990" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Statistics and optimization have been closely linked since the very outset. This connection has become more essential lately, mainly because of the recent advances in computational resources, the availability of large amount of data, and the consequent growing interest in statistical and machine learning algorithms. In this dissertation, we will discuss how one can use tools from statistics such as Stein's lemma, subsampling, and shrinkage to design scalable, and efficient optimization algorithms. The focus will be on the large-scale problems where iterative minimization of the empirical risk --or maximization of the log-likelihood-- is computationally intractable, i.e., the number of observations n is much larger than the dimension of the parameter p. In each chapter, we will discuss an efficient estimator or optimization algorithm designed for training a statistical model when the dataset is large, i.e. in the regime n > > p > > 1. The proposed algorithms have wide applicability to many supervised learning problems such as binary classification with smooth surrogate losses, generalized linear problems in their canonical representation, and M-estimators. The algorithms rely on iterations that are constructed through Stein's lemma, subsampling, and/or shrinkage techniques that achieve quadratic convergence rate, and that are cheaper than any batch optimization method by at least a factor of O(p). We will discuss theoretical guarantees of the proposed algorithms, along with their convergence behavior in terms of data dimensions. Finally, we will demonstrate their performance on well-known classification and regression problems, through extensive numerical studies on large-scale real datasets, and show that they achieve the highest performance compared to other widely used and specialized algorithms.

Description

Type of resource	text
Form	electronic; electronic resource; remote
Extent	1 online resource.
Publication date	2017
Issuance	monographic
Language	English

Creators/Contributors

Associated with	Erdogdu, Murat A
Associated with	Stanford University, Department of Statistics.
Primary advisor	Bayati, Mohsen
Primary advisor	Montanari, Andrea
Thesis advisor	Bayati, Mohsen
Thesis advisor	Montanari, Andrea
Thesis advisor	Candès, Emmanuel J. (Emmanuel Jean)
Advisor	Candès, Emmanuel J. (Emmanuel Jean)

Subjects

Genre	Theses

Bibliographic information

Statement of responsibility	Murat A. Erdogdu.
Note	Submitted to the Department of Statistics.
Thesis	Thesis (Ph.D.)--Stanford University, 2017.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...