Randomized methods in optimization : fast algorithms and fundamental limits

Placeholder Show Content

Abstract/Contents

Abstract
High-dimensional machine learning models trained on massive datasets have shown exceptional predictive ability over the last decade. Their adoption has created novel challenges and paradigms for practitioners and researchers. Typically, stochastic first-order methods have become the method of choice for training these models, due to their small per-iteration computational cost. Yet, these are known to be sensitive to hyperparameters' choices with weak or even no convergence guarantees. Alternative (e.g., second-order) optimization methods have not received the same amount of attention nor proved to be as effective in practice. A key bottleneck usually lies in their computational requirements, way more intensive than first-order methods. In this thesis, we are interested in random projections -- also known as (a.k.a.) sketching or subsampling methods -- in the context of large-scale optimization problems, which either involve massive datasets or high-dimensional optimization variables. The generic purpose of using random projections here is to reduce the dimensionality of the data matrix and/or the optimization variable, to obtain both faster convergence and smaller memory requirements. Naturally, random projections incur a loss of information, whence a loss in statistical performance. We consider the fundamental questions that arise when compressing information for faster computational procedures. What is the minimal embedding dimension one can use to project the data that guarantees convergence of an optimization solver? In fact, how to optimally couple random projections of the data with an optimization solver? And how to achieve optimal trade-offs between statistical and computational performance in terms of the choice of the random projection and the embedding dimension (a.k.a. sketch size)? To answer these questions, we will first focus on unconstrained convex quadratic objectives, and consider a broad class of first-order optimization solvers that involve a sketching-based preconditioner. We will explore and compare different random projections and classical first-order methods. We gradually extend our analysis and methodology to regularized convex quadratic optimization, and then to generic convex optimization problems.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2021; ©2021
Publication date 2021; 2021
Issuance monographic
Language English

Creators/Contributors

Author Lacotte, Jonathan Pierre
Degree supervisor Pilanci, Mert
Thesis advisor Pilanci, Mert
Thesis advisor Boyd, Stephen P
Thesis advisor Wootters, Mary
Degree committee member Boyd, Stephen P
Degree committee member Wootters, Mary
Associated with Stanford University, Department of Electrical Engineering

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Jonathan Lacotte.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis Ph.D. Stanford University 2021.
Location https://purl.stanford.edu/nd373pq4764

Access conditions

Copyright
© 2021 by Jonathan Pierre Lacotte
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...