Reliable machine learning via distributional robustness
Abstract/Contents
- Abstract
- As machine learning systems increasingly get applied in high-stake domains such as autonomous vehicles and medical diagnosis, it is imperative that they maintain good performance when deployed. Modeling assumptions rarely hold due to noisy inputs, shifts in environment, unmeasured confounders, and even adversarial attacks to the system. The standard machine learning paradigm that optimize average performance is brittle to even small amounts of noise, and exhibit poor performance on underrepresented minority groups. We study \emph{distributionally robust} learning procedures that explicitly protect against potential shifts in the data-generating distribution. Instead of doing well just on average, distributionally robust methods learn models that can do well on a range of scenarios that are different to the training distribution. In the first part of thesis, we show that robustness to small perturbations in the data allows better generalization by optimally trading between approximation and estimation error. We show that robust solutions provide asymptotically exact confidence intervals and finite-sample guarantees for stochastic optimization problems. In the second part of the thesis, we focus on notions of distributional robustness that correspond to uniform performance across different subpopulations. We build procedures that balance tail-performance alongside classical notions of average performance. To trade these multiple goals \emph{optimally}, we show fundamental trade-offs (lower bounds), and develop efficient procedures that achieve these limits (upper bounds). Then, we extend our formulation to study partial covariate shifts, where we are interested in marginal distributional shifts on a subset of the feature vector. We provide convex procedures for these robust formulations, and characterize their non-asymptotic convergence properties. In the final part of the thesis, we develop and analyze distributionally robust approaches using Wasserstein distances, which allows models to generalize to distributions that have different support than the training distribution. We show that for smooth neural networks, our robust procedure guarantees performance under imperceptible adversarial perturbations. Extending such notions to protect against distributions defined on learned feature spaces, we show these models can also improve performance across unseen domains.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2019; ©2019 |
Publication date | 2019; 2019 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Namkoong, Hongseok |
---|---|
Degree supervisor | Duchi, John |
Degree supervisor | Glynn, Peter W |
Thesis advisor | Duchi, John |
Thesis advisor | Glynn, Peter W |
Thesis advisor | Blanchet, Jose H |
Thesis advisor | Van Roy, Benjamin |
Degree committee member | Blanchet, Jose H |
Degree committee member | Van Roy, Benjamin |
Associated with | Stanford University, Department of Management Science and Engineering. |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Hongseok Namkoong. |
---|---|
Note | Submitted to the Department of Management Science and Engineering. |
Thesis | Thesis Ph.D. Stanford University 2019. |
Location | electronic resource |
Access conditions
- Copyright
- © 2019 by Hongseok Namkoong
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...