Reliable machine learning via distributional robustness

Placeholder Show Content

Abstract/Contents

Abstract
As machine learning systems increasingly get applied in high-stake domains such as autonomous vehicles and medical diagnosis, it is imperative that they maintain good performance when deployed. Modeling assumptions rarely hold due to noisy inputs, shifts in environment, unmeasured confounders, and even adversarial attacks to the system. The standard machine learning paradigm that optimize average performance is brittle to even small amounts of noise, and exhibit poor performance on underrepresented minority groups. We study \emph{distributionally robust} learning procedures that explicitly protect against potential shifts in the data-generating distribution. Instead of doing well just on average, distributionally robust methods learn models that can do well on a range of scenarios that are different to the training distribution. In the first part of thesis, we show that robustness to small perturbations in the data allows better generalization by optimally trading between approximation and estimation error. We show that robust solutions provide asymptotically exact confidence intervals and finite-sample guarantees for stochastic optimization problems. In the second part of the thesis, we focus on notions of distributional robustness that correspond to uniform performance across different subpopulations. We build procedures that balance tail-performance alongside classical notions of average performance. To trade these multiple goals \emph{optimally}, we show fundamental trade-offs (lower bounds), and develop efficient procedures that achieve these limits (upper bounds). Then, we extend our formulation to study partial covariate shifts, where we are interested in marginal distributional shifts on a subset of the feature vector. We provide convex procedures for these robust formulations, and characterize their non-asymptotic convergence properties. In the final part of the thesis, we develop and analyze distributionally robust approaches using Wasserstein distances, which allows models to generalize to distributions that have different support than the training distribution. We show that for smooth neural networks, our robust procedure guarantees performance under imperceptible adversarial perturbations. Extending such notions to protect against distributions defined on learned feature spaces, we show these models can also improve performance across unseen domains.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2019; ©2019
Publication date 2019; 2019
Issuance monographic
Language English

Creators/Contributors

Author Namkoong, Hongseok
Degree supervisor Duchi, John
Degree supervisor Glynn, Peter W
Thesis advisor Duchi, John
Thesis advisor Glynn, Peter W
Thesis advisor Blanchet, Jose H
Thesis advisor Van Roy, Benjamin
Degree committee member Blanchet, Jose H
Degree committee member Van Roy, Benjamin
Associated with Stanford University, Department of Management Science and Engineering.

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Hongseok Namkoong.
Note Submitted to the Department of Management Science and Engineering.
Thesis Thesis Ph.D. Stanford University 2019.
Location electronic resource

Access conditions

Copyright
© 2019 by Hongseok Namkoong
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...