Confident and reliable statistical predictions in changing environments

Placeholder Show Content

Abstract/Contents

Abstract
A complete machine learning pipeline typically spans three distinct phases: first data collection, then model selection and training, and finally model validation, evaluation and eventual failure detection. While a large part of the statistical machine learning literature discusses the second of these tasks, the last one arguably becomes more central as the decisions we trust our statistical models with, whether it be in healthcare, transportation systems, finance or policies, turn increasingly critical, which is why in this thesis we focus on that particular aspect. We propose novel algorithms that quantity the uncertainty of predictions and construct reliable end-to-end predictive pipelines, even allowing them to leverage weaker forms of data supervision in the process. We additionally focus on a model's behavior after its release, and make sure to guarantee adequate performance even when future test distributions vary from the initial available ones; in particular, we design algorithms that are capable of detecting and localizing potential failure modes of our model, with the end goal of improving it on specific ``hard" slices of our data. Each of our methods typically builds on top of any black-box predictive model and comes with tunable and quantifiable guarantees, allowing the practitioner some flexibility when designing their model. Additionally, we present empirical results of our methodology on real-world data sets (including ImageNet, Covid-19, PovertyMap); we design experiments suggesting that in realistic scenarios our methods behave consistently with our initial expectations and hypotheses.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2022; ©2022
Publication date 2022; 2022
Issuance monographic
Language English

Creators/Contributors

Author Cauchois, Maxime Rene Marcel
Degree supervisor Duchi, John
Thesis advisor Duchi, John
Thesis advisor Candès, Emmanuel J. (Emmanuel Jean)
Thesis advisor Liang, Percy
Degree committee member Candès, Emmanuel J. (Emmanuel Jean)
Degree committee member Liang, Percy
Associated with Stanford University, Department of Statistics

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Maxime Cauchois.
Note Submitted to the Department of Statistics.
Thesis Thesis Ph.D. Stanford University 2022.
Location https://purl.stanford.edu/gj794wk1767

Access conditions

Copyright
© 2022 by Maxime Rene Marcel Cauchois
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...