Theory and algorithms for efficient deployment of machine learning systems

Placeholder Show Content

Abstract/Contents

Abstract
In this work, we explore theory and algorithms that improve the efficiency of various aspects of machine learning systems. First, we investigate algorithmic principles that enable efficient machine unlearning in machine learning. We propose two unsupervised learning algorithms which achieve over 100 times improvement in online data deletion, while producing clusters of comparable statistical quality to a canonical k-means++ baseline. Second, we explore mixed dimension embeddings, an embedding layer architecture in which a particular embedding vector's dimension scales with its query frequency. Through theoretical analysis and systematic experiments, we demonstrate that using mixed dimensions can drastically reduce the memory usage, while maintaining and even improving predictive performance. Mixed dimension layers improve accuracy by 0.1% using half as many parameters or maintain it using 16 times fewer parameters for click-through rate prediction on the Criteo Kaggle dataset. They also train over 2 times faster on a GPU. Finally, we propose a novel approach, MLDemon, for ML Deployment monitoring. MLDemon integrates both unlabeled data and a small amount of on-demand labels to produce a real-time estimate of a deployed model's current accuracy on a given data stream. Subject to budget constraints, MLDemon decides when to acquire additional, potentially costly, expert supervised labels to verify the model. MLDemon compares favorably to prior methods on benchmarks. We also provide theoretical analysis to show that MLDemon is minimax rate optimal for a broad class of distribution drifts.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2022; ©2022
Publication date 2022; 2022
Issuance monographic
Language English

Creators/Contributors

Author Ginart, Antonio Alejandro
Degree supervisor Zou, James
Thesis advisor Zou, James
Thesis advisor Özgür, Ayfer
Thesis advisor Valiant, Gregory
Degree committee member Özgür, Ayfer
Degree committee member Valiant, Gregory
Associated with Stanford University, Department of Electrical Engineering

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Antonio Alejandro Ginart.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis Ph.D. Stanford University 2022.
Location https://purl.stanford.edu/xy602qz3160

Access conditions

Copyright
© 2022 by Antonio Alejandro Ginart
License
This work is licensed under a Creative Commons Attribution Share Alike 3.0 Unported license (CC BY-SA).

Also listed in

Loading usage metrics...