Resource and data efficient deep learning

Placeholder Show Content

Abstract/Contents

Abstract
Using massive computation, deep learning allows machines to translate large amounts of data into models that accurately predict the real world, enabling powerful applications like virtual assistants and autonomous vehicles. As datasets and computer systems have continued to grow in scale, so has the quality of machine learning models, creating an expensive appetite in practitioners and researchers for data and computation. To address this demand, this dissertation discusses ways to measure and improve both the computational and data efficiency of deep learning. First, we introduce DAWNBench and MLPerf as a systematic way to measure end-to-end machine learning system performance. Researchers have proposed numerous hardware, software, and algorithmic optimizations to improve the computational efficiency of deep learning. While some of these optimizations perform the same operations faster (e.g., increasing GPU clock speed), many others modify the semantics of the training procedure (e.g., reduced precision) and can even impact the final model's accuracy on unseen data. Because of these trade-offs between accuracy and computational efficiency, it has been difficult to compare and understand the impact of these optimizations. We propose and evaluate a new metric, time-to-accuracy, that can be used to compare different system designs and use it to evaluate high performing systems by organizing two public benchmark competitions, DAWNBench and MLPerf. MLPerf has now grown into an industry standard benchmark co-organized by over 70 organizations. Second, we present ways to perform data selection on large-scale datasets efficiently. Data selection methods, such as active learning and core-set selection, improve the data efficiency of machine learning by identifying the most informative data points to label or train on. Across the data selection literature, there are many ways to identify these training examples. However, classical data selection methods are prohibitively expensive to apply in deep learning because of the larger datasets and models. To make these methods tractable, we propose (1) "selection via proxy" (SVP) to avoid expensive training and reduce the computation per example and (2) "similarity search for efficient active learning and search" (SEALS) to reduce the number of examples processed. Both methods lead to order of magnitude performance improvements, making techniques like active learning on billions of unlabeled images practical for the first time.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2021; ©2021
Publication date 2021; 2021
Issuance monographic
Language English

Creators/Contributors

Author Coleman, Cody Austun
Degree supervisor Zaharia, Matei
Thesis advisor Zaharia, Matei
Thesis advisor Bailis, Peter
Thesis advisor Li, Fei Fei, 1976-
Degree committee member Bailis, Peter
Degree committee member Li, Fei Fei, 1976-
Associated with Stanford University, Computer Science Department

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Cody Coleman.
Note Submitted to the Computer Science Department.
Thesis Thesis Ph.D. Stanford University 2021.
Location https://purl.stanford.edu/my863wx9641

Access conditions

Copyright
© 2021 by Cody Austun Coleman
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...