Resource and data efficient deep learning
Abstract/Contents
- Abstract
- Using massive computation, deep learning allows machines to translate large amounts of data into models that accurately predict the real world, enabling powerful applications like virtual assistants and autonomous vehicles. As datasets and computer systems have continued to grow in scale, so has the quality of machine learning models, creating an expensive appetite in practitioners and researchers for data and computation. To address this demand, this dissertation discusses ways to measure and improve both the computational and data efficiency of deep learning. First, we introduce DAWNBench and MLPerf as a systematic way to measure end-to-end machine learning system performance. Researchers have proposed numerous hardware, software, and algorithmic optimizations to improve the computational efficiency of deep learning. While some of these optimizations perform the same operations faster (e.g., increasing GPU clock speed), many others modify the semantics of the training procedure (e.g., reduced precision) and can even impact the final model's accuracy on unseen data. Because of these trade-offs between accuracy and computational efficiency, it has been difficult to compare and understand the impact of these optimizations. We propose and evaluate a new metric, time-to-accuracy, that can be used to compare different system designs and use it to evaluate high performing systems by organizing two public benchmark competitions, DAWNBench and MLPerf. MLPerf has now grown into an industry standard benchmark co-organized by over 70 organizations. Second, we present ways to perform data selection on large-scale datasets efficiently. Data selection methods, such as active learning and core-set selection, improve the data efficiency of machine learning by identifying the most informative data points to label or train on. Across the data selection literature, there are many ways to identify these training examples. However, classical data selection methods are prohibitively expensive to apply in deep learning because of the larger datasets and models. To make these methods tractable, we propose (1) "selection via proxy" (SVP) to avoid expensive training and reduce the computation per example and (2) "similarity search for efficient active learning and search" (SEALS) to reduce the number of examples processed. Both methods lead to order of magnitude performance improvements, making techniques like active learning on billions of unlabeled images practical for the first time.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2021; ©2021 |
Publication date | 2021; 2021 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Coleman, Cody Austun |
---|---|
Degree supervisor | Zaharia, Matei |
Thesis advisor | Zaharia, Matei |
Thesis advisor | Bailis, Peter |
Thesis advisor | Li, Fei Fei, 1976- |
Degree committee member | Bailis, Peter |
Degree committee member | Li, Fei Fei, 1976- |
Associated with | Stanford University, Computer Science Department |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Cody Coleman. |
---|---|
Note | Submitted to the Computer Science Department. |
Thesis | Thesis Ph.D. Stanford University 2021. |
Location | https://purl.stanford.edu/my863wx9641 |
Access conditions
- Copyright
- © 2021 by Cody Austun Coleman
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...