Automated discovery of machine learning optimizations

Jia, Zhihao

Automated discovery of machine learning optimizations

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Ftx747jd8716" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: The increasing complexity of machine learning (ML) models and ML-specific hardware architectures makes it increasingly challenging to build efficient and scalable ML systems. Today's ML systems heavily rely on human effort to optimize the deployment of ML models on modern hardware platforms, which requires a tremendous amount of engineering effort but only provides suboptimal runtime performance. Moreover, the rapid evolution of ML models and ML-specific hardware makes it infeasible to manually optimize performance for all model and hardware combinations. In this dissertation, we propose a search-based methodology to build performant ML systems by automatically discovering performance optimizations for ML computations. Instead of only considering the limited set of manually designed performance optimizations in current ML systems, our approach introduces a significantly more comprehensive search space of possible strategies to optimize the deployment of an ML model on a hardware platform. In addition, we design efficient search algorithms to explore the search space and discover highly-optimized strategies. The search is guided by a cost model for evaluating the performance of different strategies. We also propose a number of techniques to accelerate the search procedure by leveraging the topology of the search space. This dissertation presents three ML systems that apply this methodology to optimize different tasks in ML deployment. Compared to current ML systems relying on manually designed optimizations, our ML systems enable better runtime performance by automatically discovering novel performance optimizations that are missing in current ML systems. Moreover, the performance improvement is achieved with less engineering effort, since the code needed for discovering these optimizations is much less than manual implementation of these optimizations. First, we developed TASO, the first ML graph optimizer that automatically generates graph optimizations. TASO formally verifies the correctness of the generated graph optimizations using an automated theorem prover, and uses cost-based backtracking search to discover how to apply the verified optimizations. In addition to improving runtime performance and reducing engineering effort, TASO also provides correctness guarantees using formal methods. Second, to generalize and go beyond today's manually designed parallelization strategies for distributed ML computations, we introduce the SOAP search space, which contains a comprehensive set of possible strategies to parallelize ML computations by identifying parallelization opportunities across different Samples, Operators, Attributes, and Parameters. We developed FlexFlow, a deep learning engine that automatically searches over strategies in the SOAP search space. FlexFlow includes a novel execution simulator to evaluate the runtime performance of different strategies, and uses a Markov Chain Monte Carlo (MCMC) search algorithm to find performant strategies. FlexFlow discovers strategies that significantly outperform existing strategies, while requiring no manual effort during the search procedure. Finally, we developed Roc, which automates data placement optimizations and minimizes data transfers in the memory hierarchy for large-scale graph neural network (GNN) computations. Roc formulates the task of optimizing data placement as a cost minimization problem and uses a dynamic programming algorithm to discover a globally optimal data management plan that minimizes data transfers between memories.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2020; ©2020
Publication date	2020; 2020
Issuance	monographic
Language	English

Creators/Contributors

Author	Jia, Zhihao
Degree supervisor	Aiken, Alexander
Degree supervisor	Zaharia, Matei
Thesis advisor	Aiken, Alexander
Thesis advisor	Zaharia, Matei
Thesis advisor	Olukotun, Oyekunle Ayinde
Degree committee member	Olukotun, Oyekunle Ayinde
Associated with	Stanford University, Computer Science Department

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Zhihao Jia.
Note	Submitted to the Computer Science Department.
Thesis	Thesis Ph.D. Stanford University 2020.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...