A hybrid approach to automatic program parallelization via efficient tasking with composable data partitioning
Abstract/Contents
- Abstract
- Despite the decades of research, distributed programming is still a painful task and programming systems designed to improve productivity fall short in practice. Auto-parallelizing compilers simplify distributed programming by parallelizing sequential programs automatically for distributed execution. However, their applicability is severely limited due to the fundamental undecidability of their static analysis problem. Runtime systems for implicit parallelism can handle a broader class of programs via an expressive programming model, but their runtime overhead often becomes a performance bottleneck. To design a practical system for productive distributed programming, one must combine the strengths of different parallelization paradigms to overcome their weaknesses when used in isolation. This dissertation presents a hybrid approach to automatic program parallelization, which combines an auto-parallelizing compiler with an implicitly parallel tasking system. Our approach parallelizes programs in two steps. First, the auto-parallelizer materializes data parallelism in a program into task parallelism. Next, the tasking system dynamically analyzes dependencies between tasks and executes independent tasks in parallel. This two-stage process gives programmers a second chance when the auto-parallelizer "fails": When a part of a program is not amenable to the compiler auto-parallelization, the programmer can gracefully fall back to the run- time parallelization by writing that part directly with task parallelism. Furthermore, hand-written tasks can be seamlessly integrated with the auto-parallelized part via composable data partitioning enabled by our auto-parallelizer, which allows them to share the partitioning strategy and thereby avoid excessive communication. Key to the success of this hybrid approach is to minimize the overhead of the tasking system. To achieve this goal, we introduce dynamic tracing, a runtime mechanism for efficient tasking. The most expensive component in the tasking system is dynamic dependence analysis. Although this dynamic analysis is necessary when applications exhibit true dynamic behavior, the analysis is redundant for common cases where dependencies are (mostly) unchanging. Dynamic tracing eliminates this redundancy in dynamic dependence analysis by recording the dependence analysis of an execution trace and then replaying the recording for the subsequent occurrences of the same trace. To guarantee that a recording of a trace correctly replaces the trace's original analysis, dynamic tracing also records memory locations that hold valid data when it records a trace and replays the recording only when those locations are still valid. Dynamic tracing significantly improves the efficiency of tasking, and thereby brings the strong scalability of explicit parallelism to implicit task parallelism
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2019; ©2019 |
Publication date | 2019; 2019 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Lee, Woenchan |
---|---|
Degree supervisor | Aiken, Alexander |
Thesis advisor | Aiken, Alexander |
Thesis advisor | Mitchell, John C |
Thesis advisor | Olukotun, Oyekunle Ayinde |
Degree committee member | Mitchell, John C |
Degree committee member | Olukotun, Oyekunle Ayinde |
Associated with | Stanford University, Computer Science Department. |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Woenchan Lee |
---|---|
Note | Submitted to the Computer Science Department |
Thesis | Thesis Ph.D. Stanford University 2019 |
Location | electronic resource |
Access conditions
- Copyright
- © 2019 by Woenchan Lee
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...