Scaling high performance domain-specific language implementation with delite

Placeholder Show Content

Abstract/Contents

Abstract
The multicore era is now in full swing: single threaded processors with deep complex pipelines have been replaced with an increasing number of simpler processors. The shift to these multicore designs is motivated by the need for energy efficiency in addition to high-performance. There is now mounting evidence that further improvements in energy efficiency and performance will come from heterogeneous hardware. Programming heterogeneous hardware systems is difficult, which limits their utility. Each heterogeneous computing element has its own performance characteristics and pitfalls, and usually comes with its own programming model. This means that applications cannot take advantage of the additional compute power available in these new and emerging systems without a significant parallel programming effort. To simplify the complexity of programming heterogeneous hardware, one viable approach is the use of Domain-Specific Languages (DSLs) to develop algorithms at very high levels of abstraction. A corresponding DSL compiler can then reason about high-level domain knowledge now explicitly encoded in the application and generate efficient implementations of the algorithm for the different heterogeneous computing elements. This shifts most of the burden of parallelization to DSL authors requiring them to combine domain, programming language implementation and parallelization expertise. In this Thesis, we start by discussing the benefits of using such DSLs for parallel heterogeneous programming. We motivate the need for an infrastructure to simplify the effort required to author these high-performance DSLs. We then present the Delite Compiler Framework and Runtime, the result of our effort in designing and implementing such an infrastructure. The framework lifts embedded DSL applications to an intermediate representation (IR), performs generic, parallel, and domain-specific optimizations, and generates an execution graph that targets multiple heterogeneous hardware devices. One key component of this framework is a set of IR nodes, called Delite Ops, which simplify DSL parallelization by providing a set of re-usable and extensible parallel execution patterns. We illustrate the usefulness of Delite by showing examples of DSLs that have been implemented using this framework.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2014
Issuance monographic
Language English

Creators/Contributors

Associated with Chafi, Hassan
Associated with Stanford University, Department of Electrical Engineering.
Primary advisor Olukotun, Oyekunle Ayinde
Thesis advisor Olukotun, Oyekunle Ayinde
Thesis advisor Hanrahan, P. M. (Patrick Matthew)
Thesis advisor Kozyrakis, Christoforos, 1974-
Advisor Hanrahan, P. M. (Patrick Matthew)
Advisor Kozyrakis, Christoforos, 1974-

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Hassan Chafi.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis (Ph.D.)--Stanford University, 2014.
Location electronic resource

Access conditions

Copyright
© 2014 by Hassan Chafi
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...