The design of an efficient vector virtual machine for data analytics

Placeholder Show Content

Abstract/Contents

Abstract
The amount of parallelism available in commodity hardware has been steadily increasing. Recent hardware trends suggest that within the next few years, widely available computers will boast substantial parallelism--8 or more cores, each supporting hardware vectorized operations on 4 or more double-precision floating point values at a time. In parallel with this development, the rise of big data, and data science more generally, has created enormous interest in analytics that can extract value from data. However, the most popular languages for data analysis and statistical computing--R, Matlab, Python, and Excel--are designed for ease of use, not performance, and their current implementations may see little performance gains from the increasing parallelism of commodity hardware. This dissertation focuses on the design of virtual machines that aim to deliver the full performance potential of upcoming commodity-level parallel hardware while still providing a high-level, easy-to-use analytic experience. First, this dissertation describes Riposte, a JIT compiler for the R language, designed to execute vector code on parallel hardware. To execute code with scalars and short vectors, Riposte uses a tracing-based approach adapted from recent Javascript VM work to extract hot loops. It then uses a novel partial length specialization pass to eliminate loop overhead and promote vectors into hardware SIMD units. To execute long vector code, Riposte uses a delayed evaluation approach to dynamically discover and extract sequences of vector operations. Once extracted, it then fuses operations to eliminate unnecessary memory traffic and then schedules them to run across multiple cores. On a variety of analytic workloads, we demonstrate that Riposte can run R code at near the speed of optimized C and for data parallel workloads we demonstrate good scalability out to 32 cores. At a high level, Riposte demonstrates that a vector-based, dynamically typed language can be an effective and efficient parallel programming model for data analysis tasks. Second, this dissertation describes Phoenix++, a C++ library for MapReduce-style computation on shared memory machines that leverages compile-time specialization to generate high quality parallel code while greatly reducing the amount of code that users must write. We validate the approach by implementing a variety of common analytic workloads in Phoenix++ and demonstrating substantial speed up and improved scalability compared to previous work.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2013
Issuance monographic
Language English

Creators/Contributors

Associated with Talbot, Justin
Associated with Stanford University, Department of Computer Science.
Primary advisor Hanrahan, P. M. (Patrick Matthew)
Thesis advisor Hanrahan, P. M. (Patrick Matthew)
Thesis advisor Heer, Jeffrey Michael
Thesis advisor Kozyrakis, Christoforos, 1974-
Advisor Heer, Jeffrey Michael
Advisor Kozyrakis, Christoforos, 1974-

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Justin Talbot.
Note Submitted to the Department of Computer Science.
Thesis Ph.D. Stanford University 2013
Location electronic resource

Access conditions

Copyright
© 2013 by Justin Faux Talbot
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...