Domain specific hardware acceleration

Casper, Jared; Stanford University, Department of Computer Science.

Domain specific hardware acceleration

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fpw135js0060" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: The performance of microprocessors has grown by three orders of magnitude since their beginnings in the 1970s; however, this exponential growth in performance has been achieved without overcoming substantial obstacles. These obstacles were over- come due in large part of the exponential increases in the amount of transistors available to architects as transistor technology scaled. Many today call the largest of the hurdles impeding performance gain "walls". Such walls include the Memory Wall, which is memory bandwidth and latency not scaling with processor performance; the Power Wall, which is the processor generating too much heat to be feasibly cooled; and the ILP wall, which is the diminishing return seen when making processor pipelines deeper due to the lack of available instruction level parallelism. Today, computer architects continually overcome new walls to extend this exponential growth in performance. Many of these walls have been circumvented by moving from large monolithic architectures to multi-core architectures. Instead of using more transistors on bigger, more complicated single processors, transistors are partitioned into separate processing cores. These multi-core processors require less power and are better able to exploit data level parallelism, leading to increased performance for a wide range of applications. However, as the number of transistors available continues to increase, the current trend of increasing the number of homogeneous cores will soon run into a "Capability Wall" where increasing the core count will not increase the capability of a processor as much as it has in the past. Amdahl's law limits the scalability of many applications and power constraints will make it unfeasible to power all the transistors available at the same time. Thus, the capability of a single processor chip to compute more things in a given time slot will stop improving unless new techniques are developed. In this work, we study how to build hardware components that provide new capabilities by performing specific tasks more quickly and with less power then general purpose processors. We explore two broad classes of such domain specific hardware accelerators: those that require fine-grained communication and tight coupling with the general purpose computation and those that require much a looser coupling with the rest of the computation. To drive the study, we examine a representative example in each class. For fine-grained accelerators, we present a transactional memory accelerator. We see that dealing with the latency and lack of ordering in the communication channel between the processor and accelerator presents significant challenges to efficiently accelerating transactional memory. We then present multiple techniques that over- come these problems, resulting in an accelerator that improves the performance of transactional memory application by an average of 69%. For course-grained loosely coupled accelerators, we turn to accelerating database operations. We discuss that since these accelerators are often dealing with large amounts of data, one of the key attributes of a useful database accelerator is the ability to fully saturate the bandwidth available to the system's memory. We provide insight into how to design an accelerator that does so by looking at designs to perform selection, sorting, and joining of database tables and how they are able to make the most efficient use of memory bandwidth.

Description

Type of resource	text
Form	electronic; electronic resource; remote
Extent	1 online resource.
Publication date	2015
Issuance	monographic
Language	English

Creators/Contributors

Associated with	Casper, Jared
Associated with	Stanford University, Department of Computer Science.
Primary advisor	Olukotun, Oyekunle Ayinde
Thesis advisor	Olukotun, Oyekunle Ayinde
Thesis advisor	Horowitz, Mark
Thesis advisor	Kozyrakis, Christoforos, 1974-
Advisor	Horowitz, Mark
Advisor	Kozyrakis, Christoforos, 1974-

Subjects

Genre	Theses

Bibliographic information

Statement of responsibility	Jared Casper.
Note	Submitted to the Department of Computer Science.
Thesis	Thesis (Ph.D.)--Stanford University, 2015.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...