Techniques for building predictable stream processing pipelines
Abstract/Contents
- Abstract
- This dissertation presents techniques for easily building real-time parallel stream processing pipelines with predictable performance. These techniques enable automatically finding layouts of pipelines onto parallel-processing hardware that guarantee required performance and use resources efficiently. The automated workflow replaces tedious and error-prone process of laying out pipelines by trial-and-error. The key to the automated workflow is a novel performance modeling approach for stream processing pipelines based on two simple principles. First, pipelines are built out of components with predictable performance that also compose in predictable ways. Second, pipelines are built entirely out of compute and data-transfer as the basic operations. A large class of stream processing pipelines can be built following those principles. For any pipeline built in such a manner, techniques presented in this dissertation enable development of accurate models that can predict the performance of parallelized layouts. In turn, those models enable automated search for efficient pipeline layouts that can meet target performance requirements. The dissertation also includes design and implementation of two challenging real-time stream processing pipelines from the context of software-defined wireless networks - a WiFi data-plane and an LTE control-plane. Those pipelines are built following the proposed principles and techniques, thus demonstrating their effectiveness. The WiFi data-plane pipeline meets performance requirements of processing 20 million samples per second with latency bounds of tens of microseconds. This pipeline is built using Atomix, a novel programming framework for predictable signal processing embodying the proposed principles. The design and implementation of Atomix is presented along with that of the WiFi data-plane pipeline. The LTE control-plane pipeline meets peak performance requirements of processing event streams from 3,000 LTE base-stations with sub-second latency with processing load varying over two orders of magnitude daily. This pipeline is built using Trevor, a novel auto-scaling system for distributed stream processing leveraging the proposed principles. The design and implementation of Trevor is presented along with that of the LTE control-plane pipeline and similar pipelines realized using Trevor. The principles and techniques contained in this dissertation streamline continuous development, predictable execution, and efficient operation of parallel stream processing pipelines at scale through automated workflows. The two specific pipelines used to illustrate those techniques stretch the limits of real-time stream processing and demonstrate the power of model-based pipeline development. The contributions presented here apply broadly to real-time parallel stream processing in both multi-core and distributed settings.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2018; ©2018 |
Publication date | 2018; 2018 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Bansal, Manu Kumar | |
---|---|---|
Degree supervisor | Katti, Sachin | |
Thesis advisor | Katti, Sachin | |
Thesis advisor | Kozyrakis, Christoforos, 1974- | |
Thesis advisor | Levis, Philip | |
Degree committee member | Kozyrakis, Christoforos, 1974- | |
Degree committee member | Levis, Philip | |
Associated with | Stanford University, Department of Electrical Engineering. |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Manu Kumar Bansal. |
---|---|
Note | Submitted to the Department of Electrical Engineering. |
Thesis | Thesis Ph.D. Stanford University 2018. |
Location | electronic resource |
Access conditions
- Copyright
- © 2018 by Manu Kumar Bansal
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...