Efficient remote memory for parallel and distributed data analytics

Placeholder Show Content

Abstract/Contents

Abstract
This thesis describes Clamor, a functional cluster computing framework that adds support for fine-grained, transparent access to global variables for distributed, data-parallel tasks. Clamor targets workloads that perform sparse accesses and updates within the bulk synchronous parallel (BSP) execution model, a setting where the standard technique of broadcasting global variables is highly inefficient. We show that this restriction of workloads is powerful, enabling efficient transparent remote memory access using techniques from distributed shared memory that are known to be inefficient for general workloads. These restrictions further enable novel features in Clamor, including a dynamic distributed serving mechanism that takes advantage of the functional programming model to cache and serve data for the duration of a parallel task, and lineage-based fault recovery at a finer granularity than existing systems that do not account for sparsity in the access pattern. Clamor can integrate with existing Rust and C ++ libraries to transparently distribute programs on the cluster. We show that Clamor is competitive with Spark in simple functional workloads and can improve performance significantly compared to custom systems on workloads that sparsely access large global variables: from 5× for sparse logistic regression to over 100× on distributed geospatial queries.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2022; ©2022
Publication date 2022; 2022
Issuance monographic
Language English

Creators/Contributors

Author Thaker, Pratiksha Ranjit
Degree supervisor Zaharia, Matei
Thesis advisor Zaharia, Matei
Thesis advisor Levis, Philip
Thesis advisor Ousterhout, John K
Degree committee member Levis, Philip
Degree committee member Ousterhout, John K
Associated with Stanford University, Computer Science Department

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Pratiksha Thaker.
Note Submitted to the Computer Science Department.
Thesis Thesis Ph.D. Stanford University 2022.
Location https://purl.stanford.edu/tt346kh1748

Access conditions

Copyright
© 2022 by Pratiksha Ranjit Thaker
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...