Data-driven statistical sharding for industry-scale neural recommendation

Placeholder Show Content

Abstract/Contents

Abstract
Deep learning based recommendation models (DLRMs) form the backbone of many internet-scale services such as web search, social media, and video streaming. Primarily composed of massive embedding tables, potentially terabytes in size, these models require immense system resources to train and the solving of the sharding problem. The sharding problem is the task of partitioning and placing the embedding table parameters throughout the target system memory topology such that training throughput is maximized. This dissertation: (1) Characterizes and derives statistics from DLRM training data which can be used to accurately and granularly predict the memory demands of individual embedding table rows; (2) Presents RecShard, a mixed-integer linear program based approach which uses these statistics to solve the sharding problem for capacity constrained single-node systems, where parameters must be placed across high-performance GPU HBM and much slower CPU DRAM; reducing accesses to the latter by orders of magnitude; and (3) Presents FlexShard, a precise row-level sharding algorithm which focuses on sharding emerging sequence-based DLRMs across multi-node GPU training clusters; leveraging these statistics to significantly reduce inter-node communication demand, the bottleneck of scale-out DLRM training.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2023; ©2023
Publication date 2023; 2023
Issuance monographic
Language English

Creators/Contributors

Author Sethi, Geet
Degree supervisor Kozyrakis, Christoforos, 1974-
Thesis advisor Kozyrakis, Christoforos, 1974-
Thesis advisor Trippel, Caroline
Thesis advisor Wu, Carole-Jean
Degree committee member Trippel, Caroline
Degree committee member Wu, Carole-Jean
Associated with Stanford University, School of Engineering
Associated with Stanford University, Computer Science Department

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Geet Sethi.
Note Submitted to the Computer Science Department.
Thesis Thesis Ph.D. Stanford University 2023.
Location https://purl.stanford.edu/zs617qp8476

Access conditions

Copyright
© 2023 by Geet Sethi
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...