Data-driven statistical sharding for industry-scale neural recommendation
Abstract/Contents
- Abstract
- Deep learning based recommendation models (DLRMs) form the backbone of many internet-scale services such as web search, social media, and video streaming. Primarily composed of massive embedding tables, potentially terabytes in size, these models require immense system resources to train and the solving of the sharding problem. The sharding problem is the task of partitioning and placing the embedding table parameters throughout the target system memory topology such that training throughput is maximized. This dissertation: (1) Characterizes and derives statistics from DLRM training data which can be used to accurately and granularly predict the memory demands of individual embedding table rows; (2) Presents RecShard, a mixed-integer linear program based approach which uses these statistics to solve the sharding problem for capacity constrained single-node systems, where parameters must be placed across high-performance GPU HBM and much slower CPU DRAM; reducing accesses to the latter by orders of magnitude; and (3) Presents FlexShard, a precise row-level sharding algorithm which focuses on sharding emerging sequence-based DLRMs across multi-node GPU training clusters; leveraging these statistics to significantly reduce inter-node communication demand, the bottleneck of scale-out DLRM training.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2023; ©2023 |
Publication date | 2023; 2023 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Sethi, Geet |
---|---|
Degree supervisor | Kozyrakis, Christoforos, 1974- |
Thesis advisor | Kozyrakis, Christoforos, 1974- |
Thesis advisor | Trippel, Caroline |
Thesis advisor | Wu, Carole-Jean |
Degree committee member | Trippel, Caroline |
Degree committee member | Wu, Carole-Jean |
Associated with | Stanford University, School of Engineering |
Associated with | Stanford University, Computer Science Department |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Geet Sethi. |
---|---|
Note | Submitted to the Computer Science Department. |
Thesis | Thesis Ph.D. Stanford University 2023. |
Location | https://purl.stanford.edu/zs617qp8476 |
Access conditions
- Copyright
- © 2023 by Geet Sethi
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...