Pushing transport layer latency down towards its physical limits in data centers with programmable architectures and algorithms

Arslan, Serhat

Pushing transport layer latency down towards its physical limits in data centers with programmable architectures and algorithms

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fzj481vg3597" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Data center applications keep scaling horizontally across many machines to accommodate more users and data. This makes the communication performance requirements even more stringent, i.e., higher bandwidth and lower latency. The increasing link capacities address the bandwidth demands, but the latency requirements necessitate more sophisticated solutions. In this thesis, I observe that the transport layer is the only layer in the networking stack to impact latency both at the end-hosts and the network. The way it handles packets sets the end-hosts processing delay. And its congestion control determines the queuing delay in the network. Hence, I study transport layer designs to push both latencies down to their physical limits. First, I argue that end-host latency can be minimized by offloading the transport layer to NIC hardware, but fixed-function chips prohibit custom solutions for diversified environments. As a solution, I introduce nanoTransport, a programmable NIC architecture for message-based Remote Procedure Calls. It is programmed using the P4 language, making it easy to modify (or create) transport protocols while the packets are processed orders of magnitude faster than traditional software stacks. It identifies common events and primitive operations for a streamlined, modular, and programmable pipeline; including packetization, reassembly, timeouts, and packet generation, all expressed by the programmer. Next, I argue that network latency can only be minimized with quick and accurate congestion control decisions, which require precise congestion signals and the shortest control loop delay. I present Bolt to address these requirements and push congestion control to its theoretical limits. Bolt is based on three core ideas, (I) Sub-RTT Control (SRC) reacts to congestion faster than one RTT, (II) Proactive Ramp-up (PRU) foresees flow completions to promptly occupy released bandwidth, and (III) Supply matching (SM) matches bandwidth demand with supply to maximize utilization. I show that these mechanisms reduce 99th-p latency by 80% and improve 99th-p flow completion time by up to 3X compared to Swift and HPCC even at 400Gb/s.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2024; ©2024
Publication date	2024; 2024
Issuance	monographic
Language	English

Creators/Contributors

Author	Arslan, Serhat
Degree supervisor	McKeown, Nick
Thesis advisor	McKeown, Nick
Thesis advisor	Katti, Sachin
Thesis advisor	Prabhakar, Balaji, 1967-
Degree committee member	Katti, Sachin
Degree committee member	Prabhakar, Balaji, 1967-
Associated with	Stanford University, School of Engineering
Associated with	Stanford University, Department of Electrical Engineering

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Serhat Arslan.
Note	Submitted to the Department of Electrical Engineering.
Thesis	Thesis Ph.D. Stanford University 2024.
Location	https://purl.stanford.edu/zj481vg3597

Access conditions

License: This work is licensed under a Creative Commons Attribution 3.0 Unported license (CC BY).

Also listed in

View in SearchWorks

Loading usage metrics...