Congestion control and adaptive routing in large-scale interconnection networks

Placeholder Show Content

Abstract/Contents

Abstract
Large-scale lossless interconnection networks are widely used in high-throughput, low-latency systems such supercomputers or data centers. In addition to bandwidth and latency, managing congestion is a critical component to guarantee network performance. Congestion in a lossless network is worsened by the effect of tree saturation, where a single point of congestion can spread through the network. This can have severe and global impact on the performance of the entire system. As the size of networks continues to expand, traditional congestion control algorithms, such as Explicit Congestion Notification (ECN) and traditional adaptive routing algorithms, are no longer adequate for large-scale networks. In this thesis, I describe two solutions for eliminating network congestion caused by different types of traffic. For congestion within the network fabric, I introduce Indirect Adaptive Routing (IAR), a new class of adaptive routing algorithms designed to improve channel load balance in a large-scale high-radix network. Each IAR algorithm uses a different mechanism to determine the level of congestion on non-local channels near a router's vicinity. Both local and remote network congestion information are then used to improve the adaptive routing decisions for network packets, load balancing the network under arbitrary admissible traffic patterns. The Speculative Reservation Protocol (SRP) resolves congestion caused by inadmissible or in-cast traffic at the network endpoints. SRP uses end-to-end reservations to ensure that no destination in the network is overloaded. To reduce reservation overhead, SRP allows sources to send packets speculatively while awaiting reservation replies. These speculative data packets can be dropped by the otherwise lossless network if congestion begins to form. By taking a proactive approach, SRP is able to react significantly faster to the onset of network congestion compared to reactive protocols used in systems today. While IAR and SRP each targets a specific area of network congestion, they can also be combined to create a single comprehensive congestion solution. I show that these two protocols work in complement and are able to resolve congestion caused by arbitrary traffic configurations in a large-scale network.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2013
Issuance monographic
Language English

Creators/Contributors

Associated with Jiang, Nan
Associated with Stanford University, Department of Electrical Engineering.
Primary advisor Dally, William
Thesis advisor Dally, William
Thesis advisor Kozyrakis, Christoforos, 1974-
Thesis advisor Rosenblum, Mendel
Advisor Kozyrakis, Christoforos, 1974-
Advisor Rosenblum, Mendel

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Nan Jiang.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis (Ph.D.)--Stanford University, 2013.
Location electronic resource

Access conditions

Copyright
© 2013 by Nan Jiang
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...