Network Congestion Avoidance Through Speculative Reservation

Network Congestion Avoidance Through Speculative Reservation Nan Jiang, Daniel U. Becker, George Michelogiannakis, and William J. Dally Department of Electrical Engineering, Stanford University fnjiang37, dub, mihelog, [email protected] Abstract the same destination and form a transient hot-spot [6]. Traf- fic that cannot be serviced by the over-subscribed destination Congestion caused by hot-spot traffic can significantly de- is left in the router queues, causing network congestion. In grade the performance of a computer network. In this study, lossy network systems like TCP/IP, congestion causes packet we present the Speculative Reservation Protocol (SRP), a new drops, but as a result the point of congestion remain somewhat network congestion control mechanism that relieves the effect isolated. However, many system area networks, such as Infini- of hot-spot traffic in high bandwidth, low latency, lossless com- band [2], are designed to be lossless and use tightly-controlled puter networks. Compared to existing congestion control ap- buffer allocation policies such as credit-based flow control. In proaches like Explicit Congestion Notification (ECN), which these systems, the congested traffic remains in the network until react to network congestion through packet marking and rate it is delivered. As a result, congested packets back up into the throttling, SRP takes a proactive approach of congestion avoid- rest of the network in a condition called tree saturation [21]. ance. Using a light-weight endpoint reservation scheme and Without proper management and isolation of these congestion speculative packet transmission, SRP avoids hot-spot conges- effects, traffic flow in the rest of the network will be adversely tion while incurring minimal overhead. Our simulation results affected. show that SRP responds more rapidly to the onset of severe Many congestion control mechanisms for networking sys- hot-spots than ECN and has a higher network throughput on tems have been proposed [28]. Explicit Congestion Notifica- bursty network traffic. SRP also performs comparably to net- tion (ECN) is a popular mechanism that has been adopted by works without congestion control on benign traffic patterns by many networking systems [2, 22]. While the exact implemen- reducing the latency and throughput overhead commonly asso- tation of ECN differs from system to system, the underlying ciated with reservation protocols. operating principle is similar. When the network detects congestion, it signals the sources contributing to the bottleneck to throttle down. The congestion signal is sent via an explicit mes- 1. Introduction sage or piggybacked on acknowledgment packets from the destination. ECN has been well studied in the context of system Congestion management is an important aspect of network- area networks, particularly the InfiniBand Architecture (IBA) ing systems. In a shared communication medium, the presence [11,13,23], and has shown to be effective in combating conges- of network congestion has a global impact on system perfor- tion. However, studies have also pointed out limitations such mance. Network congestion is created when the offered load as reduced system stability, the need for parameter adjustment, on a channel is greater than its bandwidth. In many tradi- and slow congestion response time [10, 20]. tional networks, the focus is on local communication band- In this work we introduce the Speculative Reservation Proto- width, and the network bisection channels are heavily over- col (SRP), a new congestion management mechanism for sys- subscribed due to cost constraints [1]. In these systems, the tem area networks. In contrast to ECN, which only reacts to network bottlenecks usually occur on internal network chan- congestion after it has occurred, SRP avoids congestion by us- nels due to under-provisioned global bandwidth. More recently, ing bandwidth reservation at the destinations. Contrary to the there has been a shift towards building system networks with common belief that network reservation protocols incur high full bisection bandwidth as new data center and cloud comput- overhead and complexity, SRP is designed with simplicity and ing technologies increase the demand for global network com- low overhead in mind. Unlike previous reservation systems munication [3, 4, 14–16, 18]. In networks with ample bisection [19,25], SRP uses a very light-weight reservation protocol with bandwidth, congestion occurs almost entirely at the edge of the minimal scheduling complexity. The SRP reservation sched- network. ule is a simple mechanism that prevents the over-subscription Network endpoint hot-spots can occur in a wide range of of any network destination, eliminating hot-spot congestion. network operations. Programming models commonly used in Furthermore, SRP avoids the latency overhead associated with large computer systems, such as MapReduce, can have inher- reservation protocols by allowing sources to transmit packets ent hot-spot behavior [17]. Even if network traffic is uniform speculatively without reservation. Speculative packets are sent and random, multiple senders may temporarily send packets to with a short time-to-live and are dropped (and retransmitted later with reservation) if network congestion begins to form. S0 S1 S2 S3 S4 S5 The speculative reservation protocol advances the state of the art in congestion control in the following ways: • SRP prevents the formation of congestion rather than reacting to congestion after it has already occurred. L0 L2 D1 • SRP has a very rapid transient response, reacting almost R0 R1 instantaneously to the onset of congestion-prone traffic, compared to the hundreds of micro-seconds it takes packet L1 marking protocols such as ECN to respond. D0 • SRP has a low overhead and performs on par with networks without congestion control on benign traffic. (a) Simple network with congestion • SRP improves fairness between sources competing for a network hot-spot. L2 S1 S2 S3 S4 S5 The remainder of the paper is organized as follows. In Sec- Link L1 S0 tion 2, we demonstrate the effect of tree saturation on networks ID without congestion control and describe the current solution using ECN. Section 3 describes in detail the operation of SRP. L0 S0 S1 S2 Section 4 specifies the experimental methodology used in this study. In Section 5, we present a comparison study of SRP, a 0 0.2 0.4 0.6 0.8 1 baseline network, and ECN using several different test cases. Link Utilization In Section 6, we examine in detail the behavior and overhead of (b) Channel throughput of the congested network SRP. Related congestion control mechanisms are discussed in Section 7. Finally, we conclude the study in Section 8. Figure 1. Effect of hot-spot congestion in a network without congestion control. 2. Motivation Threshold Figure 1(a) shows a simple network congestion scenario that Data Data+ECN demonstrates the effects of tree saturation. Nodes S1 through S5 are contending for the hot-spot destination D1, whereas S0 is at- S0 D0 tempting to reach the uncongested destination D0. Each source ECN tries to send at the maximum rate, which is equal to the rate of R0 the links. Since S0 shares link L0 with S1 and S2, network congestion at L2 eventually backs up and affects the performance of Figure 2. Operation of ECN. S0 as shown in Figure 1(b). The hot-spot link L2 is at 100% utilization, with bandwidth divided between S1 through S5. Even destination, relieving the congestion. In the absence of conges- though links L0 and L1 have spare bandwidth to support traf- tion notifications, the sender will gradually increase its injection fic from S0, this bandwidth cannot be utilized due to congested rate to fully utilize the bandwidth of an uncongested network. packets from S1 and S2 that are present in the input and out- To regulate the sender transmission rate, in the case of Infini- put buffers of L0. We also note that the throughput of traffic band, an inter-packet delay is added between successive packet from S1 and S2 is only half that of traffic from S3 through S5 transmissions to the same destination. due to the local fairness policies of the routers that grant equal With proper configuration, ECN has been shown to be effec- throughput to each input port rather than to each traffic flow. tive in combating network congestion for long traffic flows [13]. A congestion management algorithm used in many network- However, due to the incremental nature of the algorithm, the re- ing systems today is ECN. Figure 2 shows an example of the liance on buffer thresholds, and the round trip time of conges- operation of ECN as implemented in Infiniband networks [2]. tion information, ECN can have a slow response to the onset of An ECN enabled router detects congestion by monitoring the congestion [10]. Furthermore, the set of parameters that regu- occupancy of its input or output buffers. When a buffer’s oc- lates the behavior of the ECN algorithm needs to be carefully cupancy exceeds a certain threshold, the router marks the ECN adjusted to avoid network instability [20]. field of packets passing through the buffer (in some systems the marking operation only occurs on ports identified as the root of 3. Speculative Reservation Protocol the congestion). When the marked packet arrives at its destination, the ECN field is returned to the packet’s source using a In contrast to packet marking mechanisms that react to con- congestion notification packet. After the sender receives a no- gestion after it has occurred, SRP operates on the principle of tification, it incrementally reduces its transmission rate to that congestion avoidance. SRP requires a reservation-grant hand- S D S D Non-speculative R(n) R(n) packet P1 P1 Speculative packet G(ts) G(ts) Packet drop P2 A1 N1 A2 ts P3 ts P1 . Pn . Pn . (a) Normal network operation (b) Congested network operation Figure 3.

Network Congestion Avoidance Through Speculative Reservation

Application Note Pre-Deployment and Network Readiness Assessment Is

Latency and Throughput Optimization in Modern Networks: a Comprehensive Survey Amir Mirzaeinnia, Mehdi Mirzaeinia, and Abdelmounaam Rezgui

Lab 6: Understanding Traditional TCP Congestion Control (HTCP, Cubic, Reno)

Adaptive Method for Packet Loss Types in Iot: an Naive Bayes Distinguisher

Congestion Control Overview

The Effects of Different Congestion Control Algorithms Over Multipath Fast Ethernet Ipv4/Ipv6 Environments

What Causes Packet Loss IP Networks

Packet Loss Burstiness: Measurements and Implications for Distributed Applications

A Network Congestion Control Protocol (NCP)

How to Properly Measure and Correct Packet Loss

The Need for Realism When Simulating Network Congestion

TCP Congestion Control Scheme for Wireless Networks Based on TCP Reserved Field and SNR Ratio