TCP Congestion Control) Checksum Urgent Pointer Options (Variable)
Total Page:16
File Type:pdf, Size:1020Kb
TCP Header 0 4 10 16 31 Source port Destination port Sequence number Acknowledgement CS 268: Lecture 6 HdrLen Flags Advertised window (TCP Congestion Control) Checksum Urgent pointer Options (variable) ¡ Sequence number, acknowledgement, and advertised window – used by Ion Stoica sliding-window based flow control February 6, 2006 ¡ Flags: - SYN, FIN – establishing/terminating a TCP connection - ACK – set when Acknowledgement field is valid - URG – urgent data; Urgent Pointer says where non-urgent data starts - PUSH – don’t wait to fill segment - RESET – abort connection 4 Today’s Lecture TCP Header (Cont) ¢ Checksum – 1’s complement and is computed over - TCP header Basics of Transport - TCP data - Pseudo-header (from IP header) • Note: breaks the layering! Basics of Congestion Control Source address Destination address Comments on Congestion Control 0 Protocol (TCP) TCP Segment length 2 5 Duties of Transport TCP Connection Establishment Demultiplexing: Three-way handshake - IP header points to protocol - Goal: agree on a set of parameters: the start sequence - Transport header needs demultiplex further number for each side • UDP: port Client (initiator) Server • TCP: source and destination address/port SYN, Se - Well known ports and ephemeral ports qNum = x + 1 nd Ack = x a um = y CK, SeqN Data reliability (if desired): SYN and A - UDP: checksum, but no data recovery ACK, A - TCP: checksum and data recovery ck = y + 1 3 6 1 TCP Issues Observations Connection confusion: Throughput is ~ (w/RTT) - ISNs can’t always be the same Sender has to buffer all unacknowledged packets, Source spoofing: because they may require retransmission - Need to make sure ISNs are random Receiver may be able to accept out-of-order packets, SYN floods: but only up to its buffer limits - SYN cookies State management with many connections - Server-stateless TCP (NSDI 05) 7 10 TCP Flow Control What Should the Receiver ACK? Make sure receiving end can handle data 1. ACK every packet, giving its sequence number Negotiated end-to-end, with no regard to network 2. Use negative ACKs (NACKs), indicating which packet did not arrive Ends must ensure that no more than W packets are in flight 3. Use cumulative ACK, where an ACK for number - Receiver ACKs packets n implies ACKS for all k < n - When sender gets an ACK, it knows packet has arrived 4. Use selective ACKs (SACKs), indicating those that did arrive, even if not in order 8 11 Sliding Window Error Recovery Must retransmit packets that were dropped ¡ ¢ To do this efficiently £ - Keep transmitting whenever possible ¤ - Detect dropped packets and retransmit quickly ¥ ¦ Requires: ¤ - Timeouts (with good timers) - Other hints that packet were dropped ¥ ¦ §©¨ © "! ¨ # $%§&¨ ' )(* +& ,- . © "! ¨ # $ 9 12 2 Timer Algorithm Dangers of Increasing Load packet knee cliff t loss Use exponential averaging: Knee – point after which u p - Throughput increases very slow h ¢¡¤£¦¥¨§ © ¡¤£¦¥ ¡© ¥ ¡¤£¦¥ g u - Delay increases fast o congestion ¡¤£¦¥¨§ © ¡¤£¤¥¨ ¡© ¥¤ ¦¡ ¡¤£¦¥ ¡¤£¦¥¥ r h collapse "!$#% &¡¤£¦¥¨§' ¢¡¤£¦¥¨)(*+¡¤£¦¥ T Cliff – point after which - Throughput starts to decrease very fast to zero (congestion Load Question: Why not set timeout to average delay? y collapse) a l - Delay approaches infinity e D , - ¨ ( 43 5 $ +5)6 798 ( ( ! 5 ¨6© ( ¨¢5 : & 5 .0/21 In an M/M/1 queue =6 43 : ¨08* ( : © ?> ;)/2< - Delay = 1/(1 – utilization) @)/2A 3 :. B8 (C5 & #-¨ +ED ¨5 5 ( + , 13 Load 16 Hints Cong. Control vs. Cong. Avoidance When should I suspect a packet was dropped? Congestion control goal - Stay left of cliff When I receive several duplicate ACKs Congestion avoidance goal - Receiver sends an ACK whenever a packet arrives - Stay left of knee knee cliff - ACK indicates seq. no. of last received consecutively t u received packet p h - Duplicate ACKs indicates missing packet g u congestion o r h collapse T Load 14 17 TCP Congestion Control Control System Model [CJ89] User 1 Can the network handle the rate of data? x1 x2 Determined end-to-end, but TCP is making User 2 Σ Σx >X guesses about the state of the network i goal xn User n y Two papers: - Good science vs great engineering ¢ Simple, yet powerful model ¢ Explicit binary signal of congestion 15 18 3 Multiplicative Increase, Possible Choices Multiplicative Decrease ¢ fairness ¥ ¤ a + b x (t) increase line + = I I i £ ¡ x (t 1) Converges to (x ,x ) i + 1h 2h aD bD xi (t)decrease stable cycle, (bIbDx1h, ¢ but is not fair b b x ) Multiplicative increase, additive decrease I D 2h - aI=0, bI>1, aD<0, bD=1 2 x ¢ : Additive increase, additive decrease 2 r - a >0, b =1, a <0, b =1 e (bdx1h,bdx2h) I I D D s U ¢ Multiplicative increase, multiplicative decrease - aI=0, bI>1, aD=0, 0<bD<1 ¢ Additive increase, multiplicative decrease - a >0, b =1, a =0, 0<b <1 I I D D efficiency ¢ Which one? line User 1: x1 19 22 Multiplicative Increase, Additive Increase, Additive Decrease Multiplicative Decrease fairness fairness line line Fixed point at (bI(x1h+aD), bI(x2h+aD)) Converges to (x1h,x2h) stable and (bDx1h+aI, b x +a ) bI aD fair cycle D 2h I x = x = (x1h,x2h) 1h 2h 1− b I 2 2 x x : : 2 2 r r e (x1h+aD,x2h+aD) e (bDx1h,bDx2h) Fixed point is s s unstable! U U efficiency efficiency line line User 1: x1 User 1: x1 20 23 Additive Increase, Additive Decrease Modeling fairness (x1h+aD+aI), line Reaches x2h+aD+aI)) Critical to understanding complex systems stable cycle, - [CJ89] model relevant after 15 years, 106 increase of but does not bandwidth, 1000x increase in number of users (x1h,x2h) converge to 2 x fairness : Criteria for good models 2 r e (x1h+aD,x2h+aD) - Two conflicting goals: reality and simplicity s U - Realistic, complex model → too hard to understand, too limited in applicability - Unrealistic, simple model → can be misleading efficiency line User 1: x1 21 24 4 TCP Congestion Control Slow Start Example ¢ [CJ89] provides theoretical basis for basic The congestion segment 1 window size grows cwnd = 1 congestion avoidance mechanism very rapidly ACK 2 cwnd = 2 segment 2 segment 3 Must turn this into real protocol ACK 4 cwnd = 4 ¢ TCP slows down segment 4 segment 5 the increase of segment 6 cwnd when segment 7 cwnd >= ssthresh ACK8 cwnd = 8 25 28 TCP Congestion Control Congestion Avoidance Maintains three variables: ¢ Slow down “Slow Start” - cwnd: congestion window - flow_win: flow window; receiver advertised window ¢ ssthresh is lower-bound guess about location of knee - Ssthresh: threshold size (used to update cwnd) - ¢ If cwnd > ssthresh then each time a segment is acknowledged For sending, use: win = min(flow_win, cwnd) increment cwnd by 1/cwnd (cwnd += 1/cwnd). ¢ So cwnd is increased by one only if all segments have been acknowledged. 26 29 Slow Start/Congestion Avoidance TCP: Slow Start Example cwnd = 1 Goal: reach knee quickly Assume that cwnd = 2 ssthresh = 8 cwnd = 4 Upon starting (or restarting): 14 12 - Set cwnd =1 10 cwnd = 8 ) s - Each time a segment is acknowledged increment t n 8 cwnd by one (cwnd++). e m ssthresh g 6 e s n i ( 4 Slow Start is not actually slow d n 2 cwnd = 9 w - cwnd increases exponentially C 0 0 2 4 6 t= t= t= t= Roundtrip times cwnd = 10 27 30 5 Putting Everything Together: TCP Pseudocode Fast Recovery Initially: cwnd = 1; while (next < unack + win) After a fast-retransmit set cwnd to ssthresh/2 ssthresh = infinite; transmit next packet; - i.e., don’t reset cwnd to 1 New ack received: if (cwnd < ssthresh) /* Slow Start*/ where win = min(cwnd, But when RTO expires still do cwnd = 1 cwnd = cwnd + 1; flow_win); else /* Congestion Avoidance */ Fast Retransmit and Fast Recovery cwnd = cwnd + 1/cwnd; seq # unack next - Implemented by TCP Reno Timeout: - Most widely used version of TCP today /* Multiplicative decrease */ ssthresh = cwnd/2; win cwnd = 1; Lesson: avoid RTOs at all costs! 31 34 The big picture Fast Retransmit and Fast Recovery cwnd cwnd Congestion Avoidance Slow Start Timeout Congestion Avoidance Time Slow Start Retransmit after 3 duplicated acks - prevent expensive timeouts No need to slow start again Time At steady state, cwnd oscillates around the optimal window size. 32 35 Fast Retransmit Engineering vs Science in CC ¢ Don’t wait for window to cwnd = 1 segment 1 Great engineering built useful protocol: drain ACK 2 - TCP Reno, etc. cwnd = 2 segment 2 ¢ Resend a segment after 3 segment 3 duplicate ACKs ACK 3 Good science by CJ and others ACK 4 cwnd = 4 segment 4 - Basis for understanding why it works so well segment 5 segment 6 ACK 4 segment 7 ACK 4 3 duplicate ACK 4 ACKs 33 36 6 Behavior of TCP TCP: Cooperation and Compatibility Are packets smoothly paced? TCP assumes all flows employ TCP-like - NO! Ack-compression congestion control - TCP-friendly or TCP-compatible Are long-lived flows nicely interleaved? - NO! Selfish flows: can get all the bandwidth they like How does throughput depend on drop rate? If new congestion control algorithms are developed, they must be TCP-friendly Tput ~ 1/sqrt(d) 37 40 Extensions to TCP Selective acknowledgements: TCP SACK Explicit congestion notification: ECN Delay-based congestion avoidance: TCP Vegas Discriminating between congestion losses and other losses: cross-layer signaling and guesses Randomized drops (RED) and other router mechanisms 38 Issues with TCP Fairness: - Throughput depends on RTT High speeds: - to reach 10gbps, packet losses occur every 90 minutes! Short flows: - How to set initial cwnd properly What about flows that want congestion control, but don’t want reliable delivery? 39 7.