How Linux Community Fixed Bufferbloat?

Total Page:16

File Type:pdf, Size:1020Kb

How Linux Community Fixed Bufferbloat? How Linux community fixed Bufferbloat? Stephen Hemminger [email protected] #LinuxPiter BufferbloatBufferbloat ● What is the problem ● What causes the problem ● Solutions – Demonstration ● Recent Progress NetworkNetwork PerformancePerformance Fairness Throughput Latency Mbits/sec Round Trip Time HowHow badbad isis it?it? WhyWhy isis worseworse now?now? ● No longer in FTP universe ● TCP Initial Window Size = 10 ● Advanced TCP algorithms ● Network Offload packet trains ● Bad benchmarks ● Router/switch memory is cheap ● Everyone fears dropping packets HeadHead ofof LineLine blockingblocking QueuingQueuing TheoryTheory BasicsBasics utilization servicerate averagetimeinqueue= 1- utilization From Fred Baker: Bufferbloat! Graphic courtesy Sprint, Apricot 2004 Netalyzr:Netalyzr: DownstreamDownstream Netalyzr:Netalyzr: UpstreamUpstream TCPTCP throughputthroughput dynamicsdynamics effectivewindow meanthroughput = t meanroundtriptime u p h Bottleneck Capacity g u o r “knee” Queue “cliff” h T Depth e l b a r u s a e M g n i s a e r c n I Increasing TCP Window From Fred Baker: Bufferbloat! BlameBlame LinuxLinux ● Windows XP – Maximum window 64k ● Windows 7 – Bandwidth limit to 80 mbits ● Android – Receive window limited BlameBlame thethe customercustomer ● Customers call support ● Applications are using more bandwidth ● Block and charge WhyWhy Queueing?Queueing? PriorityPriority Queue:Queue: TheoryTheory PriorityPriority Queue:Queue: RealityReality HierarchicalHierarchical TokenToken BucketBucket RandomRandom EarlyEarly DetectDetect ExplicitExplicit CongestionCongestion NotificationNotification PACKETS IdealIdeal ActiveActive QueueQueue ManagementManagement ● Fair – All flows get some bandwidth ● Simple – No tuning ● Easy to deploy – No special hardware, no protocol changes ● Reasonable – Won't create multi-second latency StocasticStocastic FairFair QueueQueue CodelCodel PIEPIE CakeCake CoDel Fair Queue Priority Back to Reality BenchmarkBenchmark issuesissues ● Bad ● Good – Bytes/sec – Throughput + – Packes/sec Latency – Latency – Multiple connections – Real not simulated BetterBetter toolstools ● DSLreport's – http://www.dslreports.com/speedtest/ ● ICSI netalyzr – http://netalyzr.icsi.berkeley.edu/ ● Flent: The Flexible Network Tester – https://flent.org/ – RRUL - Real time Response Under Load DSLReportsDSLReports speedtestspeedtest 5 sec delay!! FIFOFIFO –– thethe defaultdefault Ping 1sec! Upload Download Høiland-Jørgensen T., Battling Bufferbloat StochasticStochastic FairFair QueueQueue Høiland-Jørgensen T., Battling Bufferbloat ControlledControlled DelayDelay -- codelcodel Høiland-Jørgensen T., Battling Bufferbloat FairFair QueueQueue ControlledControlled DelayDelay Høiland-Jørgensen T., Battling Bufferbloat RecentRecent OnceOnce youyou startstart looking,looking, BufferbloatBufferbloat cancan bebe everywhereeverywhere ● Edge – Home router ● Provider ● Network ● Accelerators – Caches – Load balancers,... ● Servers LinuxLinux BufferbloatBufferbloat fixes:fixes: 2011-20162011-2016 ● Linux 3.3: Byte Queue Limits ● Linux 3.4 RED bug fixes & IW10 added & SFQRED ● Linux 3.5 Fair/Flow Queuing packet scheduling (fq_codel, codel) ● Linux 3.7 TCP small queues (TSQ) ● Linux 3.12 TSO/GSO improvements ● Linux 3.13 Host FQ + Pacing (sch_fq) ● Linux 3.15 Change to microseconds from milliseconds throughout networking kernel ● Linux 3.17 Network Batching API ● The Linux stack is now mostly “pull through”, where it used to be “push”, and looks nothing like it did 6 years ago. ● At least a dozen other improvements I forget ● Linux 4.8 – TCP BBR … (and BSD just got fq_codel!) Basically – everything – except WiFi (and lte) can be debloated now. – And we just made a big dent in WiFi BufferbloatBufferbloat inin Wi-FiWi-Fi 1+sec1+sec latencylatency LinuxLinux 4.44.4 SubSub 40-ms40-ms LinuxLinux 4.9?4.9? HowHow isis BufferbloatBufferbloat SolvedSolved onon Linux?Linux? ● Queuing disciplines – Codel, PIE, ... ● Linux internal – BQL, FQ, Pacing ● Enable good defaults – ecn – fq_codel Questions? Thank you Stephen Hemminger @networkplumber [email protected] Bufferbloat resources Bufferbloat.net: http://bufferbloat.net Email Lists: http://lists.bufferbloat.net CeroWrt: http://www.bufferbloat.net/projects/cerowrt Other talks: http://mirrors.bufferbloat.net/Talks Jim Gettys Blog – http://gettys.wordpress.com A big thanks to the bloat mailing list, Jim, Kathie, Van, Dave, Eric, ISC, and all the other the CeroWrt/OpenWrt contributors .
Recommended publications
  • A Letter to the FCC [PDF]
    Before the FEDERAL COMMUNICATIONS COMMISSION Washington, DC 20554 In the Matter of ) ) Amendment of Part 0, 1, 2, 15 and 18 of the ) ET Docket No. 15­170 Commission’s Rules regarding Authorization ) Of Radio frequency Equipment ) ) Request for the Allowance of Optional ) RM­11673 Electronic Labeling for Wireless Devices ) Summary The rules laid out in ET Docket No. 15­170 should not go into effect as written. They would cause more harm than good and risk a significant overreach of the Commission’s authority. Specifically, the rules would limit the ability to upgrade or replace firmware in commercial, off­the­shelf home or small­business routers. This would damage the compliance, security, reliability and functionality of home and business networks. It would also restrict innovation and research into new networking technologies. We present an alternate proposal that better meets the goals of the FCC, not only ensuring the desired operation of the RF portion of a Wi­Fi router within the mandated parameters, but also assisting in the FCC’s broader goals of increasing consumer choice, fostering competition, protecting infrastructure, and increasing resiliency to communication disruptions. If the Commission does not intend to prohibit the upgrade or replacement of firmware in Wi­Fi ​ ​ devices, the undersigned would welcome a clear statement of that intent. Introduction We recommend the FCC pursue an alternative path to ensuring Radio Frequency (RF) compliance from Wi­Fi equipment. We understand there are significant concerns regarding existing users of the Wi­Fi ​ spectrum, and a desire to avoid uncontrolled change. However, we most strenuously advise against prohibiting changes to firmware of devices containing radio components, and furthermore advise against allowing non­updatable devices into the field.
    [Show full text]
  • Latency and Throughput Optimization in Modern Networks: a Comprehensive Survey Amir Mirzaeinnia, Mehdi Mirzaeinia, and Abdelmounaam Rezgui
    READY TO SUBMIT TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS JOURNAL 1 Latency and Throughput Optimization in Modern Networks: A Comprehensive Survey Amir Mirzaeinnia, Mehdi Mirzaeinia, and Abdelmounaam Rezgui Abstract—Modern applications are highly sensitive to com- On one hand every user likes to send and receive their munication delays and throughput. This paper surveys major data as quickly as possible. On the other hand the network attempts on reducing latency and increasing the throughput. infrastructure that connects users has limited capacities and These methods are surveyed on different networks and surrond- ings such as wired networks, wireless networks, application layer these are usually shared among users. There are some tech- transport control, Remote Direct Memory Access, and machine nologies that dedicate their resources to some users but they learning based transport control, are not very much commonly used. The reason is that although Index Terms—Rate and Congestion Control , Internet, Data dedicated resources are more secure they are more expensive Center, 5G, Cellular Networks, Remote Direct Memory Access, to implement. Sharing a physical channel among multiple Named Data Network, Machine Learning transmitters needs a technique to control their rate in proper time. The very first congestion network collapse was observed and reported by Van Jacobson in 1986. This caused about a I. INTRODUCTION thousand time rate reduction from 32kbps to 40bps [3] which Recent applications such as Virtual Reality (VR), au- is about a thousand times rate reduction. Since then very tonomous cars or aerial vehicles, and telehealth need high different variations of the Transport Control Protocol (TCP) throughput and low latency communication.
    [Show full text]
  • Buffer De-Bloating in Wireless Access Networks
    Buffer De-bloating in Wireless Access Networks by Yuhang Dai A thesis submitted to the University of London for the degree of Doctor of Philosophy School of Electronic Engineering & Computer Science Queen Mary University of London United Kingdom Sep 2018 TO MY FAMILY Abstract Excessive buffering brings a new challenge into the networks which is known as Bufferbloat, which is harmful to delay sensitive applications. Wireless access networks consist of Wi-Fi and cellular networks. In the thesis, the performance of CoDel and RED are investigated in Wi-Fi networks with different types of traffic. Results show that CoDel and RED work well in Wi-Fi networks, due to the similarity of protocol structures of Wi-Fi and wired networks. It is difficult for RED to tune parameters in cellular networks because of the time-varying channel. CoDel needs modifications as it drops the first packet of queue and thehead packet in cellular networks will be segmented. The major contribution of this thesis is that three new AQM algorithms tailored to cellular networks are proposed to alleviate large queuing delays. A channel quality aware AQM is proposed using the CQI. The proposed algorithm is tested with a single cell topology and simulation results show that the proposed algo- rithm reduces the average queuing delay for each user by 40% on average with TCP traffic compared to CoDel. A QoE aware AQM is proposed for VoIP traffic. Drops and delay are monitored and turned into QoE by mathematical models. The proposed algorithm is tested in NS3 and compared with CoDel, and it enhances the QoE of VoIP traffic and the average end- to-end delay is reduced by more than 200 ms when multiple users with different CQI compete for the wireless channel.
    [Show full text]
  • Evaluation of Priority Scheduling and Flow Starvation for Thin Streams with FQ-Codel
    2015 European Conference on Networks and Communications (EuCNC) Evaluation of Priority Scheduling and Flow Starvation for Thin Streams with FQ-CoDel Eduard Grigorescu, Chamil Kulatunga, Gorry Nicolas Kuhn Fairhurst Télécom Bretagne, IRISA School of Engineering, University of Aberdeen, UK [email protected] {eduard, chamil, gorry}@erg.abdn.ac.uk Abstract— Bufferbloat is the result of oversized buffers and algorithms, managing packet scheduling and isolation/capacity induced high end-to-end latency experienced by applications allocation among flows, can be introduced. As one example of across the Internet. This additional delay can adversely impact a scheme that mixes both classes, FlowQueue-CoDel (FQ- thin streams that frequently exchange small amounts of data, but CoDel) [7] is a scheduling scheme that features prioritization have stringent latency requirements. Active Queue Management and flow isolation. FQ-CoDel creates one sub-queue per flow (AQM) techniques, such as Controlled Delay (CoDel), can control and applies CoDel on each of them. The awareness of the the queuing delay in a network device to ensure low latency by dropping packets to indicate incipient congestion. FlowQueue- latency resulting from over-provisioned buffers has been CoDel (FQ-CoDel) is a scheduling scheme that creates one sub- accompanied by an increase in real-time applications such as queue per flow and applies CoDel on each of them. FQ-CoDel Voice over Internet Protocol, gaming or financial trading features: (1) priority scheduling for low-rate traffic; (2) flow applications. As one example, the latency experienced by isolation; (3) queue management with CoDel. First, this paper gamers can directly impact the perceived value of the network fills a gap in the understanding of FQ-CoDel by analyzing what service [10].
    [Show full text]
  • Measuring Latency Variation in the Internet
    Measuring Latency Variation in the Internet Toke Høiland-Jørgensen Bengt Ahlgren Per Hurtig Dept of Computer Science SICS Dept of Computer Science Karlstad University, Sweden Box 1263, 164 29 Kista Karlstad University, Sweden toke.hoiland- Sweden [email protected] [email protected] [email protected] Anna Brunstrom Dept of Computer Science Karlstad University, Sweden [email protected] ABSTRACT 1. INTRODUCTION We analyse two complementary datasets to quantify the la- As applications turn ever more interactive, network la- tency variation experienced by internet end-users: (i) a large- tency plays an increasingly important role for their perfor- scale active measurement dataset (from the Measurement mance. The end-goal is to get as close as possible to the Lab Network Diagnostic Tool) which shed light on long- physical limitations of the speed of light [25]. However, to- term trends and regional differences; and (ii) passive mea- day the latency of internet connections is often larger than it surement data from an access aggregation link which is used needs to be. In this work we set out to quantify how much. to analyse the edge links closest to the user. Having this information available is important to guide work The analysis shows that variation in latency is both com- that sets out to improve the latency behaviour of the inter- mon and of significant magnitude, with two thirds of sam- net; and for authors of latency-sensitive applications (such ples exceeding 100 ms of variation. The variation is seen as Voice over IP, or even many web applications) that seek within single connections as well as between connections to to predict the performance they can expect from the network.
    [Show full text]
  • The Effects of Latency on Player Performance and Experience in A
    The Effects of Latency on Player Performance and Experience in a Cloud Gaming System Robert Dabrowski, Christian Manuel, Robert Smieja May 7, 2014 An Interactive Qualifying Project Report: submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE in partial fulfillment of the requirements for the Degree of Bachelor of Science Approved by: Professor Mark Claypool, Advisor Professor David Finkel, Advisor This report represents the work of three WPI undergraduate students submitted to the faculty as evidence of completion of a degree requirement. WPI routinely publishes these reports on its web site without editorial or peer review. Abstract Due to the increasing popularity of thin client systems for gaming, it is important to un- derstand the effects of different network conditions on users. This paper describes our experiments to determine the effects of latency on player performance and quality of expe- rience (QoE). For our experiments, we collected player scores and subjective ratings from users as they played short game sessions with different amounts of additional latency. We found that QoE ratings and player scores decrease linearly as latency is added. For ev- ery 100 ms of added latency, players reduced their QoE ratings by 14% on average. This information may provide insight to game designers and network engineers on how latency affects the users, allowing them to optimize their systems while understanding the effects on their clients. This experiment design should also prove useful to thin client researchers looking to conduct user studies while controlling not only latency, but also other network conditions like packet loss. Contents 1 Introduction 1 2 Background Research 4 2.1 Thin Client Technology .
    [Show full text]
  • AQM Algorithms and Their Interaction with TCP Congestion Control Mechanisms
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Universidad Carlos III de Madrid e-Archivo Grado Universitario en Ingenier´ıaTelem´atica 2016/2017 Trabajo Fin de Grado Control de Congesti´onTCP y mecanismos AQM Sergio Maeso Jim´enez Tutor/es Celeste Campo V´azquez Carlos Garc´ıaRubio Legan´es,2 de Octubre de 2017 Esta obra se encuentra sujeta a la licencia Creative Commons Reconocimiento - No Comercial - Sin Obra Derivada Control de Congesti´onTCP y mecanismos AQM By Sergio Maeso Jim´enez Directed By Celeste Campo V´azquez Carlos Garc´ıaRubio A Dissertation Submitted to the Department of Telematic Engineering in Partial Fulfilment of the Requirements for the BACHELOR'S DEGREE IN TELEMATICS ENGINEERING Approved by the Supervising Committee: Chairman Marta Portela Garc´ıa Chair Carlos Alario Hoyos Secretary I~naki Ucar´ Marqu´es Deputy Javier Manuel Mu~noz Garc´ıa Grade: Legan´es,2 de Octubre de 2017 iii iv Acknowledgements I would like to thanks my tutors Celeste Campo and Carlos Garcia for all the support they gave me while I was doing this thesis with them. To my parents, who believe in me against all odds. v vi Abstract In recent years, the relevance of delay over throughput has been particularly emphasized. Nowadays our networks are getting more and more sensible to latency due to the proliferation of applications and services like VoIP, IPTV or online gaming where a low delay is essential for a proper performance and a good user experience. Most of this unnecessary delay is created by the misbehaviour of many buffers that populate Internet.
    [Show full text]
  • Evaluating the Latency Impact of Ipv6 on a High Frequency Trading System
    Evaluating the Latency Impact of IPv6 on a High Frequency Trading System Nereus Lobo, Vaibhav Malik, Chris Donnally, Seth Jahne, Harshil Jhaveri [email protected] , [email protected] , [email protected] , [email protected] , [email protected] A capstone paper submitted as partial fulfillment of the requirements for the degree of Masters in Interdisciplinary Telecommunications at the University of Colorado, Boulder, 4 May 2012. Project directed by Dr. Pieter Poll and Professor Andrew Crain. 1 Introduction Employing latency-dependent strategies, financial trading firms rely on trade execution speed to obtain a price advantage on an asset in order to earn a profit of a fraction of a cent per asset share [1]. Through successful execution of these strategies, trading firms are able to realize profits on the order of billions of dollars per year [2]. The key to success for these trading strategies are ultra-low latency processing and networking systems, henceforth called High Frequency Trading (HFT) systems, which enable trading firms to execute orders with the necessary speed to realize a profit [1]. However, competition from other trading firms magnifies the need to achieve the lowest latency possible. A 1 µs latency disadvantage can result in unrealized profits on the order of $1 million per day [3]. For this reason, trading firms spend billions of dollars on their data center infrastructure to ensure the lowest propagation delay possible [4]. Further, trading firms have expanded their focus on latency optimization to other aspects of their trading infrastructure including application performance, inter-application messaging performance, and network infrastructure modernization [5].
    [Show full text]
  • Low-Latency Networking: Where Latency Lurks and How to Tame It Xiaolin Jiang, Hossein S
    1 Low-latency Networking: Where Latency Lurks and How to Tame It Xiaolin Jiang, Hossein S. Ghadikolaei, Student Member, IEEE, Gabor Fodor, Senior Member, IEEE, Eytan Modiano, Fellow, IEEE, Zhibo Pang, Senior Member, IEEE, Michele Zorzi, Fellow, IEEE, and Carlo Fischione Member, IEEE Abstract—While the current generation of mobile and fixed The second step in the communication networks revolutions communication networks has been standardized for mobile has made PSTN indistinguishable from our everyday life. Such broadband services, the next generation is driven by the vision step is the Global System for Mobile (GSM) communication of the Internet of Things and mission critical communication services requiring latency in the order of milliseconds or sub- standards suite. In the beginning of the 2000, GSM has become milliseconds. However, these new stringent requirements have a the most widely spread mobile communications system, thanks large technical impact on the design of all layers of the commu- to the support for users mobility, subscriber identity confiden- nication protocol stack. The cross layer interactions are complex tiality, subscriber authentication as well as confidentiality of due to the multiple design principles and technologies that user traffic and signaling [4]. The PSTN and its extension via contribute to the layers’ design and fundamental performance limitations. We will be able to develop low-latency networks only the GSM wireless access networks have been a tremendous if we address the problem of these complex interactions from the success in terms of Weiser’s vision, and also paved the way new point of view of sub-milliseconds latency. In this article, we for new business models built around mobility, high reliability, propose a holistic analysis and classification of the main design and latency as required from the perspective of voice services.
    [Show full text]
  • Auto-Tuning Active Queue Management
    2017 9th International Conference on Communication Systems and Networks (COMSNETS) Auto-Tuning Active Queue Management Joe H. Novak Sneha Kumar Kasera University of Utah University of Utah Abstract not require pre-specification of even those parameters including 3 Active queue management (AQM) algorithms preemptively EWMA weights or protocol timer intervals that are routinely drop packets to prevent unnecessary delays through a network specified in protocol design and implementation. while keeping utilization high. Many AQM ideas have been When we view the network link in terms of utilization proposed, but none have been widely adopted because these rely and the queue corresponding to the link in terms of delay on pre-specification or pre-tuning of parameters and thresholds (Fig. 1 shows a typical delay-utilization curve), we see that that do not necessarily adapt to dynamic network conditions. as utilization increases, delay also increases. At a certain point, We develop an AQM algorithm that relies only on network however, there is a very large increase in delay for only a small runtime measurements and a natural threshold, the knee on improvement in utilization. This disproportionate increase in the delay-utilization curve. We call our AQM algorithm Delay delay is of little to no value to the applications at the endpoints. Utilization Knee (DUK) based on its key characteristic of We want to avoid this unstable region of high increase in delay keeping the system operating at the knee of the delay-utilization with little increase in utilization. As a result, a natural threshold curve. We implement and evaluate DUK in the Linux kernel in becomes apparent.
    [Show full text]
  • Performance Analysis of Receive-Side Real-Time Congestion Control for Webrtc
    Performance Analysis of Receive-Side Real-Time Congestion Control for WebRTC Varun Singh Albert Abello Lozano J¨org Ott Aalto University, Finland Aalto University, Finland Aalto University, Finland varun.singh@aalto.fi albert.abello.lozano@aalto.fi jorg.ott@aalto.fi Abstract—In the forthcoming deployments of WebRTC sys- congestion control, the IETF has chartered a new working tems, we speculate that high quality video conferencing will see group, RMCAT3, to standardize congestion-control for real- wide adoption. It is currently being deployed on Google Chrome time communication, which is expected to be a multi-year and Firefox web-browsers, meanwhile desktop and mobile clients are under development. Without a standardized signaling mech- process [1]; but early implementations are already available. anism, service providers can enable various types of topologies; In this paper, we evaluate the performance of WebRTC ranging from full-mesh to centralized video conferencing and video calls over different topologies and with varying amounts everything in between. In this paper, we evaluate the performance of cross-traffic. All experiments are conducted using the of various topologies using endpoints implementing WebRTC. Chrome browser and our testbed. With the testbed we are able We specifically evaluate the performance of the congestion con- trol currently implemented and deployed in these web-browser, to control the bottleneck link capacity, the end-to-end latency, Receive-side Real-Time Congestion Control (RRTCC). We use link loss rate and the queue size of intermediate routers. transport impairments like varying throughput, loss and delay, Consequently, in this testbed we can investigate the following: and varying amounts of cross-traffic to measure the performance.
    [Show full text]
  • Simulation and Comparison of Various Scheduling Algorithm for Improving the Interrupt Latency of Real –Time Kernal
    Journal of Computer Science and Applications. ISSN 2231-1270 Volume 6, Number 2 (2014), pp. 115-123 © International Research Publication House http://www.irphouse.com Simulation And Comparison of Various Scheduling Algorithm For Improving The Interrupt Latency of Real –Time Kernal 1.Lavanya Dhanesh 2.Dr.P.Murugesan 1.Research Scholar, Sathyabama University, Chennai, India. 2.Professor, S.A. Engineering College, Chennai, India. Email:1. [email protected] Abstract The main objective of the research is to improve the performance of the Real- time Interrupt Latency using Pre-emptive task Scheduling Algorithm. Interrupt Latency provides an important metric in increasing the performance of the Real Time Kernal So far the research has been investigated with respect to real-time latency reduction to improve the task switching as well the performance of the CPU. Based on the literature survey, the pre-emptive task scheduling plays an vital role in increasing the performance of the interrupt latency. A general disadvantage of the non-preemptive discipline is that it introduces additional blocking time in higher priority tasks, so reducing schedulability . If the interrupt latency is increased the task switching delay shall be increasing with respect to each task. Hence most of the research work has been focussed to reduce interrupt latency by many methods. The key area identified is, we cannot control the hardware interrupt delay but we can improve the Interrupt service as quick as possible by reducing the no of preemptions. Based on this idea, so many researches has been involved to optimize the pre-emptive scheduling scheme to reduce the real-time interrupt latency.
    [Show full text]