Improving Latency with Active Queue Management (AQM) During COVID-19 By Allen Flickinger, Carl Klatsky, Atahualpa Ledesma, Jason Livingood, Sebnem Ozer 30 July 2021 1. Abstract During the COVID-19 pandemic the Comcast network performed well in response to unprecedented changes in Internet usage1 2 3 4. Video communications applications such as Zoom, WebEx, and FaceTime have exploded in popularity – applications that happen to be sensitive to network latency. However, in today’s typical networks – such as a home network – those applications often degrade if they are sharing a network link with other network traffic. This is a problem caused by a network design flaw often described using the term ‘buffer bloat’. Several years ago, Comcast helped to fund research & development in the technical community into new Active Queue Management (AQM) techniques to eliminate this issue and this was later built into Data Over Cable Service Interface Specification (DOCSIS) standards5 6 7. Just prior to the pandemic, Comcast also deployed a large-scale network performance measurement system that included a latency under load test. In addition, Comcast happened to deploy two otherwise identical cable modems; one with upstream AQM enabled, and the other without. This fortuitous confluence of events has enabled Comcast to perform a comparative analysis of the differences between cable modem gateways using AQM with those that lack that enhancement, at a unique level of scale across many months of time and millions of devices. These deployments were completed during a period of unprecedented network bandwidth demand growth due to work and school from home during the pandemic. The data reveals significantly better upstream latency under load performance when the Proportional Integral Controller Enhanced (DOCSIS-PIE) AQM is used. For the device with AQM, the large majority of the latency under load tests resulted in 15-30 milliseconds of latency. In comparison, the device without AQM averaged roughly 250 milliseconds of latency – between 8-16 times higher – a highly significant difference to the end user quality of experience (QoE). As a result, users of this XB6 device with AQM enabled will have a far superior experience when using all Internet applications. Delay is sometimes thought to affect primarily latency-sensitive applications such as video conferencing and online gaming, but consistent low delay provides a better user experience for any Internet application where the user is waiting to see the result of some operation, such as web browsing, stock quotes, weather forecasts, maps, 1 See 2020 network performance report at https://corporate.comcast.com/press/releases/comcast-2020-network- performance-data 2 See report on Pandemic Network Performance from the Broadband Internet Technical Advisory Group (BITAG) at https://www.bitag.org/2020-pandemic-network-performance.php 3 See report from an Internet Architecture Board workshop on COVID at https://www.rfc- editor.org/rfc/rfc9075.html 4 See Comcast-specific reports at https://corporate.comcast.com/covid-19/network/may-20-2020 and https://corporate.comcast.com/stories/covid-19-network-report-smart-network-speed-and-stability 5 https://community.cablelabs.com/wiki/plugins/servlet/cablelabs/alfresco/download?id=3a751ba9-593c-464a-bc91- 3675e1824e30 6 https://tools.ietf.org/id/draft-white-aqm-docsis-pie-00.html, 7 https://www-res.cablelabs.com/wp-content/uploads/2019/02/28094021/DOCSIS-AQM_May2014.pdf Improving Latency with Active Queue Management (AQM) During COVID-19 driving directions, etc. Even video streaming applications benefit from lower network delay (latency) because they need less initial buffering time before a video can begin playing. These large-scale measurement comparisons should provide additional data to justify accelerated deployment of AQM in DOCSIS and other Internet Service Provider (ISP) networks, particularly in the equipment installed in customer homes, including user-purchased home network equipment. 2. Introduction In 2020, the COVID-19 pandemic emerged as an extremely disruptive global event, leading to seismic changes in everyone’s day-to-day lives. In particular, the pandemic prompted a sudden, massive shift to home-based work & school. This led to unprecedented increases in network usage for ISPs across the world, including significant new demands on home networks and local access network segments. The key application to emerge and achieve mainstream adoption was real-time video conferencing, such as Zoom, which on the Comcast network increased in volume by 316% between February and October 20208. As opposed to other types of applications, such as overnight online backups or background downloads of video game updates, the quality of the end user experience is highly delay-sensitive – meaning it is significantly affected by any network latency. Even simple web browsing is improved by lower network delays, though in this case users have been trained over the last 25 years to simply to be more patient when web pages are sometimes slow to load for no apparent reason. While it has always been the case that real-time communications traffic is latency-sensitive, widespread adoption of video conferencing in parallel with significantly increased competing LAN traffic in a home during the COVID-19 pandemic has created a perfect storm of using latency-sensitive applications while other Internet traffic is sharing the home network. This can lead users to experience a degraded video conferencing experience – because of “latency under load” or higher “working latency” – during times when more than one kind of traffic is sharing the network. High latency under load and buffer bloat9, are issues that the networking community has been aware of for many years and Comcast and other ISPs and software developers have been working on solutions to this challenge. In the case of DOCSIS networks, the solution has been to shift from legacy Tail-Drop First-In First-Out (FIFO) queues and configurable buffer control in DOCSIS 3.0, to Active Queue Management (AQM)10 – a feature that is implemented DOCSIS and many other networks and software systems. Enabling AQM on a network interface that suffers from poor responsiveness under normal working conditions improves the performance of all user-facing network applications, and especially real-time, latency-sensitive applications such 8 Based on analysis of aggregate network changes from NetFlow, interconnection traffic exchanGe, and similar data sources. 9 See Buffer Bloat article, “Buffer Bloat: Dark Buffers in the Internet” by Jim Gettys in the November 29, 2011 issue of Communications of the ACM at https://queue.acm.org/detail.cfm?id=2071893 10 RFC 7567 at https://www.rfc-editor.org/rfc/rfc7567.html is a great backGround to why AQM is necessary, so is an excellent starting point for readers learning about the AQM and the context into which it fits in the history of the Internet. Page 2 of 19 Improving Latency with Active Queue Management (AQM) During COVID-19 as video conferencing, without any negative effects on overall throughput or on delay-tolerant applications. This paper explains how and where AQM was deployed in the Comcast network. It also compares latency under load performance measurements for those devices that utilized AQM to those without AQM support. 3. Developments in 2020 In early 2020 Comcast implemented a new cable modem-based Quality of Experience (QoE) measurement system. This QoE measurement system, notably for this report, included a latency under load measurement11. The measurement system was deployed in early 2020 prior to the pandemic 12 13. It consisted of millions of devices able to run performance tests, thus creating a baseline of latency under load measurements prior to the pandemic. In addition, of the range of Comcast cable modem gateways and retail cable modems, the Comcast XB6 device has two distinct model variants – one with upstream AQM enabled during the measurement period in 2020 and one without. This provided a fortuitous opportunity for a uniquely high scale side-by-side performance comparison for a specific model of Comcast home gateway at a time of high demand for latency-sensitive applications. Comcast also in late 2020 deployed and optimized downstream AQM on all Arris E6000 Cable Modem Termination Systems (CMTSes)14. Looking at the events and changes of the year 2020, it is clear that latency-sensitive applications such as video conferencing are here to stay. As well, implementing AQM and other low latency improvements can very positively impact end user QoE. Such improvements can also be deployed independently by ISPs and equipment manufacturers, with no need to coordinate with application developers15. This should provide motivation for network operators to deploy AQM and other low latency improvements and we believe that latency under load performance will emerge as a competitive differentiator – perhaps eventually on equal footing as connection speed in the minds of consumers once they learn how critical latency under load is to their perception of performance. 11 The test uses the open Realtime Response Under Load (RRUL) specification per https://www.bufferbloat.net/projects/bloat/wiki/RRUL_Spec/ 12 See NetForecast’s independent audit of the design of this measurement system at https://www.netforecast.com/netforecast-design-audit-report-of-comcasts-network-performance-measurement- system/ 13 See NetForecast’s independent assessment of performance results during the pandemic at https://www.netforecast.com/netforecasts-report-on-comcasts-network-performance-measurement-system-results-
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages19 Page
-
File Size-