Experimental Evaluation of an Adaptive Flash Crowd Protection System

Experimental Evaluation of an Adaptive Flash Crowd Protection System Xuan Chen and John Heidemann ∗ ISI-TR-2003-573 July 2003 Abstract target server requests responses networks Network early warning system (NEWS) is an adaptive admission control scheme that protects server and net- ... ... ... ... works from overloading during flash crowds, and maintains high performance for accepted requests. Unlike packets dropped other admission control systems, NEWS regulates requests by observing response performance, automatically Figure 1: Several factors affect end-user web performance adapting to changing traffic mixes. We have previously during flash crowds. studied NEWS through simulation; this paper presents an implementation of NEWS on a Linux-based router and evaluates that implementation in testbed experiments NEWS request Regulation with HTTP server log recorded during a flash crowd. NEWS This paper has three contributions. First, we use the ack implementation to evaluate scenarios not considered in TCP simulation. In addition to validating our previous simu- ... ... TCP TCP Congestion lation results in network-limited scenario quantitatively, ... ... client Control server we further consider server memory-limited scenario, con- TCP ... ... firming that NEWS is effective in both cases. Second, we ack evaluate the run-time cost of NEWS traffic monitoring in Measurement response practice, and find that it consumes little CPU time and relatively small memory. Finally, we extend core NEWS (a) algorithms to include hot-spot identification function to (b) protect bystander traffic from flash crowds efficiently. Figure 2: NEWS enforces high-level control among indi- 1 Introduction vidual connections, and between requests and responses. We design Network Early Warning System (NEWS) to adaptively protect server and networks from persistent overloading during flash crowds [1, 2]. tion in response network path [3]. As a result, most Flash crowds usually happen when many end-users end-users perceive unacceptably poor performance. simultaneously send requests to one web site (tar- In the mean while, flash crowds unintentionally deny get server) because of sudden events such as earth- services for other end-users who either share common quakes, breaking news stories, or links from popular links with flash crowd traffic or retrieve unrelated in- web sites (slash-dot effect). As shown in Figure 1, formation from target server. target server may reject some excessive requests, and Flash crowds are caused by too many concurrent process other accepted ones slowly due to either its re- connections, even though each of them is well be- source limitation (CPU, memory or disk), or conges- haved under TCP congestion control. Based on this observation, we argue that a high-level control ∗Xuan Chen and John Heidemann are with University of is essential to mitigate flash crowds. By deploy- Southern California, Information Sciences Institute. This ma- ing NEWS, we impose such a super-visioning control terial is based upon work supported by DARPA via the Space from two aspects: it enforces cooperation among in- and Naval Warfare Systems Center San Diego under Contract No. N66001-00-C-8066 (\SAMAN"), and by the CONSER dividual TCP connections (Figure 2(a)); it further project supported by NSF. coordinates requests and responses (Figure 2(b)) so 1 that neither target server nor network links are over- Second, by implementing NEWS on a Linux-based loaded. router, we examine NEWS overhead on router. We We design NEWS as a router-based adaptive ad- measure CPU and memory usage of NEWS under dif- mission control scheme. It admits incoming requests ferent detection intervals. Our statistics show that based on measurement of response performance. This NEWS only consumes less than 5% of CPU time approach is different from previous admission con- and 3{10M byte memory. Overall, NEWS is a rela- trol algorithms considering explicit service require- tive light-weighted scheme and applicable to real net- ment or measuring performance of incoming traffic works. directly [4, 5, 6, 7]. Moreover, NEWS only measures Third, we extend NEWS to automatically detect aggregate performance for fast connections (details target server by extracting common characteristics in Section 3.1), rather than for all existing flows like among incoming requests. In the case of multiple measurement-based admission control [6, 7, 8]. servers connecting through NEWS router, this hot- Another novel aspect of NEWS is that it detects spot identification algorithm helps NEWS to regulate flash crowds through performance change in response incoming requests intelligently, that is, automatically traffic. This is different from traditional detection identifying one hot server even if many other servers schemes that monitor increase in request rate di- are operational behind the NEWS router. Our re- rectly [9, 10]. As shown in Figure 1, many factors sults show that this function can protect traffic to affect end-user perceived performance, including re- nearby, bystander servers efficiently. Compared to quest rate, server's capacity, and network bandwidth. original NEWS algorithms, it reduces average end- As a result, it is difficult to determine a single optimal to-end latency for bystander web traffic by about 17 threshold in request rate to detect flash crowds. Even times (Section 5.6). if an optimal threshold exists, it changes over times due to variation in response sizes [11]. In another 2 Related Work word, request rate is not necessarily correlated with response performance. Therefore, we believe that we In this section, we briefly describe mechanisms should detect flash crowd by monitoring performance to accommodate flash crowds through resource pro- changes in response traffic. visioning. We also review other commonly used We have developed basic flash crowd detection and schemes to protect servers and networks from over- mitigation algorithms for NEWS in ns [12], and eval- loading, such as admission control, congestion con- uate NEWS performance with simulations [13]. In trol, and server overload control. this paper, we report our experience of implementing Infrastructure vendors such as Akamai [14] deploy NEWS on a Linux-based router, and summarize re- web caches and content delivery networks (CDN) sults from testbed experiments. This work has three to accommodate excessive web requests during flash major contributions. crowds. However, recent studies [1, 15] show that First, with testbed experiments, we validate current web caching schemes are not so efficient to NEWS performance in a network-limited scenario deal with flash crowds as claimed. One important quantitatively in a more realistic experimental model reason is that most web pages are not cached be- than simulations. We also configure a server-limited fore flash crowds (breaking news, for example). As scenario, where request processing exhausts server's a result, requests still reach target server eventually. memory. We investigate system performance of tar- Jung et. al. proposed adaptive web caching [1] for get server under flash crowds in details. improvement. We show that NEWS is an adaptive system; ef- It is necessary to provide enough resources to pre- fectively prevents server and network overloading in vent flash crowds from happening, and accommodate both scenarios. More specifically, NEWS detects excessive traffic afterward. However, due to the na- flash crowds around one detection interval (88 sec- ture of flash crowds, there are circumstances that it onds with 64 seconds' detection interval). It au- is either difficult or impossible to provide or even es- tomatically regulates incoming requests to a proper timate the \enough" resources [2, 16]. Therefore, we rate. As a result, NEWS protects target server and believe that we need to deploy control scheme like networks from overloading by discarding about 49% NEWS to mitigate flash crowds in these cases so that excessive requests. NEWS also maintains high per- servers and networks survive from overloading, and formance for admitted requests: increasing aggregate some end-users still perceive reasonable high perfor- response rate for high-bandwidth connections by two mance. times. This performance is comparable to the best Due to large number of concurrent connections possible static rate limiter in the same scenario. during flash crowds, per-connection (such as TCP) 2 and per-host (such as congestion manager (CM) [17]) 500 based congestion control algorithms are not sufficient 450 to protect network from overloading. On the other 400 hand, we can apply algorithms like aggregate-based 350 congestion control (ACC) [18] to relief congestion caused by response traffic. However, since ACC does 300 not reduce the volume of incoming requests, it can 250 not mitigate flash crowds either. Therefore, we em- 200 phasize that a high-level control between requests and 150 responses is essential to protect servers and networks 100 from flash crowds. 50 Admission control [4, 5, 6, 7] is important to sup- ARR (Kbps) for HBCs under Flash Crowds 0 0 500 1000 1500 2000 port quality-of-service. Based on service requirement and current available resources (network bandwidth or circuits), admission control makes decision to ac- Figure 3: Aggregate response rate of high-bandwidth connections drops dramatically under flash crowds. cept incoming connections or not. Although appro- priate in integrated service networks [19] and public telephone networks [20], it is difficult for some ap- we developed in simulation [13]. Rather than dis- plications to accurately

Load more