An Open Testbed for Realistic Benchmarking of Web Applications

BenchLab: An Open Testbed for Realistic Benchmarking of Web Applications Emmanuel Cecchet, Veena Udayabhanu, Timothy Wood, Prashant Shenoy University of Massachusetts Amherst {cecchet,veena,twood,shenoy}@cs.umass.edu Abstract Web applications have evolved from serving static content to dynamically generating Web pages. Web 2.0 applications include JavaScript and AJAX technologies that manage increasingly complex interactions between the client and the Web server. Traditional benchmarks rely on browser emulators that mimic the basic network functionality of real Web browsers but cannot emulate the more complex interactions. Moreover, experiments are typically conduct- ed on LANs, which fail to capture real latencies perceived by users geographically distributed on the Internet. To address these issues, we propose BenchLab, an open testbed that uses real Web browsers to measure the performance of Web applications. We show why using real browsers is important for benchmarking modern Web applications such as Wikibooks and demonstrate geographically distributed load injection for modern Web applications. 1. Introduction vendors. CloudStone [15] is a recently proposed open- source cloud/Web benchmark that addresses some of Over the past two decades, Web applications have the above issues; it employs a modern Web 2.0 applica- evolved from serving primarily static content to com- tion architecture with load injectors relying on a Mar- plex Web 2.0 systems that support rich JavaScript and kov model to model user workloads. Cloudstone, how- AJAX interactions on the client side and employ so- ever, does not capture or emulate client-side JavaScript phisticated architectures involving multiple tiers, geo- or AJAX interactions, an important aspect of today’s graphic replication and geo-caching on the server end. Web 2.0 applications and an aspect that has implica- From the server backend perspective, a number of Web tions on the server-side load. application frameworks have emerged in recent years, In this paper, we present BenchLab, an open testbed for such as Ruby on Rails, Python Django and PHP Cake, realistic Web benchmarking that addresses the above that seek to simplify application development. From a drawbacks. BenchLab’s server component employs client perspective, Web applications now support rich modern Web 2.0 applications that represent different client interactivity as well as customizations based on domains; currently supported server backends include browser and device (e.g., laptop versus tablet versus Wikibooks (a component of Wikipedia) and Cloud- smartphone). The emergence of cloud computing has Stone’s Olio social calendaring application, with sup- only hastened these trends---today’s cloud platforms port for additional server applications planned in the (e.g., Platform-as-a-Service clouds such as Google Ap- near future. BenchLab exploits modern virtualization pEngine) support easy prototyping and deployment, technology to package its server backends as virtual along with advanced features such as autoscaling. appliances, thereby simplifying the deployment and To fully exploit these trends, Web researchers and de- configuration of these server applications in laboratory velopers need access to modern tools and benchmarks clusters and on public cloud servers. BenchLab sup- to design, experiment, and enhance modern Web sys- ports Web performance benchmarking “at scale” by tems and applications. Over the years, a number of leveraging modern public clouds---by using a number Web application benchmarks have been proposed for of cloud-based client instances, possibly in different use by the community. For instance, the research com- geographic regions, to perform scalable load injection. munity has relied on open-source benchmarks such as Cloud-based load injection is cost-effective, since it TPC-W [16] and RUBiS [13] for a number of years; does not require a large hardware infrastructure and however these benchmarks are outdated and do not also captures Internet round-trip times. In the design of fully capture the complexities of today’s Web 2.0 ap- BenchLab, we make the following contributions: plications and their workloads. To address this limita- We provide empirical results on the need to capture tion, a number of new benchmarks have been proposed, the behavior of real Web browsers during Web load such as TPC-E, SPECWeb2009 or SPECjEnter- injection. Our results show that traditional trace re- prise2010. However, the lack of open-source or freely play methods are no longer able to faithfully emu- available implementations of these benchmarks has late modern workloads and exercise client and serv- meant that their use has been limited to commercial er-side functionality of modern Web applications. features of modern Web applications. These features Based on this insight, we design BenchLab to use include: real Web browsers, in conjunction with automated Multi-tier architecture: Web applications commonly tools, to inject Web requests to the server applica- use a multi-tier architecture comprising at least of a tion. As noted above, we show that our load injec- database backend tier, where persistent state is stored, tion process can be scaled by leveraging inexpensive and a front-end tier, where the application logic is im- client instances on public cloud platforms. plemented. In modern applications, this multi-tier archi- Similar to CloudStone’s Rain [2], BenchLab pro- tecture is often implemented in the form of a Model- vides a separation between the workload model- View-Controller (MVC) architecture, reflecting a simi- ing/generation and the workload injection during lar partitioning. A number of platforms are available to implement such multi-tier applications. These include benchmark execution. Like Rain, BenchLab sup- traditional technologies such as JavaEE and PHP, as ports the injection of real Web traces as well as syn- well as a number of newer Web development frame- thetic ones generated from modeling Web user be- works such as Ruby on Rails, Python Django and PHP havior. Unlike Rain, however, BenchLab uses real Cake. Although we are less concerned about the idio- browsers to inject the requests in these traces to syncrasies of a particular platform in this work, we faithfully capture the behavior of real Web users. must nevertheless pay attention to issues such as the BenchLab is designed as an open platform for real- scaling behavior and server overheads imposed by a istic benchmarking of modern Web applications us- particular platform. ing real Web browsers. It employs a modular archi- Rich interactivity: Regardless of the actual platform tecture that is designed to support different backend used to design them, modern Web applications make server applications. We have made the source code extensive use of JavaScript, AJAX and Flash to enable for BenchLab available, while also providing virtual rich interactivity in the application. New HTML5 fea- appliance versions of our server application and cli- tures confirm this trend. In addition to supporting a rich ent tools for easy, quick deployment. application interface, such applications may incorporate The rest of this document is structured as follows. Sec- functionality such as “auto complete suggestions” tion 2 explains why realistic benchmarking is an im- where a list of completion choices is presented as a user portant and challenging problem. Section 3 introduces types text in a dialog or a search box; the list is contin- BenchLab, our approach to realistic benchmarking uously updated as more text is typed by the user. Such based on real Web browsers. Section 4 describes our functions require multiple round trip interactions be- current implementation that is experimentally evaluated tween the browser and the server and have an implica- in section 5. We discuss related work in section 6 be- tion on the server overheads. fore concluding in section 7. Scaling behavior: To scale to a larger number of users, an application may incorporate techniques such as rep- 2. Why Realistic Benchmarking Matters lication at each tier. Other common optimizations in- A realistic Web benchmark should capture, in some clude use of caches such as memcached to accelerate reasonable way, the behavior of modern Web applica- and scale the serving of Web content. When deployed tions as well as the behavior of end-users interacting on platforms such as the cloud, it is even feasible to use with these applications. While benchmarks such as functions like auto-scaling that can automatically provi- TPC-W or RUBiS were able to capture the realistic sion new servers when the load on existing ones crosses behavior of Web applications from the 1990s, the fast a threshold. paced technological evolution towards Web 2.0 has Domain: Finally, the “vertical” domain of the applica- quickly made these benchmarks obsolete. A modern tion has a key impact on the nature of the server-side Web benchmark should have realism along three key workload and application characteristics. For example, dimensions: (i) a realistic server-side application, (ii) a “social” Web applications incorporate different features realistic Web workload generator that faithfully emu- and experience a different type of workload than say, lates user behavior, and (iii) a realistic workload injec- Web applications in the financial and retail domains. tor that emulates the actual “browser experience.” In Although it is not feasible for us to capture the idiosyn- this section, we describe the key issues that must

An Open Testbed for Realistic Benchmarking of Web Applications

NASA-LMCSOC SN and GN Interoperability Testbed

Using XML, XSL, and CSS in a Digital Library

Introducing Control System Security Center (Cssc)

Software System for the Testbed

Testbed-14 Call for Participation (CFP)

The Openuav Swarm Simulation Testbed: a Collaborative Design Studio for Field Robotics

A Thesis Entitled a Light-Weight, Flexible, and Web-Based Testbed for Computer & Network Security Experiments by Wei Liu Su

D6.3 Final Demo and Testbed Experimentation Results

Bianca's Thesis

SCIONLAB: a Next-Generation Internet Testbed