Webos: Software Support for Scalable Web Services
Total Page:16
File Type:pdf, Size:1020Kb
WebOS: Software Support for Scalable Web Services Amin Vahdat, Michael Dahlin, Paul Eastham, Chad Yoshikawa, Thomas Anderson, and David Culler Computer Science Division University of California Berkeley, CA 94720 Abstract making mostly read-only requests to hardware- The burgeoning popularity of the Web is pushing against constrained servers over an almost oblivious network the performance limits of the underlying infrastructure, pre- fabric. Our goal is to demonstrate a different vision senting a number of dif®cult challenges for the Web as a sys- of the Web operating environment where com- tem. We believe that resources such as connectivity, stor- putation and storage servers in the Web function age, computation, latency, and bandwidth are likely to re- cooperatively to provide ef®cient access to resources main constrained in the future. Thus, we are building a supporting Web applications. These core servers higher level Web operating system to ef®ciently manage are incrementally scalable and highly available, these resources for the bene®t of all Web users. Our pro- browsers are aware of the service characteristics totype, WebOS, is designed to support highly available, in- and the state of the network connection to these crementally scalable, self-tuning, dynamically recon®gur- servers, and servers can cooperate to share load and ing and geographically aware Web services. WebOS in- distribute data throughout the Web to both reduce cludes mechanisms for naming Web services, coherent data latency and save bandwidth. This vision requires a replication, ef®cient distribution of client requests across software framework for development of advanced the wide area, safe execution of arbitrary executables on re- Web applications. mote sites, and authentication for secure access to Web re- To demonstrate that our vision of the Web is sources. achievable, we are constructing WebOS, a framework for building highly available, incrementally scalable, self-tuning, dynamically recon®guring and geograph- 1 Introduction ically aware Web services. WebOS must provide tra- ditional local OS functionality at the global level, in- The growth rate and scale of the Web presents a fam- cluding: naming, resource allocation, inter-process ily of dif®cult challenges for the Web as a system. communication, scheduling, fault tolerance, and au- While the underlying communications infrastructure thentication. To this end, we have initial prototypes has proved surprisingly robust, we believe that the of: (i) Smart Clients for fault tolerant, load balanced presence of a higher level system design will enable access to Web services, (ii) WebFS, a global cache advances in the capabilities of Web applications. In coherent ®lesystem, (iii) a virtual machine support- short, we believe the developmentof an operating sys- ing secure program execution and mechanisms for re- tem for the Web will provide Web applications conve- source allocation, and (iv) authentication for secure nient access to shared resources. The operating sys- access to global Web resources. To evaluate WebOS, tem will ef®ciently manage these resources for the we are developing a global cooperative cache of Web bene®t of all users of the system. pages, allowing replication and migration of Web ser- Today, the operating environment of the Web vices during times of peak demand. could be characterized as location-unaware browsers The rest of this paper will describe the WebOS de- This work was supported in part by the Defense Advanced Research Projects Agency (N00600-93-C-2481, F30602-95-C-0014), the National sign, implementation status, and supported applica- Science Foundation (CDA 0401156), California MICRO, the AT&T Founda- tions. Section 2 describes our requirements for high- tion, Digital Equipment Corporation, Exabyte, Hewlett Packard, Intel, IBM, Microsoft, Mitsubishi, Siemens Corporation, Sun Microsystems, and Xerox. performance scalable Web services. Section 3 and 4 Anderson was also supported by a National Science Foundation Presidential presents the WebOS design and implementation sta- Faculty Fellowship. Yoshikawa is supported by a National Science Foun- tus. Section 5 details an application we are building dation Fellowship. The authors can be contacted at fvahdat, dahlin, eastham, chad, tea, [email protected]. to evaluate WebOS. Section 6 describes related work and Section 7 concludes. move portions of their service across the Inter- net. Once a service becomes suf®ciently popu- lar, it is almost forced to become geographically 2 Web Service Requirements distributed; no single point of connection on the Internet is capable of servicing simultaneous re- In this section, we outline some of the requirements quests from millions of users all over the world. for a software framework supporting high perfor- mance scalable Web services. Existing Web services currently address these requirements on an ad-hoc and incomplete basis. End-to-end continuous operation: With millions WebOS is designed to support construction services of potential clients geographically distributed capable of dynamic geographic migration and repli- around the world, someone somewhere is always cation based on client demands. Any necessary requesting service. Computer systems providing service state is replicated on demand through a global continuous service over a period of years must cache coherent ®le system. Service code such as not only mask failures, but must also demon- HTTP servers or CGI-bin programs are executed in a strate end-to-end availability based on client per- protected environment. Authentication mechanisms ceptions. It does not matter if the service is are provided to ensure secure access to private ªworkingº if the client cannot access it because service state (e.g. customer database or Web index) of router or DNS problems. As a motivating ex- and to safeguard against unauthorized execution of ample, one study [44] found that the probability service programs. To allow the client to automat- of encountering a major Internet routing pathol- ically choose the service provider able to deliver ogy was between 1.5% and 3.4%, and that the the best performance to the user, the client code is failure rate may increase with increasing Internet enhanced to become aware of service replication, its use. location relative to the nearest server, and the relative load of each of the servers. Similarly, clients can Cheap incremental scalability: To make it easy transmit their geographic location to the service so to create new services, it is crucial to minimize that intelligent migration/replication decisions can be the initial hardware and software investment re- made before a service becomes overloaded. quired to go on-line. It is equally crucial for the service to be able to add hardware resources as the service becomes more popular. 3 Scalable Web Service Access Graceful burst behavior: The aggregate behav- This section and the next describe the design and im- ior of millions of clients is highly bursty; for ex- plementation status of the WebOS prototype. The ample, the USGS web site was effectively un- goal of the prototype is to support services running usable for hours after a recent (small) Northern at multiple, geographically distributed sites across the California earthquake [50]. Bursty client behav- Internet. One motivation is to optimize for Internet ior requires server software to be able dynami- latency and congestion; by moving services closer to cally recruit resources over the Internet. In ad- clients, we reduce response time and consume less In- dition, burstiness puts a premium on adapting in ternet bandwidth. Another motivation is the bursti- advanceÐusing idle time to recon®gure the sys- ness of Web requests. We believe the appropriate tem to better handle any impending spikes in de- model for effectively delivering Web services to end mand. For example, the IRS Web site becomes users is that of the power grid where underutilized re- extremely loaded every year in mid-April. sources can serve excess demand. For example, a spo- radically popular web site, such as the IRS near April Geographically dispersed resources: The Inter- net is slow, costly, and prone to congestion. La- 15th, should buy only enough local hardware to han- tency (often best predicted by the number of net- dle the typical demand and dynamically recruit extra work hops between client and server) is often hardware resources over the Internet to handle peak the most important metric for measuring perfor- loads. mance delivered to the client. Further, the cur- rent (hidden) cost of Internet traf®c is a few dol- 3.1 Current Approaches lars per terabyte-mile [25]. These considera- tions suggest that to both reduce latency and con- Today, some popular services such as the Alta Vista sume less bandwidth, service providers should search engine [21] or the Netscape home page [43] are geographically distributed but replicated manu- ally by the service provider. Load balancing across Nearby Client-Side Applet Request the wide area is achieved by instructing users to ac- User Mirror cess a particular ªmirror siteº based on their location. Requests GUI Director To distribute load in the local area techniques such Thread Thread Response/ as HTTP redirect [6] or DNS Aliasing [11, 38] can State update Distant be used to send user requests to individual machines. Mirror With HTTP redirect, a front end machine redirects the client to resend the request to an available worker ma- chine. This approach has the disadvantage of adding a round trip message latency to each request. Further, Figure 1: Two cooperating threads make up the Smart the front-end machine serving redirects is both a sin- Client architecture. The GUI thread presents the ser- gle point of failure and a central bottleneck for very vice interface and passes user requests to the Director popular services. Finally, any load-balancing mecha- Thread. The Director is responsible for picking a ser- nisms must be re-implemented on the front-end on a vice provider likely to provide best service to the user. service-by-service basis.