Service Differentiation in Multitier Data Centers
Total Page:16
File Type:pdf, Size:1020Kb
Service Differentiation in Multitier Data Centers Kostas Katsalis∗ Georgios S. Paschoszy Yannis Viniotisx, Leandros Tassiulas∗y ∗University of Thessaly, Dept. of ECE, Greece, yCERTH-ITI, Greece xNorth Carolina State University, Dept of ECE, NC, USA, zMassachusetts Institute of Technology, LIDS, MA, USA Email: kkatsalis,[email protected], [email protected], [email protected] Abstract—In this paper, we study the problem of resource may include XML/XSLT transformations, logging and se- allocation in the setting of multitier data centers. Our main curity/authentication. After preprocessing, requests are for- motivation and objective is to provide applications hosted in warded for service at a server in the final processing tier. This the data center with different service levels. In such centers, there are several mechanisms the designer can use to achieve is the place where the “main business logic” is executed. such objectives. We restrict our attention to CPU time at the The datacenter hosts several applications/services. Traffic service tier as the resource; the objective we consider is service entering the datacenter is naturally classified into classes - differentiation, expressed as allocating prespecified percentages of called Service Domains (SD) in SOA environments. How this resource to applications. Then, mechanisms at the designer’s CPU time is allocated to the Service Domains clearly affects disposal to provide desired service differentiation include the triplet of load balancing through the switch fabric, enqueueing at the performance seen by the application requests. In typical a server and scheduling at a server. We focus on the enqueueing Service Level Agreements (SLAs), metrics regarding this component of control mechanisms. We provide, through analysis performance are included and expressed in various ways. In and simulations “rules of thumb” for situations where simple this paper, we focus on CPU utilization as one of these metrics. enqueueing policies can provide service differentiation. For simplicity and clarity of presentation, we consider CPU Index Terms—Data Centers, Service Differentiation, Resource Provisioning utilization of the servers in the preprocessing tier only. As an example, an SLA with CPU utilization metrics can be I. INTRODUCTION described as follows: guarantee 20% of total (preprocessing Although the well known datacenter multitier architecture tier) CPU power to SD A, 80% to SD B. still remains the basis of the datacenter network design, Three elements have an effect on how CPU time is allocated Software Defined Networking (SDN) enforces a new approach to requests of a SD and thus can determine whether the SLA in the way we build datacenter networks or use virtualization can be honored or not. An administrator can control them in technologies. A main argument in SDNs is that every single order to provide service differentiation to the various SD. In hardware or software component must be a service or a Figure 1, we label the main available control actions that an service handler/receiver/producer. It was only recently that administrator can use. First, an arriving request must be routed large vendors decided to use powerful middleware [1] in order through the fabric to one of the servers; we call this the load to offer web services optimization. In large scale deploy- balancing control. Once it is sent to a server, a request must ments for applications (e.g., banking transactions) specialized be enqueued in a buffer; we call the design of the buffering middleware is now widely used as a service accelerator and scheme the enqueueing policy. This is the second control at optimizer. For example the VECSS platform [2] was developed the disposal of the administrator. The third control is the CPU jointly by VISA and IBM and is used to process over than scheduling inside the servers, i.e., deciding from which queue 8,000 banking transactions per second, with over than 500 to forward a request for processing. different transaction types. Specialized middleware can act Note that the controls are distributed: two controls are as the connecting link between SDNs and the so called implemented in the server (but we have many of them in the Application Delivery Networks (ADN) having a central role tier) and one is implemented in the switch fabric (at one or in future SDN/ADN based datacenter designs. more tiers therein). Clearly all three controls have an effect In Figure 1, a (high-level) typical multitier datacenter ar- on what percentage of the CPU time in the cluster of servers chitecture is presented. The key elements of this architecture a given SD can have. Design/analysis of a distributed, end- are the switch fabric (depicted as a three-tier core-distribution- to-end control policy is out of the scope of this study. In access architecture), the preprocessing tier and the final pro- this paper, we consider fixed load balancing and scheduling cessing tier [3]. (Depending on the application, of course, policies and focus on the effect of the enqueueing part of not all requests require preprocessing.) Core router/switches the triplet. Our contribution is twofold (a) we propose a accept external traffic through the Internet or a VPN. They simple design for the enqueueing policy that is geared towards route this traffic through the fabric to one of several servers distributed implementation: it has low communication and in the preprocessing tier. Servers in the preprocessing tier processing overhead, and, (b) we provide “rules of thumb” are typically specialized middleware devices that prepare for when the design can achieve the SLA. We base these rules requests for processing in the servers housed at the final on both analysis and simulation. processing tier. Typical functions at the preprocessing tier The paper is organized as follows. In Section II, we present Data center Infrastructure Final processing Tier Data Center switch fabric Preprocessing tier Middleware Services Internet Servers & Databases Customer 1 2 ... K After preprocessing requests A Yes Yes ... Yes go to final processing B Yes No ... No C No Yes ... No Load Balancing Preprocessing Tier OverloadOverload SLA SLA Contract Contract ● Customer A: 20% CPU power CPU ● Customer B: 30% CPU power ● Customer C: 50% CPU power Enqueueing Scheduling Servers Figure 1. Multitier Architecture the motivation for this study and related work. In Section III, III. PROBLEM STATEMENT we define the system model and state the SLA and the problem A. System model and assumptions formally. In Section IV, we present the proposed policies; we evaluate them in Section V and provide rules of thumb in The system model is presented in Figure 2. We begin Section VI. We conclude the paper in Section VII. by omitting the server tier and collapsing the datacenter network fabric into a load balancer function. Modeling the routing through the switch fabric as a single load balancer is sufficient, since we focus on the enqueueing policies in II. MOTIVATION AND RELATED WORK this study; which (core, aggregation or access) switch(es) were responsible for forwarding the traffic to a preprocessing tier In enterprise (data center) SLAs, performance objectives server is irrelevant. are described in terms of several and varied metrics, such The load balancer is responsible for distributing incoming as cost, response time, availability of resources, uptime, and traffic from a set D = f1; :::; dg of service domains into losses. Meeting such SLAs requires careful management of a cluster of N = f1; :::; ng servers, each with a CPU of data center resources and is a fine balancing act of policies capacity µ. Every middleware server is modeled as Multiclass- for short, medium and long-term resource allocation. multiqueue system (MCMQS), with M = f1; :::; mg the set When an “overload” occurs, typical SLAs define a different, of queues for all the servers. We assume that m < d, i.e., there usually simplified set of performance objectives. For example, are not enough queues available to separate traffic belonging under overload conditions, it is very difficult to provide to different service domains. This is a reasonable assumption response time or loss guarantees. A common example of an in data centers that offer SaaS, IaaS or PaaS services. SLA in overload conditions is one that focuses on simply Finally, we assume that the incoming traffic for SD i follows stated CPU utilization metrics; we describe a specific one in a general arrival and service process with (unknown) arrival the next section. Such simplified SLAs have the additional rates λi and service rates 1=ESi, where ESi is the mean advantage of not requiring knowledge about the arrival and service time of SD i. Also we assume non preemptive service service statistics of the SD. for the CPU operation. We assume that signaling and feedback In the field of load balancing policies, Bernoulli splitting across the servers take negligible time, however we note that is a simple mechanism that has been studied under multiple the design of our control system is tailored to minimize these variations (e.g., [8]). Enqueueing mechanisms have been ex- effects. tensively used in the past for providing QoS guarantees [9], while Multiclass-multiqueue systems studies have been studied B. Control policy definition for token based systems [10]. A control policy π is a rule according to which control Scheduling policies have been mainly used to control re- actions are taken at time t. We identify different control sponse time metrics; work in this space typically assumes policies regarding different control points inside the network. arrival or service knowledge [4] or makes use of complex 1) Load balancing policy πr that defines a forwarding feedback mechanisms [5]. In prior work [6], we proved that action r(tr). Say tr is the time instant that the load static scheduling policies fail to provide arbitrary percentages balancer must forward an incoming request to the clus- of CPU utilization for every SD.