Sumo: Analysis and Optimization of Amazon EC2 Instances
Total Page:16
File Type:pdf, Size:1020Kb
J Grid Computing DOI 10.1007/s10723-014-9311-x SuMo: Analysis and Optimization of Amazon EC2 Instances P. Kokkinos · T. A. Varvarigou · A. Kretsis · P. Soumplis · E. A. Varvarigos Received: 29 January 2014 / Accepted: 3 September 2014 © Springer Science+Business Media Dordrecht 2014 Abstract The analysis and optimization of public mization (CUO) mechanism, formulated as an Integer clouds gains momentum as an important research Linear Programming (ILP) problem, for optimizing topic, due to their widespread exploitation by individ- the cost and the utilization of a set of running Ama- ual users, researchers and companies for their daily zon EC2 instances. This CUO mechanism receives tasks. We identify primitive algorithmic operations information on the currently used set of instances that should be part of a cloud analysis and opti- (their number, type, utilization) and proposes a new mization tool, such as resource profiling, performance set of instances for serving the same load that mini- spike detection and prediction, resource resizing, and mizes cost and maximizes utilization and performance others, and we investigate ways the collected monitor- efficiency. ing information can be processed towards these pur- poses. The analyzed information is valuable in driving Keywords Public clouds · Analysis · Optimization · important virtual resource management decisions. We Amazon web services · Toolkit also present an open-source tool we developed, called SuMo,which contains the necessary functionalities for collecting monitoring data from Amazon Web Ser- 1 Introduction vices (AWS), analyzing them and providing resource optimization suggestions. SuMo makes easy for any- Cloud computing [1, 2] provides resources as a service one to analyze AWS instances behavior, incorporating over the network, enabling their efficient and flexi- a set of basic modules that provide profiling and spikef ble management. An increasing number of individual detection functionality. It can also be used as a basis users, researchers and companies, established or star- for the development of new such analytic procedures tups, of any size and scope, trust their computing and for AWS. SuMo contains a Cost and Utilization Opti- storage tasks to public and private clouds, replacing fixed Information Technology (IT) costs of owner- ship and operation with variable use-dependent costs. P. Kokkinos () · T. A. Varvarigou Public clouds provide resizable compute capacity as a Department of Electrical and Computer Engineering public service (e.g., Amazon Web Services – AWS [3], National Technical University of Athens, Athens, Greece Rackspace [4]), while private clouds are built based e-mail: [email protected] on the organizations’ own infrastructure, using cloud A. Kretsis · P. Soumplis · E. A. Varvarigos computing toolkits (e.g., OpenStack [5], OpenNebula Department of Computer Engineering and Informatics, [6], Eucalyptus [7]) and providing computing services University of Patras, Patra, Greece to their employees or customers. P. Kokkinos et al. Monitoring, analysis and optimization are a set of recently as an important field, studied in a number of important interrelated operations for cloud resource works [11, 12]. management. The sheer number of cloud resources In our work, we investigate ways in which the makes it difficult for a simple user or administra- monitoring information collected by a public cloud tor to effectively monitor and analyze their behavior provider in general and by AWS in particular, can be or control (optimize) the parameters that determine used in a “smart” way to produce valuable informa- their proper use. Estimates in [8] place the number tion and actionable data for the proper and efficient of servers in commercial data centers, used for pro- use and management of the virtual resources. Towards viding public or private cloud services, to 450.000 this end a number, of basic algorithmic operations servers, while even larger ones (in the order of 106 to are identified, including resource profiling, perfor- 107 machines) are envisioned for the future [9]. Also, mance spike detection, performance prediction and organizations may utilize hundreds of virtual instances resource resizing. We also present a tool, called SuMo, of a public cloud provider for their operations [10]. that we developed to assist in cloud analysis and A good analytic and optimization entity ensures that optimization. SuMo contains the necessary mecha- the monitored resources run uninterrupted and with nisms for collecting monitoring data from AWS and acceptable performance and utilization, keeping the it incorporates an initial set of basic algorithms (for associated (e.g., energy) costs low, either proactively profiling and spike detection) for analyzing them. (e.g., by informing the administration for an ill behav- It also contains a Cost and Utilization Optimization ior) or reactively (e.g., by detecting problems as they (CUO) mechanism for minimizing the usage cost of appear). Amazon EC2 instances and maximizing their uti- There are a number of important differences lization and the performance efficiency, based on between analyzing private and public clouds, both Integer Linear Programming (ILP) formulation. The regarding the way monitoring is performed and proposed mechanism collects information on the type, regarding the parameters of interest and the require- number, utilization and other features of the cur- ments. In private clouds the administrator has full rent set of AWS instances running and proposes a access to the resources and is able to monitor any new set of instances that could be used for serving kind of parameter (performance, energy, etc.), with the same load. Our experimental results show that any kind of software or hardware tools and at any the proposed algorithm can increase the utilization granularity. This is not the case for public clouds (e.g., and efficiency of the infrastructure resources, while Amazon Web Services - AWS [3]) where the users lowering the accumulated user costs. When neces- have access to their virtual resources through web sary, CUO also increases resources’ capacity so as or programmable interfaces that provide only specific to resolve possible performance bottlenecks. Also, information (e.g., state, performance), and at specific a number of conclusions are drawn regarding the granularities (e.g., every 5 minutes). The main param- effects that instances’ characteristics (capacity gran- eters of interest for public clouds are the cost of ularity, region of operation) have on utilization and the resources and their utilization. Utilization directly cost. SuMo makes easy for anyone, a researcher affects the cost, since it determines how effectively the or an administrator, to monitor his instances and resources of the cloud provider are being used. run the proposed analytic mechanisms or implement The cost encountered by a user of public clouds new more intelligent ones. SuMo tool is open-source depends mainly on the pricing policy of the cloud and available through github [47]. Currently, the provider, the number and types of resources used, and development of the core SuMo functionalities has the resources’ utilization. Today, there are signs of been completed and we are investigating whether merchandising the virtual/cloud resources as goods SuMo could be coupled with other tools and in (like gold, stocks, or energy). For example, recently, it particular, cloud management software for federated became possible for a user of AWS [3] who has bought clouds. a number of computing resources, to resell the unused The remainder of the paper is organized as follows. part of his owned resources as the needs change, such In Section 2 we report on previous work. In Section as selling capacity for projects that end before the term III we describe Amazon’s computing and monitor- expires. In general, cloud economics have emerged ing services. In Section 4, we discuss the algorithmic SuMo: Analysis and Optimization of Amazon EC2 Instances challenges posed by the analysis of cloud monitor- the platform via an API. The platform will imple- ing data. In Section 5, we present the SuMo toolkit ment a multi-agent brokering mechanism that looks and its constituent modules. In Section 6 we describe for services matching the applications’ request, and an Integer Linear Program (ILP) formulation used possibly composes the requested service if no direct for resource re-optimization (resizing). In Section 7 hit is found. Other similar projects are Optimis [16, we present performance results obtained using SuMo 40] and Aeolus [41]. Aoleus is an open-source Cloud and the CUO mechanism. Finally, in Section 8 we management software that allows users to choose conclude the paper. between Private, Public or Hybrid Clouds, using δ- Cloud library. The Optimis Toolkit offers a platform for Cloud service provisioning that manages the life- 2 Previous Work cycle of the service and addresses issues like risk and trust management. In [45] authors present a federated IT monitoring and analysis has long been around, cloud management solution that utilizes cloud-brokers targeting the needs of physical servers and other for various IaaS providers. In this work, among oth- devices (printers, switches, ups, etc.), clusters, grids ers, a monitoring service has been designed with the [13] and small or large data centers. Clouds bring capability to simultaneously monitor both private and a completely new environment and introduce new public clouds. In [46]