ElasticMark: An Elasticity Benchmarking Framework for Cloud Platforms
Sadeka Islam
Dissertation submitted in fulfilment of the requirements for the degree of Doctor of Philosophy
School of Computer Science and Engineering Faculty of Engineering
March 2016
PLEASE TYPE ~SITY THE UNI OF NEw SOU'tl�WA L 8 Th.. I .,Di Hertatlon Sheet
Surname or Fam,t name· Islam
First name Sad a 0th.. rn m "' s PhD Abbrev,alJon for degree as g, nm the Uno ty cate nd• r:
SchoolCSE FacultyEng1neenng
IV ochma Trtle ElasbcMa .An Elasba Be g Framewooc Cloud Platforms '°'
Abstract 350 words maximum: (PLEASE TYPE)
Elasticity Is the unique cost-effective proposition of the cloud. It promises rapid adjustment of resources In response to varying wor1doads so that the application can meet its QoS objectives with minimal operational expenses. It is a crucial attribute that all commercial cloud providers frequently claim to possessin their offerings. However existing literature has not yet provided any meaningful and systematic guidanceto evaluate the elasticity of the cloud platform from the consumer's viewpoint The lack of an elasticity benchmarking framework makes it difficult for the consumer to diagnose and avoid elasticity issues, validate various claims of elasticity and compare the desirability of competing elastic cloud platfonns.
As such, this thesis proposes a novel consumer-centric elasticity benchmarking framework ElasticMark that reflects the elasticity of the cloud platfonn as a single figure of merit. It takes the consumer's perspective on running the benchmark by incorporating her application and workload profiles and then encapsulating the consumer's business objectives into the elasticity metric based on observations accessible via the cloud APls. The core framework is comprised of a penalty based elasticity measurerpentmodel, a standard workload suite with time-varying workload patterns and a set of guidelines for instantiating an executable benchmark. Toe measurement model derives the elasticity metricbased on financial penalty rates resulting from over-provisioning (unutilized resources) and under-provisioning (inadequate resources) when the cloud-hosted application is exposed to a suite of fluctuating workloads. ElastlcMant also includes a novel workload model in order to assist the consumer generate representative prototypes of her application-specific fine-scale bursty worlcloads, thus facilitates custom ela�ticity bench�arking. Fu,:thermore, this framework recommends a set of rigorous techniques to ensure repeatable and valid benchmarking results m presence of the performance unpredictability of the cloud environment
The framework has been validatedagainst a widely-used commercial cloud service to make sure that the elasti_ci� metric is� good reflection of the low-level adaptability characteristics and observed phenomena. It �as also �een pr�ven effe<:t1ve m comparing and contrasting the elasticity of multiple cloud pl�orms as well as pinpointing anomalies in their adaptive behaviors.
Declaration relating to disposition of project thesis/dissertation · · to make available my thesis or dissertationin whole of New South Wales or its agen the n· ht t a hive and 1 hereby grant to the University � provisions of the Copyright Act 1968. I retain forms of media, now or ere a � no ;n subject to the or in part in the University libranes tn all � t all or part of this thesis or the nght to use 1n ure wo rl
······· ...... Witness Signature Signature ······ ··· ..... Date copying or conditions on use Requests for e ptional c rcums n re utrin restrictions on onal recognises that there may be xce i : C::: t 1 r f of restriction may be cons idered In excepti The University l.i eq s s o a onger period must be ma de tn wn ng ction for a pen'od f up to 2 years restri ° of the Dean of Graduate Research a�d r 1 e eth approval circumstances � _ -
for Award· Date of completion of requirements FOR OFFICE USE ONLY ------COVER OF THE THESIS GLUED TO THE INSIDE FRONT THIS SHEET IS TO BE i
ORIGINALITY STATEMENT
I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledge- ment is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project’s design and conception or in style, presentation and linguistic expression is acknowledged.
Signed ...... Sadeka Islam Mar 2016 ii
COPYRIGHT STATEMENT
I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstract International (this is applicable to doctoral theses only). I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of my thesis or dissertation.
Sadeka Islam Mar 2016
AUTHENTICITY STATEMENT
I certify that the Library deposit digital copy is a direct equivalent of the final o cially approved version of my thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital format.
Sadeka Islam Mar 2016 iii
ACKNOWLEDGEMENTS
When life is sweet, say thank you and celebrate. And when life is bitter, say thank you and grow.
Shauna Niequist
The path to a doctoral degree is full of many ups and downs. Thanks to the Almighty Lord for granting me the ability to adapt well to those fluctuations! More specifically, I am thankful to Him for letting me work on a fabulous topic - elasticity; the more time I spent with it, the more I became enamored with its beauty. This is the area where my mind loves to surf, this is the place where my heart finds its route to contentment!
My passion towards research did not grow suddenly out of thin air; as I look back and connect the dots, I see a group of brilliant scholars as my primary source of inspiration. It is their passionate zeal for research that motivates my young research mind, it is their amazing influence that encourages me to strive hard for quality research. I owe them a huge debt of gratitude. Among them, first comes the name of my supervisor, Dr. Anna Liu; she is an excellent mentor with a good combination of academic skills and business insights. Her inspiring words significantly boosted up my confidence. Her thoughtful guidance helped me reach completion within the assigned timeframe. I also feel blessed to have worked with Professor Alan Fekete; his creative thinking and visionary views have a profound impact on my overall research. He also guided me to the fascinating world of benchmarking. Working with my co-supervisor, Dr. Srikumar Venugopal, enhanced my research rigor. He also guided me through the exciting sphere of control-theoretic elasticity mechanisms. My joint supervisor, Dr. Hiroshi Wada, provided me invaluable advice for my reading course and earlier work, which quickly set me up and running. I am also indebted to Dr. Sherif Sakr for his sound guidance on my literature review work. Professor Len Bass’s insightful feedback proved to be instrumental in shaping up my dissertation; remembering that with sincere gratitude. I also worked with Dr. Jacky Keung and Dr. Kevin Lee in my earlier research projects; I appreciate their help and advice. iv
I am profoundly grateful to my annual review panel - Professor Ross Je ery, Professor Fethi Rabhi, Dr. Helen Paik and Dr. Adnene Guabtni, for their thoughtful comments and improvement suggestions. Ross also helped me define my research scope using GQM; his amazing mentorship is truly appreciated with respect.
Among others, I’d like to thank UNSW learning center and Ms. Pam Mort for arrang- ing the thesis writing workshop. My sincere gratitude goes to Mr. Colin Taylor for his sound advice regarding administrative matters. I am also grateful to Coursera for o ering high quality online courses, which not only quenched my thirst for knowledge but also influenced my thinking pattern.
During this PhD, I submitted my papers to a number of conferences and journals and received insightful suggestions from anonymous reviewers and shepherds. I appreciate their invaluable feedbacks; undoubtedly, those enhanced the quality of my work. My sincere gratitude also goes to all researchers and practitioners who provided their constructive feedbacks on my research in those conferences.
I acknowledge the generous research grants from Amazon Web Services (AWS) and RightScale, which helped me carry out a series of comprehensive experiments. I received several travel grants too, which allowed me to present my research at numerous conferences; thank you all for supporting a poor PhD student.
And finally, I would like to express my heartfelt gratitude to my family for their pure transparent, unconditional love and support. I am really fortunate to have a wonderful dad who, despite not having a background in computer science, eagerly reads my research papers and asks me so many interesting questions. He and my extraordinary father-in-law also sent me motivational messages and blessings for my successful completion. I also appreciate my husband’s incredible love and support during the roller-coaster ride of this PhD. These people are the precious assets of my life, my everlasting source of inspiration. Contentment is nothing but a sparkling of silent appreciation in their eyes whenever I accomplish something. Words lose their expressive power whenever I try to thank these beautiful minds! v
To the Almighty God, who has given me the ability to learn and reason, enlightened my mind with knowledge, and guided me towards a successful completion. vi
ABSTRACT
Elasticity is the unique cost-e ective proposition of the cloud. It promises rapid ad- justment of resources in response to varying workloads so that the application can meet its QoS objectives with minimal operational expenses. It is a crucial attribute that all com- mercial cloud providers frequently claim to possess in their o erings. However, existing literature has not yet provided any meaningful and systematic guidance to evaluate the elasticity of the cloud platform from the consumer’s viewpoint. The lack of an elasticity benchmarking framework makes it di cult for the consumer to diagnose and avoid elastic- ity issues, validate various claims of elasticity and compare the desirability of competing elastic cloud platforms.
As such, this thesis proposes a novel consumer-centric elasticity benchmarking frame- work ElasticMark that reflects the elasticity of the cloud platform as a single figure of merit. It takes the consumer’s perspective on running the benchmark by incorporating her application and workload profiles and then encapsulating the consumer’s business ob- jectives into the elasticity metric based on observations accessible via the cloud APIs. The core framework is comprised of a penalty based elasticity measurement model, a standard workload suite with time-varying workload patterns and a set of guidelines for instan- tiating an executable benchmark. The measurement model derives the elasticity metric based on financial penalty rates resulting from over-provisioning (unutilized resources) and under-provisioning (inadequate resources) when the cloud-hosted application is exposed to a suite of fluctuating workloads. ElasticMark also includes a novel workload model in or- der to assist the consumer to generate representative prototypes of her application-specific fine-scale bursty workloads, thus facilitates custom elasticity benchmarking. Furthermore, this framework recommends a set of rigorous techniques to ensure repeatable and valid benchmarking results in presence of the performance unpredictability of the cloud environ- ment. The framework has been validated against a widely-used commercial cloud service to make sure that the elasticity metric is a good reflection of the low-level adaptability characteristics and observed phenomena. It has also been proven e ective in comparing and contrasting the elasticity of multiple cloud platforms as well as pinpointing anomalies in their adaptive behaviors. vii
PUBLICATIONS AND RESEARCH GRANTS
Publications that have been included in this dissertation:
Sadeka Islam, Srikumar Venugopal, and Anna Liu. Evaluating the impact of fine- • scale burstiness on cloud elasticity. In Proceedings of the Sixth ACM Symposium on Cloud Computing, pages 250–261. ACM, 2015.
Sadeka Islam, Kevin Lee, Alan Fekete, and Anna Liu. How a consumer can measure • elasticity for cloud platforms. In Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering, pages 85–96. ACM, 2012.
Other publications that haven’t been included in this dissertation:
Sadeka Islam, Jacky Keung, Kevin Lee, and Anna Liu. An empirical study into • adaptive resource provisioning in the cloud. In IEEE International Conference on Utility and Cloud Computing (UCC 2010), page 8, 2010.
Sadeka Islam, Jacky Keung, Kevin Lee, and Anna Liu. Empirical prediction mod- • els for adaptive resource provisioning in the cloud. Future Generation Computer Systems, 28(1):155–162, 2012.
Research grants:
AWS Education Research Grant. •
RightScale Premium Education Account. • viii Contents
List of Abbreviations xv
List of Figures xviii
List of Tables xix
1 Introduction 1
1.1 Cloud computing ...... 1
1.1.1 Deployment models ...... 3
1.1.2 Service models ...... 4
1.1.3 Pricing models ...... 5
1.1.4 Cloud ecosystem ...... 6
1.2 The truth about elasticity: expectation vs. reality ...... 8
1.2.1 Consumers’ expectations about elasticity ...... 8
1.2.2 Reality: imperfect elasticity ...... 10
1.3 Why is an elasticity benchmark the need of the hour? ...... 12
1.4 State of the art ...... 14
1.5 Research problem, hypothesis and goal ...... 14
1.6 Contribution and impact ...... 16
1.7 Scope and assumptions ...... 20
1.7.1 Scope ...... 20
ix x CONTENTS
1.7.2 Assumptions ...... 22
1.8 Research method ...... 23
1.9 Terminologies used ...... 24
1.10 Thesis overview ...... 27
2 Background and state of the art 29
2.1 Background ...... 29
2.1.1 Elasticity: definition and characteristics ...... 30
2.1.2 Comparison: Elasticity, Scalability and E ciency ...... 34
2.1.3 Foundation ...... 36
2.2 Related work ...... 41
2.2.1 Cloud performance analysis and benchmarking ...... 41
2.2.2 Elasticity benchmarking: initial concepts ...... 44
2.2.3 Elasticity benchmarking frameworks ...... 45
2.2.4 Elastic scaling techniques ...... 67
2.2.5 Elasticity quality assurance frameworks ...... 69
2.2.6 Elasticity modeling and simulation ...... 70
2.3 Summary ...... 71
3 The core framework 75
3.1 Introduction ...... 75
3.2 Consumer-centric elasticity measurement ...... 78
3.3 Elements of the elasticity benchmarking framework ...... 80
3.3.1 Penalty model ...... 80
3.3.2 Single figure of merit for elasticity ...... 83
3.3.3 Workload suite specification ...... 84
3.4 Executable benchmark instantiation ...... 85 CONTENTS xi
3.4.1 Choices for an elasticity benchmark ...... 86
3.4.2 Experimental setup ...... 89
3.4.3 Configuration and measurement procedure ...... 91
3.5 Case studies ...... 91
3.5.1 Exploring workload patterns ...... 91
3.5.2 Exploring the impact of scaling rules ...... 96
3.6 Discussion ...... 101
3.7 Critical reflection ...... 103
3.8 Conclusion ...... 109
4 Customized benchmarking? Use fine-scale bursty prototypes 111
4.1 Introduction ...... 111
4.2 Fine-scale burstiness: evidence and repercussions ...... 114
4.3 Prevalent fine-scale burstiness studies ...... 117
4.4 Multifractal analysis ...... 118
4.5 Modeling methodology ...... 121
4.5.1 Step 1: Characterization of pointwise regularity ...... 121
4.5.2 Step 2: Synthesis of fine-scale burstiness ...... 122
4.5.3 Step 3: Trend construction and superposition ...... 123
4.5.4 Working example ...... 125
4.5.5 Comparison with other methods ...... 127
4.6 Experimental setup ...... 128
4.7 Case study ...... 129
4.7.1 E ect of fine-scale burstiness on elasticity ...... 130
4.7.2 Trends in the elasticity penalty rate ...... 135
4.7.3 Summary ...... 136 xii CONTENTS
4.8 Critical reflection ...... 137
4.9 Conclusion ...... 140
5 Validity and repeatability? Tame the unpredictability 141
5.1 Introduction ...... 142
5.2 Runtime variability in elasticity behavior ...... 144
5.3 Causes of runtime variability ...... 147
5.4 Prevalent evaluation methodologies ...... 149
5.4.1 General features ...... 150
5.4.2 Common pitfalls ...... 151
5.5 Rigorous evaluation techniques ...... 153
5.5.1 Experiment design ...... 153
5.5.2 Data analysis ...... 158
5.6 Experimental setup ...... 164
5.7 Case study ...... 166
5.7.1 Rigorous method ...... 167
5.7.2 Comparison with prevalent methods ...... 170
5.7.3 Summary ...... 174
5.8 Critical reflection ...... 175
5.9 Conclusion ...... 178
6 Conclusion and future work 181
6.1 Recap on research problem and objective ...... 181
6.2 Contributions ...... 182
6.3 Critical reflection ...... 186
6.4 Future directions ...... 191
6.5 Concluding remarks ...... 195 CONTENTS xiii
Bibliography 197
Appendices 220
A Elasticity definition 221
B Elasticity behavior of standard workloads 227
B.1 Standard workload suite ...... 227
B.2 Elasticity behaviors ...... 231
C Elasticity behavior of fine-scale bursty workloads 253
D A comparison of elasticity evaluation methodologies: Environmental bias perspective 257
E Variability in elasticity behavior 259 xiv CONTENTS List of Abbreviations
AWS Amazon Web Services. PaaS Platform as a Service.
B2C Business to Consumer. QoS Quality of Service.
CoV Coe cient of Variation. RDBMS Relational Database Manage- ment Systems. DaaS Database as a Service. RDS Relational Database Service. DNS Domain Name Server. ROI Return On Investment. DoE Design of Experiment. RPS Requests Per Second. EC2 Elastic Compute Cloud. S3 Simple Storage Service. ECU Elastic Compute Unit. SaaS Software as a Service. GAE Google App Engine. SLA Service Level Agreement. GCE Google Compute Engine. SLO Service Level Objective. GQM Goal Question Metric. SME Small and Medium Enterprise. SOA Service-Oriented Architecture. IaaS Infrastructure as a Service. SPEC Standard Performance Evaluation NIST National Institute of Standards and Corporation. Technology. SUT System Under Test.
OLTP Online Transaction Processing. TCO Total Cost of Ownership.
xv xvi List of Abbreviations List of Figures
1.1 Cloud service models ...... 4
1.2 Cloud ecosystem (adapted from [67]) ...... 7
1.3 Gartner’s hype cycle for cloud computing 2011 ...... 8
1.4 Traditional IT vs. automated elasticity (adapted from [233]) ...... 9
1.5 Elasticity behavior of EC2 platform in response to a sinusoidal workload . . 11
1.6 Motivation for an elasticity benchmark ...... 12
1.7 Research method: high-level approach ...... 23
2.1 Architecture of an elastic system (adapted from [126]) ...... 37
2.2 Taxonomy of factors for analyzing elasticity benchmarking frameworks . . . 46
2.3 Overview of macro-benchmarking modeling approaches ...... 48
2.4 Elasticity measurement concept of Dory et al. (adapted from [94]) . . . . . 52
2.5 Overview of Micro-benchmarking modeling approaches ...... 56
2.6 Evaluation criteria for elasticity benchmarking frameworks ...... 63
2.7 Classification scheme for elasticity techniques (adapted from [110]) . . . . . 67
3.1 Elasticity behavior of EC2 in response to a periodic sinusoidal workload . . 79
3.2 Elasticity behavior of EC2 with ruleset 1 in response to a sinusoidal work- load with 30 minutes period ...... 93
3.3 Results of the trapping scenario ...... 95
xvii xviii LIST OF FIGURES
3.4 Elasticity behavior of EC2 platform with ruleset 1 for exponential workload with growth 24/hour and decay 3/hour ...... 98
3.5 Rippling e ect for sinusoidal workload with 90 minutes period ...... 100
4.1 Wikipedia workload snippet (Oct 1 2007) ...... 114
4.2 EC2 platform’s elasticity behavior under non-bursty and bursty workloads . 115
4.3 Partition function, Z(q,a) vs. Timescale, a ...... 120
4.4 Scaling exponent, ·(q) vs. q ...... 120
4.5 Holder function, deterministic trend and generated fine-scale bursty prototype126
4.6 Workload reproduced using LIMBO toolkit ...... 127
4.7 EC2 platform’s elasticity behavior for non-bursty and bursty workloads (smooth vs. sigma150) ...... 131
4.8 Estimated probability of average cpu utilization ...... 132
4.9 Behavior of the Tomcat application server under non-bursty and bursty workloads (smooth vs. sigma150) ...... 133
4.10 Response time percentiles ...... 135
4.11 Trends in elasticity penalty rates ...... 136
5.1 Random elasticity behavior of the m1.medium instance ...... 146
5.2 Elasticity a ected by random scaling delay ...... 147
5.3 Diversified workload suite smooths out the random bias to some extent . . 154
5.4 An example 3-level experiment design ...... 157
5.5 Runtime variability in the elasticity score when compared the elastic im- provement of m1.medium instance with respect to the m1.small instance. . 169
5.6 Quantile-quantile (qq) plot of the sample elasticity scores ...... 169
5.7 Prevalent method evaluation ...... 173 List of Tables
2.1 Comparison: Elasticity, Scalability and E ciency ...... 36
2.2 Characteristics of di erent elasticity macro-benchmarking frameworks . . . 55
2.3 Characteristics of elasticity micro-benchmarking frameworks ...... 62
2.4 Evaluation of elasticity benchmarking frameworks ...... 65
3.1 Autoscaling engine configuration ...... 90
3.2 Penalty for Benchmarking Workloads - Ruleset 1 ...... 97
3.3 Penalty for Benchmarking Workloads - Ruleset 2 ...... 97
3.4 Penalty for Benchmarking Workloads - Ruleset 3 ...... 97
4.1 QoS degradation for under-provisioning in non-bursty and bursty workloads 117
4.2 Elasticity penalty for fine-scale burstiness ...... 130
5.1 State of the art elasticity evaluation methodologies ...... 150
5.2 Elasticity metrics for a 3-level hierarchical experimental design (followed [143])...... 168
5.3 Classification of the outcomes of the prevalent method and the rigorous method ...... 170
xix
Chapter 1
Introduction
“Mirror mirror on the wall, which cloud is the elastic of all?”
Cloud consumer
1.1 Cloud computing
Cloud computing is a popular computing paradigm that delivers IT capabilities on demand over the network [204]. Although the widespread adoption of cloud computing has gained momentum over the last few years, its root can be traced back to the mainframe com- puting era when computers were beyond the a ordability range of individuals and small companies. Large enterprises, though could a ord to buy such expensive computers, were constantly seeking ways to increase their utilizations using timesharing technology for im- proving their Return On Investment (ROI) [72]. These circumstances eventually drove these companies to find a cost-e ective alternative to the mainframe system. Therefore, in 1961 when Professor John McCarthy suggested the concept of computing delivered as a public utility, it was believed to knock out all the aforementioned problems with just one stone [36]. This idea received immense popularity in the late ’60s, however, even- tually faded away because of several issues, such as application stack lock-ups due to incompatibilities among hosted application environments, ine cient network infrastruc- ture etc. [107, 68]. It took several decades to establish the concept of utility computing as a sustainable reality. Technological innovations, such as cheaper micro-processors and
1 2 Introduction datacenters, fiber optic networks, virtualization, web services - all of these contributed to the realization of computing as a public utility [68]. Cloud computing indeed embraces the idea of utility computing; it delivers IT capabilities (such as infrastructure, software, development environment) as a service on demand and the cost incurred is approximately proportional to the amount of resources consumed. It facilitates economies of scale for both parties - the provider and the consumer; the provider can now enjoy better server utilizations by serving a large consumer base; thereby achieving faster ROI as well as lower Total Cost of Ownership (TCO). Likewise, the elimination of upfront capital investment as well as cheaper and readily accessible services on demand significantly reduces the cost per transaction at the consumer’s end; therefore, the consumer’s application can now serve the end-users with lower operational expenses. Undoubtedly, “economies of scale” is the key driving force behind the mainstream adoption of the cloud over the recent years [204].
Substantial e orts were spent to provide a precise definition of “cloud computing” and characterize its key features. Some of these attempts viewed cloud from technical perspec- tive [232, 108, 69, 11, 70, 106, 68, 48], while some others focused solely on the delivery model that enables the consumer’s business agility [204, 176, 177, 178]. Despite these di erences in viewpoints, all of these works unanimously agreed on the following key char- acteristics of cloud computing: on-demand self-service so that the consumer can provision and release resources automatically without any human intervention of the cloud service provider; broad network access so that resources are readily accessible from anywhere through di erent types of client platforms (e.g., mobiles, tablets, laptops, workstations); a set of pooled resources at the service provider to dynamically serve the demand of multi- ple consumers; rapid elasticity to enable quick allocation and deallocation of resources in response to fluctuating demand; and measured service to leverage pay-as-you-go pricing for resource usage. Among these features, elasticity and metered pricing enable cloud’s unique vision of o ering economies of scale; as resources are charged based on actual usage in the cloud, this implies that elasticity has the power to shape up the operational ex- penses in proportion to the workload intensity faced by the application. This is what most Small and Medium Enterprises (SMEs) and even large enterprises yearned for a long time; they neither wanted to throw away their resources for an application which cannot draw enough customers (over-provisioning) nor wanted to starve their application for resources which suddenly becomes wildly popular overnight (under-provisioning). What they really wanted to have is the resource capacity that matches the workload intensity in real time 1.1. Cloud computing 3 and cloud elasticity brings this long-cherished dream into reality.
1.1.1 Deployment models
The cloud deployment model refers to di erent ways of setting up the cloud environment, typically characterized by ownership, size and accessibility concerns [74]. Several factors need to be considered while choosing a deployment model for hosting a cloud application, such as application’s requirements, time-to-market need, overall budget, security and so on. At present, cloud computing has four types of deployment models [11, 48, 178]: private cloud, community cloud, public cloud and hybrid cloud.
In a private cloud, the datacenter’s hardware and software are operated and managed by the organization itself or a third party and these resources remain dedicated for the internal use of the organization (i.e., these are not available to the general public). Large enterprises (and also the Government organizations) may find private cloud beneficial because of enhanced security, accountability and resilience. However, the upfront capital investment, longer time-to-market (typically 6-36 months) and management overhead may make it less attractive to small startups and medium enterprises.
Community cloud is a more generalized form of the private cloud where the data- center’s resources are managed and operated by several organizations or a third party to support some common interests (e.g., security and compliance requirements, mission- critical objectives) of the community. Since the cloud resources are shared by more than one organization, the participating organizations need to establish fair charging policies and governance procedures for the e cient operation of the community cloud.
Public cloud is the most generalized form in the spectra where the datacenter’s re- sources are made available to the general public on a pay-as-you-go manner. In the public cloud scenario, the datacenter is owned and maintained by an organization and it sells cloud resources based on its own charging policy. Public cloud eliminates the risks of upfront capital investment and longer time-to-market as well as maintenance overheads; on the downside, the consumers lose some visibility and control over the internal workflow of the underlying virtualization environment.
The last one, i.e., the Hybrid cloud deployment model is composed of several clouds (private, community, or public) which act as standalone entities but are bound together by 4 Introduction some standardized technology. Sometimes enterprises may need to use a combination of private and public cloud (i.e., the hybrid cloud) to handle sudden surges of their workloads (e.g., during a product launch event, Black Friday deals).
1.1.2 Service models
Any resource which is delivered over the internet can be considered as a cloud service. Some of these cloud services provide infrastructure resources, some o er specialized development environments and proprietary APIs for developers and testers while some others deliver sophisticated software applications to promote business productivity. These services are classified into three main categories: Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS) [11, 232, 48, 178, 161, 169, 63]. These services are interdependent and form di erent layers in the cloud technology stack; this is depicted in Fig. 1.1.
SaaS Google Docs, Dropbox, Box.net, Salesforce CRM PaaS Elastic Beanstalk, Azure Cloud Service, App Engine IaaS EC2, Rackspace, Azure Virtual Machine, Google Compute Engine
Figure 1.1: Cloud service models
IaaS is the most rudimentary form of cloud computing where the service providers o er a shared pool of infrastructural resources (e.g., CPU, memory, bandwidth, storage etc.) using virtualization technology so that the consumers can customize, deploy and run arbitrary softwares and applications. Some well-known examples falling into this category are Amazon Web Services (AWS) Elastic Compute Cloud (EC2) [18], Rackspace [28], Azure Virtual Machines [34], Google Compute Engine (GCE) [16] and OrionVM [24].
PaaS provides an additional level of abstraction on top of the IaaS; i.e., it provides software development environment with proprietary APIs and tools. The consumers can host their applications on the platform and tune the deployment configuration. It facili- tates quick development, testing and deployment of software applications by eliminating 1.1. Cloud computing 5 the overheads of software licensing and maintenance tasks. However, on the downside, the developers may have to work with a smaller subset of programming languages and tools supported by the platform; their visibility and control over the infrastructure may also be limited. Some examples of this category include Amazon’s Elastic Beanstalk [35], Azure Cloud Services [14], Google App Engine (GAE) [3] and Facebook’s Developer platform [20].
SaaS o ers customized versions of software applications which are typically accessible through web browsers, tablets and cellphones. Some popular examples of SaaS include Google Docs [21], O ce 365 [23], Dropbox [17], Box.net [10] and Salesforce CRM [29]. Some of the SaaS applications are free to use, while some others require subscription based usage.
1.1.3 Pricing models
Well-designed pricing model is a key ingredient for the success of cloud computing [238]. The central theme for cloud resource pricing is the “pay-per-use” model where the con- sumer is charged based on the amount of resource usage. Cloud providers nowadays employ di erent pricing models for charging resources which can be considered as slight variants of the “pay-per-use” theme. The current pricing models for cloud resources can be classified into four categories: pay-as-you-go, pay for resources, subscription based pricing and dynamic pricing [41, 45, 144, 50].
In the pay-as-you-go model, consumers pay a fixed price per unit of resource usage. Cloud services, such as AWS [15], Microsoft Azure [27] and Google Cloud Platform [26], adopt this model to charge the consumer a fixed price for using a virtual machine for a predefined charging quanta. Variations exist, though, in the specification of the charging quanta by di erent providers. For instance, the charging quanta of AWS, Microsoft Azure and GCE are 60 minutes, 1 minute and 15 minutes respectively. The duration of the charging quanta has a significant impact on the consumer’s operating expenses; a recent study of Andra [45] described some use cases where Microsoft Azure’s per-minute based pricing model outperforms Amazon’s hourly pricing model. Applications with short spiky or unpredictable load surges may find this pricing model very useful.
Pay for resources is another commonly observed pricing model where the consumer 6 Introduction pays for the amount of resources consumed. Most of the cloud providers (e.g., AWS [15], Microsoft Azure [27] and Google Cloud Platform [26]) o er their network bandwidths and storage resources based on this pricing model.
In the subscription based pricing, cloud providers o er resources at a discounted rate if the consumer subscribes to use the cloud service for a longer period (e.g., 6 months or 1 year). Amazon’s reserved instance pricing model [1] is an example of this type where the consumer gets a discounted rate if she1 subscribes to use a fixed resource quantity for a long term, such as a year. Applications with predictable or steady resource usage may be potential candidates for this pricing model.
In contrast to the above three models, the dynamic pricing refers to variable charging rate per unit of cloud resources based on real-time market conditions, such as auctioning, bargaining, supply vs. demand etc. For example, Amazon’s spot instance pricing model [2] allows the consumer to bid price on spare EC2 instances; if the bid price is higher than the spot value, then the instance is allocated to the consumer. The spot price, in fact, fluctuates based on the supply and demand for instances. The allocated instance may terminate abruptly when the spot price goes beyond the consumer’s bid price. The main benefit of this pricing strategy is that it is very cost-e ective, often 50 90% cheaper than ≠ the pay-as-you-go model. The potential use cases to be benefited from this pricing scheme may be applications with flexible completion times, applications facing urgent need of large compute capacity and so on.
1.1.4 Cloud ecosystem
With the advent of cloud computing, technological development has entered a brand new era - rapid elasticity and pay-per-use pricing promote quick development and hosting of innovative applications within a limited budget. As a result, many young entrepreneurs came forward with their innovative ideas and exploited cloud resources to make their applications available to the end-users; this is how a complex cloud ecosystem has evolved over the recent years.
In such an ecosystem, there are five major players who communicate with each other
1Throughout this dissertation, the female form of the consumer and the provider is used; however, one can also substitute it with the male form. 1.1. Cloud computing 7
Figure 1.2: Cloud ecosystem (adapted from [67]) through a standard business process: cloud provider, cloud consumer, cloud broker, cloud auditor and cloud carrier [169, 63]. The cloud provider o ers her resources (e.g., infras- tructures, development platforms and applications) to the consumers. The cloud consumer is either an end-user who uses the SaaS applications without any modification or a de- veloper/application provider who uses the cloud infrastructural resources or development platforms to make her product/service available to the end-users. The consumers can request the cloud resources directly from the cloud provider or through a cloud broker.A cloud broker typically negotiates the relationship between the consumer and the provider. He may also o er enhanced quality cloud service by adding more features to it, such as performance reporting, enhanced security etc. A cloud auditor is an independent entity that verifies whether the cloud provider complies with the regulatory and compliance re- quirements (e.g., security, privacy, performance) based on objective evidence. And finally, a cloud carrier is responsible for providing connectivity (e.g., network, telecommunication, transport agent) between the consumer and the provider.
Fig. 1.2 depicts a simple ecosystem that consists of the cloud provider and the cloud consumer. The interested reader will find more example ecosystems in [169]. 8 Introduction
1.2 The truth about elasticity: expectation vs. reality
As mentioned in the previous section, elasticity is a cost-e ective feature of the cloud that has received significant attention over the years. As suggested by the Gartner’s hype cycle (Fig. 1.3), it is an emerging technological innovation which is supposed to hit the mainstream in about 2015 2016. According to ComputerWeekly.com [208], this wave of ≠ mainstream adoption has already started to widen in 2015. Under these circumstances, the cloud consumer is very often confronted with this question: Is the current state of elasticity good enough to meet the needs of my application? In the following, we explore the answer to this question in the light of the available facts and empirical evidence.
Figure 1.3: Gartner’s hype cycle for cloud computing 2011
1.2.1 Consumers’ expectations about elasticity
When a new technology comes into the market, people usually try to understand its rel- evance and significance to their lives and businesses. Cloud elasticity faced no exception to this trend. Potential consumers of the cloud, therefore, studied commonly available information sources (e.g., cloud providers’ websites, white papers and research papers, practitioners’ blogs etc.) to form their expectations about elasticity. The general percep- 1.2. The truth about elasticity: expectation vs. reality 9 tion about cloud elasticity goes along with the following definition:
“In cloud computing, elasticity is a term used to reference the ability of a sys- tem to adapt to changing workload demand by provisioning and deprovisioning pooled resources so that provisioned resources match current demand as well as possible.” (Cloud computing IT Glossary [12])
This definition points out several advantages of cloud elasticity over the traditional IT. First, it relieves the consumer from the burden of upfront capital investment by replacing CapEx (Capital Expenditure) with a moving OpEx (Operational Expenses). Second, the on-demand availability of resources in response to changing workload conditions elimi- nates the risks of over-provisioning (payment for idle resources) and under-provisioning (unserved demand for inadequate resources); for this reason, it is believed to shape up the consumer’s revenue in proportion to the workload demand, something that was never so easy to achieve with conventional IT systems. Fig. 1.4 draws the distinction between inelastic traditional IT system and elastic cloud platform; apparently, elasticity appears to be a more economical alternative for serving the fluctuating workload of the cloud consumer.
Figure 1.4: Traditional IT vs. automated elasticity (adapted from [233]) 10 Introduction
Because of this unprecedented economic appeal of elasticity, it has received significant attention from the cloud consumer community. There are many compelling scenarios where elasticity appears as a boon to the consumer’s revenue [48]. A startup facing rapid growth wishes that its costs start small, and grow as and when the income arrives to match. In contrast, traditional data processing requires a large up-front capital expenditure to buy and install IT systems. Traditionally, the cost must cover enough processing for the anticipated and the hoped-for growth; this leaves the company bearing much risk from uncertainty in the rate of growth. If growth is slower than expected, the revenue would not be available to pay for the infrastructure, while if growth is too fast, the systems may reach capacity and then a very expensive upgrade or expansion is needed. Also, it is common in web-based companies for demand to be periodic or bursty (e.g., the Slashdot e ect). The workload may grow very rapidly when the idea is “hot”, but fads are fickle and demand can then shrink back to a previous level. Traditional infrastructure must try to provision for the peak, and so it risks wasting resources after the peak has passed. In summary, elasticity can remove risk from a startup or an enterprise, by allowing “pay-as-you-grow” computing infrastructure where the costs adjust smoothly to rising (and perhaps falling) workload. Motivated by these examples, many consumers became enamored with the elasticity concept and decided to adopt it to maximize their revenues and net profits.
1.2.2 Reality: imperfect elasticity
There are two sides of every story: the bright side and the dark side, and the truth lies somewhere in between. Same applies to the views about elasticity too. Soon after its adoption, some consumers experienced practical issues in the delivered elasticity of the public cloud o erings. For instance, the resource acquisition delay in the public cloud is not instantaneous and the lead time is in minutes; therefore, concerns abound whether it would be a good fit for applications with stringent Quality of Service (QoS) requirements (e.g., web and e-commerce applications, real-time and mission-critical applications). To get a good grasp of the current state of elasticity, we carried out an experiment on the AWS EC2 cloud. In particular, we observed the elasticity behavior of an online bookstore application TPC-W hosted on a dynamic web server farm in the EC2 cloud in response to a sinusoidal workload of 30 minutes period. We configured the scaling policy as follows: add an EC2 instance to the server farm when the average CPU utilization goes beyond 70% for 1.2. The truth about elasticity: expectation vs. reality 11
2 consecutive minutes and remove an EC2 instance when the average CPU utilization goes below 20% for 2 consecutive minutes. The elasticity behavior is shown in the following, where each EC2 instance is assumed to provide 100% of CPU supply per minute.
Figure 1.5: Elasticity behavior of EC2 platform in response to a sinusoidal workload
In Fig. 1.5, demand means the amount of resource needed by the application to serve the requests with satisfactory performance, available supply (a/supply) implies the amount of resource allocated by the platform and chargeable supply (c/supply) indicates the amount of resource for which the cloud platform charges the consumer. This figure suggests that the elasticity o ered by the EC2 platform is not perfect; sometimes the platform charges the consumer for unutilized resources (between timepoints 0 12), some- ≠ times the platform introduces minutes-long lead time to allocate the resources (between timepoints 15 20) although it starts charging from the very moment the resource request ≠ is placed. It also charges the consumer for already released resources due to the hour-long charging quanta (between timepoints 40 50). In fact, these flaws are not only specific to ≠ the EC2 platform but apply to all cloud platforms to some extent. Therefore, cloud con- sumers may end up with operational overspending if they do not know how to judiciously use the cloud resources [200]. Additionally, their expected revenues may also su er if the cloud resources are not available on the fly.
This reality check may increase the consumers’ skepticism about the elasticity claims of di erent cloud o erings. On one hand, there are compelling benefits of elasticity, but on the other hand, there are some concerning facts which may partially shadow its benefits. This is an important riddle that the potential consumer wants to solve at the preprocessing stage of elastic cloud adoption. 12 Introduction
Figure 1.6: Consumers need an elasticity benchmark to compare the elasticity claims of various cloud platforms. The consumer’s image is taken from Zedge.net 2
1.3 Why is an elasticity benchmark the need of the hour?
Example 1.1 Running example: Dona is a local pizza shop owner with some yummy recipes which were very popular in the town. She wanted to boost her business by cre- ating an online presence with a website. She expected a fluctuating tra c pattern with peaks during the lunch and dinner hours; therefore, she decided to host her website into the elastic cloud so that the cost of the infrastructure adapts well to the incoming tra c demand. Based on some back-of-the-envelope calculations [48], she felt convinced about the economic benefit of elastic cloud compared to the traditional infrastructure. Since all cloud providers claim elasticity in their o erings, she did not consider it as an important factor for cloud adoption. She carried out benchmarking for the scalability aspect (i.e., the performance/price ratio) based on a standard benchmark and picked the best one. This ap- proach worked well as long as the request arrival rate for that application remained steady over time.
However, a couple of weeks later, she heard customers’ complaints about her website’s sluggish performance and unavailability during rush hours. After a careful investigation, she found that the site could not scale quickly in response to growing tra c because of minutes-long resource spin-up delay of the cloud platform, thus experiencing customer abandonment because of unacceptable response time and dropped requests. She also figured out that her operational expenses did not go down as soon as the resources were released because of the hour-long charging quanta of the cloud platform. These phenomena increased
2Source: http://www.zedge.net/wallpaper/3428710/ 1.3. Why is an elasticity benchmark the need of the hour? 13 her concerns about the consequences of imperfect elasticity on her business’s revenue and net profit. She also perceived the fact that the delivered elasticity of the public cloud services is not as perfect as their claims. Therefore, she felt the need to compare the elasticity claims of di erent cloud services and pick the one best suited to her application and workload profile. So, she searched the web for an elasticity benchmark whom she can ask:
“Mirror, mirror on the wall, Which cloud is the most elastic of all?”
However, to her disappointment, she could not find that magic mirror and therefore, failed to make a well-informed decision about elastic cloud adoption.
The story portrayed in the above example is somewhat familiar to many potential consumers of the elastic cloud. Enthusiastic consumers, who moved to the elastic cloud just driven by the marketing hype, experienced the repercussions of imperfect elasticity on their applications’ revenues. Apparently, elasticity is a double-edged sword; while good elasticity has the potential to maximize the consumer’s revenue, bad elasticity can worsen the revenue because of operational overspending and unacceptable website performance. Users, who find the website unavailable during the scale-out period (when resources are saturated), not only generate zero revenue but also never come back and recommend it to anyone due to poor service. For this reason, pragmatic cloud consumers feel the need to validate the elasticity claims of di erent cloud o erings for their applications and workload profiles. However, the irony is that elasticity at present is a frequently-claimed yet never-validated attribute. All public cloud providers claim their o erings to be elastic without specifying any measurement metric (even though none are perfect in delivering instantaneous elasticity). As a consequence, consumers face significant di culty when they try to compare and contrast the quality of elasticity and pick the one best suited to their applications’ requirements and business objectives.
This is the reason why an elasticity benchmark is so desirable to the consumers of the cloud. A well-designed elasticity benchmark can serve as a useful tool to draw insights about the adaptability behavior of the cloud platform in the context of a given application and its workload profile. It can help cloud consumers evaluate competitive cloud platforms and choose the most suitable one for a given application. It can also be applied to tune the deployment configuration (such as scaling policies) in order to achieve better elasticity 14 Introduction for an application hosted on a particular cloud platform.
1.4 State of the art
There are undoubtedly several elasticity benchmarks available for evaluating the cloud platforms. However, we have not found any elasticity benchmark in the literature that adequately characterizes and evaluates elasticity from the consumer’s perspective. Most of the works made attempts to characterize the elasticity of the cloud platform based on scaling delay and resource granularity; however, this sort of representation is flawed as it lacks some key aspects of elasticity, such as the charging quanta and pricing model of the cloud platform (because of their direct impact on the consumer’s operational expenses). Scarcely, any of these works o ered any means to encapsulate the consumer’s concerns (e.g., her application-specific business objectives and workload profiles) in the derived elasticity metric. As a consequence, cloud consumers find it very di cult to comprehend the economic worth of elasticity for their application specific contexts. Moreover, none of the prevalent elasticity benchmarks provided any explicit guidance to express the elasticity score of a workload collection as a single figure of merit, thereby posing di culty to draw a simple conclusion about one platform’s worthiness over another.
To sum up, there is no adequate yardstick available today to evaluate the elasticity be- havior of the cloud platforms from the perspective of cloud consumers. As a consequence, it is di cult to identify and fix elasticity issues, validate and compare the elasticity claims of competing cloud o erings and adaptive scaling strategies, and choose the most optimal elasticity solution well-suited to the consumer’s business situation.
1.5 Research problem, hypothesis and goal
Our research problem is defined as follows:
Research Problem The computing community is lacking a meaningful and system- atic means to evaluate the elasticity of the cloud platform from the cloud consumer’s perspective.
By “meaningful and systematic means”, we denote a set of standard techniques that 1.5. Research problem, hypothesis and goal 15 can be applied to evaluate the elasticity of competing cloud platforms.
By “cloud platform”, we mean an adaptive cloud system on which the consumer’s application can run. Typically, an adaptive cloud system is comprised of an underlying cloud infrastructure and a set of scaling policies to adjust the resource capacity in response to a fluctuating workload.
By “cloud consumer’s perspective”, we mean doing everything with the consumer’s specific concerns and limitations in mind. In other words, it denotes putting oneself into the consumer’s shoes while evaluating the cloud platforms and support their decision- making process with relevant metrics. The following criteria need to be met in order to satisfy the consumer’s perspective; first, the evaluation process should keep provision to incorporate the consumer’s application and realistic workload profiles. Second, only those observations which are accessible to the consumer through the platform’s APIs and performance tools need to be recorded. And finally, the reported metrics should readily reflect the degree to which the consumer’s business objectives have been met so that they can easily draw simple conclusions about one platform’s worthiness over another.
To address the research problem, we have formulated the following hypothesis:
Research Hypothesis It is possible to develop an elasticity benchmarking framework to evaluate the elasticity of cloud platforms, so that consumers can make well-informed decisions about elastic cloud adoption.
By “benchmarking framework”, we refer to a conceptual abstraction that includes a set of metric definitions, a representative workload suite and a precisely defined procedure to yield those metrics for fair comparison of alternative platforms with respect to a spe- cific attribute, such as elasticity. The elasticity benchmarking framework serves as a basic template to instantiate an executable elasticity benchmark, given the consumer’s spe- cific context (e.g., application domain, workload characteristics) and business objectives. Designing a generic elasticity benchmarking framework that can be applied consistently across all types of cloud platforms and application domains is far from being trivial in the limited timeframe of a PhD. For this reason, we constrain our focus by considering a limited range of cloud platforms and application domains.
Research Goal Our research objective is defined using the Goal Question Metric (GQM) method [53, 231]:
Design a benchmarking framework for the purpose of Evaluation (i.e., measurement, 16 Introduction
analysis and comparison) with respect to the elasticity of cloud platforms from the viewpoint of cloud consumers in the following context: Online Transaction Processing (OLTP) type applications hosted on public IaaS and PaaS VM type o erings and charged based on the pay-per-use pricing scheme.
The research questions to accomplish this goal are specified in the following:
RQ 1 How can we design a core elasticity benchmarking framework for cloud platforms from the consumer’s viewpoint?
RQ 1.1 How can we define a metric for measuring elasticity of the cloud platform from the consumer’s perspective?
RQ 1.2 How can we design a standard workload suite for elasticity evaluation?
RQ 2 What are the concrete steps for instantiating an executable elasticity benchmark?
RQ 3 How can we reproduce custom prototypes of actual workloads for elasticity evalua- tion?
RQ 4 How can we ensure repeatability and validity in the elasticity benchmarking results in the presence of the performance unpredictability of cloud platforms?
Having defined the goal and research questions, we will now move on to highlight our contributions in the next section.
1.6 Contribution and impact
The primary contribution of this dissertation is ElasticMark - a novel elasticity bench- marking framework that takes on consumer-centric view while assessing the elasticity of the cloud platforms. Our contribution advances the state of the art in several ways. First, it helps the cloud consumer make a well-informed decision about elastic cloud adoption by comparing and contrasting the elasticity of competing cloud platforms. Furthermore, it serves as a crucial tool to support innovation in the area of adaptive cloud systems. Performance analysts and adaptive system designers can use this framework to evaluate the e ectiveness of alternative adaptive mechanisms and identify possible areas for further improvement. In addition to this, it helps the cloud provider determine the worth of her 1.6. Contribution and impact 17 elastic cloud o erings with respect to other competitive o erings in the market, thereby giving her an opportunity to optimize her products to better address the consumer’s ob- jectives.
In the following, we highlight our individual contributions and their impacts:
A core framework for evaluating the elasticity of competing cloud platforms (e.g., • cloud o erings, adaptive scaling strategies) from the consumer’s perspective.
The first concrete proposal that clearly and objectively incorporates the consumer’s viewpoint while evaluating the elasticity of the cloud platforms. Unlike other frame- works in the computing literature that measures only the technical aspects of elastic- ity (e.g., scaling delay and precision between demanded and allocated resources), our framework quantifies elasticity in terms of the complex interaction of the technical aspects and the consumer’s business situation. This makes our task more di cult than usual as it involves looking at the evaluation problem through the lenses of the consumer and understanding the consumer’s preference objectives, business context and other constraints (e.g., limited visibility through the cloud APIs). Neverthe- less, this undertaking is worthy as it gives the consumer a rational basis to make a well-informed decision about elastic cloud adoption. This framework follows a penalty-based approach for elasticity evaluation; it includes some metric definitions that help the consumer understand the deviation between the desired and the per- ceived level of elasticity in terms of monetary units. Another noteworthy feature of this framework is that it expresses elasticity as a single figure of merit, thus helping the consumer draw a simple conclusion about the worthiness of alternative cloud platforms.
Elasticity implies the cloud platform’s adaptive behavior in response to the variation in the resource demand of the workload. The workload suite for elasticity evaluation in this framework, therefore, includes a set of time-varying workload patterns to stress the platform’s adaptability behavior. Each of these workload patterns repre- sents realistic usage scenarios commonly seen in the web and e-commerce applications (e.g., periodic variation, flash crowds). This workload suite helps the consumer un- derstand the cloud platform’s elasticity behavior in realistic settings. By plotting the workload intensity and elasticity behavior over time, it is also possible to diagnose adaptability issues and identify potential areas for optimization. 18 Introduction
Specific guidance for instantiating an executable elasticity benchmark. • We also provide guidelines for instantiating an executable elasticity benchmark from the conceptual framework. It involves making concrete choices about the benchmark application, workload suite specification and the consumer’s non-functional objec- tives (e.g., QoS thresholds, monetary e ects for violating the QoS thresholds when the elasticity is imperfect). Additionally, it elaborates the details of the testbed setup and guides through the specifics of the configuration and measurement proce- dure that the consumer has to follow to conduct elasticity benchmarking for cloud platforms. All these steps are illustrated with a working example, thus providing the consumer precise guidance on instantiating an elasticity benchmark. The core framework and instantiation procedure were published in:
How a consumer can measure elasticity for cloud platforms. Sadeka Islam, Kevin Lee, Alan Fekete, and Anna Liu. In Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering, pages 85–96. ACM, 2012.
A novel workload model for generating representative prototypes of fine-scale bursty • workloads based on traces.
Although the standard workload suite includes workloads representative of some common real-world usage scenarios, sometimes a need may arise for reproducing realistic workload prototypes based on the consumer’s application-specific traces. Our literature review reveals the fact that web and e-commerce workloads include many noisy oscillations or burstiness at the small timescale (e.g., in the order of sec- onds), which are di cult to model with simple mathematical equations. To resolve this issue, this framework includes a novel workload model for reproducing realistic prototypes of fine-scale bursty workloads, thus assisting the consumer to carry out customized elasticity evaluation in a cost-e ective way. This work was published in:
Evaluating the impact of fine-scale burstiness on cloud elasticity. Sadeka Islam, Srikumar Venugopal, and Anna Liu. In Proceedings of the Sixth ACM Sym- posium on Cloud Computing, pages 250–261. ACM, 2015.
A set of rigorous techniques for ensuring valid and repeatable elasticity benchmarking • results in the presence of the performance unpredictability of the cloud environment. 1.6. Contribution and impact 19
Elasticity is a random variable whose precise nature depends on various random fac- tors of the cloud environment; as a result, the derived elasticity metric shows some variation across di erent runs of the same workload. The elasticity benchmarks to date have not taken into account the influence of performance unpredictability on elasticity, nor have proposed any means to ensure consistent benchmarking results. To address this gap in the literature, our research investigates the impact of per- formance unpredictability on elasticity and suggests a set of rigorous techniques for ensuring repeatable and valid elasticity benchmarking results in the presence of such unpredictability. This is a distinctive feature of our framework that has never been considered by previous elasticity benchmarks.
Insightful case studies demonstrating the applicability of the framework for the pur- • pose of elasticity evaluation.
Several empirical case studies conducted on the Amazon’s EC2 cloud demonstrating the applicability of the concepts and techniques proposed by our framework. These studies draw some interesting insights about the elasticity of the EC2 platform. Additionally, these reveal the di erence between technical elasticity and consumer- perceived elasticity. Furthermore, these studies pinpoint some shortcomings of the commonly-used adaptive scaling strategies and suggest several approaches for im- proving the consumer-perceived elasticity. The extracted insights and optimization prescriptions themselves are valuable contributions to the performance engineering literature.
Some additional benefits of this research are summarized as follows:
A framework that promotes a proactive decision-making approach to elastic cloud • adoption, specifically with respect to the consumer’s application context and business situation.
Several shreds of empirical evidence that unravel the counter-intuitive notion about • consumer-perceived elasticity.
A case study that raises awareness about the impact of fine-scale burstiness on the • elasticity of the cloud platform.
An empirical analysis on the interaction of fine-scale burstiness and cloud elasticity. • 20 Introduction
Scientific papers in top-tier peer-reviewed conference proceedings ([139, 140]). A con- • ference paper, elaborating the rigorous techniques for ensuring valid and repeatable elasticity benchmarking results in the presence of the performance unpredictability of the cloud, is currently in progress. A survey paper on elasticity is also being drafted.
1.7 Scope and assumptions
This section describes the scope and assumptions of our work.
1.7.1 Scope
This dissertation concentrates on designing an elasticity benchmarking framework and providing guidance for instantiating an executable benchmark so that the consumer can make a well-informed decision about elastic cloud adoption. Elasticity benchmarking, however, is a vast area of research; covering the entire scope in the lifetime of a PhD is far from being trivial. For this reason, we have decided to restrict our scope to a limited range of cloud o erings and application types. In the following, we shed light on the inclusions and exclusions of our scope:
All IaaS and some PaaS based cloud providers nowadays o er resources as a bundle • and charge the consumer based on the leased time for those bundles. For example, Amazon’s EC2 instance includes CPU, memory and disk storage and the pricing for an m1.small EC2 instance is 4.4 per hour. Our framework applies only to this type of cloud o ering. However, it is¢ not uncommon to see some PaaS o erings (e.g., Heroku [22], AWS Lambda [9]) where there is an abstraction around the underlying resource configuration and pricing occurs based on the number of completed tasks. Our framework at this moment does not cover this category.
At present, our benchmarking framework considers only the public cloud o ering • which has one or more pricing models for renting resources. We have not investigated the applicability of our framework for other cloud deployment models (e.g., private cloud, community cloud) where the pricing scheme is not that obvious. 1.7. Scope and assumptions 21
Our benchmarking framework at present applies to the VM type which supplies • fixed amount of resource quantity throughout its lifecycle. We plan to extend our framework for VM types with variable resource supply or bursting capability in future.
Our benchmarking framework currently considers only pay-per-use pricing model. • Incorporating other pricing models (e.g., dynamic pricing model, subscription based pricing model) is one of the priorities of our future research.
Developing a generic elasticity benchmarking framework to assess the elasticity of all • application types (e.g., batch processing applications, real-time and mission critical applications) would be a large area of research as di erent applications have di erent types of resource access patterns, workload profiles, and business objectives. We, therefore, restrict ourselves within the scope of OLTP type applications; because this application type is considered to be at the heart of the e-business and has to maintain stringent QoS (e.g., response time, availability etc.) even in the face of highly fluctuating workload demands. Since cloud elasticity today is not so mature (i.e., immediate and fine-grained), it is more likely to have a negative impact on the revenue of these applications.
We have not provided any guidance on extrapolating the benchmarking results. Con- • sumers with large complex applications may often feel the need to extrapolate the benchmarking results based on data gathered from a small test application deployed into the cloud. However, providing a solution to this problem is not so straightfor- ward as it involves understanding the complex dependency structure of di erent ap- plication components, their deployment configurations, bottleneck switching among di erent components and designing of workloads representing the interactions among various components. For this reason, we have considered extrapolation as a future work.
We have decided to restrict our benchmarking case study to di erent adaptive scaling • strategies of AWS EC2 cloud platform. Setting up the testbed and carrying out benchmarking in other cloud o erings (e.g., Microsoft Azure, GCE) require a great deal of development, management and economic e orts. For this reason, we have decided to leave it out of our scope for the time being and explore later in the future.
It is not our intent to develop queuing theoretic models based on the instantiation • 22 Introduction
of our benchmarking framework.
Another area we have not covered is the development of elasticity optimization tech- • niques. It is an interesting research direction that we plan to explore in the future.
1.7.2 Assumptions
Our elasticity benchmarking framework relies on several assumptions. We recognize that some of those may not always hold in the real world setting. However, we have found those assumptions reasonable enough to evaluate and analyze elasticity of the cloud platforms in our case studies. With the continuation of our research, we will gradually relax or refine some of our assumptions, so that it can also cope up with the imperfections of the real world. For the time being, we specify the following assumptions:
There is a running version of the application hosted on the cloud which exhibits • adaptive behavior in response to the fluctuating resource demand of the workload.
The consumer has access to the resource utilization and performance related infor- • mation through the platform’s APIs and other performance tools.
There are su cient infrastructure resources available to evaluate the cloud platforms • over the full range of workload demands.
The consumer’s application has one or more QoS objectives, whose violation has • financial implications. It is also possible to express the financial implication as a mathematical function.
The consumer has the necessary analytical skill to cater for the mapping between • low-level observations and high-level business objectives as well as instantiating ap- propriate functions from the metric definition templates.
The consumer has the necessary skill to split the monetary cost for the VM bundle • within its contained resources.
The consumer has necessary development and deployment skills to configure the • testbed and representative adaptive scaling techniques and take measurements for elasticity evaluation. 1.8. Research method 23
observation Define elasticity metrics literature
Design workload suite / Develop workload model
Outline steps for instantiating an executable benchmark
Develop an executable benchmark
Recommend techniques for valid, repeatable benchmarking results
Figure 1.7: Research method: high-level approach 1.8 Research method
The research method adopted in this dissertation is based on both theoretical concepts and practical views. The research problem is formulated based on a comprehensive literature review of academic publications and experience reports from the practitioners’ blogs. The high-level methodology for the design and development of ElasticMark is depicted in Fig. 1.7.
Initially, an intuitive understanding of elasticity and its core aspects is formed based on the state of the art review and empirical observations on a real cloud platform. Next task is to define the elasticity metric so that it represents the complex interplay between the technical elasticity aspects and the consumer’s concerns. To meet this criterion, a penalty based approach is adopted that penalizes the cloud platform for imperfect elasticity in terms of monetary cost (e.g., hourly penalty rate).
Workload characteristics play a crucial role in consumer-centric elasticity evaluation. Since elasticity is a measure of the adaptability behavior of the cloud platform, it needs to be stressed with fluctuating workloads during evaluation. For this reason, we design a time-varying workload suite with representative usage scenarios. In addition to that, we include a novel workload model that allows the consumer to reproduce representative prototypes of her fine-scale bursty workloads (i.e., workloads with noisy oscillations at 24 Introduction the small timescale, such as seconds) from traces and carry out elasticity evaluation for custom scenarios in a cost-e ective way.
Furthermore, we specify a set of precisely defined steps for instantiating an executable elasticity benchmark from the abstract framework. This process is exemplified through the construction of a prototype elasticity benchmark. This prototype is used for two purposes; first, it helps us demonstrate the applicability of our framework in the context of a real cloud o ering, such as the AWS EC2 cloud. It also helps us explore interesting phenomena and pinpoint anomalies of the cloud platform.
The elasticity behavior shows some variation across di erent runs of the same workload because of the performance unpredictability of the cloud, which eventually a ects the repeatability and validity of the elasticity evaluation results. To resolve this issue, we present a set of rigorous techniques to ensure valid and repeatable elasticity evaluation results even in the presence of the performance unpredictability of the cloud environment.
1.9 Terminologies used
There are several terms used throughout this dissertation. The definitions of those terms in the context of this research are presented in the following:
Metric. According to Fenton and Bieman [101], a measure is a number or symbol • that characterizes a specific attribute by mapping its empirical notion to the formal, relational world. A metric is “a quantitative measure of the degree to which a sys- tem, component or process possesses a certain attribute” [202]. In this dissertation, we define a single-figured elasticity metric so that the consumer can readily under- stand “how elastic is a given cloud platform” and compare the relative worth of one platform over another.
Validation. It is the process of “confirmation, through the provision of objective • evidence, that the requirements for a specific intended use or application have been fulfilled” [202]. Appropriate validation of software metrics is a necessary precondition for developing a good benchmarking framework [100]. The software measurement discipline stresses the importance of carrying out both theoretical and empirical validation; the former seeks to determine whether the software metric is a good 1.9. Terminologies used 25
reflection of the attribute it is purported to measure, whereas the latter intends to provide convincing evidence to demonstrate that the metric is practically useful [66, 147].
Benchmark. It refers to “a procedure, problem or test used to compare systems or • components with each other or to a particular standard” [202]. In this dissertation, we define a benchmark as a standardized tool which evaluates the quality of an attribute of a given system (e.g., elasticity of a cloud platform) against an established standard or point of reference. Typically, a benchmark consists of one or more performance metrics, a workload suite and a specification documenting the rules, requirements and constraints of the measurement environment [71]. A benchmark is used to compare the relative worth of various systems for decision-making purpose and extract insights about the performance bottlenecks in a system.
Benchmarking framework. By this term, we denote a conceptual abstraction • that includes a set of metric definitions, workload specification and guidelines for instantiating an executable benchmark.
Cloud platform. By this term, we mean an adaptive cloud system on which the • consumer’s application can run. Typically, an adaptive cloud system is comprised of an underlying cloud infrastructure and a set of scaling policies to adjust the resource capacity in response to a fluctuating workload.
Cloud consumer. In this dissertation, this term refers to an application provider • who leases the cloud infrastructure resources (IaaS) or development platforms (PaaS) to deliver a specific service to the end-users.
Workload. Generally, it refers to “a mix of tasks running on a given computer • system” [202]. According to Cloud scale EU project [64], a workload is the combi- nation of work and load,wherework denotes the data that needs to be processed by a service to yield a particular result and load refers to the frequency with which the service is invoked to perform the work. In this dissertation, a workload means a sequence of concurrent user requests to be processed by the application. More specifically, a workload has a specific duration and within that duration, the num- ber of concurrent user requests at each timepoint is precisely defined. Simply put, a workload is a function that relates each timepoint to a specific number of user requests. In this research, our purpose is to evaluate the adaptability aspect of the 26 Introduction
cloud platform; for this reason, we concentrate on workloads which are time-varying in nature and stress the application to increase and decrease its resource capacity on demand.
Fine-scale bursty workload. The timepoints of a workload can be viewed at • di erent resolutions or timescales. For instance, one can view the workload at a coarse resolution where the request rates are defined per hour; alternatively, one can view the workload at a fine resolution where the request rates are defined per second. Web and e-commerce workloads exhibit noisy oscillations or burstiness at the fine timescale. In this dissertation, we use this term to denote workloads which exhibit severe oscillations at the fine timescale (e.g., seconds, milliseconds).
Workload model. By this term, we mean a systematic representation of a workload • that resembles some of its important characteristics. A workload model facilitates benchmarking, performance analysis and capacity planning in a managed and cost- e ective way.
Workload demand. By this term, we refer to the resource demand or process- • ing requirement of a given workload. For instance, a workload intensity of 20 re- quests/second may have a resource demand of 10% CPU capacity and 2MB memory.
Resource supply. This term implies the amount of resource capacity supplied • by the underlying system in response to a resource demand. It can assume two di erent meanings when used in the context of this dissertation: available supply and chargeable supply. Available supply indicates the amount of resources allocated by the cloud platform, whereas chargeable supply refers to the amount of resources for which the cloud consumer gets charged by the underlying platform.
Load generator or load driver. It is an important component of the benchmark • that initiates workloads during benchmarking [192, 31].
System Under Test (SUT). It is a collection of system components that processes • the workloads generated by the load drivers during benchmarking [192, 31]. In this dissertation, an SUT is comprised of a scalable infrastructure and a management system for monitoring and scaling that infrastructure on demand. The scalable infrastructure is of the principal interest of the benchmark; the management system is considered as a functional component. 1.10. Thesis overview 27
Testbed. It refers to the entire test setup which includes the SUT as well as • any external systems required to carry out the benchmarking process [31]. In this dissertation, the testbed consists of the SUT and one or more load drivers.
1.10 Thesis overview
This dissertation is composed of six themed chapters.
Chapter 1 has presented the motivation of our work. It has also described the contributions and impacts of this research as well as the research problem and hypothesis, scope and assumptions and an overview of the research method.
Chapter 2 begins by laying out the background concepts to understand this research. The position of our research with respect to the state of the art is also discussed in this chapter.
Chapter 3 introduces a core framework for evaluating the elasticity of the cloud plat- forms from the consumer’s perspective. Later in this chapter, we instantiate an executable elasticity benchmark based on this framework and use it to explore some interesting adap- tive behavior of the EC2 platform.
Chapter 4 can be regarded as an extension of the core framework. It presents a novel workload model that allows the consumer to reproduce prototypes of their fine-scale bursty workloads and conduct custom elasticity benchmarking in a cost-e ective way.
Chapter 5 presents a set of techniques to ensure valid and repeatable elasticity bench- marking results even in the presence of the performance unpredictability of the cloud en- vironment. This chapter brings together several techniques from traditional computing and other fields of science to deal with the variation in the resultant elasticity metric.
Chapter 6 summarizes and critically evaluates the contributions of this work and explores possible avenues for future research.
General remarks. We have not included any separate validation chapter in this dissertation; instead, we have decided to validate our research ideas on a chapter by chapter basis. At the end of each concrete chapter, a critical evaluation section is also included that projects on the potential risks and benefits of the work presented in that chapter.
Chapter 2
Background and state of the art
“Don’t be satisfied with stories, how things have gone with others. Unfold your own myth.”
Jalaluddin Rumi
Elasticity is a pretty new computing paradigm that has the prospect to revolutionize the way resources are procured and utilized in the IT industry. Despite the freshness of this concept, it has been identified with a number of obstacles that ultimately have given rise to a large area of research. The aim of this literature review is, therefore, two folds; first, we lay out the necessary theoretical grounding to understand the elasticity concept and di erentiate it with similar terms. Next, we present an overview of the state of the art research in elasticity and relevant areas and give justification as how our work fits into the existing body of literature.
2.1 Background
The topics covered in this section includes the definition and characteristics of elasticity, its di erentiation with similar terms (e.g., scalability and e ciency) and the architectural overview of an elastic system.
29 30 Background and state of the art
2.1.1 Elasticity: definition and characteristics
Since its inception, the interest in “Elasticity” witnessed a dramatic rise as it was consid- ered to be one of the most important and appealing features of the cloud. Initially, the term was ambiguous, and there was no common and clear definition and understanding of it. Several e orts attempted to address this gap with the main focus on the cloud; however, those e orts di ered in the angle they looked at this feature.
A list of definitions on elasticity is given in Appendix A. Most of these definitions viewed elasticity as an extension of scalability that allows dynamic provisioning and depro- visioning of resource capacity on demand ([79, 172, 184, 207, 213, 173, 77, 211, 49, 185, 85]). Some interesting definitions are as follows:
“A simple, but interesting property in utility models is elasticity, that is, the ability to stretch and contract services directly according to the consumer’s needs.” (David Chiu, Editor, CrossRoad [77])
“Elasticity is basically a ‘rename’ of scalability [. . . ].” And “Elasticity is [. . . ] more like a rubber band. You ‘stretch’ the capacity when you need it and ‘release’ it when you don’t anymore.” (Edwin Schouten, IBM, Thoughts on Cloud [211])
These definitions, however, provide an oversimplified view about elasticity and do not shed any light on the quality of the adaptation process.
A large and growing body of literature attempted to provide a more rigorous definition for elasticity by relating it to two important dimensions, such as rapidity of resource adaptation and fine-grained resource supply [156, 48, 55, 223, 160, 150, 178, 125]. Armbrust et al. [48] defined elasticity as the cloud’s ability to add and remove computing resources at a fine grain and with a lead time of minutes so that the supplied capacity can closely follow the workload demand. With the term “fine grain”, they pointed to the granularity of one server at a time; however, they did not discuss anything about at what granularity the server capacity should be provided.
The prestigious National Institute of Standards and Technology (NIST) [178] provided definitions for a number of cloud-specific terms. For elasticity, NIST pointed to rapid pro- 2.1. Background 31 visioning and deprovisioning capability, virtually infinite resource capacity with unlimited purchasable quantity at any single moment.
Garg et al. [119, 118] followed a similar approach as NIST in their elasticity definition. They defined elasticity in terms of two aspects: mean adaptation delay to expand or contract the resource capacity and the max amount of resources that can be provisioned during the peak.
Among others, the definition given by Herbst et al. [125] is widely accepted in the community. Their definition is as follows:
“Elasticity is the degree to which a system is able to adapt to workload changes by provisioning and de-provisioning resources in an autonomic manner, such that at each point in time the available resources match the current demand as closely as possible.” (Herbst et al. [125])
This definition views elasticity as the quality of the adaptation process in response to a fluctuating workload and associates it with the scaling speed and precision between de- manded and supplied resource quantity. The scaling speed reflects how quickly the system can make a transition from an under-provisioned or over-provisioned state to an optimal resource configuration state. The precision aspect quantifies how closely the demanded and allocated resource quantity match each other. It is intuitive that if the resources are provided with the finest granularity in no time, the demand and supply curves will coincide with each other and the cloud platform will be considered as absolutely elastic. However, the real world is not as ideal as an absolute one; so the demand and supply rarely coincide with each other.
Although those resource-oriented elasticity definitions seem to be reasonable at first blush, one can easily encounter exceptions when looked through the lenses of the cloud consumer. These definitions are subjective (as they use terms - rapidly and closely) whose precise interpretation depends on the consumer’s context. For instance, a cloud platform with 2-minutes scaling delay may be perceived di erently by di erent applications. A batch processing application may find it elastic, whereas an interactive application with stringent QoS requirements (e.g., web application, online gaming) may perceive the same platform as not-so-elastic. Moreover, these definitions relate elasticity to the platform’s resource adaptability behavior only; looking at these definitions, it is not possible to interpret the meaning of elasticity in terms of “economies of scale” (main motivating 32 Background and state of the art factor for elasticity). As an example, consider two cloud platforms which are identical in all respects (scaling speed and precision) except their pricing models. Based on the resource elasticity definition, these two platforms have identical elasticity behavior; however, this conclusion is misleading as they yield di erent operational costs per transaction.
To resolve the above issues, the elasticity definition needs to incorporate the consumer’s context and the notion of cost-e ectiveness. A number of authors addressed these concerns in their elasticity definitions [201, 40, 94, 93, 113, 111, 112, 155, 145, 103, 129, 78, 243].
Several authors [40, 94, 93, 155] explicitly tied the rapidity of resource adaptation pro- cess to the consumer’s application performance. They emphasized on a seamless resource adaptability behavior so that the hosted application do not su er from service disrup- tion and performance variation when resources are added or removed by the underlying platform.
Gambi et al. [113, 111, 112] specifically stressed on two aspects while defining the elasticity of the cloud platform: quick resource adaptation speed to ensure consistent QoS and cost-e ective resource usage for minimal operational expenses. Their definitions are as follows:
“Elastic computing systems can dynamically scale to continuously and cost- e ectively provide their required Quality of Service in face of time-varying workloads, and they are usually implemented in the cloud.” (Gambi et al. [111])
“Cloud-based elastic computing systems dynamically change their resources al- location to provide consistent quality of service and minimal usage of resources in the face of workload fluctuations.” (Gambi et al. [112])
Daryl Plummer, [200] a Gartner Fellow, viewed elasticity as a critical tool to be han- dled by the consumers because of several factors, such as variable nature of the workloads and lack of control in upfront budgetary planning. In his opinion, companies with lack of acumen in managing fluctuating workloads may sometimes end up with too much opera- tional expenditures which are far beyond their anticipated limits.
The blog posts from Timothy Fitz and Ricky Ho [103, 129] pointed to an additional aspect of elastic behavior, i.e., the granularity of usage accounting. That means, the 2.1. Background 33 deprovisioning aspect of elasticity is not only a function of the speed to decommission a resource, but also depends on whether charging for that resource is immediately ceased or not. This aspect is crucial to cover the notion of cost-e ectiveness in the elasticity definition. Timothy Fitz also identified several enterprise use cases (e.g., extremely parallel computing needs, web user interaction) where the current degree of spin-down elasticity (typically an hour) is inadequate.
Considering all of these definitions, we identify the following key characteristics of elasticity:
Resource adaptation speed. Ideally, an elastic cloud platform should seamlessly • adapt its resource capacity in response to the time-varying workload so as to satisfy the QoS requirements of the hosted application.
Maximum scaling factor. It refers to the maximum amount of resources that can • be procured at once. To the consumer, the amount of resources in the provider’s resource pool should appear to be virtually infinite and they should be able to acquire as many servers as they need at any single moment to meet their workload demands.
Granularity of usage accounting. In an elastic cloud platform, a consumer • should be charged based on the amount of resources consumed and no more. It depends on two factors: the granularity of the resource supply and the charging quanta of the cloud platform. If the resources are allocated at a fine granularity, then the supplied resource quantity can closely follow the demand curve; thereby, minimizing the operating expenses for idle resources. On the other hand, charging quanta implies whether charging for a resource is immediately ceased with its release or not; it also influences the operational cost for unutilized resources.
Cost-e ective resource usage. A motivating factor behind elastic cloud adoption • is economies of scale; that is, consumers would prefer a cloud o ering that minimizes the operational cost per transaction over a range of scales. This essentially points to the pricing strategy of the cloud platform; between two platforms A and B identical in all respects, A provides better elasticity if and only if the cost per transaction is cheaper in A than in B.
Among these aspects, the Maximum scaling factor cannot be measured directly at the consumer’s end. However, this is not a major problem because almost all cloud providers 34 Background and state of the art nowadays possess virtually infinite resource capacity, which is most often su cient to serve the peak workloads of the consumer. The other characteristics, however, can be practically measured at the consumer’s end; therefore, we seek to incorporate these characteristics into our elasticity metric definition.
2.1.2 Comparison: Elasticity, Scalability and E ciency
Elasticity, scalability and e ciency are commonly used (and often misused) in the context of cloud computing. Several academic works and practitioners’ blogs [236, 97, 40, 64, 156, 129, 221] discussed the similarities and dissimilarities among these terms.
2.1.2.1 Scalability
In general, scalability is the ability of a system to meet its quality requirements (e.g., supporting more users, improving QoS or both) in response to a growing workload by incrementally adding a proportional amount of resource capacity [141, 240, 95, 87]. In traditional client-server systems, applications are designed for scalability to make sure that the operational expenses grow cost-e ectively with respect to workload demand. The case of removing resources, when the workload demand shrinks back to normal, is usually not considered because the purchased resources are already a sunk cost. However, in the context of the cloud and its pay-as-you-go pricing model, the financial motivation for re- leasing resources during low demand cannot be ignored and hence the notion of scalability needs to go through further refinement [129]. In cloud computing, scalability describes the ability of a higher layer service to meet its quality requirements in response to varying workload demand by adjusting its resource consumption from its lower layer service [64]. Resource capacity can be scaled in two di erent ways: vertical scaling (also known as, scaling up/down) and horizontal scaling (also known as, scaling out/in). Vertical scaling means the addition of resources (or removal of resources) to a computing node. On the other hand, Horizontal scaling refers to the addition (or removal) of computing nodes to an existing cluster. Note that scalability is a time-free notion and hence it does not capture how quickly the system adapts to a changing workload demand over time. In contrast, elasticity is a time-dependent notion which measures how quickly a system can adapt to a 2.1. Background 35 varying workload without causing any disruption in the o ered service. This rapid respon- siveness notion is absent from the definition of scalability. Moreover, scalability does not consider how frequently and at what granularity the system adjusts its resource capacity as the workload demand varies over time. On the contrary, elasticity is concerned about how precisely the supplied capacity follows the workload demand over time; therefore, these aspects are considered as crucial ingredients in defining and measuring elasticity. In the context of cloud, elasticity means real-time optimization of the operational expenses with the variation in workload demand; in other words, good elasticity ensures significant savings with reduced operational expenditure and minimal service interruption.
2.1.2.2 E ciency
E ciency describes a system’s ability to process a certain amount of work with the smallest possible e ort (e.g., cost, consumed energy or consumed amount of resources) [64, 236]. This term can be applied either to part of a system, e.g., to a single resource or to an entire system. The workload for which e ciency is measured could be either constant or fluctuating over time. In contrast, for constant workloads, elasticity is not an attractive option as there is no need for resource adaptation in this case; so the value of elasticity can be best realized in a fluctuating demand setting.
2.1.2.3 Comparison
Table 2.1 compares elasticity, scalability and e ciency based on several criteria. As we can see, elasticity is a dynamic property of the system as it allows real-time adaptation of its resource quantity in response to workload fluctuations. On the other hand, scala- bility and e ciency are static properties of the system; none of these are concerned with the real-time adaptation process. Scalability and elasticity, however, are related to the adaptability aspect of the system. Scalability is a measure of the system’s ability to func- tion gracefully at di erent scales whereas elasticity specifically determines the quality of the adaptation process - that is, whether the system can make smooth and continuous transition between di erent scales in real time to guarantee acceptable QoS in response to the fluctuating workload demand. The quality of the adaptation process is measured 36 Background and state of the art in terms of adaptation time and accuracy between the demanded and supplied resource quantity. On the contrary, scalability is not at all concerned about responsiveness delay and resource accuracy.
Table 2.1: Comparison: Elasticity, Scalability and E ciency
Criteria Elasticity Scalability E ciency Dynamic property X Adaptability X X Real-time adaptation X Quality of adaptation X Performance (QoS) X X X Resource X X X Cost-e ectiveness X X X
Note that all of these terms are more or less related to performance, resource and cost-e ectiveness. However, e ciency is a broader concept and may apply to a range of domains in the cloud computing context. For example, power e ciency defines how well a system consumes power with respect to time [76, 75], computational e ciency reflects the ratio between achieved operations and peak operations per second [98], etc. In the context of scalability, cost may sometimes be implicit, such as when the quality of interest is performance and energy consumption (e.g., performance/watt metric). In Section 2.1.1, several resource-oriented elasticity definitions have been discussed that completely ignore performance and cost aspects. We argue that elasticity is not an absolute concept, but a relative one whose precise value depends on the consumer’s context. Therefore, leaving performance and cost out from its definition and measurement conveys an incomplete view about elasticity.
2.1.3 Foundation
This section provides an architectural overview of the elastic system and its properties [96, 237, 57]. 2.1. Background 37
2.1.3.1 Elastic system architecture
An elastic system can dynamically adjust its resource capacity in the face of fluctuating workload conditions. Thus, it maintains a tolerable performance threshold in the delivered service with minimal operational expenditure. In particular, two specific features of the cloud serve as the main motivation behind an elastic system: on-demand availability of resources and usage-based pricing model.
Here goes an intuitive description of the working process of an elastic system. Usually, an elastic system starts with some minimum allocation of resources so that it can process requests under normal workload condition. In the course of time, there may arise situations (e.g., seasonality of user demand, flash crowds etc.) when the system resources tend to saturate because of a sudden increase in the arrival of requests. In this condition, an elastic system stretches its capacity by allocating additional resources from the cloud to maintain a tolerable performance limit. When the peak is over and the request arrival rate drops back to normal, a fraction of the resources in the system remains unutilized. In this condition, the system contracts its capacity by de-allocating a portion of resources to the cloud, thus not paying anymore for idle resources. This is how an elastic system meets its performance criteria with minimal operating expenses in response to workload fluctuations.
Figure 2.1: Architecture of an elastic system (adapted from [126])
Figure 2.1 shows the basic architecture of an elastic system. At the core of the elastic system, there are two components: a scalable infrastructure and a management system. The cloud providers o er infrastructures or development platforms to the consumers in the form of virtual machines (VM) with access to network bandwidth and storage. A 38 Background and state of the art hypervisor or Virtual Machine Monitor (VMM) abstracts out virtual machines from the underlying physical hardware. The hypervisor has control over the physical server’s re- sources and can manage allocation (de-allocation) of resources to the virtual machines in case a scale-up (scale-down) is required. In some cases, the cloud consumer prefers to distribute her application load to a pool of virtual machines with dynamic scale-out (scale-in) capability. This task is performed by a load-balancer which sits in front of the pool of virtual machines and forwards the incoming requests based on a pre-configured routing algorithm (e.g., round-robin, least outstanding requests etc.) to the attached VMs. The scalable infrastructure is administered by a Cloud Management System which is com- prised of a load-balancer,amonitoring system,areconfiguration management module and in most cases, an elastic controller. The monitoring system periodically monitors the re- source utilization and performance characteristics of the VM instance pool. It can also send notifications to the elastic controller or the application administrator once a trig- gering condition is configured. A trigger watches a single metric (e.g., CPU utilization, latency etc.) or a combination of metrics over a time interval and sends a notification once the observed value of the metric(s) meets a predefined threshold condition for a specific number of times. The elastic controller, on receiving the trigger notification, makes an adjustment to the resource capacity by invoking the reconfiguration management API. The reconfiguration management service places request to the scalable infrastructure for an allocation (de-allocation) of resources and updates the load-balancer’s instance pool, if necessary, once the request gets fulfilled. This is how an elastic system can adapt to the workload demand by stretching and contracting its resource capacity in real time. The quality of the adaptation process depends on several parameters, for example, pro- visioning and deprovisioning time of the scalable cloud infrastructure, supplied resource granularity, the cloud provider’s pricing model, elastic controller’s adaptation strategy to load variation and so on.
2.1.3.2 Properties
The behavior of a cloud-based elastic system is a function of various factors and their complex dependencies, such as the adaptation quality of the underlying cloud platform, workload pattern and adaptation logic specified in the elasticity controller. These complex dependencies often give rise to hard-to-determine e ects on the elasticity behavior of 2.1. Background 39 the cloud-based application. For this reason, application providers feel the need of a formalization and verification technique to check whether the elastic system adheres to the intended elasticity behavior or not. To address this need, Bersani et al. [57] characterized the properties of an elastic system and proposed a formalization to it using temporal logic called CLTLt(D) that stands for Timed Constraint LTL. This formalization is later used to construct a verification tool for validating whether certain facets of an elastic system hold or not during its execution in the cloud. In the following, we describe the properties of a cloud hosted elastic system. For brevity, we skip the formalized definition. The details can be found in [57]. The properties of an elastic system can be classified into three categories: elasticity, resource management and quality of service.
Eagerness. It describes the adaptation speed in response to workload variation • over time.
Sensitivity. It denotes the minimum change in workload that starts an adaptation • process. This parameter is defined over a range of the recently observed load. If the load stays within a particular limit, then the system resources are adequate to meet the demand, therefore no adaptation action is required. Otherwise, the system needs to go through an adaptation process to prevent resource saturation or resource wastage.
Plasticity. It defines the system’s inability to release resources and go back to the • minimal resource configuration from a state with higher resource capacity. Ideally, an elastic system should be able to de-allocate a portion of its resources and revert to the minimal resource configuration within a reasonable amount of time once the load intensity decreases; failure to exhibit this behavior makes the system plastic.
Precision. It defines the accuracy with which the elastic system can allocate or • de-allocate resources so that the di erence between the supplied resource quantity and workload demand is minimal (or zero in an ideal case).
Oscillation. It refers to repeated allocation and de-allocation of resources even when • the workload demand is stable. Poorly designed scaling rulesets may sometimes give rise to oscillatory behavior. Although oscillation appears to be a valid behavior, it has a negative impact on the running cost for the cloud application.
Resource thrashing. It refers to a situation when elastic systems tend to exhibit • 40 Background and state of the art
opposite adaptations in a very short interval. For instance, a system may acquire additional resources in one interval, and then when the resource allocation phase is completed, it starts to release these resources. This can be thought of a temporary, yet quick oscillation in the adaptation of resources. In a resource thrashing situation, the portion of resources taking part in oscillation can not perform any useful work, however, they increase the running cost of the cloud-hosted application.
Cool-down period. It is the amount of time that must elapse before a new scaling • action can be issued after an adaptation action has been triggered. During this interval, the elastic controller inhibits itself from triggering any scaling action in order to let the system stabilize after the previous adaptation. The purpose of the cool-down period is to prevent any unexpected oscillatory behavior.
Bounded concurrent adaptations. It is the maximum number of scaling actions • that are allowed by the elastic controller when an adaptation is in progress. It can be viewed as a relaxed generalization of the cool-down period strategy where the maximum number of scaling actions during an adaptation is one.
Bounded resource usage. It is a property that allows the elastic system to use • resources beyond a certain threshold, if necessary, for a pre-specified interval. This means, whenever the elastic controller allocates more resources than specified by some temporary threshold, it should de-allocate those excess resources before the end of that interval.
Bounded QoS degradation. It is a property that restrains the amount of QoS • degradation during adaptation. It conveys the fact that the normally-required QoS threshold can only be stratified when the elastic system is stable, i.e., it is not going through any adaptation process; otherwise, a more relaxed level of QoS threshold needs to be enforced during adaptation.
Bounded actuation delay. It restrains the maximum delay introduced by the • elastic controller and reconfiguration service to execute a scaling action. Intuitively, the time required for the application to be ready to serve requests must be limited by this value.
Preserving the intended elasticity properties may seem to be reasonably trivial to achieve at the first blush, yet there may arise some imperceptible situations when the 2.2. Related work 41 system may fail to exhibit the desired level of elasticity. In Chapter 3, we will present some real-world examples to elucidate this point further.
2.2 Related work
This section presents an overview of the state of the art research on elasticity evaluation and other related areas.
2.2.1 Cloud performance analysis and benchmarking
As mentioned in the previous section, elasticity is a combination of several internal at- tributes of the cloud platform, some of which are directly related to low-level performance measures (e.g., scaling latency, resource specification). It also has some commonalities with the concept of scalability. From this perspective, our work has some relevance to the area of cloud performance analysis and benchmarking; some of those works analyzed newly arrived features (e.g., variability) of the cloud platform while some others refined the existing metrics and evaluation strategies for the old ones (e.g., scalability). In the sections that follow, we highlight the main di erences between these streams of research and our work.
2.2.1.1 Variability
Variability refers to the extent of spread-out or fluctuation in the observed values of a performance aspect of the cloud o ering [166]. This branch of research is specifically concerned about quantifying variability of a set of primitive performance measures, inves- tigating possible causes for variability and recommending viable solutions for its e ective management [196, 193, 210, 88, 51, 162, 241, 136, 135, 99, 235, 73, 246, 153].
Several studies discovered substantial variability in the performance (e.g., resource scaling latency, CPU, IO and network performance of VMs) of public cloud o erings [196, 136, 210, 88, 162]. Some key findings from these studies are: (1) the performance of identical VM types is very heterogeneous and may vary up to a factor of 4 [88]; (2) most of 42 Background and state of the art these performance measures exhibit weekly and yearly patterns [196, 136]; (3) The choice of availability zone, time of the day and day of the week often exert significant influence on these performance measures, for instance, VM startup time showed high mean covariance when aggregated by hour of the day or day of the week (up to 91.4% for day of the week for EC2 m1.small instance in US data centers) in the study of Schad et al. [210]. Most of these works also conducted a series of case studies to analyze the impact of these micro variances on scientific and interactive application prototypes. These empirical analyses finally led the authors to the following conclusions: (1) variability has significant implications for SLA-aware dynamic provisioning systems in terms of both performance and cost, and (2) the complex dependency of the performance measures with the temporal and spatial aspects of the cloud platform has an influence on the repeatability and reproducibility of wall-clock experiments.
Considerable e ort has been spent on addressing this variability problem; some works delved deeper to relate the performance variability e ect further to the internal factors of the cloud environment [99, 51, 241, 162, 235, 73] while some others suggested novel techniques to mitigate its impact and deliver reliable QoS guarantee for the consumer’s application [51, 193, 241, 99, 153]. Various environmental factors of the cloud contribute to performance variability, examples include, heterogeneity in the underlying commodity hardware, interference e ect from co-located VMs, sudden overload due to resource over- commitment at the provider’s end and so on [99, 51, 241, 162, 235, 246]. To resolve this issue, several solutions has been proposed; for instance, heterogeneity-aware VM place- ment strategies [99], Overdriver that adaptively switches between network memory based co-operative swap and VM migration techniques for mitigating transient and sustained memory overloads respectively [241], Q-Clouds - application feedback driven closed loop controller that tunes resource allocation to guarantee reliable QoS [193] and several ad- mission control policies for performance isolation (such as round-robin request handling, blacklisting disruptive tenants) [153].
Although at first glance this research does not seem to have much relevance to our work, it o ers us some valuable insights for designing the elasticity benchmarking frame- work. It reveals significant variability in the micro characteristics of elasticity (e.g., scaling latency, resource demand due to variation in the VM performance), thereby setting our expectation that the overall elasticity behavior will also exhibit some variation across dif- ferent benchmark runs. To reiterate, variability may a ect the validity and repeatability 2.2. Related work 43 of the elasticity evaluation results, if appropriate precautions are not taken. To resolve this issue, we recommend a set of rigorous techniques to ensure valid and repeatable elasticity evaluation results in Chapter 5.
2.2.1.2 Scalability
As pointed out in Section 2.1.2, both scalability and elasticity address the adaptability aspect of the cloud platform, however, in a complementary way. From this perspective, our work is closely connected to scalability evaluation research.
This area of research revolves around some key themes: development of scalability benchmarks and metrics [151, 64, 225, 116, 93, 79], scalability evaluation of cloud o er- ings for a range of workloads [151, 79, 93, 155] and development of scalability testing frameworks [115, 234]. A lot of works defined scalability measurement metrics in the context of cloud computing [151, 64, 225, 116, 93, 79]. Some example scalability metrics are: scaleup [79], WIPS and cost/WIPS where WIPS is defined as the number of requests meeting the Service Level Agreement (SLA) constraints [151], performance change with respect to workload change (measured as the PRR ratio for the compared workloads, where PRR stands for ‘performance to resource ratio’) [225], mean transactions per sec- ond [46], and ratio between system load increase and system capacity increase [116]. Most of these works characterize scalability in terms of performance and/or cost - two critical factors that influence the consumer’s purchase decision. The consumer-centric view in this domain is a good source of inspiration for our research. Despite this similarity, there is a clear distinction between scalability and elasticity measurement. Scalability metric is concerned about the system’s ability to meet its quality objectives (e.g., performance and cost) over a given range of workloads; for instance, it checks whether the system can function gracefully (e.g., in terms of latency) when resources are added in proportion to the workload. That means, scalability is measured based on the discrete steady-state behavior of the system (pre-adaptation state and post-adaptation state). It is not at all concerned about the system’s behavior during adaptation, that is, how quickly the system makes transitions between states and how accurately the system calibrates its resource al- location. This is what elasticity is concerned about; for this reason, it is considered as the dynamic property of the system. This argument also justifies the complementary relation- ship between scalability and elasticity; both of them together represent the adaptability 44 Background and state of the art behavior of the system.
Another obvious di erence between scalability and elasticity evaluation relates to the workload specification. Scalability evaluation considers growing workload patterns (that is, workloads that grow larger and larger) [151, 79, 93, 155]; on the contrary, elasticity evaluation requires fluctuating workloads (that is, workloads that grow as well as shrink) to assess the quality of the adaptation process.
2.2.2 Elasticity benchmarking: initial concepts
This section presents the initial ideas on elasticity measurement; some of these works followed the naive definition of elasticity and expressed it simply in terms of the resource scaling delay, while others put forth more concentrated e ort to sketch the basic layout for an elasticity benchmarking framework.
As mentioned in Section 2.1.1, elasticity was initially viewed as the scaling delay of the underlying cloud service to adjust the resource capacity. Inspired by this definition, several research groups focused on measuring di erent statistics of the scaling delay as well as its complex dependency with various non-deterministic factors of the cloud platform and determining its suitability to handle various workload scenarios; examples include, [97, 164, 163, 127, 136, 173, 65]. Recall from Section 2.1.1, scaling delay is but one of the many aspects of elasticity; reporting it alone is not su cient to provide a complete view about the elasticity of the cloud platform (not even from the resource elasticity perspective).
The Standard Performance Evaluation Corporation (SPEC) Open System Group [37] characterized elasticity in terms of four metrics: Provisioning interval (same as scaling delay), Agility (measures how closely the supplied resource quantity tracks the workload demand), Scale up/down (measures the system’s ability to maintain a consistent unit completion time for increased problem size by adding a proportional amount of resources) and Elastic Speedup (measures whether the performance improvement is proportional to the increase in resource quantity or not). Note that the first two metrics capture the resource elasticity view, however, the last two metrics reflect scalability, not elasticity.
Among others, Suleiman et al. [218] characterized elasticity in terms of elasticity time (i.e., the scaling delay), minimum and maximum amount of resources that can be added 2.2. Related work 45
(i.e., how much to scale), specification and types of resources, and amount of available resources. The second and third metrics influence the granularity of usage accounting to some extent; however, these are not specific characteristics of elasticity.
Folkerts et al. [105] suggested an elasticity metric based on the price of a varying workload with respect to that of the full workload; a cheaper price for the varying workload serves as an indicator of the cloud platform’s elasticity property. This metric completely disregards the consumer’s detriment for the spin-up delay and imprecise resource supply, thus conveys a flawed view about the consumer-perceived elasticity of the cloud platform.
Li et al. [166] characterized elasticity in terms of three metrics: Resource acquisition time, Resource release time and Cost and time e ectiveness (relates to the granularity of usage accounting to some extent). These discrete metrics, though comprehensive, can- not reflect the complex interaction between the cloud platform’s elasticity property and the consumer’s application specific context; this is the gap we want to address in this dissertation.
Binnig et al. [59] sketched the initial layout for an elasticity benchmark. They defined an elasticity metric based on the ratio of WIPS in RT (Web Interactions Per Second sat- isfying the given Response Time constraint) and Issued WIPS; a ratio of 1 means perfect elasticity of the cloud platform in response to that workload. In addition to this primary metric, they recommended reporting cost (i.e., $/WIPS) and standard deviation in cost; a smaller and stable cost value indicates better adaptability of the cloud platform. Fur- thermore, they also discussed the general characteristics of the elasticity benchmarking workload, such as slowly growing workload, fast and sudden spikes that stress the plat- form’s adaptability aspect etc. This idea, although perfectly valid, did not crystallize into a tangible elasticity benchmark; nevertheless, it still serves as a source of inspiration and fuel for many consumer-centric elasticity benchmarking frameworks, including ours.
2.2.3 Elasticity benchmarking frameworks
This section reports on the state of the art research in elasticity evaluation. This stream of research specifically concentrates on the design and development of elasticity benchmarks as well as revealing the pros and cons of alternative elasticity solutions. The publications in this area can be roughly divided into two categories: micro-benchmarking frameworks 46 Background and state of the art
Elasticity frameworks
Assumptions Measurement Validation Pragmatic issues and scope framework strategy Cloud deployment Modeling approach model Cloud service Metric(s)
System evaluated Workload profile
Modeling perspec- Figure of merit tive Testing method
Figure 2.2: Taxonomy of factors for analyzing elasticity benchmarking frameworks and macro-benchmarking frameworks. A micro-benchmarking framework focuses on the discrete characteristics of elasticity to reveal potential bottlenecks of the underlying sys- tem, while a macro-benchmarking framework attempts to draw an overall conclusion about the relative worthiness of competing cloud platforms.
In the subsections that follow, we present a taxonomy of factors for analyzing this diverse set of elasticity measurement frameworks and then we critically review these works based on the proposed factors. At the end, we highlight the research challenges and open issues in this domain and include those in our research roadmap.
2.2.3.1 Taxonomy
Now we employ a taxonomy for analyzing the elasticity benchmarking frameworks based on four criteria: (1) Assumptions and scope, (2) Measurement framework, (3) Validation strategy and (4) Pragmatic issues that have been addressed. This taxonomy has been derived from the study and analysis of the surveyed elasticity benchmarking frameworks. Fig 2.2 illustrates our proposed taxonomy.
The first criterion, Assumptions and scope, defines the context and applicability of the measurement framework. It can be further divided into several characteristics: Cloud deployment model, Cloud service, System evaluated and Modeling perspective. The cloud deployment model describes how resources are provisioned based on the organizational structure and the provisioning location. The cloud service is any service that is provided 2.2. Related work 47 over the internet on demand. In our surveyed measurement frameworks, we have identified three di erent deployment models, namely, private, public and hybrid cloud and two cloud services, namely, Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). The system evaluated aspect refers to the type of the targeted system under study: application or database. The modeling perspective corresponds to the point of view from which the framework is designed and developed. Our surveyed elasticity frameworks either hold the consumer’s and/or the provider’s perspective or focus solely on technical characteristics.
The next criterion, Measurement framework, usually refers to a skeletal structure with a set of metrics and a set of rules that govern the test condition and method, for instance, input workload profiles, testbed specification, testing methodology, etc. This aspect can sub-classified as follows: Modeling approach, Metric(s), Workload profile, Figure of merit and Testing method. The modeling approach describes how the specific aspects of elasticity are conceptualized and represented. For micro-benchmarking frameworks, we characterize it with respect to the elasticity dimensions they address: capacity, QoS and cost dimensions [186]. For the macro-benchmarking frameworks, in contrast, we characterize it based on the technique adopted to represent the elasticity of the whole system; examples include, multi-criteria analysis, financial impact analysis, performance overhead analysis etc. The metric(s) refers to specific measures used to gauge the elasticity of the SUT. The workload profile corresponds to the representative use-case scenarios for a particular domain, e.g., transaction processing application, scientific application etc. Usually, the workloads used for elasticity benchmarking includes fluctuating patterns and a mix of transactions. The figure of merit implies whether the elasticity metric(s) for a set of workload profiles can be combined into one or not. A unified elasticity metric facilitates informed decision-making over several competitive cloud solutions. The testing method denotes how much control or internal knowledge about the SUT is needed to carry out the benchmarking task; examples include, black-box, grey-box and white-box testing.
The criterion, Validation strategy, is used to confirm whether the proposed measure- ment framework can appropriately reflect the intended aspects of elasticity or not. Exam- ples include, theoretical analysis, simulation and experimentation.
The last criterion, Pragmatic issues, corresponds to realistic phenomena and consid- erations that arise most often in practical situations. Examples include, SLA, charging anomalies of the cloud environments etc. 48 Background and state of the art
Macro- benchmarking frameworks Financial Performance Multi-criteria impact analysis impact analysis analysis
Weinman Dory et al. Majakorpi
Tinnefeld et al. Shawky et al.
Almeida et al.
Figure 2.3: Overview of macro-benchmarking modeling approaches
2.2.3.2 Macro-benchmarking frameworks
A macro-benchmarking framework quantifies the elasticity of the system as a whole. It measures elasticity with respect to a specific class of applications and yields a few sum- mary measures as a reflection of the system’s elasticity behavior. It is particularly useful to the stakeholders in that it helps them draw conclusions and informed comparisons over competitive cloud o erings, adaptive policies, design alternatives and deployment config- urations. It should be noted, however, it does not necessarily reveal the reasons behind the system’s poor elasticity behavior.
The frameworks in this category can be further classified based on their respective approaches to model the elasticity behavior. Fig. 2.3 provides an overview of this classi- fication.
2.2.3.2.1 Financial impact analysis
The financial impact analysis based approach quantifies the implications of under-provisioning and over-provisioning in terms of monetary cost. These frameworks usually take on the consumer’s perspective and hold the assumption that the consumer’s application has to meet a predefined SLA while minimizing operational expenses. Failure to meet the SLA for insu cient resources or excess payment for unutilized resources contributes to the penalty which is later transformed into monetary units. The obvious merit of these frameworks lies in modeling the consumer’s detriment in monetary terms. However, it requires a solid 2.2. Related work 49 understanding of the consumer’s business situation and cloud platform’s charging policy; failure to accommodate any of these aspects may prevent any practical use of this type of framework.
Weinman’s framework. To the best of our knowledge, Weinman [239] is the first who coined the term “Penalty model” approach in his proposed elasticity measurement frame- work that explicitly takes on the consumer’s viewpoint. The main idea behind Weinman’s model stems from the basics of economics, i.e., demand and supply. In his measurement model, the resource demand is expressed as a function of time, D(t) and allocated resource supply is expressed as a function of time, R(t). Suppose that, for some time point ti,there is a di erence between the observed demand D(ti) and provided resource supply R(ti).If
D(ti) >R(ti), then the system is under-provisioned, i.e., the current resource capacity is not adequate to meet the demand. On the other hand, if D(ti)