A Self-Adaptive Auto-Scaling System in Infrastructure-As- A-Service Layer of Cloud Computing

Total Page:16

File Type:pdf, Size:1020Kb

A Self-Adaptive Auto-Scaling System in Infrastructure-As- A-Service Layer of Cloud Computing A Self-Adaptive Auto-Scaling System in Infrastructure-as- a-Service Layer of Cloud Computing by Seyedali Yadavar Nikravesh A thesis submitted to the Faculty of Graduate and Postdoctoral Affairs in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Engineering Carleton University Ottawa, Ontario © 2016, Seyedali Yadavar Nikravesh Abstract The National Institute of Standards and Technology (NIST) defines scalability, resource pooling and broad network access as the main characteristics of cloud computing which provides highly available, reliable, and elastic services to cloud clients. In addition, cloud clients can lease compute and storage resources on a pay-as-you-go basis. The elastic nature of cloud resources together with the cloud’s pay-as-you-go pricing model allow the cloud clients to merely pay for the resources they actually use. Although cloud’s elasticity and its pricing model are beneficiary in terms of cost, the obligation of maintaining Service Level Agreements (SLAs) with the end users necessitates the cloud clients to deal with a cost/performance trade-off. Auto-scaling systems are developed to balance the trade-off between the cost and the performance by automatically provisioning compute and storage resources for the cloud services. Rule-based systems are currently the most popular auto-scaling systems in the industrial environments. However, rule-based systems suffer from two main shortcomings: a) reactive nature, and b) the difficulty of configuration. This thesis proposes an auto-scaling system which overcomes the shortcomings of the rule-based systems. The proposed auto-scaling system consists of a self-adaptive prediction suite and a cost driven decision maker. The prediction suite remedies the first shortcoming (i.e., the reactive nature) of the rule-based systems by forecasting the near future workload of the cloud service. In addition, the cost driven decision maker uses a genetic algorithm to automatically configure the rule-based decision makers. The evaluation results show that the proposed system reduces the total auto-scaling cost up to 25% compared with the Amazon auto-scaling system. 2 Acknowledgements I would like to express my sincere appreciation and thanks to my supervisors Dr. Samuel A. Ajila and Dr. Chung-Horng Lung for their support, patience, motivation, and immense knowledge. This thesis would not have been completed without their guidance and help. I would also like to thank my family. Words cannot express how grateful I am to my mother, father, brother, and sister for their support and encouragement. Last but not the least, my deepest appreciation is expressed to my beloved wife Saloumeh for her unconditional love, patience, and understanding. 3 Table of Contents Abstract .............................................................................................................................. 2 Acknowledgements ........................................................................................................... 3 Table of Contents .............................................................................................................. 4 List of Tables ..................................................................................................................... 9 List of Illustrations .......................................................................................................... 12 List of Acronyms ............................................................................................................. 14 List of Symbols ................................................................................................................ 16 Chapter 1 - Introduction ................................................................................................ 18 1.1 Research Motivation ............................................................................................ 18 1.2 Research Scope .................................................................................................... 19 1.3 Problem Statement ............................................................................................... 22 1.4 Formal Problem Definition .................................................................................. 23 1.5 Research Goal and Objectives .............................................................................. 25 1.6 Research Approach .............................................................................................. 28 1.7 Thesis Contributions ............................................................................................ 29 1.8 Outline of the Thesis ............................................................................................ 31 Chapter 2 - Literature Review ....................................................................................... 32 2.1 Reducing the Virtual Machine Boot-Up Time ..................................................... 32 2.2 Background Concepts .......................................................................................... 33 2.2.1 Infrastructure-as-a-Service Providers .............................................................. 34 2.2.1.1 Amazon Elastic Compute Cloud (Amazon EC2) ..................................... 36 2.2.2 Cloud Service Benchmarks ............................................................................. 37 2.2.2.1 TPC-W ...................................................................................................... 39 4 2.2.3 Service Level Agreements ............................................................................... 40 2.2.4 Workload ......................................................................................................... 41 2.3 Existing Auto-Scaling Techniques ....................................................................... 44 2.3.1 Target Tier ....................................................................................................... 45 2.3.2 Prediction Method ........................................................................................... 46 2.3.3 Decision Making Approach ............................................................................. 47 2.3.4 Performance Indicator ..................................................................................... 48 2.3.5 Workload Pattern ............................................................................................. 50 Chapter 3 - Theoretical Foundation of Machine Learning ......................................... 51 3.1 Formal Definition of the Machine Learning Process ........................................... 52 3.2 Empirical Risk Minimization ............................................................................... 54 3.2.1 VC-dimension ................................................................................................. 56 3.2.2 Uniform Convergence Rates for the ERM ...................................................... 57 3.3 Structural Risk Minimization ............................................................................... 58 3.4 Effect of the Workload Pattern on the Prediction Accuracy of the Empirical and the Structural Risk Minimizations ........................................................................ 59 3.5 Artificial Neural Networks ................................................................................... 62 3.5.1 Multi-Layer Perceptron with Empirical Risk Minimization ........................... 62 3.5.2 Multi-Layer Perceptron with Structural Risk Minimization ........................... 63 3.6 Support Vector Machine ...................................................................................... 64 Chapter 4 - Proposed Predictive Auto-Scaling System ............................................... 68 4.1 Self-Adaptive Workload Prediction Suite ............................................................ 69 4.2 Cost Driven Decision Maker ................................................................................ 75 4.2.1 The Reason to Choose Genetic Algorithm ...................................................... 79 4.2.2 Genetic Algorithm ........................................................................................... 80 4.2.2.1 Individual Presentation ............................................................................. 81 5 4.2.2.2 Fitness Function ........................................................................................ 82 4.2.2.3 Genetic Operators ..................................................................................... 83 4.2.2.3.1 Selection Operator ............................................................................. 84 4.2.2.3.2 Crossover and Mutation Operators .................................................... 85 4.2.3 The Proposed Decision Maker Algorithm ....................................................... 86 4.3 Evaluation ............................................................................................................ 90 4.3.1 Evaluation of the Cost Driven Decision Maker ............................................... 90 4.3.2 Evaluation of the Proposed Predictive Auto-Scaling System .......................... 93 4.3.2.1 Amazon Auto-Scaling System Details ..................................................... 93 4.3.2.2 Evaluation Results .................................................................................... 94 Chapter 5 - Experiments and Results ..........................................................................
Recommended publications
  • A Study of the Effectiveness of Cloud Infrastructure
    A STUDY OF THE EFFECTIVENESS OF CLOUD INFRASTRUCTURE CONFIGURATION A Thesis Presented to the Faculty of California State Polytechnic University, Pomona In Partial Fulfillment Of the Requirements for the Degree Master of Science In Computer Science By Bonnie Ngu 2020 SIGNATURE PAGE THESIS: A STUDY OF THE EFFECTIVENESS OF CLOUD INFRASTRUCTURE CONFIGURATION AUTHOR: Bonnie Ngu DATE SUBMITTED: Spring 2020 Department of Computer Science Dr. Gilbert S. Young _______________________________________ Thesis Committee Chair Computer Science Yu Sun, Ph.D _______________________________________ Computer Science Dominick A. Atanasio _______________________________________ Professor Computer Science ii ACKNOWLEDGEMENTS First and foremost, I would like to thank my parents for blessing me with the opportunity to choose my own path. I would also like to thank Dr. Young for all the encouragement throughout my years at Cal Poly Pomona. It was through his excitement and passion for teaching that I found my passion in computer science. Dr. Sun and Professor Atanasio for taking the time to understand my thesis and providing valuable input. Lastly, I would like to thank my other half for always seeing the positive side of things and finding the silver lining. It has been an incredible chapter in my life, and I could not have done it without all the love and moral support from everyone. iii ABSTRACT As cloud providers continuously strive to strengthen their cloud solutions, companies, big and small, are lured by this appealing prospect. Cloud providers aim to take away the trouble of the brick and mortar of physical equipment. Utilizing the cloud can help companies increase efficiency and improve cash flow.
    [Show full text]
  • Autoscaling Hadoop Clusters Master’S Thesis (30 EAP)
    UNIVERSITYOFTARTU FACULTY OF MATHEMATICS AND COMPUTER SCIENCE Institute of Computer Science Toomas R¨omer Autoscaling Hadoop Clusters Master's thesis (30 EAP) Supervisor: Satish Narayana Srirama Author: ........................................................................ \....." may 2010 Supervisor: ................................................................. \....." may 2010 Professor: ................................................... \....." may 2010 TARTU 2010 Contents 1 Introduction4 1.1 Goals......................................5 1.2 Prerequisites..................................6 1.3 Outline.....................................6 2 Background Knowledge7 2.1 IaaS Software.................................7 2.1.1 Amazon Services...........................7 2.1.2 Eucalyptus Services.......................... 10 2.1.3 Access Credentials.......................... 12 2.2 MapReduce and Apache Hadoop...................... 12 2.2.1 MapReduce.............................. 12 2.2.2 Apache Hadoop............................ 13 2.2.3 Apache Hadoop by an Example................... 14 3 Results 19 3.1 Autoscalable AMIs.............................. 19 3.1.1 Requirements............................. 19 3.1.2 Meeting the Requirements...................... 20 3.2 Hadoop Load Monitoring........................... 24 3.2.1 Heartbeat............................... 24 3.2.2 System Metrics............................ 25 3.2.3 Ganglia................................ 26 3.2.4 Amazon CloudWatch......................... 27 3.2.5 Conclusions.............................
    [Show full text]
  • Cloud Programming Simplified: a Berkeley View on Serverless
    Cloud Programming Simplified: A Berkeley View on Serverless Computing Eric Jonas Johann Schleier-Smith Vikram Sreekanti Chia-Che Tsai Anurag Khandelwal Qifan Pu Vaishaal Shankar Joao Carreira Karl Krauth Neeraja Yadwadkar Joseph E. Gonzalez Raluca Ada Popa Ion Stoica David A. Patterson UC Berkeley [email protected] Abstract Serverless cloud computing handles virtually all the system administration operations needed to make it easier for programmers to use the cloud. It provides an interface that greatly simplifies cloud programming, and represents an evolution that parallels the transition from assembly language to high-level programming languages. This paper gives a quick history of cloud computing, including an accounting of the predictions of the 2009 Berkeley View of Cloud Computing paper, explains the motivation for serverless computing, describes applications that stretch the current limits of serverless, and then lists obstacles and research opportunities required for serverless computing to fulfill its full potential. Just as the 2009 paper identified challenges for the cloud and predicted they would be addressed and that cloud use would accelerate, we predict these issues are solvable and that serverless computing will grow to dominate the future of cloud computing. Contents 1 Introduction to Serverless Computing 3 2 Emergence of Serverless Computing 5 2.1 Contextualizing Serverless Computing . ............ 6 2.2 Attractiveness of Serverless Computing . ............. 8 arXiv:1902.03383v1 [cs.OS] 9 Feb 2019 3 Limitations of Today’s Serverless Computing Platforms 9 3.1 Inadequate storage for fine-grained operations . ............... 12 3.2 Lack of fine-grained coordination . .......... 12 3.3 Poor performance for standard communication patterns . ............... 13 3.4 PredictablePerformance .
    [Show full text]
  • Engineering Degree Project Predictive Autoscaling of Systems Using
    Engineering Degree Project Predictive Autoscaling of Systems using Artificial Neural Networks Authors: Christoffer Lundström, Camilla Heiding Supervisor: Sogand Shirinbab Lnu Supervisor: Jonas Nordqvist Examiner: Jonas Lundberg Semester: Spring 2021 Subject: Computer Science Abstract Autoscalers handle the scaling of instances in a system automatically based on spec- ified thresholds such as CPU utilization. Reactive autoscalers do not take the delay of initiating a new instance into account, which may lead to overutilization. By ap- plying machine learning methodology to predict future loads and the desired number of instances, it is possible to preemptively initiate scaling such that new instances are available before demand occurs. Leveraging efficient scaling policies keeps the costs and energy consumption low while ensuring the availability of the system. In this thesis, the predictive capability of different multilayer perceptron configurations is investigated to elicit a suitable model for a telecom support system. The results indicate that it is possible to accurately predict future load using a multilayer percep- tron regressor model. However, the possibility of reproducing the results in a live environment is questioned as the dataset used is derived from a simulation. Keywords: autoscaling, predictive autoscaling, machine learning, artificial neu- ral networks, multilayer preceptrons, MLP-regressor, time series forecasting Preface Both of us would like to express our sincere gratitude to Ericsson, and especially Sogand Shirinbab whose guidance and support made this thesis project an invaluable experience for us. We are also grateful to Eldin Malkoc for endorsing us and helping us through the onboarding process. We would like to extend our appreciation and thankfulness to our supervisor Jonas Nordqvist for his generous support and willingness to provide valuable insights.
    [Show full text]
  • Autoscaling Mechanisms for Google Cloud Dataproc
    POLITECNICO DI TORINO Department of Control and Computer Engineering Computer Engineering Master Thesis Autoscaling mechanisms for Google Cloud Dataproc Monitoring and scaling the configuration of Spark clusters Supervisors: prof. Paolo Garza prof. Raphael Troncy Luca Lombardo Delivery Hero Company Supervisor Nhat Hoang Academic year 2018-2019 Contents 1 Introduction 1 1.1 The problem: Hadoop cluster autoscaling................2 1.2 Apache Hadoop..............................2 1.2.1 Hadoop HDFS..........................3 1.2.2 Hadoop YARN..........................4 1.3 Apache Spark...............................6 1.3.1 Scheduling of stages and tasks..................7 1.3.2 Dynamic Resource Allocation..................8 1.4 Google Cloud Platform..........................9 1.4.1 Cloud Storage Connector.....................9 1.5 Elastic for YARN............................. 10 2 A piece in the puzzle: OBI 13 2.1 Architecture................................ 13 2.1.1 Heartbeat............................. 14 2.1.2 Scheduler............................. 14 2.1.3 Predictive module......................... 15 2.1.4 API................................ 16 2.2 Authentication and authorization.................... 16 2.3 Fault tolerance.............................. 16 2.3.1 Deployment example on Kubernetes............... 17 3 State of the art 19 3.1 Google Cloud Dataflow.......................... 19 3.1.1 What we learned......................... 20 3.2 Shamash.................................. 21 3.2.1 What we learned......................... 22 3.3 Spydra................................... 23 3.3.1 What we learned......................... 24 3.4 Cloud Dataproc Cluster Autoscaler................... 24 3.4.1 What we learned......................... 25 iii 4 Design 27 4.1 The core points.............................. 27 4.2 The window logic for scaling up..................... 28 4.3 Selection of YARN metrics........................ 30 4.4 Downscaling................................ 32 5 Implementation 35 5.1 Go in a nutshell.............................
    [Show full text]
  • A Comprehensive Feature Comparison Study of Open-Source Container Orchestration Frameworks
    applied sciences Article A Comprehensive Feature Comparison Study of Open-Source Container Orchestration Frameworks Eddy Truyen * , Dimitri Van Landuyt, Davy Preuveneers , Bert Lagaisse and Wouter Joosen imec-DistriNet, KU Leuven, 3001 Leuven, Belgium; [email protected] (D.V.L.); [email protected] (D.P.); [email protected] (B.L.); [email protected] (W.J.) * Correspondence: [email protected]; Tel.: +32-163-735-85 Received: 23 November 2018; Accepted: 25 February 2019; Published: 5 March 2019 Featured Application: Practitioners and industry adopters can use the descriptive feature comparison as a decision structure for identifying the most suited container orchestration framework for a particular application with respect to different software quality properties such as genericity, maturity, and stability. Researchers and entrepreneurs can use it to check if their ideas for innovative products or future research are not already covered in the overall technological domain. Abstract: (1) Background: Container orchestration frameworks provide support for management of complex distributed applications. Different frameworks have emerged only recently, and they have been in constant evolution as new features are being introduced. This reality makes it difficult for practitioners and researchers to maintain a clear view of the technology space. (2) Methods: we present a descriptive feature comparison study of the three most prominent orchestration frameworks: Docker Swarm, Kubernetes, and Mesos, which can be combined with Marathon, Aurora or DC/OS. This study aims at (i) identifying the common and unique features of all frameworks, (ii) comparing these frameworks qualitatively and quantitatively with respect to genericity in terms of supported features, and (iii) investigating the maturity and stability of the frameworks as well as the pioneering nature of each framework by studying the historical evolution of the frameworks on GitHub.
    [Show full text]
  • An Auto-Scaling Framework for Analyzing Big Data in the Cloud Environment
    applied sciences Article An Auto-Scaling Framework for Analyzing Big Data in the Cloud Environment Rachana Jannapureddy, Quoc-Tuan Vien * , Purav Shah and Ramona Trestian Faculty of Science and Technology, Middlesex University, The Burroughs, London NW4 4BT, UK; [email protected] (R.J.); [email protected] (P.S.); [email protected] (R.T.) * Correspondence: [email protected]; Tel.: +44-208-411-4016 Received: 28 March 2019; Accepted: 29 March 2019; Published: 4 April 2019 Abstract: Processing big data on traditional computing infrastructure is a challenge as the volume of data is large and thus high computational complexity. Recently, Apache Hadoop has emerged as a distributed computing infrastructure to deal with big data. Adopting Hadoop to dynamically adjust its computing resources based on real-time workload is itself a demanding task, thus conventionally a pre-configuration with adequate resources to compute the peak data load is set up. However, this may cause a considerable wastage of computing resources when the usage levels are much lower than the preset load. In consideration of this, this paper investigates an auto-scaling framework on cloud environment aiming to minimise the cost of resource use by automatically adjusting the virtual nodes depending on the real-time data load. A cost-effective auto-scaling (CEAS) framework is first proposed for an Amazon Web Services (AWS) Cloud environment. The proposed CEAS framework allows us to scale the computing resources of Hadoop cluster so as to either reduce the computing resource use when the workload is low or scale-up the computing resources to speed up the data processing and analysis within an adequate time.
    [Show full text]
  • Autoscaling Tiered Cloud Storage in Anna
    Autoscaling Tiered Cloud Storage in Anna Chenggang Wu, Vikram Sreekanti, Joseph M. Hellerstein UC Berkeley fcgwu, vikrams, [email protected] ABSTRACT deal with a non-uniform distribution of performance require- In this paper, we describe how we extended a distributed ments. For example, many applications generate a skewed key-value store called Anna into an autoscaling, multi-tier access distribution, in which some data is \hot" while other service for the cloud. In its extended form, Anna is designed data is \cold". This is why traditional storage is assembled to overcome the narrow cost-performance limitations typi- hierarchically: hot data is kept in fast, expensive cache while cal of current cloud storage systems. We describe three key cold data is kept in slow, cheap storage. These access dis- aspects of Anna's new design: multi-master selective repli- tributions have become more complex in modern settings, cation of hot keys, a vertical tiering of storage layers with because they can change dramatically over time. Realistic different cost-performance tradeoffs, and horizontal elastic- workloads spike by orders of magnitude, and hot sets shift ity of each tier to add and remove nodes in response to and resize. These large-scale variations in workload moti- load dynamics. Anna's policy engine uses these mechanisms vate an autoscaling service design, but most cloud storage to balance service-level objectives around cost, latency and services today are unable to respond to these dynamics. fault tolerance. Experimental results explore the behavior The narrow performance goals of cloud storage services of Anna's mechanisms and policy, exhibiting orders of mag- result in poor cost-performance tradeoffs for applications.
    [Show full text]
  • PDF Download Exam Ref AZ-300 Microsoft Azure Architect
    EXAM REF AZ-300 MICROSOFT AZURE ARCHITECT TECHNOLOGIES PDF, EPUB, EBOOK Mike Pfeiffer | 320 pages | 25 Nov 2019 | Pearson Education (US) | 9780135802540 | English | Boston, United States Exam Ref AZ-300 Microsoft Azure Architect Technologies PDF Book Related exams AZ Microsoft Azure Architect Technologies This exam measures your ability to accomplish the following technical tasks: deploy and configure infrastructure; implement workloads and security; create and deploy apps; implement authentication and secure data; and develop for the cloud and Azure storage. Just read this blog where I have covered the exact difference between AZ certification and AZ certification. You will see how to configure the networking and storage components of virtual machines. For additional information, see the Global Shipping Program terms and conditions - opens in a new window or tab This amount includes applicable customs duties, taxes, brokerage and other fees. And perhaps the most exciting thing you will learn is how to use the Azure Resource Manager deployment model to work with resources, resource groups, and ARM templates. View details. After the retirement date, please refer to the related certification for exam requirements. Learn the types of storage and how to work with managed and custom disks. Manage and maintain the infrastructure for the core web apps and services that developers build and deploy. Learn more about requesting an accommodation for your exam. Students will also learn how to use Azure Site Recovery for performing the actual migration of workloads to Azure. Prepare for Microsoft Exam AZ —and help demonstrate your real-world mastery of architecting high-value Microsoft Azure solutions for your organization or customers.
    [Show full text]
  • Autoscaling Tiered Cloud Storage in Anna
    Autoscaling Tiered Cloud Storage in Anna Chenggang Wu, Vikram Sreekanti, Joseph M. Hellerstein UC Berkeley fcgwu, vikrams, [email protected] ABSTRACT cold data is kept in slow, cheap storage. These access dis- In this paper, we describe how we extended a distributed tributions have become more complex in modern settings, key-value store called Anna into an autoscaling, multi-tier because they can change dramatically over time. Realistic service for the cloud. In its extended form, Anna is designed workloads spike by orders of magnitude, and hot sets shift to overcome the narrow cost-performance limitations typi- and resize. These large-scale variations in workload moti- cal of current cloud storage systems. We describe three key vate an autoscaling service design, but most cloud storage aspects of Anna's new design: multi-master selective repli- services today are unable to respond to these dynamics. cation of hot keys, a vertical tiering of storage layers with The narrow performance goals of cloud storage services different cost-performance tradeoffs, and horizontal elastic- result in poor cost-performance tradeoffs for applications. ity of each tier to add and remove nodes in response to To improve performance, developers often take matters into load dynamics. Anna's policy engine uses these mechanisms their own hands by addressing storage limitations in custom to balance service-level objectives around cost, latency and application logic. This introduces significant complexity and fault tolerance. Experimental results explore the behavior increases the likelihood of application-level errors. Develop- of Anna's mechanisms and policy, exhibiting orders of mag- ers are inhibited by two key types of barriers when building nitude efficiency improvements over both commodity cloud applications with non-uniform workload distributions: KVS services and research systems.
    [Show full text]
  • Cutter IT Journal
    Cutter The Journal of IT Journal Information Technology Management Vol. 26, No. 9 September 2013 “One of the great things about the API economy is that it is Profiting in the API Economy based on existing business assets.... What were assets of fixed and known value suddenly become a potential Opening Statement source of seemingly unlimited by Giancarlo Succi and Tadas Remencius . 3 business opportunities.” The API Economy: Playing the Devil’s Advocate — Giancarlo Succi and by Israel Gat, Tadas Remencius, Alberto Sillitti, Giancarlo Succi, and Jelena Vlasenko . 6 Tadas Remencius, Unified API Governance in the New API Economy Guest Editors by Chandra Krintz and Rich Wolski . 12 How APIs Can Reboot Commerce Companies by Christian Schultz . 17 Tailoring ITIL for the Management of APIs by Tadas Remencius and Giancarlo Succi . 22 7 API Challenges in a Mobile World by Chuck Hudson . 30 NOT FOR DISTRIBUTION For authorized use, contact Cutter Consortium: +1 781 648 8700 [email protected] Cutter IT Journal About Cutter IT Journal Cutter IT Journal® Cutter Business Technology Council: Part of Cutter Consortium’s mission is to Cutter IT Journal subscribers consider the Rob Austin, Ron Blitstein, Tom DeMarco, Lynne Ellyn, Israel Gat, Vince Kellen, foster debate and dialogue on the business Journal a “consultancy in print” and liken Tim Lister, Lou Mazzucchelli, technology issues challenging enterprises each month’s issue to the impassioned Ken Orr, and Robert D. Scott today, helping organizations leverage IT for debates they participate in at the end of Editor Emeritus: Ed Yourdon competitive advantage and business success. a day at a conference.
    [Show full text]
  • Combining Analytics Framework and Cloud Schedulers in Order to Optimise Resource Utilisation in a Distributed Cloud
    KTH Royal Institute of Technology Master Thesis Combining analytics framework and Cloud schedulers in order to optimise resource utilisation in a distributed Cloud Author: Supervisor: Nikolaos Stanogias Ignacio Mulas Viela A thesis submitted in fulfilment of the requirements for the degree of Software Engineering of Distributed Systems July 2015 TRITA-ICT-EX-2015:154 Declaration of Authorship I, Nikolaos Stanogias, declare that this thesis titled, 'Combining analytics framework and Cloud schedulers in order to optimise resource utilisation in a distributed Cloud' and the work presented in it are my own. I confirm that: This work was done wholly or mainly while in candidature for a research degree at this University. Where any part of this thesis has previously been submitted for a degree or any other qualification at this University or any other institution, this has been clearly stated. Where I have consulted the published work of others, this is always clearly at- tributed. Where I have quoted from the work of others, the source is always given. With the exception of such quotations, this thesis is entirely my own work. I have acknowledged all main sources of help. Where the thesis is based on work done by myself jointly with others, I have made clear exactly what was done by others and what I have contributed myself. Signed: Date: i \Thanks to my solid academic training, today I can write hundreds of words on virtually any topic without possessing a shred of information, which is how I got a good job in journalism." Dave Barry KTH ROYAL INSTITUTE OF TECHNOLOGY Abstract Faculty Name Software Engineering of Distributed Systems Masters Degree Combining analytics framework and Cloud schedulers in order to optimise resource utilisation in a distributed Cloud by Nikolaos Stanogias Analytics frameworks were initially created to run on bare-metal hardware so they con- tain scheduling mechanisms to optimise the distribution of the cpu load and data allo- cation.
    [Show full text]