Engineering Degree Project Predictive Autoscaling of Systems Using

Engineering Degree Project Predictive Autoscaling of Systems using Artificial Neural Networks Authors: Christoffer Lundström, Camilla Heiding Supervisor: Sogand Shirinbab Lnu Supervisor: Jonas Nordqvist Examiner: Jonas Lundberg Semester: Spring 2021 Subject: Computer Science Abstract Autoscalers handle the scaling of instances in a system automatically based on spec- ified thresholds such as CPU utilization. Reactive autoscalers do not take the delay of initiating a new instance into account, which may lead to overutilization. By ap- plying machine learning methodology to predict future loads and the desired number of instances, it is possible to preemptively initiate scaling such that new instances are available before demand occurs. Leveraging efficient scaling policies keeps the costs and energy consumption low while ensuring the availability of the system. In this thesis, the predictive capability of different multilayer perceptron configurations is investigated to elicit a suitable model for a telecom support system. The results indicate that it is possible to accurately predict future load using a multilayer perceptron regressor model. However, the possibility of reproducing the results in a live environment is questioned as the dataset used is derived from a simulation. Keywords: autoscaling, predictive autoscaling, machine learning, artificial neural networks, multilayer preceptrons, MLP-regressor, time series forecasting Preface Both of us would like to express our sincere gratitude to Ericsson, and especially Sogand Shirinbab whose guidance and support made this thesis project an invaluable experience for us. We are also grateful to Eldin Malkoc for endorsing us and helping us through the onboarding process. We would like to extend our appreciation and thankfulness to our supervisor Jonas Nordqvist for his generous support and willingness to provide valuable insights. Christoffer would like to express his love and gratitude for his family, especially his partner Stina and his mother Linda who have been tremendously supportive and caring during trying times. Finally, he would like to express his gratitude to his project partner Camilla and commend her for outstanding devotion and attention to detail during the course of the project. Camilla directs a great amount of appreciation towards family and friends that helped keep her motivation up. To be cohabiting in an apartment during a period of only working from home has been particularly challenging, therefore some extra love is directed to her partner Johan. She would also like to return the appreciation to her project partner Christoffer for great commitment, knowledge, and optimism during the work on this thesis. Contents 1 Introduction1 1.1 Background.................................1 1.2 Related work................................1 1.3 Problem formulation............................2 1.4 Motivation..................................2 1.5 Milestones..................................3 1.6 Scope/Limitation..............................3 1.7 Target group.................................3 1.8 Outline...................................3 2 Theory5 2.1 Docker....................................5 2.2 Container orchestration...........................5 2.3 Autoscaling.................................6 2.4 Reactive and predictive autoscaling....................6 2.5 Autoscalers in the market..........................7 2.5.1 Kubernetes and the Horizontal Pod Autoscaler (KHPA)......7 2.5.2 Amazon EC2 Autoscaling.....................7 2.5.3 Google Compute Engine Autoscaling...............8 2.6 Machine learning..............................8 2.6.1 Overview and terminology.....................8 2.6.2 Validation..............................9 2.6.3 Model selection...........................9 2.6.4 Feature selection.......................... 10 2.6.5 Feature scaling........................... 10 2.6.6 Model evaluation.......................... 11 2.6.7 Artificial neural networks..................... 12 2.7 Time series data............................... 13 3 Method 16 3.1 Research Project.............................. 16 3.2 Method................................... 16 3.2.1 Literature review.......................... 16 3.2.2 Controlled experiment....................... 17 3.2.3 Data preprocessing......................... 18 3.2.4 Feature selection and scaling.................... 19 3.2.5 Cross-validation with time series split............... 19 3.2.6 Hyperparameter tuning....................... 19 3.2.7 Model evaluation.......................... 20 3.2.8 Predictive autoscaling evaluation.................. 20 3.3 Reliability and Validity........................... 20 3.4 Ethical considerations............................ 21 4 Implementation 22 4.1 Data preprocessing............................. 22 4.2 Model compilation............................. 23 4.3 Grid search with time series cross-validation................ 24 5 Experimental Setup and Results 25 5.1 Experimental setup............................. 25 5.2 Results.................................... 25 6 Analysis 35 7 Discussion 37 7.1 Model validity................................ 37 7.2 Predictions as scaling policy........................ 38 7.3 Further improvements............................ 38 7.4 Connections to related work........................ 39 8 Conclusion 41 8.1 Future work................................. 41 References 42 1 Introduction This chapter introduces the background and motivation for the problem investigated. Re- lated work is presented, and the focus points for this thesis are elicited in the problem formulation. Following chapters are outlined to guide the reader through the report. 1.1 Background Autoscaling is a cloud computing feature that enables organizations to scale cloud services such as server capacities, virtual machines, or pods up or down automatically, based on pre-defined thresholds for resource utilization levels. The overall benefit of autoscaling is that it eliminates the need to respond manually in real-time to traffic spikes that merit new resources and instances by automatically changing the active number of servers. Each of these servers requires configuration, monitoring, and decommissioning, which is the core of autoscaling. Core autoscaling features also allow lower cost and reliable performance by seamlessly increasing and decreasing new instances as demand spikes and drops [1]. Cloud computing providers such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP), offer autoscaling tools. However, most of these tools use threshold-based mechanisms to control the autoscaling process which are not very accu- rate for complex applications such as telecom support systems since they do not consider the start-up time of new instances. A study by Casalicchio [2] has shown that threshold based autoscalers underestimate the number of pods required to keep CPU utilization at a low enough level to satisfy the quality of service constraints on response time. This thesis is done in cooperation with Ericsson and will target parts of their telecom support systems. Ericsson is a multinational telecommunications company providing services such as software and hardware infrastructure worldwide. Predicting future resource demand based on metrics provided by telecom support systems could potentially achieve more efficient scaling and, by extension, lower response time, resource utilization, and cost. The purpose of this thesis is to propose a machine learning model that can be used to learn and predict the future load on an application. The results obtained in this thesis will be used as a foundation for a software framework that provides recommendations regarding when it is a good time to scale and which application should be scaled up or down. 1.2 Related work The report [3] by Jiang et al. uses linear regression to predict the average number of web requests in the coming hour to adapt the resource capacity accordingly. The motivation for their work is that launching a virtual machine (VM) suffers a delay of considerable length and by using a predictive approach they can optimize the price-performance ratio and preempt the demand for VMs when the load increases. The authors of the report have found seasonality in web requests. For example, in their data, more email services are requested on Monday mornings and the total number of web requests drops around midnight. The seasonality improves the ability to predict the future. Their conclusion is that their approach indicates a better price-performance ratio than other methods. The doctoral thesis [4] by Yadavarnikravesh addresses the issue that reactive autoscaling neglects the boot-up time of VMs which leads to under-provisioning. However, the author 1 raises the issue that existing predictive autoscaling provides limited accuracy which deters cloud clients from using them. Autoscalers using algorithms such as artificial neural networks (ANN) and support vector machines (SVM) are implemented in the thesis to achieve greater accuracy. The author also experiments with time series window sizes to determine whether it is beneficial to use a broad window of previous data to predict the future or if trends are better captured with recent data. Both of the theses [3][4] compare their solutions with autoscaling approaches offered by Amazon and both indicate improvement over threshold based autoscalers. 1.3 Problem formulation The intent of this thesis is to develop a machine learning model that predicts the optimal number

Engineering Degree Project Predictive Autoscaling of Systems Using

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support