INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 03, MARCH 2020 ISSN2277-8616

Impact Of Parallelism And Virtualization On Task Scheduling In Cloud Environment

Sanjay Kumar Sharma, Nagresh Kumar

Abstract- Today’s, has placed itself in every field of the IT industry. It provides infrastructure, platform, and software as an amenity to users which are effortlessly available via the internet. It has a large number of users and has to deal with a large number of task executions which needs a suitable task scheduling algorithm. Virtualization and parallelism are the core components of cloud computing for optimization of resource utilization and to increase system performance. Virtualization is used to provide IT infrastructure on demand. To take more advantages of virtualization, parallel processing plays an important role to improve the system performance. But it is important to understand the mutual effect of parallel processing and virtualization on system performance. This research helps to better understand the effect of virtualization and parallel processing. In this paper, we studied the effect of parallelization and virtualization for the scheduling of tasks to optimize the time and cost. We found that virtualization along with parallelization boosts the system performance. The experimental study has been done using the CloudSim simulator.

Keywords: virtualization; parallel programming; distributed processing ——————————  —————————— 1. INTRODUCTION Parallelism and virtualization are the key components of In the present world of information technology, cloud cloud computing. Now, the question is how parallelism and computing emerges as a new computing technology due to virtualization affect each other to improve system its economical and operational benefits. Cloud computing is performance? Does it prevent or boost system able to perform the processing of an enormous amount of performance? What are the other parameters which also data using high computing capacity and distributed servers. affect the parallelism and virtualization? This research Clients are facilitated to avail of this facility on the basis of paper is organized as follows: the literature review of pay-per-use policy. When the users need changes, the previous work is described in section-2. Section-3 contains cloud server’s capacity scales up, and down to meet the the details about the proposed methodology for user requirements. It is highly flexible, reduces capital parallelization and virtualization. Section-4 describes the expenditure, robust disaster recovery and can operate from experimental setting and simulation result. Section-5 anywhere through the internet. Users can avail these describes future work and conclusion. services by just submitting the request to the environment provided by the service provider. Parallel processing is a 2. RELATED WORK computing technique in which more than one is Parallel architecture is an organization of resources to executed simultaneously on different processors. Multiple maximize resource utilization and improve system processes solve a given problem efficiently by executing on performance. The organisation of resources is categorised multiple processors. The divide and conquer technique is into three parts; processor (SMP), cluster used to divide a task into multiple subtasks. A parallel parallel computers and hybrid. SMP is also called multiple program written on the basis of the divide and conquer processors in which each processor access a single shared technique execute subtask on multiple processors. The memory. So, a task divided into subtask must be shared the requirement of high computation power cannot be full-fill by same memory and communication between subtasks take a single CPU. So parallel processing can improve the place via shared variables. In a cluster parallel computers, system computation power by increasing the number of more than one computing node with its own memory and CPUs. It is the best cost-effective technique to enhance processor are connected to each other. Hence two system computation power. This technique can also be processor needs to send a message to each other node in used in load balancing in cloud computing [1]. Virtualization order to exchange variable values. A hybrid architecture is is a key component of cloud computing. Virtualization is a just an interconnected SMP. Communication with SMP mechanism that is used to create interactive environments takes place through a variable value in shared memory but like server, storage, , desktop, etc, which the SMP node communicates via message passing. Cluster is expected by the user. Hardware virtualization based on a architecture is highly scalable [5]. OpenMP is a standard set of software is also called a hypervisor or virtual machine application program interface (API) that consists of a set of manager(VMM). This software operates on each VM directives and subroutines [6].OpenMPI is a shared instance in complete isolation. A high-performance server, memory programming environment that based on several machines needs suitable and customized supports C/C++ and Fortran programming language. A software on demand. This approach helps to deploy tasks program can write and execute a parallel program in the parallelly to the available resources on-demand, such as OpenMP environment by dividing the task into an VMware, AmazonEC2, vCloud, RightScale, etc. [1][2][3][4] independent task [7]. Virtualization is a technology that is used to provide better ———————————————— resource utilization. A different operating system can be installed on different virtual machines(VMs) to increase  Sanjay Kumar Sharma, Department of Computer Science, Banasthali Vidyapith, Rajsthan, India. E-mail: system flexibility. A hypervisor runs on the top of hardware. [email protected] A hypervisor is also called a virtual machine manager  Nagresh Kumar, Department of Computer Science, Banasthali (VMM). Hypervisor hides the actual physical hardware and Vidyapith, Rajsthan, India. Email: [email protected] provides virtual resources as per user expectations. The hypervisor is also responsible to protect hardware from 5346 IJSTR©2020 www.ijstr.org INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 03, MARCH 2020 ISSN2277-8616 malware unauthorised access. All the external requests are Where cap(i) is the processing strength of individual managed by the hypervisor. Virtualization and elements. parallelization are the key components in cloud computing to improve system performance; it also placed some overhead on the complete system. The overhead cores vm1 vm2 cores vm1 vm2 2 2 percentage depends upon the host configuration. Perera et t3 t4 t6 t8 t4 t8 t3 t7 al.[8] compare the hypervisors VMware[9] and Xen[8]. 1 t1 t2 t5 t7 1 t2 t6

Authors conclude that Xen is much better than VMware t1 t5 concern to paravirtualized and VMware is much better than time time (a) (b)

Xen concern to fully-virtualized [10]. Tafa et al.[11] study

the behavior of various parameters like memory utilization, cores cores transfer time, and CPU consumption in five different 2 2 t8 t4 t8 vm2 t7 vm2 hypervisors. The authors concluded that KVM imposes t6 more overhead than other hypervisors. To study the effect t3 t7 vm1 t5 vm1 1 1 t4 of parallelism and virtualization Xu et al.[12] compares the t2 t6 vm2 t3 vm2 t2 performance of OpenMP API, MPI and hybrid paradigm in t1 t5 vm1 t1 vm1

Xen hypervisor. The authors concluded that an idea (c) time () time can be achieved in SMP. If the number of virtual CPU is not more than physical CPU a linear speedup is achieved. Sharma S.K. et al. found that the parallelization Fig.1: Provisioning policies for cloudlets and VMs (a) of a sequential process improves performance. The result Cloudlets: Space-shared, VMs: Space-shared (b) Cloudlets: shows that the is approximately twice Time-shared, VMs: Space-shared (c) Cloudlets: Space- faster than the sequential algorithm. This work can also be shared, VMs: Time-shared (d) Cloudlets: Time-shared, used for real-time implementation to a large extent and a VMs: Time-shared high-performance system.[13] Figure 1(b), showing a space-shared policy for VMs and 3. METHODOLOGY time shared policy for cloudlets. All the tasks are assigned For the experimental study, we use the CloudSim simulator simultaneously during the lifetime of VMs. When one VM and configure the hypervisor to analyse the parameters of completes its lifetime then another VM start executed. parallelism and virtualization to improve the system Estimated finish time of a Cloudlet managed by a VM is performance. given by

( ) ( ) VM Allocation Model ( ) To exhibits, the simulation result on CloudSim simulator, space-shared and time-shared allocation policy for both Where ct is the current simulation time, and cores (p) is the cloudlets and VMs are provisioned [14]. The execution number of PEs required by the cloudlet. The total policy of cloudlets and VMs with time-shared and space- processing capacity of Cloud host is given by- shared provisioning is clearly shown in fig.-1. In this figure, the execution of 8 tasks (t1, t2,…, t8) is shown with a host ∑ ( ) with 2 CPU cores and two VMs per core. ( ) (∑ ( ) ) In fig-1(a) space-shared provisioning is taken for cloudlets and VMs. In space shared only one VM can run at a time. Where cap(i) is the processing strength of individual Since each VM has two cores, so at a time two tasks can elements. be executed simultaneously as shown in fig-1(a). According In Figure 1(c), a time-shared allocation policy for VMs and to space shared policy of VMs and cloudlets estimated space shared policy for tasks is taken. In this scenario, VMs finish time(eft) of a task p managed by the virtual machine i assign a time slice and with each time slice context switch is given by take place on available VMs. At the given instance of time slice, only one task can execute. In fig-1(d) time shared

( ) ( ) ( ) allocation policy for tasks and VMs are taken. The ( ) processing power of VMs is shared between tasks. There is no queuing delay in this scenario. Where est(p) is task execution start time and rl is a total Workload Variability [15] number of instructions executed on a processor. The result When evaluating the mutual effect of virtualization and eft(p) depends on the status of a task in the queue because parallelization, a particular set of applications exhibit tasks are placed in a ready queue if and only if the required different results under different performance parameters. number of free PEs are available in VMs. The total Small changes in host configuration, cloudlets and VM capacity of a host with np processing elements (PEs) is allocation policy, queuing delay, task scheduling algorithms, given by: nature of hypervisors and other hardware platforms

produce a completely different result. The dimensions of ( ) ∑ ( ) virtualization and parallelization modeling are shown in the tables.

Datacentre/Host/Virtual Machine Configuration 5347 IJSTR©2020 www.ijstr.org INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 03, MARCH 2020 ISSN2277-8616

Datacentre Parameters Prov Prov SS ST TS TT String arch = "x86"; // system architecture SS ST TS TT VM String os = "Linux"; // operating system VM String vmm = "Xen"; 2 5.6 5.6 10.1 10.1 Number of host = 2 2 10.6 10.6 20.1 20.1 4 3.1 3.1 5.1 5.1 4 5.6 5.6 10.1 10.1 Host Parameters 8 3.1 3.1 5.1 5.1 8 3.1 3.1 5.1 5.1 int ram = 65535; //host memory (MB) 16 3.1 3.1 5.1 5.1 long storage = 1000000; //host storage 16 3.1 3.1 5.1 5.1 int bw = 10000; 32 3.1 3.1 5.1 5.1 core = each host is quad core 32 3.1 3.1 5.1 5.1

VM Parameters Note: We have used ‘Prov’ as a abbreviation for provision. long size = 10000; //image size (MB) int ram = 512; //vm memory (MB) int mips = 1000; long bw = 1000; Cloudlets=20,PE's/VM=1 int pesNumber = n; //n number of cpus

String vmm = "Xen"; //VMM name 15 10 Cloudlet parameters 5 long length = 1000; TS=T long fileSize = 300; 0 TSS=ST long outputSize = 300; 0 2 4 6 8 10 12 14 16

int pesNumber = 1; TurnarounTime No. of virtual Machines

VM Scheduling Policy: Time shared and space shared.

Cloudlet Scheduling Policy: Time shared and space Fig-2: Corresponding to Table-1 shared.

Simulation Results Cloudlets=20,PE's/VM=1 Simulation results are recorded as average turnaround time

(in simulator time unit) in four different tables as given 12 below. Four different abbreviations are used as- 10 8 6 (i) SS-Space share for VMs and Space shared for cloudlets. 4 (ii) ST-Space share for VMs and Time shared for cloudlets. 2 TS=TT 0 (iii) TS-Time share for VMs and Space shared for cloudlets. SS=ST

(iv) TT-Time share for VMs and Time shared for cloudlets. TurnarounTime 0 2 4 6 8 10 12 14 16 No. of virtual Machines TABLE 1 Cloudlets=20,PEs/VM=1 TABLE 2 Cloudlets=20,PEs/VM=2

Prov Prov SS ST TS TT SS ST TS TT Fig-3: Corresponding to Table-2 VM VM 2 3.1 3.1 5.1 5.1 2 5.6 5.6 10.1 10.1 Cloudlets=40,PE's/VM=1 4 1.9 1.9 2.6 2.6 4 3.1 3.1 5.1 5.1

22 8 1.9 1.9 2.6 2.6 20 8 1.9 1.9 2.7 2.7 18 16 16 1.9 1.9 2.6 2.6 14 16 1.9 1.9 2.7 2.7 12 10 32 1.9 1.9 2.6 2.6 8 32 1.9 1.9 2.7 2.7 6 64 1.9 1.9 2.6 2.6 4 TS=TT

64 1.9 1.9 2.7 2.7 2 TurnarounTime 0 SS=ST 0 2 4 6 8 10 12 14 16 TABLE 3 Cloudlets=40,PEs/VM=1 TABLE 4 Cloudlets=40,PEs/VM=2 No. of virtual Machines

Fig-4: Corresponding to Table-3

5348 IJSTR©2020 www.ijstr.org INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 03, MARCH 2020 ISSN2277-8616

5. Space shared provision for cloudlets and VMs are Cloudlets=40,PE's/VM=2 more energy-efficient than time shared because there is a less number of switching occurs.

15 6. The divergence rate of time shared provisioning of 10 VMs is faster than space shared provisioning as shown in all four graphs. TS=TT 5 Now, it is clear from the graphs, recorded data in four SS=ST

0 different tables and result analysis that TurnarounTime 0 2 4 6 8 10 12 14 16 reduces the average turnaround time and increases the overall system performance. The availability of multicore No. of virtual Machines VMs results in independent execution of tasks and hence more speedup can be achieved. More virtualization generates extra overheads that should be minimized A Fig-5: Corresponding to Table-4 hypervisor is responsible to schedule the available physical resources and allocates them to different installed virtual Observations: machines. So a hypervisor must consider the points raised above. Selection of scheduling algorithm plays a key role in 1. Space shared virtual machine policy performs better the hypervisor to increase system performance and better than time shared virtual machine in terms of Average resource utilization. A suitable scheduling algorithm turnaround time. minimizes the waiting time of VM. This policy makes VM 2. If PEs per VM increases then turnaround time unaware of the existence of any competition from other decreases in all four scenarios. VMs on the available resources. To schedule the task on 3. Exponential increment in virtual machine results in an the best resources, an optimization algorithm should be exponential decrement in turnaround time for a short implemented in the hypervisor. Due to the heterogeneous duration and then constant throughout the execution nature of a large number of VMs and cloudlets an intelligent of tasks. program is also needed for parallel execution. 4. More switching occurs when virtual machines and cloudlets are provisioned in time shared policy. 5. CONCLUSION 5. The multicore virtual machine performs better than Cloud computing is leading technology in the field of single-core virtual machines. information technology. Virtualization and parallelization are the key components of cloud computing. Parallelization and 4. ANALYSIS virtualization definitely improve system performance. The In the above simulation result, we have taken only four maximum benefits of virtualization are taken by parallel variable parameters i.e. number of cloudlets, VMs allocation execution of tasks on virtualized VM and parallelization is policy, Cloudlet allocation policy and a number of PEs per possible only through feasible virtualization. Parallelization VMs. All the other parameters are kept constant. Size of and virtualization create extra overhead in cloud computing. cloudlets is constant in all scenarios. In this paper, the mutual effect of virtualization and parallelization are properly analysed so that a hypervisor 1. As we increase the number of cloudlets by keeping can take maximum benefits concern to average turnaround other parameters constant, Average turnaround time time. The next step is to overcome the extra overhead of increases. Since the size of all cloudlets is the same virtualization and parallelization. so definitely turnaround time will increase in all scenarios. REFERENCES 2. If we increase the number of VM then turn around [1] Raj kumar Buyya , Christian Vecchiola , S.Thamarai time decreases to a certain number of VM and then Selvi “Mastering Cloud Computing: Foundations and constant because the required number of resources Applications Programming” page 31-35, ISBN-13: 978- is available and there is no change in the number of 0124114548,Morgan Kaufmann publisher, 1st switching. Edition,2013. 3. As we increase the number of core and PEs per VM [2] D. Ruest and N. Ruest, “Virtualization: a beginner’s then more number of tasks can execute in a guide,” McGraw Hill, 2009. particular unit of time in space shared, and time [3] S. Zaghloul, “Virtualization: key to cloud computing,” shared provisioning. In space share, one virtual MASAUM International Conference on Information machine can execute at a time. If a virtual machine Technology (MICIT’12), 2012. has n number of PEs then n cloudlets can execute at [4] X. Chen and X. Huo, “Cloud computing research and a time. In time shared more number of cloudlets can development trend,” In the IEEE Second International be executed with less number of switching. Conference on Future Networks, pp. 93-97, 2010. 4. If the number of available VMs is equal or more than [5] A. Kaminsky, “Building parallel programs: SMPs, the required VM as per the ready task then there is clusters and Java”. Course Technology, Cengage no effect of cloudlet allocation policy and VMs Learning, 2010. allocation policy on turnaround time as shown in [6] The OpenMP® API specification for parallel tables. programming. “www..org”.

5349 IJSTR©2020 www.ijstr.org INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 03, MARCH 2020 ISSN2277-8616

[7] P. Pacheco, “Parallel programming with MPI,” Morgan Kaufmann Publishers, Inc., San Francisco, CA, 1997. [8] P. Perera and C. Keppitiyagama, “A Performance Comparison of Hypervisors,” The International Conference on Advances in ICT for Emerging Regions (ICTer2011), 2011. [9] Xen® Hypervisor. “www.xen.org”. [10] VMware® Hypervisor. “www.vmware.com”. [11] I. Tafa, E. Beqiri, H. Paci, E. Kajo and A. Xhuvani, “The evaluation of transfer time, CPU consumption and memory utilization in Xen-PV, Xen-HVM, OpenVZ, KVM-FV and KVM-PV hypervisors using FTP and HTTP approaches,” The Third International Conference on Intelligent Networking and Collaborative Systems, 2011. [12] X. Xu, F. Zhou, J. Wan and Y. Jiang, “Quantifying performance properties of virtual machines,” The International Symposium on Information Science and Engineering, 2008. [13] Sharma S. K, Gupta Kusum, “Parallel Performance of Numerical Algorithms on Multi-core System Using OpenMP”, Proceedings of the Second International Conference on Advances in Computing and Information Technology (ACITY) and Springer book on Advances in Computing and Information Technology, ISBN 978-3- 642-31552-7, Volume 2, 2012 [14] Rodrigo N. Calheiros, Rajiv Ranjan, Anton Beloglazov, Cesar A. F. De Rose Buyya Raj kumar, “CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms” Software—Practice & Experience archive, Volume 41 , Pages 23-50,Issue 1, ISSN:0038- 0644January 2011 [15] Melynda Eden. A survey of performance modeling and analysis issues in resource management across x86- based hypervisors in enterprise cloud computing deployments. http://www.cse.wustl.edu/~jain/cse567- 11/ftp/hypervsr/, 2011

5350 IJSTR©2020 www.ijstr.org