View metadata, citation and similar papers at core.ac.uk brought to you by CORE COMPUTING 2011 : The Second International Conference on , GRIDs, and Virtualization provided by SZTE Publicatio Repozitórium - SZTE - Repository of Publications

FCM: an Architecture for Integrating IaaS Cloud Systems

Attila Csaba Marosi, Gabor Kecskemeti, Attila Kertesz, Peter Kacsuk MTA SZTAKI Computer and Automation Research Institute of the Hungarian Academy of Sciences H-1528 Budapest, P.O. 63, Hungary Email: {atisu, kecskemeti, keratt, kacsuk}@sztaki.hu

Abstract—Cloud Computing builds on the latest achieve- computing infrastructures by allowing their users to in- ments of diverse research areas, such as Grid Computing, stantiate virtual appliances on their virtualized resources Service-oriented computing, business processes and virtu- as virtual machines. alization. In this paper, we reveal open research issues by envisaging a federated cloud that aggregates capabilities Nowadays, several public and private IaaS systems of various IaaS cloud providers. We propose a Federated co-exist and to accomplish dynamic service environ- Cloud Management architecture that acts as an entry point ments users frequently envisage a federated cloud that to cloud federations and incorporates the concepts of meta- aggregates capabilities of various IaaS cloud providers. brokering, cloud brokering and on-demand service deploy- These IaaS systems are either offered by public ser- ment. The meta-brokering component provides transparent service execution for the users by allowing the system to vice providers (like Amazon [3] or RackSpace [4]) or interconnect the various cloud broker solutions available by smaller scale privately managed infrastructures. We in the system. Cloud brokers manage the number and the propose an autonomic resource management solution location of the utilized virtual machines for the received that serves as an entry point to this cloud federation service requests. In order to fast track the virtual machine by providing transparent service execution for users. instantiation, our architecture uses the automatic service deployment component that is capable of optimizing service The following challenges are of great importance for delivery by encapsulating services as virtual appliances such a mediator solution: varying load of user requests, in order to allow their decomposition and replication enabling virtualized management of applications, estab- among the various IaaS cloud infrastructures. Our solution lishing interoperability, minimizing Cloud usage costs is able to cope with highly dynamic service executions and enhancing provider selection. by federating heterogeneous cloud infrastructures in a transparent and autonomous manner. This paper proposes a layered architecture that in- Keywords—cloud federation; cloud brokering; IaaS; vir- corporates the concepts of meta-brokering, cloud bro- tual appliance. kers and automated, on-demand service deployment. The meta-brokering component allows the system to interconnect the various cloud brokers available in the I.INTRODUCTION system. The cloud broker component is responsible for Highly dynamic service environments [1] require a managing the virtual machine instances of the particular novel infrastructure that can handle the on demand de- virtual appliances hosted on a specific infrastructure as ployment and decommission of service instances. Cloud a service provider. Our architecture organizes the virtual Computing [2] offers simple and cost effective outsourc- appliance distribution with the automatic service deploy- ing in dynamic service environments and allows the con- ment component that can decompose virtual appliances struction of service-based applications extensible with to smaller parts. With the help of the minimal man- the latest achievements of diverse research areas, such as ageable virtual appliances the Virtual Machine Handler Grid Computing, Service-oriented computing, business rebuilds these decomposed parts in the IaaS system processes and virtualization. Virtual appliances (VA) chosen by the meta-broker. As a result, the cloud broker encapsulate metadata (e.g., network requirements) with component uses the VM Handler to maintain the number a complete software system (e.g., , of virtual machines according to the demand. software libraries and applications) prepared for exe- Related works have identified several shortcomings cution in virtual machines (VM). Infrastructure as a in the current cloud infrastructures [5]: e.g., feder- Service (IaaS) cloud systems provide access to remote ated clouds will face the issue of and self-

Copyright (c) IARIA, 2011. ISBN: 978-1-61208-153-3 7 CLOUD COMPUTING 2011 : The Second International Conference on Cloud Computing, GRIDs, and Virtualization

management similarly to Grid systems, or users of the al. [8] are building complex grid infrastructures on top cloud systems should be in control of their computing of IaaS cloud systems, that allow them to adjust the costs. We propose an architecture that aims at both number of grid resources dynamically. They focus on of these problems by allowing users to utilize meta- the capability of using resources from different cloud brokering between public and private cloud systems as providers and on the capability of providing resources for a result lowering their operation costs. Our architecture different grid middleware, but meta-scheduling between also handles the issue of scalability by offering the cloud the utilized infrastructures and developing a model, that brokers that manage the virtual machines according to considers the different cloud provider characteristics is the actual demands of the user applications. not addressed. This paper is organized as follows: first, we introduce In 2009, launched Amazon the related research results in Section II. Then, we CloudWatch [9], that is a supplementary service for discuss an advanced use case in Section III that involves Amazon EC2 instances that provides monitoring services our proposed architecture and discusses its advantages in for running virtual machine instances. It allows to gather contrast to previous research results. Next, we detail the information about the different characteristics (traffic operational roles of the brokering components in our ar- shape, load, disk utilization, etc.) of resources, and based chitecture in Section III-A and Section III-B. Afterwards, on that users and services are able to dynamically start in Section IV, we discuss an optimization approach to or release instances to match demand as utilization rebuild virtual appliances within the virtual machine goes over or below predefined thresholds. The main that is used to execute them. Finally, we conclude our shortcoming is that this solution is tied to a specific IaaS research in Section V. cloud system and introduces a monetary overhead, since the service charges a fixed hourly rate for each monitored II.RELATED WORK instance. Matthias Schmidt et al. [6] investigate different strate- Mohsen Amini et al. [10] are focusing on so called gies for distributing virtual machine images within a marketing-oriented scheduling policies, that can provi- : unicast, multicast, binary tree distribution sion extra resources when the local cluster resources and peer-to-peer distribution based on . They are not sufficient to meet the user requirements. Former found the multicast method the most efficient, but in scheduling policies used in grids are not working effec- order to be able to distribute images over network tively in cloud environments, mainly because Infrastruc- boundaries ("cross-cloud") they choose BitTorrent. They ture providers are charging users in a pay- also propose to use layered virtual machine images as-you-go manner in an hourly basis for computational for virtual appliances consisting of three layers: user, resources. To find the trade-off between to buy acquired vendor and base. By using the layers and a copy-on- additional resources from IaaS and reuse existing lo- write method they were able to avoid the retransmission cal infrastructure resources he proposes two scheduling of images already present at the destination and thus policies (cost and time optimization scheduling policies) decrease instantiation time and network utilization. The for mixed (commercial and non-commercial) resource authors only investigated distribution methods within the environments. Basically two different approaches were boundaries of a single data center, going beyond that identified on provisioning commercial resources. The remained future work. first approach is offered by the IaaS providers at re- There are several related works focusing on providing source provisioning level (user/application constraints dynamic pool of resources. Paul Marshall et al. [7] are neglected: deadline, budget, etc.), the other approach describe an approach for developing an "elastic site" deploys resources focusing at user level (time and/or cost model where batch schedulers, storage and web services minimization, estimating the workload in advance, etc.). can utilize such resources. They introduce different ba- sic policies for allocating resources, that can be "on- III.FEDERATED CLOUD MANAGEMENT demand" meaning resources are allocated when a service ARCHITECTURE call or task arrives, "steady stream" assumes steady uti- Figure 1 shows the Federated Cloud Manage- lization, thus leaves some elastic resources continuously ment (FCM) architecture and its connections to the running, regardless of the (temporary) shortage of tasks, corresponding components that together represent an or "bursts" for fluctuating load. They concentrate on interoperable solution for establishing a federated cloud dynamically increasing and decreasing the number of environment. The FCM targets the problem area outlined resources, but rely on third party logic for balancing load in the Introduction, and provides solutions for most of among the allocated resources. Constantino Vázquez et the listed open issues. In the following, we exemplify

Copyright (c) IARIA, 2011. ISBN: 978-1-61208-153-3 8 CLOUD COMPUTING 2011 : The Second International Conference on Cloud Computing, GRIDs, and Virtualization

the interaction of the main components of this solution process the CloudBroker also handles the queue of through a low level use case. the incoming service calls. As a result, these calls are In this scenario we restrict our solution to sup- dispatched to the available VMs created in the previously port standard stateless web services described with discussed manner. WSDL [11]. Using this solution, users are able to execute In order to optimize service executions in highly services deployed on cloud infrastructures transparently, dynamic service environments, our architecture orga- in an automated way. Virtual appliances for all services nizes the distribution as a background should be stored in a generic repository called FCM process with the automatic service deployment compo- Repository, from that they are automatically replicated nent that can decompose virtual appliances to smaller to the native repositories of the different Infrastructure parts. With the help of the minimal manageable virtual as a Service cloud providers. appliances (MMVA – further discussed in Section IV) When a user sends a service call to the system, the Virtual Machine Handler is able to rebuild these he/she submits a request to the “Generic Meta-Broker decomposed parts in the IaaS system on demand, that Service” (GMBS) specifying the requested service with results in faster VA deployment and in a reduced storage a WSDL, the operation to be called, and its possible requirement in the native repositories. input parameters. The GMBS checks if the service has In the following, subsections we detail how resource an uploaded VA in the generic repository, then it selects a management is carried out in this architecture. At the suitable CloudBroker for further submission. The match- top-level, a meta-broker is used to select from the making is based on static data gathered from the “FCM available cloud providers based on performance metrics, Repository” (e.g., service operations, WSDL), and on while at the bottom-level, IaaS-specific CloudBrokers are dynamic information of special deployment metrics gath- used to schedule VA instantiation and deliver the service ered by the CloudBrokers. Currently we use the average calls to the clouds. VA deployment time and the average service execution A. Top-level Brokering in FCM time for each VA. VA deployment time assumes that the As we already mentioned in the scenario discussed in native repository already has the requested VA, thus in- the previous section, brokering takes place at two levels cludes only the service provision time on a specific IaaS in the FCM architecture: the service call is first submitted cloud. The role of GMBS is to manage autonomously to the Generic Meta-Broker Service (GMBS – that is a the interconnected cloud infrastructures with the help of revised and extended version of the Grid Meta-Broker the CloudBrokers by forming a federation. Service described in [12]), where a top-level decision Each “CloudBroker” has an own queue for storing is made to that cloud infrastructure the call should be the incoming service calls (called Q1 and Q2 in Fig- forwarded. Then the service call is placed in the queue ure 1), and manages one virtual machine queue for each of the selected CloudBroker, where the bottom-level VA (VAx → VMQx). Virtual machine queues represent brokering is carried out to select the VM that performs the resources that currently can serve a virtual appliance the actual service execution. This bottom-level brokering specific service call. The main goal of the CloudBroker is and the detailed introduction of the architecture of the to manage the virtual machine queues according to their CloudBroker is discussed later in Section III-B. respective service demand. The default virtual machine Now, let us turn our attention to the role of GMBS. scheduling is based on the currently available requests An overview of its architecture is shown in Figure 2. in the queue, their historical execution times, and the This meta-brokering service has five major components. number (n, m, o, p) of running VMs. The secondary task The Meta-Broker Core is responsible for managing the of the CloudBroker involves the dynamic creation and interaction with the other components and handling user destruction of the various VMQs. interactions. Virtual Machine Handler (“VM Handler”) components The MatchMaker component performs the scheduling are assigned to each virtual machine queue. These of the calls by selecting a suitable broker. This decision components process the virtual machine creation and making is based on aggregated static and dynamic data destruction requests placed in the queue. The requests are stored by the Information Collector (IC) component in translated and forwarded to the corresponding IaaS sys- a local . The Information System (IS) Agent tem (Clouda). This component is a cloud infrastructure- is implemented as a listener service of GMBS, and it specific one, that uses the public interface of the man- is responsible for regularly updating static information aged infrastructure. from the FCM Repository on service availability, and ag- Independently from the virtual machine scheduling gregated dynamic information collected from the Cloud-

Copyright (c) IARIA, 2011. ISBN: 978-1-61208-153-3 9 CLOUD COMPUTING 2011 : The Second International Conference on Cloud Computing, GRIDs, and Virtualization

Submit

Generic Meta-Broker Service

lookup submit submit CloudBroker a CloudBroker b FCM deploymentmetrics VA Repository VA VAx y VAx y VAx ... VAy

Clouda Clouda Cloudb Cloudb Q1 Q2 VMQx VMQy VMQx VMQy deployment metrics deployment

Clouda VM Handler Cloudb VM Handler call replicate call 1 1 1 1 VMx VMy VMx VMy Repository

2 2 Native 2 2 VMx VMy VMx VMy

...... Native ...... Repository n m o p VMx VMy VMx VMy Cloud a Cloudb

Fig. 1. The Federated Cloud Management architecture

ment containing basic broker properties (e.g., name), FCM User Repository and the gathered aggregated dynamic properties. The VAx ... VAy scheduling-related attributes are typically stored in the Service call PerformanceMetrics field of BPDL. More information (+requirements) Call result on this document format can be read in [12]. Namely, the following data are stored in the BPDL of each IS Agent CloudBroker: Cloud - Estimated availability time for a specific virtual Broker appliance in a native repository – collected from Meta-Broker Information Core Collector Cloud the FCM Repository; Broker - average VA deployment time and average service BPDL list ... execution time for each VA – queried from the CloudBroker; MatchMaker Invoker Cloud Broker The scheduling process first filters the CloudBro- kers by checking VA availability in the native cloud Fig. 2. The architecture of the Generic Meta-Broker Service repository, then a rank is calculated for each broker based on the collected static and dynamic data. Finally, the CloudBroker with the highest rank is selected for Brokers including average VA deployment and service forwarding the service request. execution times. The Invoker component forwards the service call to the selected CloudBroker and receives the B. CloudBroker service response. The CloudBroker handles and dispatches service calls Each CloudBroker is described by an XML-based to resources and performs resource management within Broker Property Description Language (BPDL) docu- a single IaaS system, it is an extended version of the

Copyright (c) IARIA, 2011. ISBN: 978-1-61208-153-3 10 CLOUD COMPUTING 2011 : The Second International Conference on Cloud Computing, GRIDs, and Virtualization

system described in [13]. This may lead to allocating excess resources in some The architecture of the CloudBroker is shown in cases (e.g., the resource class has twice the memory Figure 1. Its first task is to dynamically create or destroy requested to meet the CPU number requirement). i virtual machines (VMx) and VM queues (VMQx) for The CloudBroker also performs the scheduling of ser- the different used virtual appliances. To do that, first, the vice call requests to VA’s and the life-cycle management VA has to be replicated to the native repository of the of resources. Scheduling decision is made based on the IaaS system from the FCM Repository (an alternative monitoring information gathered from the resources. If method is discussed in Section IV). Alongside the ap- the service request cannot be scheduled to any resource pliance, the FCM Repository also stores additional static the CloudBroker may decide to start a new VM capable requirements about its future instances, like its minimum of serving the request. The decision is based on the resource demands (disk, CPU and memory), that are following: needed by the CloudBroker. This data is not replicated - The number of running VM’s available to handle to the native repository, rather the FCM Repository is the service call; queried. - the number of waiting service calls for the VA in A VM queue stores references to resources capable the service call queue; of handling a specific service call, thus instances of - the average execution time of service calls; a specific VA. New resource requests are new entries - the average deployment time of VA’s; inserted into the queue of the appropriate VA, while - and SLA constraints (e.g., total budget, deadline); resource destruction requests are modification of entries VM decommission is also based on the above, but representing an already running resource. The entries the CloudBroker takes into account the billing period are managed by the VM Handler, that is a cloud fabric of the IaaS system, shutdown is performed only shortly specific component designed to interact with the public before the end of the period with regard to the average interface of a single IaaS system. It simply translates decommission time for the system. and forwards requests to the public interface of the IaaS system (Clouda). Each VA contains a monitoring IV. VIRTUAL APPLIANCE DELIVERY OPTIMIZATION component deployed, that allows the CloudBroker to monitor the basic status (CPU, disk and memory usage) IaaS systems require virtual appliances to be stored in of the running resources along the average deployment their native repositories. Only those virtual appliances, time for each VA and average service execution times. that were previously stored in these repositories, can These data can be queried by the IS Agent of the GMBS. be used to instantiate virtual machines. Our architecture allows users to upload their virtual appliances to the The service call queue (Q and Q ) stores incoming 1 2 FCM Repository that behaves as an active repository and service requests and, for each request, reference to a handles the distribution of the appliances to the native VA in the FCM Repository. There is a single service repositories according to [14]. As an active repository, call queue in each CloudBroker, while there are many the FCM repository identifies the common parts of the VM queues. If the native repository does not contain the appliances and decomposes them into smaller packages requested VA it is replicated first. Dynamic requirements that allow appliance delivery and rebuilding from mul- for the VA may be specified with the service call: tiple repositories. - Additional resources (CPU, memory and disk); Central virtual appliance storage would require the - an UUID, that allows to identify service calls orig- VM Handler to first download the entire appliance from inating from the same entity. the FCM repository to a native one, then instantiate The UUID will allow to meet SLA constraints later, the appliance with the IaaS system. To avoid the first e.g., to enforce a total cost limit on public clouds for transfer, but keep the convenience for the users of our service calls of the entity, or to be in compliance with architecture, we have investigated options to rebuild vir- deadlines. If any dynamic requirements are present the tual appliances in already running virtual machines. We CloudBroker treats the VA as a new VA type, thus have identified two distinct approaches for rebuilding: creating a new VM queue and starts a VM. The service (i) native appliance reuse, (ii) minimal manageable calls may now be dispatched to the appropriate VMs. virtual appliances. The first approach utilizes already Most IaaS systems offer predefined classes of resources available virtual appliances in the native repositories and (CPU, memory and disk capacity) not adjustable by the extends them towards the required virtual appliance. In user, in this case the CloudBroker will select the resource this article, we do not aim at this approach because class that has at least the requested resources available. it requires the investigation of the publicly available

Copyright (c) IARIA, 2011. ISBN: 978-1-61208-153-3 11 CLOUD COMPUTING 2011 : The Second International Conference on Cloud Computing, GRIDs, and Virtualization

appliances in order to find the appliance most suitable the various IaaS cloud infrastructures. Regarding future for extension. works, we plan to investigate various scenarios that arise The second approach proposes the minimal manage- during handling federated cloud infrastructures using the able virtual appliance that we define as basic appliance FCM architecture (e.g., the interactions and interopera- with the following three properties: tion of public and private IaaS systems).

- Offers content management interfaces to add, con- ACKNOWLEDGMENT figure and remove new appliance parts. The research leading to these results has received - Offers monitoring interfaces to analyze the current funding from the European Community’s Seventh state of its instances (e.g., provide access to their Framework Programme FP7/2007-2013 under grant CPU load, free disk space and network usage). agreement 215483 (S-Cube) and 261556 (EDGI). - Optimally sized: only those files present in the appliance that are required to offer their extensibility REFERENCES with the previously mentioned interfaces. [1] E. Di Nitto, C. Ghezzi, A. Metzger, M. Papazoglou, and As a result, our architecture only needs to replicate K. Pohl, “A journey to highly dynamic, self-adaptive service- the MMVAs to every native repository. If the FCM based applications,” Automated Software Engg., vol. 15, pp. 313–341, December 2008. [Online]. Available: http: repository identifies high demands for specific virtual //portal.acm.org/citation.cfm?id=1459074.1459084 appliance parts, then the active repository functionality [2] R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, automatically replicates the appliance to those IaaS sys- “Cloud computing and emerging it platforms: Vision, hype, and reality for delivering computing as the 5th utility,” Future tems where most requests were originated from. Generation Computer Systems, vol. 25, no. 6, pp. 599–616, June Our VM Handler is prepared to control virtual ap- 2009. [3] Amazon Web Services LLC, “Amazon elastic compute cloud,” pliance rebuilding using minimal manageable virtual http://aws.amazon.com/ec2/, 2009. appliances. Consequently, the VM Handler applies a new [4] Rackspace Cloud, http://www.rackspace.com/cloud/. strategy when it receives a virtual appliance instantiation [5] L. M. Vaquero, L. Rodero-Merino, J. Caceres, and M. Lindner, “A break in the clouds: towards a cloud definition,” SIGCOMM request for a specific appliance that is not available Comput. Commun. Rev., vol. 39, pp. 50–55, December in the native repository. This strategy starts with the 2008. [Online]. Available: http://doi.acm.org/10.1145/1496091. instantiation of the MMVA. Next, the Handler waits until 1496100 [6] M. Schmidt, N. Fallenbeck, M. Smith, and B. Freisleben, the virtual machine of the MMVA has started up. Then, “Efficient distribution of virtual machines for cloud computing,” it requests the content management interfaces to add the in Proceedings of the 2010 18th Euromicro Conference on parts of the specific appliance that were identified as Parallel, Distributed and Network-based Processing, ser. PDP ’10. Washington, DC, USA: IEEE Computer Society, 2010, unique during the decomposition of the MMVA and the pp. 567–574. [Online]. Available: http://dx.doi.org/10.1109/PDP. specific appliance. As a result, the specific appliance is 2010.39 rebuilt and ready to serve the scheduled service requests [7] P. Marshall, K. Keahey, and T. Freeman, “Elastic site: Using clouds to elastically extend site resources,” in Proceedings of in the virtual machine instantiated for the MMVA. the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, ser. CCGRID ’10. Washington, V. CONCLUSIONAND FUTURE WORKS DC, USA: IEEE Computer Society, 2010, pp. 43–52. [Online]. Available: http://dx.doi.org/10.1109/CCGRID.2010.80 In this paper, we proposed a Federated Cloud Man- [8] C. Vázquez, E. Huedo, R. S. Montero, and I. M. Llorente, agement solution that acts as an entry point to cloud “On the use of clouds for grid resource provisioning,” Future Gener. Comput. Syst., vol. 27, pp. 600–605, May 2011. [Online]. federations. Its architecture incorporates the concepts of Available: http://dx.doi.org/10.1016/j.future.2010.10.003 meta-brokering, cloud brokering and on-demand service [9] Amazon CloudWatch, http://aws.amazon.com/cloudwatch/. deployment – their interaction is exemplified through [10] M. A. Salehi and R. Buyya, “Adapting market-oriented schedul- ing policies for cloud computing.” a low-level use case. The meta-brokering component [11] The World Wide Web Consortium, http://www.w3.org/TR/wsdl. provides transparent service execution for the users by [12] A. Kertész and P. Kacsuk, “Gmbs: A new middleware allowing the system to interconnect the various cloud service for making grids interoperable,” Future Gener. Comput. Syst., vol. 26, pp. 542–553, April 2010. [Online]. Available: broker solutions managed by aggregating capabilities http://dx.doi.org/10.1016/j.future.2009.10.007 of these IaaS cloud providers. We have shown how [13] A. C. Marosi and P. Kacsuk, “Workers in the clouds,” in PDP, CloudBrokers manage the number and the location of the Y. Cotronis, M. Danelutto, and G. A. Papadopoulos, Eds. IEEE Computer Society, 2011, pp. 519–526. utilized virtual machines for the various service requests [14] G. Kecskeméti, G. Terstyánszky, P. Kacsuk, and Z. Németh, they receive. In order to fast track the virtual machine “An approach for virtual appliance distribution for service instantiation, our architecture uses the automatic service deployment,” Future Gener. Comput. Syst., vol. 27, pp. 280–289, March 2011. [Online]. Available: http://dx.doi.org/10.1016/j. deployment component that is capable of optimizing future.2010.09.009 its delivery by decomposing and replicating it among

Copyright (c) IARIA, 2011. ISBN: 978-1-61208-153-3 12