Journal of Computing and Information Technology - CIT 19, 2011, 1, 25–55 25 doi:10.2498/cit.1001864

A Review on : Design Challenges in Architecture and Security

Fei Hu1, Meikang Qiu2,JiayinLi2, Travis Grant1, Draw Tylor1, Seth McCaleb1, Lee Butler1 and Richard Hamner 1

1 Department of Electrical and Computer Engineering, University of Alabama, Tuscaloosa, AL, USA 2 Department of Electrical and Computer Engineering, University of Kentucky, Lexington, KY, USA

Cloud computing is becoming a powerful network archi- Rather than hard-wire data circuits between the tecture to perform large-scale and complex computing. provider and customers, telephone companies In this paper, we will comprehensively survey the concepts and architecture of cloud computing, as well began using VPN-based services to transmit as its security and privacy issues. We will compare data. This allowed providers to offer the same different cloud models, trust/reputation models and amount of bandwidth at a lower cost by rerout- privacy-preservation schemes. Their pros and cons ing network traffic in real-time to accommo- are discussed for each cloud computing security and architecture strategy. date ever-changing network utilization. Thus, it was not possible to accurately predict which Keywords: cloud computing, security, privacy, computer path data would take between the provider and networks, models customer. As a result, the service provider’s network responsibilities were represented by a cloud symbol to symbolize the black of sorts from the end-users’ perspective. It is in 1. Introduction this sense that the term “cloud” in the phrase cloud computing metaphorically refers to the Cloud computing is quickly becoming one of Internet and its underlying infrastructure. the most popular and trendy phrases being tossed Cloud computing is in many ways a conglom- around in today’s technology world. “It’s be- erate of several different computing technolo- coming the phrase du jour”, says Gartner’s Ben [ ] gies and concepts like grid computing, virtu- Pring 1 . It is the big new idea that will alization, autonomic computing [40], Service- supposedly reshape the information technol- ( )[ ] ( ) oriented Architecture SOA 43 , peer-to-peer ogy IT services landscape. According to The (P2P) computing [42], and ubiquitous comput- Economist in a 2008 article, it will have huge ing [41]. As such, cloud computing has inher- impacts on the information technology industry, ited many of these technologies’ benefits and and also profoundly change the way people use [ ] drawbacks. One of the main driving forces computers 2 . What exactly is cloud comput- behind the development of cloud computing ing then, and how will it have such a big impact was to fully harness the already existing, but on people and the companies they work for? under-utilized computer resources in data cen- In order to define cloud computing, it is first ters. Cloud computing is, in a general sense, necessary to explain what is referenced by the on-demand utility computing for anyone with phrase “The Cloud”. The first reference to access to the cloud. It offers a plethora of IT “The Cloud” originated from the telephone in- services ranging from software to storage to dustry in the early 1990s, when Virtual Pri- security, all available anytime, anywhere, and vate Network (VPN) service was first offered. from any device connected to the cloud. More 26 A Review on Cloud Computing: Design Challenges in Architecture and Security formally, the National Institute of Standards and simple; the cloud (internet) can be utilized to Technology (NIST) defines cloud computing as provide services that would otherwise have to a model for convenient, on-demand network ac- be installed on a personal computer. For ex- cess to computing resources such as networks, ample, service providers may sell a service servers, storage, applications, and services that to customers which would provide storage of can be quickly deployed and released with very customer information. Or perhaps a service little management by the cloud-provider [4].Al- could allow customers to access and use some though the word “cloud” does refer to the In- software that would otherwise be very costly ternet in a broad sense, in reality, clouds can in terms of money and memory. Typically, ( be public, private, or hybrid a combination of cloud computing services are sold on an “as ) both public and private clouds . Public clouds needed” basis, thus giving customers more con- provide IT services to anyone in the general pub- trol than ever before. Cloud computing ser- lic with an Internet connection and are owned vices certainly have the potential to benefit both and maintained by the company selling and dis- providers and users. However, in order for cloud tributing these services. In contrast, private computing to be practical and reliable, many ex- clouds provide IT services through a privately- isting issues must be resolved. owned network to a limited number of people within a specific organization. With ongoing advances in technology,the use of As two examples of public consumer-level clouds cloud computing is certainly on the rise. Cloud consider Yahoo!R Mail and YouTube. Through computing is a fairly recent technology that re- these two sites, users access data in the form lies on the internet to deliver services to paying of e-mails, attachments, and videos from any customers. Services that are most often used device that has an Internet connection. When with cloud computing include data storage and users download e-mails or upload videos, they accessing software applications (Figure 1).The do not know where exactly the data came from use of cloud computing is particularly appealing or went. Instead, they simply know that their to users for two reasons: it is rather inexpensive data is located somewhere inside the cloud. In and it is very convenient. Users can access data reality, cloud computing involves much more or use applications with only a personal com- complexity than the preceding two examples puter and internet access. Another convenient illustrate and it is this behind the scenes com- aspect of cloud computing is that software ap- plexity that makes cloud computing so appeal- plications do not have to be installed on a user’s ing and beneficial to individual consumers and computer, they can simply be accessed through large businesses alike. the internet. However, as with anything else Cloud computing is an emerging technology that seems too good to be true, there is one cen- from which many different industries and in- tral concern with the use of cloud computing dividuals can greatly benefit. The concept is technology – security [44].

Figure 1. Cloud computing [5]. A Review on Cloud Computing: Design Challenges in Architecture and Security 27

In the internet, people like to use email for The bottom layer is the physical hardware, namely communication because of its convenience, ef- the cloud-provider owned servers and switches ficiency and reliability. It’s well known that that serve as the cloud’s backbone. Customers most of the contexts have no special meaning, who use this layer of the cloud are usually big which means it’s more likely our daily com- corporations who require an extremely large munication. So, such context is less attractive amount of subleased Hardware as a Service for the hacker. However, when two people or (HaaS). As a result, the cloud-provider runs, two groups, even two countries, want to com- oversees, and upgrades its subleased hardware municate using the internet secretly, for each communication partners they need some kind for its customers. Cloud-providers must over- of encryption to ensure their privacy and confi- come many issues related to the efficient, smooth, dentiality. For sake of that such situation occurs and quick allocation of HaaS to their customers, rarely, users don’t want to make their processor and one solution that allows providers to ad- slower by encrypting the email or other docu- dress some of these issues involves using re- ments. There must be a way for the users to mote scriptable boot-loaders. Remote script- easily encrypt their document only when they able boot-loaders allow the cloud-provider to require such service. specify the initial set of operations executed by In the rest of the paper, we will first survey the servers during the boot process, meaning typical architectures from networks layers view- that complete stacks of software can be quickly point. Then we will discuss cloud models. Next implemented by remotely located data center we move to security issues such as encryption- servers. on-demand. The privacy preservation models will be described. In each part where multiple The next layer consists of the cloud’s software schemes are presented, we will also compare kernel. This layer acts as a bridge between their pros and cons. the data processing performed in the cloud’s hardware layer and the software infrastructure layer which operates the hardware. It is the 2. Cloud Computing Architecture lowest level of abstraction implemented by the cloud’s software and its main job is to man- age the server’s hardware resources while at the 2.1. Abstract Layers same time allowing other programs to run and utilize these same resources. Several different To begin understanding cloud computing in implementations of software kernels include op- some sort of detail, it is necessary to examine erating system (OS) kernels, hypervisors, and it in abstraction layers beginning at the bottom and working upwards. Figure 2 illustrates the clustering middleware. Hypervisors allow mul- five layers that constitute cloud computing [44]. tiple OSs to be run on servers at the same time, A particular layer is classified above another if while clustering middleware consists of soft- that layer’s services can be composed of ser- ware located on groups of servers, which allows vices provided by the layer beneath it. multiple processes running on one or multiple servers to interact with one another as if all were concurrently operating on a single server. The abstraction layer above the software kernel is called software infrastructure. This layer ren- ders basic network resources to the two layers above it in order to facilitate new cloud soft- ware environments and applications that can be delivered to end-users in the form of IT services. The services offered in the soft- ware infrastructure layer can be separated into three different subcategories: computational resources, data storage, and communication. Figure 2. The five abstraction layers of cloud Computational resources, also called Infras- computing [5]. tructure as a Service (IaaS), are available to 28 A Review on Cloud Computing: Design Challenges in Architecture and Security cloud customers in the form of virtual machines ture provider is an autonomous business with its (VMs). Virtualization technologies such as own business goals. A provider federates with para-virtualization [44], hardware-assisted vir- other providers (i.e., other RESERVOIR sites) tualization, live-migration, and pause-resume based on its own local preferences. The IT man- enable a single, large server to act as multi- agement at a specific RESERVOIR site is fully ple virtual servers, or VMs, in order to better autonomous and governed by policies that are utilize limited and costly hardware resources aligned with the site’s business goals. To opti- such as its central processing unit (CPU) and mize this alignment, once initially provisioned, memory. Thus, each discrete server appears resources composing a service may be moved to cloud users as multiple virtual servers ready to other RESERVOIR sites based on economi- to execute the users’ software stacks. Through cal, performance, or availability considerations. virtualization, the cloud-provider who owns and Our research addresses those issues and seeks maintains a large number of servers is able to to minimize the barriers to delivering services benefit from gains in resource utilization and as utilities with guaranteed levels of service and efficiency and claim the economies of scale that proper risk mitigation. arise as a result. IaaS also automatically al- Two examples of data storage systems are Ama- locates network resources among the various zon Simple Storage Service ( S3)[11] virtual servers rapidly and transparently. The and Zecter ZumoDrive [12]. The communi- resource utilization benefits provided by virtu- cation subcategory of the software infrastruc- alization would be almost entirely negated if ture layer, dubbed Communication as a Service not for the layer’s ability to allocate servers’ (CaaS), provides communication that is reli- resources in real-time. This gives the cloud- able, schedulable, configurable, and (if neces- provider the ability to dynamically redistribute sary) encrypted. This communication enables processing power in order to supply users with CaaS to perform services like network secu- the amount of computing resources they re- rity, real-time adjustment of virtual overlays to quire without making any changes to the phys- provide better networking bandwidth or traffic ical infrastructure of their data centers. Several flow, and network monitoring. Through net- current examples of clouds that offer flexible work monitoring, cloud-providers can track the amounts of computational resources to its cus- portion of network resources being used by each tomers include the Amazon Elastic Compute customer. This “pay for what you use” concept Cloud (EC2)[9], Enomaly’s Elastic Computing is analogous to traditional public utilities like Platform (ECP)[10], and RESERVOIR archi- water and electricity provided by utility com- tecture [6]. Data storage, which is also referred panies and is referred to as utility computing. to as Data-Storage as a Service (DaaS), allows Voice over Internet Protocol (Vo IP) telephones, users of a cloud to store their data on servers lo- instant messaging, and audio and video con- cated in remote locations and have instant access ferencing are all possible services which could to their information from any site that has an In- be offered by CaaS in the future. As a result ternet connection. This technology allows soft- of all three software infrastructure subcompo- ware platforms and applications to extend be- nents, cloud-customers can rent virtual server yond the physical servers on which they reside. time (and thus their storage space and pro- Data storage is evaluated based on standards re- cessing power) to host web and online gaming lated to categories like performance, scalability, servers, to store data, or to provide any other reliability, and accessibility. These standards service that the customer desires. are not all able to be achieved at the same time There are several Virtual Infrastructure Man- and cloud-providers must choose which design agements in IaaS, such as CLEVER [25], Open- criteria they will focus on when creating their QRM [8], OpenNebula [16], and Nimbus [19]. data storage system. RESERVOIR architecture CLEVER aims to provide Virtual infrastructure allows providers of cloud infrastructure to dy- Management services and suitable interfaces at namically partner with each other to create a the High-level Management layer to enable the seemingly infinite pool of IT resources while integration of high-level features such as Public fully preserving the autonomy of technologi- Cloud Interfaces, Contextualization, Security cal and business management decisions [6].In and Dynamic Resources provisioning. Open- the RESERVOIR architecture, each infrastruc- QRM is an open-source platform for enabling A Review on Cloud Computing: Design Challenges in Architecture and Security 29 flexible management of computing infrastruc- to run and customize programs that interact tures. It is able to implement a cloud with with Force.com platform servers [14].When several features that allows the automatic de- developers design their cloud software for a spe- ployment of services. It supports different vir- cific cloud environment, their applications are tualization technologies and format conversion able to utilize dynamic scaling and load bal- during migration. OpenNebula is an open and ancing as well as easily have access to other flexible tool that fits into existing data center en- services provided by the cloud software en- vironments to build a Cloud computing environ- vironment provider like authentication and e- ment. OpenNebula can be primarily used as a mail. This makes designing a cloud applica- virtualization tool to manage virtual infrastruc- tion a much easier, faster, and more manage- tures in the data-center or cluster, which is usu- able task. Another example of the PaaS is the ally referred as Private Cloud. Only the more Azure from [46]. Applications for recent versions of OpenNebula are trying to Microsoft’s Azure are written using the .NET support Hybrid Cloud to combine local infras- libraries, and compiled to the Common Lan- tructure with public cloud-based infrastructure, guage Runtime, a language-independent man- enabling highly scalable hosting environments. aged environment. The framework is signifi- OpenNebula also supports Public Clouds by cantly more flexible than AppEngine’s, but still providing Cloud interfaces to expose its func- constrains the user’s choice of storage model tionalities for virtual machine, storage and net- and application structure. Thus, Azure is inter- work management. Nimbus provides two levels mediate between application frameworks like of guarantees: 1) quality of life: users get ex- AppEngine and hardware virtual machines like actly the (software) environment they need, and EC2 [47]. The top layer of cloud computing 2) quality of service: provision and guarantee above the software environment is the appli- all the resources the workspace needs to func- cation layer. Dubbed as Software as a Ser- tion correctly (CPU, memory, disk, bandwidth, vice (SaaS), this layer acts as an interface be- availability), allowing for dynamic renegotia- tween cloud applications and end-users to offer tion to reflect changing requirements and condi- them on-demand and many times fee-based ac- tions. In addition, Nimbus can also provision a cess to web-based software through their web virtual cluster for Grid applications (e.g. a batch browsers. Because cloud users run programs by scheduler, or a workflow system), which is also utilizing the computational power of the cloud- dynamically configurable, a growing trend in provider’s servers, the hardware requirements Grid Computing. of cloud users’ machines can often be signif- icantly reduced. Cloud-providers are able to The next layer is called the software environ- seamlessly update their cloud applications with- ment, or platform, layer and for this reason is out requiring users to install any sort of up- commonly referred to as Platform as a Service grade or patch since all cloud software resides (PaaS). The primary users of this abstract layer on servers located in the provider’s data centers. are the cloud application developers who use it Google Apps and ZOHOR are two examples of as a means to implement and distribute their pro- offerings currently available. In many cases the grams via the cloud. Typically, cloud applica- software layer of cloud computing completely tion developers are provided with a programming- eliminates the need for users to install and run language-level environment and a pre-defined software on their own computers. This, in turn, set of application programming interfaces (APIs) moves the support and maintenance of applica- to allow their software to properly interact with tions from the end-user to the company or or- the software environment contained in the cloud. ganization that owns and maintains the servers Two such examples of current software en- that distribute the software to customers. vironments available to cloud software devel- opers are the Google App Engine and Sales- There are several other layer architectures in force.com’s Apex code. Google App Engine cloud computing. For example, Sotomayor et provides both Java and Python runtime envi- al. proposed a three-layer model [37]: cloud ronments as well as several API libraries to management, Virtual infrastructure (VI) man- cloud application developers [13], while Sales- agement, and VM manager. Cloud management force.com’s Apex code is an object-oriented provides remote and secure interfaces for creat- programming language that allows developers ing, controlling, and monitoring virtualized re- 30 A Review on Cloud Computing: Design Challenges in Architecture and Security sources on an infrastructure-as-a-service cloud. by grids. Clouds present, however, specific ele- VI management provides primitives to sched- ments that call for standardization too; e.g. vir- ule and manage VMs across multiple physical tual images format or instantiation/migration hosts. VM managers provide simple primitives APIs [5]. So enhancing existing standards is (start, stop, suspend) to manage VMs on a single granted to ensure the required interoperability. host. Wang et al. presented a cloud computing architecture, the Cumulus architecture [45].In this architecture, there are also three layers: vir- tual network domain, physical network domain, and Oracle .

2.2. Management Strategies for Multiple Clouds [17]

In order to mitigate the risks associated with un- reliable cloud services, businesses can choose to use multiple clouds in order to achieve adequate fault-tolerance – a system’s ability to continue Figure 3. Architecture for resource management that allows fault-tolerance cloud computing. functioning properly in the event that single or multiple failures occur in some of its compo- nents. The multiple clouds utilized can either be Resource management architecture is realized a combination of 1) all public clouds or 2) pub- by using abstract computational resources as its lic and private clouds that together form a single most basic building block. Next, a set of op- hybrid cloud. A lack of well-defined cloud com- erations are defined so that each computational puting standards means that businesses wishing resource can be identified, started, stopped, or to utilize multiple clouds must manage multi- queried by the resource manager as to their pro- ple, often unique cloud interfaces to achieve cessing state. Finally, the resource manager their desired level of fault-tolerance and com- should provide some sort of interface to the IT putational power. This can be achieved through employees of the company utilizing it to allow the formulation of cloud interface standards so them to view, as well as manually modify, the that any public cloud available to customers amount of cloud computational resources be- utilizes at least one well-defined standardized ing used by the company. Upon proper set-up interface. As long as all cloud-providers use and configuration of IT services from multiple a standardized interface, the actual technology cloud-providers, the resource manager should and infrastructure implemented by the cloud is map the actual resources of each connected not relevant to the end-user. Also, some type of cloud to a generic, abstract representation of resource management architecture is needed to its computational resources. Ideally, once the monitor, maximize, and distribute the compu- actual resources of a cloud from a particular tational resources available from each cloud to cloud-provider have been mapped, then any the business. other cloud from that same cloud-provider could also be ported over to the resource manager There are several multiple clouds managements, as a generic computational resource represen- including the Intercloud Protocols [20],and tation using the same mapping scheme. Us- some method of cloud federation [5, 7, 48]. ing the newly created generic representations The Intercloud Protocols consist of six layers: of computational resources, the resource man- actual physical layer, physical metaphor layer, ager’s algorithm could systematically identify platform metaphor layer, communication layer, each available resource, query its amount of uti- management layer and endpoints layer. The lized processing power, and then start or stop the possibility of a worldwide federation of Clouds resource accordingly in such a way as to maxi- has been studied recently [3, 5, 48]. Some of the mize the total amount of cloud resources avail- challenges ahead for the Clouds, like monitor- able to the business for a targeted price level. ing, storage, QoS, federation of different orga- This process is depicted in Figure 3. Depend- nizations, etc. have been previously addressed ing on each business’s scenario and individual A Review on Cloud Computing: Design Challenges in Architecture and Security 31 preferences, the development of a suitable algo- commands were issued by the resource manager rithm for the cloud manager to implement could to EC2 via the Query interface. Each command be a seemingly complex task. was issued a timestamp upon transmission from One such instance of a hybrid cloud resource and arrival to the resource manager. Then, the manager has been realized through the collab- same process was repeated for EC2 using the oration of Dodda, Moorsel, and Smith [15]. SOAP interface. A comparison of both inter- faces’ response times yielded a mean response Their cloud management architecture managed ( ) in tandem the Amazon EC2 via Query and time of 508 milliseconds ms for the Query Simple Object Access Protocol (SOAP) inter- interface and 905 ms for the SOAP interface. faces as well as their own private cloud via a Thus, the SOAP interface’s response time was Representational State Transfer (REST) inter- nearly double the response time of the Query face. The Query interface of EC2 uses a query interface. In addition, the variance of the SOAP string placed in the Uniform Resource Loca- interface response times was much larger than tor (URL) to implement the management op- the variance exhibited by the response times of erations of the resource manager. Amazon, as the Query interface. So, the Query interface’s the cloud-provider, provides a list of defined response time is not only faster on average, but parameters and their corresponding values to also more consistent than the response time of be included in a query string by the resource the SOAP interface. Based on these results, it manager. These query strings are sent by the should come as no surprise that most businesses cloud resource manager to a URL called out by use EC2’s Query interface instead of its SOAP the cloud-provider via HTTP GET messages in interface. This comparison clearly shows that order to perform necessary management oper- when using a particular cloud, businesses must ations. In this way, EC2’s interface is mapped be careful in choosing the most responsive inter- to a generic interface that can be manipulated face to connect to their resource manager so as by the resource manager. The SOAP interface to maximize performance and quality of service of EC2 operates very similarly to the Query in- (QoS). The less amount of time required for pro- terface. The same operations are available to cessing commands issued by the resource man- resource managers using the SOAP interfaces ager, the more responsive the resource manager as are available when using the Query interface, can be at maximizing and distributing the com- and HTTP GET messages are sent to a URL putational resources available from each cloud specified by the cloud-provider in order to per- to the business. form cloud management operations. The only difference between the two interfaces is the ac- Another example of hybrid cloud management tual parameters themselves that are needed to described by H. Chen et al. utilizes a technique ( )[ ] perform each operation. called intelligent workload factoring IWF 17 . As was the case with the previously described The REST interface of the private cloud assigns cloud resource manager approach, businesses a global identifier, in this case a Uniform Re- can utilize IWF to satisfy highly dynamic com- ( ) source Identifier URI , to each local resource. puting needs. The architecture of the IWF Each local resource is then manipulated via management scheme allows the entire compu- HTTP and mapped to its generic interface in tational workload to be split into two separate much the same way as EC2’s SOAP and Query types of loads, a base load and a trespassing interfaces. After each cloud-specific interface load, when extreme spikes in the workload oc- had been successfully mapped to its generic in- cur. The base load consists of the smaller, terface, the resource manager was able to ef- steady workload that is constantly demanded fectively oversee both the EC2 and the private by users of cloud computing services while the cloud at the same time by using a single, com- trespassing load consists of the bigger, transient mon set of interface commands. workload demanded on an unpredictable basis However, one might wonder whether the choice by users. Both loads are managed indepen- of interface that the resource manager uses to dently from one another in separate resource oversee a cloud has a significant influence on the zones. The resources necessary to handle the cloud’s performance. To answer this question, a base load can be obtained from a company’s series of 1,000 identical management operation private cloud, and are thus referred to as the 32 A Review on Cloud Computing: Design Challenges in Architecture and Security base load resource zone, while the elastic com- form of a local cloud cluster as its base load putational power to meet the needs of the tres- resource zone and the Amazon EC2 as its tres- passing load can be accessed on-demand via passing load resource zone. The architecture a public cloud supplied by a cloud-provider, used in this IWF hybrid cloud management sys- which is also known as the trespassing load re- tem is depicted in Figure 4. source zone. Base load management should proactively seek to use base zone resources ef- The IWF module, located at the very front of the ficiently by using predictive algorithms to an- management architecture, has a job that’s two- ticipate the future base load conditions present fold: first, it should send a smooth, constant at all times in the company’s network. In con- workload to the base zone while at the same trast, trespass load management should have the time avoid overloading either zone by exces- ability to devote trespass zone resources to sud- sively redirecting load traffic; second, the IWF den, unpredictable changes in the trespass load module should effectively break down the over- in a systematic, yet agile fashion. Part of the all load initiated by users of the network, both IWF model’s task is to ensure that the base load by type and amount of requested data. The pro- remains within the amount planned for by the cess of load decomposition helps to avoid data company’s IT department; in most cases, this redundancy in the management scheme. The is the computing capacity of the business’s pri- base zone of the IWF system is on 100% of the vate cloud (i.e. on-site servers in data center). time, and, as mentioned previously, the volume By doing this, the computational capacity that of the base load does not vary much with time. the business’s on-site data center has to be over- As a result, the resources of the private cloud planned for is considerably decreased, resulting that handle the base load are able to be run in less capital expenditure and server underuti- very efficiently. Nonetheless, a small amount lization for the business. The other part of the of over-provisioning is still necessary in order IWF model’s job is to make sure a minimal for the local cloud cluster to be able to handle amount of data replication is required to process small fluctuations in the base load volume and the services obtained from the public cloud. to guarantee an acceptable QoS to the business’ One instance of such an IWF management scheme employees. The trespassing zone of the system was created by H. Chen and his colleagues [14]. is only on for X% of the time, where X is typi- Their system employed a private cloud in the cally a small single-digit number. Because the

Figure 4. The IWF hybrid cloud management system [17]. A Review on Cloud Computing: Design Challenges in Architecture and Security 33 trespassing zone must be able to adequately han- continually updates the system load as changes dle large spikes in processing demand, its com- occur to the number and nature of computing putational capacity must be over-provisioned by requests initiated by users of the network. By a considerably large amount. comparing the current system load with the base load threshold (the maximum load that the base The procedure for implementing IWF can be ) modeled as a hypergraph partitioning problem zone can adequately handle , the workload pro- consisting of vertices and nets. Data objects in filing component places the system in one of the load traffic, such as streaming videos or ta- two states – normal mode if the system load bles in a database, are represented by vertices, is less than or equal to the base load threshold while the requests to access these data objects or panic mode if the system load is greater than are modeled as nets. the base load threshold. The base load threshold can either be set manually by those in charge of The link between a vertex and a net depicts the the system or automatically by utilizing the past relationship that exists between a data object history of the base load. When the system is in and its access request. Formally, a hypergraph normal mode, the fast factoring component for- is defined as wards the current system load to the base zone. H =(V, N)(1) However, when the system is in panic mode, where V is the vertex set and N is the net set. the fast factoring component implements a fast frequent data object detection algorithm to see Each vertex vi ∈ V is assigned weights wi and if a computing request is seeking a data object si that represent the portion of the workload caused by the ith data object and the data size that is in high demand. of the ith data object, respectively. The weight wi is computed as the average workload per ac- cess request of the ith data object multiplied by its popularity. Each net nj ∈ N is assigned acostcj that represents the expected network overhead required to perform an access request of type j.Thecostc attributed to each net is j Figure 5. Flow diagram of IWF management scheme calculated as the expected number of type j ac- components [17]. cess requests multiplied by the total data size of all neighboring data objects. The problem that must be solved in the design of the IWF manage- If the object being requested is not in high de- ment system involves assigning all data objects, mand, the fast factoring component forwards or vertices, to two separate locations without that portion of the system load to the base zone; causing either location to exceed its workload if the requested object is in high demand, that capacity while at the same time achieving a min- portion of the system load is immediately for- imum partition cost. This is given by warded to the trespassing zone. In essence, the fast factoring component removes the sporadic Min( cj +  si)(2) surges in computing power from the base zone (which is not equipped to handle such requests) n ∈N vi∈Vtrespas sin g j exp ected and instead shifts them over to the trespassing where the first summation is the total cost of all zone (which is over provisioned to handle such nets that extend across multiple locations (i.e. spikes in the computing load) to be processed. ) the data consistency overhead , the second sum- The fast frequent data object detection algo- mation is the total data size of objects located rithm features a First In, First Out (FIFO) queue in the trespassing zone (i.e. the data replication ) of the past c computing requests as well as two overhead ,and is a weighting factor for each lists of the up-to-date k most popular data ob- of the two summations. jects and historical k most popular data objects The IWF management scheme consists of three based on the frequency of access requests for main components: workload profiling, fast fac- each data object. The algorithm uses its queue toring, and base load threshold. A flow diagram and popular data object lists to accurately pin- of how these three components interact is shown point data objects that are in high demand by in Figure 5. The workload profiling component the system The level of accuracy achieved by 34 A Review on Cloud Computing: Design Challenges in Architecture and Security the algorithm depends largely on how correct streaming workload would require the yearly the counters are that measure the numbers of cost of maintaining 790 servers in a local data access requests for each data object For a data center. To handle the same workload using only object T, its access request rate, p(T), is defined a public cloud like Amazon EC2’s would cost as a staggering $1.3 million per year! Needless requests p(T)= T (3) to say, although $0.10 per machine hour does requeststotal not seem like a considerable cost, it adds up where requestsT is the number of access re- very quickly for a load of considerable size. quests for data object T and requeststotal is the Finally, the hybrid cloud configuration imple- total number of access requests for all data ob- menting IWF still maintains the ability to han- jects combined. The algorithm can approximate dle large fluctuations in load demand and re- p(Tˆ)(where the hat indicates an approximation quires a yearly cost of approximately $60,000 instead of an exact result) such that plus the cost of operating 99 servers in a local data center. This is by far the most attractive   p(Tˆ) ∈ p(T) · 1 − , p(T) · 1 + configuration among the three cloud computing 2 2 models and should warrant strong consideration (4) by any small or midsize business. where  is a relatively small number that repre- sents the percent error in the request rate approx- imation for a data object T. Obviously, a lower Cloud Computing Annual Cost  (and thus percent error in approximating ac- Configuration cess request rates) allows the algorithm to more Private Cloud Cost of running a data center accurately determine which data objects are in composed of 790 servers high demand so that the frequent requests for Public Cloud $1,384,000 such data objects can be passed on to the tres- $58,960 plus cost of running Hybrid Cloud a data center composed of passing zone of the IWF management system with IWF 99 servers and processed accordingly. Therefore, through the use of the fast frequent data object detection Table 1. Cost comparsion of three different cloud algorithm the performance of the IWF hybrid [ ] cloud management system is greatly improved. computing configurations 17 . As a way to compare the costs of different cloud computing configurations, H. Chen and his col- [ ] 2.3. Integrate Sensor Networks and Cloud leagues 17 used the hourly workload data from Computing [18] Yahoo! Video’s web service measured over a 46-day period to serve as a sample workload. ( In addition to cloud computing, wireless sen- The workload measured in terms of requested ( ) video streams per hour) contained a total of over sor networks WSNs are quickly becoming a 32 million stream requests. The three cloud technological area of expansion. The ability computing configurations considered were: 1) of their autonomous sensors to collect, store, a private cloud composed of a locally operated and send data pertaining to physical or envi- data center, 2) the Amazon EC2 public cloud, ronmental conditions while being remotely de- ployed in the field has made WSNs a very at- and 3) a hybrid cloud containing both a local tractive solution for many problems in the areas cloud cluster to handle the bottom 95-percentile of industrial monitoring, environmental moni- of the workload and the Amazon EC2 public toring, building automation, asset management, cloud provisioned on-demand to handle the top and health monitoring. WSNs have enabled a 5-percentile of the workload. For the Amazon whole new wave of information to be tapped EC2, the price of a single machine hour was that has never before been available. By sup- assumed to be $0.10. The cost results of the plying this immense amount of real-time sensor study are shown in Table 1, where for the sake data to people via software, the applications are of simplification, only the costs associated with limitless: up-to-date traffic information could running the servers themselves are considered. be made available to commuters, the vital signs From Table 1, implementing a private cloud of soldiers on the battlefield could be monitored that could handle Yahoo! Video’s typical video by medics, and environmental data pertaining to A Review on Cloud Computing: Design Challenges in Architecture and Security 35 earthquakes, hurricanes, etc. could be analyzed and allows for the cost of the nodes that com- by scientists to help predict further disasters. prise WSNs to remain minimal. The pub- All of the data made available using WSNs will lisher/subscriber broker consists of four main need to be processed in some fashion. Cloud components: the stream monitoring and pro- computing is a very real option available to sat- cessing component (SMPC), registry compo- isfy the computing needs that arise from pro- nent (RC), analyzer component (AC), and dis- cessing and analyzing all of the collected data. seminator component (DC). The SMPC receives A content-based publish/subscribe model for the various streams of sensor data and initi- how WSNs and cloud computing can be in- ates the proper method of analysis for each tegrated together is described in [18]. Pub- stream. The AC ascertains which SaaS appli- lish/subscribe models do not require publishers, cations the sensor data streams should be sent or sensor nodes, to send their messages to any to and whether or not they should be delivered specific receiver, or subscriber. Instead, sent on a periodic or emergency basis. After proper messages are classified into classes and those determination, the AC passes this information who want to view messages, or sensor data, on to the DC for delivery to its subscribers via pertaining to a particular class simply subscribe SaaS applications. The DC component utilizes to that class. By disconnecting the sensor nodes an algorithm designed to match each set of sen- from its subscribers, the abilities of a WSN sor data to its corresponding subscriptions so to scale in size and its nodes to dynamically that it can ultimately distribute sensor data to change position are greatly increased. This is users of SaaS applications. SaaS supplications necessary as nodes are constantly reconfiguring are implemented using cloud resources and thus themselves in an ad-hoc manner to react to sud- can be run from any machine that is connected denly inactive, damaged nodes as well as newly to the cloud. SaaS packages the sensor data deployed and/or repositioned nodes. In order produced from WSNs into a usable graphical to deliver both real-time and archived sensor in- interface that can be examined and sorted by its formation to its subscribers, an algorithm that users. quickly and efficiently matches sensor data to its corresponding subscriptions is necessary. In ad- dition, the computing power needed to process 2.4. Typical Applications of Cloud and deliver extremely large amounts of real- Computing time, streaming sensor data to its subscribers may (depending on the transient spikes of the In describing the benefits of cloud computing, load) force cloud providers to be unable to main- one cannot help being bombarded by reasons tain a certain QoS level. Therefore, a Virtual related to economics. The two main reasons Organization (VO) of cloud-providers might be why cloud computing can be viewed as bene- necessary to ensure that this scenario does not ficial and worthwhile to individual consumers occur. In a VO, a dynamic partnership is formed and companies alike are its cheap cost and ef- with a certain understanding regarding the shar- ficient utilization of computing resources. Be- ing of resources to accomplish individual but cause cloud computing users only purchase the broadly related goals. on-demand virtual server time and applications they need at little to no upfront cost, the capi- In the model described by M. M. Hassan et [ ] tal expenditure associated with purchasing the al. 18 , sensor data is received by the gateway physical network infrastructure is no longer re- nodes of each individual WSN. These gateway quired. Moreover, the physical space needed nodes serve as the intermediary between the to house the servers and switches, the IT staff sensor nodes and a publisher/subscriber broker. / needed to operate, maintain, and upgrade the The publisher subscriber broker’s job is to de- data center, and the energy costs of operating liver sensor information to the SaaS cloud com- the servers are all suddenly deemed unneces- puting applications under dynamically chang- sary. ing conditions. In this way, sensor data from each WSN need only to be sent to one fixed lo- For small businesses, barriers to enter markets cation, the publisher/subscriber broker, instead that require significant amounts of computing of directly to multiple SaaS applications simul- power are substantially lowered. This means taneously; this reduces the amount of complex- that small companies have newly found access ity necessary in the sensor and gateway nodes to computing power that they otherwise would 36 A Review on Cloud Computing: Design Challenges in Architecture and Security never have been able to acquire. Because com- market standards. This also means that a con- puting power costs are calculated based on us- stant flow of new cloud-providers is entering age, small businesses do not have to take on the cloud computing market in an effort to each the unnecessary risk associated with commit- secure a large and loyal customer base. ting a large amount of capital to purchasing net- As is the case with any developing technol- work infrastructure. The ability of a company ogy, a constantly changing marketplace makes to grow or shrink the size of its workforce based choosing the right cloud-provider and the proper on business demand without having to scale its cloud computing standards difficult. Customers network infrastructure capacity accordingly is face the risk of choosing a wrong provider who another benefit of on-demand pricing and sup- will soon go out of business and take the cus- [ ] ply of computing power 21 . tomer’s data with them or backing a cloud com- While operating completely without any on- puting standard that will eventually become ob- location, physical servers might be an option solete. Once an individual consumer or busi- for some small companies, many colleges and ness has chosen a particular provider and its associated cloud, it is currently difficult (but large companies might not consider this option ) very feasible. However, they could still stand not altogether impossible to transfer data be- to benefit from cloud computing by construct- tween clouds if the customer wants to switch ing a private cloud. Although a private cloud cloud-providers, and most likely, the customer would still require on-site server maintenance will have to pay for the move in the form of and support and its relatively small size would switching costs. prevent benefits related to economies of scale Although virtualization has helped resource ef- from being realized, the efficiency gains real- ficiency in servers, currently, performance in- ized from server virtualization would be large terference effects still occur in virtual environ- enough in most cases to justify the effort. The IT ments where the same CPU cache and trans- research firm Infotech estimates that distributed lation lookaside buffer (TLB) hierarchy are physical servers “generally use only 20 percent shared by multiple VMs. The increasing num- of their capacity, and that, by virtualizing those ber of servers possessing multi-core processors server environments, enterprises can boost hard- further compounds the effects of this issue. As ware utilization to between 60 percent and 80 a result of these interference effects, cloud- percent” [22]. In addition, the private cloud providers are currently not able to offer their could be connected to a public cloud as a source customers an outright guarantee as to the spe- of on-demand computing power. This would cific level of computing performance that will help combat instances where spikes in compu- be provided. tational power exceeded the capacity of the pri- The physical location of a cloud-provider’s ser- vate cloud’s virtualized servers. Part of section vers determines the extent to which data stored V,entitled ‘Management Strategies for Multiple on those servers is confidential. Due to re- Clouds’, explicates in considerable detail how cent laws such as the Patriot Act, files stored multiple clouds, whether public or private, can on servers that physically reside in the United be usefully utilized together in this manner. States are subject to possible intense examina- tion. For this reason, cloud-providers such as Amazon.com allow their customers to choose 2.5. Comparisons on Cloud Computing between using servers located in the United Models States or Europe. Another negative aspect of cloud computing in- From the preceding discussion it might seem as volves data transfer and its associated costs. if cloud computing has virtually no downsides Currently, because customers access a cloud associated with it as long as the cloud is large using the Internet, which has limited band- enough in size to realize economies of scale; width, cloud-providers recommend that users yet, cloud computing does have several draw- transfer large amounts of information by ship- backs. Because cloud computing is still a new ping their external storage devices to the cloud- concept and still in its infant stages of develop- provider. Upon receipt of the storage device, ment, there is currently a lack of well-defined the cloud-provider uploads the customer’s data A Review on Cloud Computing: Design Challenges in Architecture and Security 37 to the cloud’s servers. For instance, Amazon customers. Many cloud-providers utilize per- S3 recommends that customers whose data will unit pricing for data transfers (as mentioned in take a week or more to upload should use Ama- the preceding paragraph’s example of AWS Im- zon Web Services (AWS) Import/Export. This port/Export) and storage space. GoGrid Cloud tool automates shipping an external storage de- Hosting, however, measures their server compu- vice to AWS Import/Export using a mail courier tational resources used in random access mem- such as United Parcel Service (UPS). For this ory (RAM) per hour [24]. Subscription-based service, the user is charged $80 per storage de- pricing models are typically used for SaaS. vice handled and $2.49 per data-loading hour Rather than charge users for what they actually ( where partial data-loading-hours are billed as use, cloud-providers allow customers to know )[ ] full hours 23 . Until the Internet infrastruc- in advance what they will be charged so they ture is improved several times over, transferring can accurately predict future expenses. large amounts of data will be a significant obsta- cle that prevents some companies from adopting Because transitioning between clouds is diffi- cloud computing. cult and costly, end-users are somewhat locked There are several different cloud pricing mod- in to whichever cloud-provider they choose. els available, depending on the provider – the This presents reliability issues associated with main three are tiered, per-unit, and subscription- the chosen cloud. Although rare and only for a based pricing. Amazon.com’s clouds offer tiered matter of hours in most cases, some cloud ac- pricing that corresponds to varying levels of of- cess failures have already occurred with Google fered computational resources and service level and Amazon.com’s services. One such outage agreements (SLAs). SLAs are part of a cloud- occurred with the Amazon EC2 as a result of provider’s service contract and define specific a power failure at both its Virginia data center levels of service that will be provided to its and its back-up data center in December 2009.

Public Cloud Private Cloud Hybrid Cloud Allows for complete control of Most cost-efficient through 1. Simplest to implement and use server software updates utilization flexibility of public patches, etc. and private clouds 2. Minimal upfront costs Minimal long-term costs Less susceptible to prolonged service outages 3. Utilization efficiency gains Utilization efficiency gains Utilization efficiency gains Pros through server virtualization through server virtualization through server virtualization Suited for handling large 4. Widespread accessibility – spikes in workload Requires no space 5. dedicated for data center – – Suited for handling large 6. spikes in workload – – Difficult to implement due 1. Most expensive long-term Large upfront costs to complex management schemes and assorted cloud center Susceptible to prolonged Susceptible to prolonged Requires moderate amount 2. of space dedicated for services outages services outages data center Cons 3. – Limited accessibility – Requires largest amount 4. – of space dedicated – for data center 5. – Not suited for handling – large spikes in workload

Table 2. Summarization of pros and cons for public, private, and hybrid cloud. 38 A Review on Cloud Computing: Design Challenges in Architecture and Security

This outage caused many east coast Amazon cut costs without sacrificing their productivity. AWS users to be without service for approxi- These cost savings stem from little or no up- mately five hours. While these access issues keep required by their in-house IT departments are inevitable with any data center (including as well as their ability to forego the huge capital those on-site), lack of access to a business’s expenditures associated with purchasing servers cloud could occur at very inopportune instances for their local data center. Even for those com- resulting in a business’s computing capability panies that adopt hybrid clouds which require being effectively shut down for the duration of some local servers, the number of these is dra- the service outage. This issue has the poten- matically reduced due to over provisioning of tial to nullify the hassle-free benefit that many the data center no longer being necessary. In cloud-providers tout as a selling point for utiliz- addition to cost savings, cloud computing al- ing cloud computing. lows companies to remain agile and more dy- namically responsive to changes in forecasted business. If a company needs to quickly scale 2.6. Summary of Cloud Computing up or down the computing power of their net- Architecture work infrastructure, they can simply pay for more on-demand computing resources available A common misconception among many of those from cloud-providers. This same level of scala- who would consider themselves “tech-savvy” bility is simply not achievable with a local data technology users is that cloud computing is a center only. As cloud computing technology fancy word for grid computing; grid computing continues to mature, an increasing number of is a form of parallel computing where a group of businesses are going to be willing to adopt it. networked computers form a single, large vir- For this reason, cloud computing is much more tual supercomputer that can perform obscenely than just the latest technological buzzword on large amounts of computational operations that the scene – it’s here to stay. would take normal computers many years to complete. Although cloud computing is in some ways a descendant of grid computing, it is by no 3. Cloud Computing Security means an alternative title. Instead, cloud com- puting offers software, storage, and/or com- 3.1. Why Security in Cloud Computing? puting power as a service to its users, whether they be individuals or employees of a company. By using offloading data and cloud computing, Due to its efficient utilization of computing re- a lot of companies can greatly reduce their IT sources and its ability to scale with workload cost. However, despite tons of merits of cloud demand, cloud computing is an enticing new computing, many companies owners began to technology to businesses of all sizes. However, worry about the security treats. Because in the the decision each business faces is not quite cloud-based computing environment, the em- as clear-cut as whether cloud computing could ployees can easily access, falsify and divulge prove beneficial to it. Instead, each business the data. Sometime such behavior is a disaster must make a decision as to what type of cloud for a big and famous company. it will utilize: public, private, or hybrid. Each cloud type has its own set of positives and neg- Encryption is a kind of ideal way to solve such atives. These benefits and drawbacks are listed problem, whereas for the customers who are us- in Table 2. For example, the public cloud has ing the cloud computing system cannot use such encrypted data. The original data must be used the minimum upfront cost due to the fact that in the host memory otherwise the host VM ma- users do not need to invest in the infrastructure. chine cannot do applications on-demand. For However, it has the maximum long term cost that sake, people can hardly achieve the good since users have to pay for the lease in long security in today’s Cloud services. For exam- term, while the user in the private cloud need to ple, the Amazon’s EC2 is one of the service pay less in long term due to the ownership of providers who have privileged to read and tam- the cloud. per the data from the customers. There is no In recent years, more and more companies have security for the customers who use such ser- begun to adopt cloud computing as a way to vice. A Review on Cloud Computing: Design Challenges in Architecture and Security 39

Some service providers develop some technical Engineers cannot fixate only on the present is- method aimed to avoid the security treats from sues with cloud computing security but they the interior. For instance, some providers limit must also prepare for the future. As cloud the authority to access and manage the hard- computing services become increasingly pop- ware, monitor the procedures, and minimize the ular, new technologies are destined to arise. number of staff who has privilege to access the For example, in the future, user information is vital parts of the infrastructure. However, at the likely going to be transferred among different provider backend, the administrator can also ac- service providers. This certainly increases the cess the customer’s VM-machine. complexity of the security mechanism because the information must be tracked and protected Security within cloud computing is an espe- wherever it may go. As businesses begin to rely cially worrisome issue because of the fact that upon cloud computing, the speed of services the devices used to provide services do not be- will need to be as high as possible. However, long to the users themselves. The users have no the speed of services cannot come about at the control of, nor any knowledge of, what could cost of lowered security, after all security is happen to their data. This is a great concern in the top concern among cloud computing users. cases when users have valuable and personal in- Another aspect of future cloud computing that formation stored in a cloud computing service. engineers must be aware of is the increasing Personal information could be “leaked” out, population of cloud computing users. As cloud leaving an individual user or business vulnera- computing services become more popular, the ble for attack. Users will not compromise their cloud will have to accommodate more users. privacy so cloud computing service providers This raises even more concern about informa- must ensure that the customers’ information is tion security and also increases the complexity safe. This, however, is becoming increasingly of cloud computing security mechanisms. Se- challenging because as security developments curity designers must track every bit of data and are made, there always seems to be someone also protect information from other users as well to figure out a way to disable the security and as provide a fast and efficient service. take advantage of user information. Aware of Risk of information theft is different for ev- these security concerns, cloud computing ser- ery cloud computing service and thus should be vice providers could possibly face extinction if addressed differently when designing security problems which are hindering fail-proof secu- schemes. For example, services which process rity are not resolved. and store public information that could also be Designing and creating error-proof and fail- found in say, a newspaper, would need a very proof security for cloud computing services falls low amount of security. On the other hand, ser- on the shoulders of engineers. These engineers vices which process personalized data about an face many challenges. The services must elim- individual or a business must be kept confiden- inate risk of information theft while meeting tial and secure. This type of data is most often government specifications. Some governments at risk when there is unauthorized access to the have laws limiting where personal information service account, lost copies of data throughout can be stored and sent. If too much security de- the cloud, or security faults. Because of these tail is relayed to the users, then they may become concerns, a security scheme in which all data concerned and decide against cloud computing have a “tracking device” that is limited to cer- altogether. If too little security detail is relayed tain locations could be implemented. However, to the users, then the customers will certainly care must be taken to also ensure that the data is find business elsewhere. If a cloud comput- not transferred or processed in any way without the consent and knowledge of the user. ing service provider’s security is breached and valuable user information is gathered and used When preparing to design a cloud computing against them, then the user could, and likely service and in particular a security scheme, sev- would, sue the provider and the company could eral things must be kept in mind in order to lose everything. Given these scenarios, engi- effectively protect user information. First and neers certainly have to be flawless in their secu- foremost, protection of user personal informa- rity mechanism designs. tion is the top priority. Keeping this in mind 40 A Review on Cloud Computing: Design Challenges in Architecture and Security will certainly provide a better service to cus- Regarding practical cloud computing products, tomers. The amount of valuable personal in- Microsoft Azure cloud computing plans to of- formation that is given to a provider should be fer a new security structure for its multi-tenant kept to a minimum. For example, if the ser- cloud environments as well as private cloud vice being provided does not require a telephone software. Google has a multi-layered security number, then the user should not have to offer process protocol to secure cloud data. Such that information to the provider. Where pos- a process has been independently verified in sible, personal information could be encrypted a successful third-party SAS 70 Type II audit. or somehow kept anonymous. Also, mecha- Google is also able to efficiently manage se- nisms which destroy personal data after use and curity updates across our nearly homogeneous mechanisms which prevent data copying could global cloud computing infrastructure. Intel be implemented. The amount of control that uses SOA Expressway Cloud Gateway to en- a user has over their information should be in- able cloud security. Intel’s Cloud Gateway acts creased to a maximum. For example, if a service as a secure broker that provides security cloud requires that some personal information be sent connectors to other companies’ cloud products. out through the cloud and processed, then the user should be able to control where and how that information is sent. The user should also 3.2. Encryption-on-Demand [26] be able to view their data and decide whether or not certain information can be sent out through The basic idea for encryption-on-demand is that the cloud. Before any data processing or storing they try to use the encryption-on-demand server is done, the user should be asked for consent to which can provide some kind of encryption ser- carry out the operation. Maximizing customer vice. For example, when the server gets a control and influence will earn user trust and request from user, the website or server will establish a wider consumer base. make a unique encryption program, the so- called ‘client’ program, send a package includ- Some organizations have been focusing on secu- ing such program to the client. However, how rity issues in the cloud computing. The Cloud to identify a client program or how to assign a Security Alliance is a non-profit organization client program to a client? They use a tailoring formed to promote the use of best practices identification (TID) The user can forward the for providing security assurance within Cloud TID to the server, the server will use TID to Computing, and provide education on the uses send them the so-tailor made copy. So how do of Cloud Computing to help secure all other the users or communication partners compro- forms of computing [37]. The Open Security mise the TID? So far, there are many different Architecture (OSA) is another organizations fo- ways to do that, by using cell phone, text mes- cusing on security issues. They propose the sage, or other communication tools. By using OSA pattern [38], which pattern is an attempt some reference information, such as birth city, to illustrate core cloud functions, the key roles birth day, or some other personal information all for oversight and risk mitigation, collaboration of which can only be known by the communi- across various internal organizations, and the cation partners. As long as the two partners can controls that require additional emphasis. For receive the client package including the client example, the Certification, Accreditation, and program, they can encrypt their messages or Security Assessments series increase in impor- email easily. After communication, both users tance to ensure oversight and assurance given should dump the client program they used. So, that the operations are being “outsourced” to every time when they want to communicate with another provider. System and Services Acqui- each other privately, they need to download the sition is crucial to ensure that acquisition of ser- package. vices is managed correctly. Contingency plan- In Figure 6, the client (both communicating ning helps to ensure a clear understanding of partners) sends the UID (same UID for two how to respond in the event of interruptions to partners) to the server, the server will assign service delivery. The Risk Assessment controls a package which binds the encryption system are important to understand the risks associated and choose key to the client. Further, the client with services in a business context. can use such encryption-on-demand package to A Review on Cloud Computing: Design Challenges in Architecture and Security 41

Figure 6. The client sending the UID to the server [26]. encrypt the message and communicate with the and switch from one VM to another VM. TVDc partner. consists of a bunch of VMs and the associ- ated resources which can be used for one user’s program. In TVD, the VMs have some labels 3.3. Security for the Cloud Infrastructure: which can be identified uniquely. For example, Trusted Virtual Data Center one label is corresponded with one customer or Implementation [27, 28] the customer’s data. TVDc isolation policy in- cludes two major tasks: (1) label that can be For managing the hardware or resources in the used to indentify the VMs which are assigned same physical system, the VMM (virtual ma- to the customers, (2) allow all the VMs to run chine monitor) can create multiple VMs (virtual on the same TVD. Based on the security label, machines). For the server, this infrastructure the control management can assign the author- provides a very convenient way to create, mi- ity to the users for the accessibility. Figure 7 grate, and delete the VMs. The cloud comput- illustrates such a principle. ing concept can easily achieve the big scale and cheap services. However, using one physical There are three basic components in the net- machine to execute all the workloads of all the work (Figure 8), such as Client: the laptop, users, it will make some serious security issues. desktop or PDA which can be seen as the data Because this infrastructure needs many likeli- resource. These data need to be processed by hood components, it will easily lead to the mis- the cloud servers. Cloud storage server (CSS) configuration problems. The so-called trusted has a huge space to store the data. Third party virtual data center has different VMs and as- auditor (TPA) can be responded with the data sociated hardware resources. In this way, the verification. In Figure 8, the client can deliver VM will know which resource it can access. the data to the cloud server, the admin will pro- TVDc can separate each customer workloads cess the data before storage. Since all these data to different associated virtual machines. The aren’t processed in the local computer, the cus- advantages are very obvious, 1) make sure no tomer needs to verify whether or not the data is workload can leak to other customers, 2) in case correct. Without monitoring the data, the user of some malicious programs like viruses, they needs a delegate like the TPA to monitor the cannot be spread to other nodes, 3) prevent the data. misconfiguration problems. The client needs to check the data in the server TVDc uses the so-called isolation policy, so it and make sure all the data of the cloud is correct can separate both the hardware resource and without any modifying periodically. However, users’ workload or data. The isolation policy in reality, we assume the adversary can freely can manage the data center, access the VMs, access the storage in the server. For example, 42 A Review on Cloud Computing: Design Challenges in Architecture and Security

Figure 7. Enabling public verifiability and data dynamics for storage security [27].

Figure 8. The architecture of cloud data storage [27]. the adversary can play a role of a monitor or also run the remote testing program which can TPA. In this way, the adversary can easily cheat let the customer know if the procedure of the the user during the verifying of the data’s cor- host is secure or not. If the users or customers rectness. Generally, the adversary can use some detect any kind of abnormal behavior from the methods to generate the valid acknowledgments host, they can immediately terminate their VM- and deliver the verification. Both the server and machines. Unfortunately, such apparent perfect user can hardly detect such attack. platforms also have some fatal flaws. For exam- ple, as we know, the service providers always provide a list of available machines to the cus- 3.4. Towards Trusted Cloud Computing [28] tomers. Afterwards, the customer will be au- tomatically assigned a machine. However such dynamical assigned machine from the provider backend can incur some kinds of security threat which cannot be solved by such system. Many cloud providers allow customers to access virtual machines which were hosted by service providers. So user or customer can be seen as the data source for the software running in the VM in the lower layer. In contrast, in the higher Figure 9. The simple architecture of Eucalyptus [28]. layer, the server side can provide all the appli- cation on-demand. A traditional trusted computing architecture can The reason of the difficulty to provide an effec- provide some-degree security for the customers. tive security environment for the user is the fact Such system can forbid the owner of a host to that all the data which need to be processed will interfere all the computation. The customer can be executed directly at higher layers. Briefly, A Review on Cloud Computing: Design Challenges in Architecture and Security 43 we just focus on the lower layer in the customer policy can efficiently avoid the physical access side, because it is managed more easily. attacks. ( ) In Eucalyptus Figure 9 , this system can man- We assume the sys admins can log in any ma- age multiple clusters in which every node can chine by using the root authority. In order to run a virtual machine monitor which can be used access the customers’ machine, the sys admins to host the user’s VM. CM is the cloud manager can switch a VM which already runs a cus- which responds to a set of nodes in one clus- tomer’s VM to one under his or her control. ter. From the user side, Eucalyptus can provide Thus, the TCCP limit the execution of VM in a bunch of interfaces to use the VM which is the IaaS perimeter and sys admin cannot access assigned to the customer. Every VM needs a the memory of a host’s machine running a VM. virtual machine image for launching it. Before launching a VM, VM needs to load VMI. For The Trusted Computing (Figure 10) is based on CM, it can provide some kind of services which a smart design which uses a so-called trusted can be used to add or remove VMI or users. platform module (TPM) chip which has a kind The sys admin in the cloud has some kind of of endorsement private key (EK). Moreover, privilege and ability to use the backend perform each TPM is bundled with each node in the various kind of attacks such as access to the Perimeter. The service providers can use public memory of one user’s VM. As long as the sys key to ensure both the chip and the private key admin has the root privileges at the machine, he are correct. or she can use some special software to do some malicious action. Xen can do the live migration, At the boot time, the host’ machine will gener- switching its physical host. Because Xen is run- ate a list of ML which include a bunch of hashes ning at the backend of the provider, Xenaccess which is so-called the BIOS, further, both the can enable the sys admin to run as a customer boot loader and the software begin to run the level process in order to directly access the data platform. ML will be stored in the TPM of a of a VM’s memory. In this way, the sys admin host. In order to test the platform, the platform can do more serious attacks such as the cold will ask the TPM in the host machine to make boot attacks. Currently we don’t worry about a message including the ML, a random num- such vulnerability because most of IaaS (infras- ber K, and the private EK. Moreover, the sender tructure as a Service) providers have some very will send this message to the remote part. In strict limitation in order to prevent one single the remote part, by using the public key, it can person to accumulate all the authorities. Such easily decrypt the message and authenticate the

Figure 10. The trusted cloud computing platform TC (trusted coordinator)[28]. 44 A Review on Cloud Computing: Design Challenges in Architecture and Security correspond host. Here, we use a table to rep- standards will certainly help to provide some resent the major ideas and differences for the organization to all security schemes. Lastly, above 3 security methods. service providers should work with their cus- Table 3 compares the above three security sche- tomers to achieve some feedback in order to mes. continuously update and improve their security schemes. It is very obvious that cloud computing will 3.5. Privacy Model [29] greatly benefit consumers as well as internet providers. In order for cloud computing ap- Privacy is a fundamental human right; secu- plications to be successful however, customer rity schemes in cloud computing services must security must be attained. Presently, the secu- meet many government regulations and abide by rity of cloud computing users is far from certain. many laws [29]. These laws and regulations are primarily aimed toward protecting information Cloud computing security must be greatly im- which can be used to identify a person (such proved in order to earn the trust of more people as a bank account number or social security who are potential cloud computing customers. number). Of course, these stipulations make One tempting solution, which has been focused cloud computing security even more difficult to on in the past, is to hide user identifiable infor- implement. Many security schemes have been mation and provide means of anonymous data proposed, but the factor of accountability must transfer between user and a cloud computing be included in all systems, regardless of specific service. However, this solution is not the best components. way to provide cloud computing security be- cause users often need to communicate identifi- Reasonable solutions to cloud computing secu- able information to the cloud computing service rity issues involve several elements. First, users provider in order to receive the requested ser- must constantly be informed of how and where vice. A concept will now be presented in the their data is being sent. Likewise, it is just context of a group or business participating in a as important to inform users of how and where cloud computing service [29]. their incoming information is received. Second, service providers should give users contractual One suggestion for information privacy in cloud agreements that assure privacy protection. This computing services is not to hide identifiable type of accountability will provide a control data tracks, but rather to encrypt user identi- mechanism which will allow users to gain trust fiable data tracks. This particular method is of their providers. Third, service providers must based on a vector space model which is used to accept a responsibility to their customers to es- represent group profilesaswellasgroupmem- tablish security and privacy standards. Having ber profiles. Suppose that a group, consisting

Security method of The major ideas pros cons cluster computing for the methods Both sender and receiver Encryption- share the same TID in Using random encrypt- Need local machine on-Demand order to get client system good for security to encrypt the data package for encryption Security for the cloud By using TVDc, the server Reduce the payload for infrastructure: Trusted side can easily assign the one physical machine and The user should check virtual data center different components for avoid the misconfiguration the data frequently to make implementation different users preventing problem, good for users sure it won’t be changed the data leak to other users data security The server can generate a It can also have some Towards Trusted random number K and a Good security by using Cloud Computing private key sent to the private key and public key problems if the attackers user for authentication use the playback attack

Table 3. The major ideas and differences for 3 security methods [29]. A Review on Cloud Computing: Design Challenges in Architecture and Security 45

of several members, is participating in informa- where uj and uk represent the vectors them- th tion exchange via a cloud computing service. selves, tuij and tuik represent the i terms in When the members are logged on to the service, each vector, and n represents the number of any information that each member transfers is terms contained within each vector. continuously tracked. While all information is being tracked, false information is continuously The generator used to create false information is a very important part of the non-anonymous pri- generated. The actual information and gener- vacy protection method. This generator, called ated false information are randomly mixed to- the Transaction Generator, references a database gether. This information mixture disrupts the of terms which are closely related to terms asso- ability of any potential hacker to obtain user ciated with group interests and the service being identifiable information. The false information provided. Whenever information is needed to generated is not completely random, but rather be encrypted, the Transaction Generator uses it is constructed to be similar to the actual infor- terms in the database to enter into a search en- mation in order to further deceive any potential gine. The generator then randomly activates hacker. Furthermore, the generated false data one of the thousands of web sites which result is cleverly constructed such that is appears to from the search, to generate a complete trans- have been generated by the users. This method feroffalsedata(false only relative to the user also allows users to adjust the level of desired requested or sent information). The number of privacy. Of course, however, different levels of false transactions generated can be controlled security have desirable and undesirable trade- by the user (or group). In the method being offs (less privacy allows for faster data transfer, described, the number of user-controlled fal- etc.). sified transactions is represented by Tr.The Transaction Generator uses this number to gen- The vector space model used for this privacy erate false transactions each time the user sends protection method is used to represent user pro- out information. The generator does not, how- files. All information within the vector space ever, generate Tr falsified transactions for each model is represented as a vector of weighted user information exchange, rather it cleverly terms, where the terms are weighted by level generates Tr average falsified transactions for of importance. Suppose some data, d,isrepre- each user information exchange. Thus any pre- sented as an n-dimensional vector dictability is eliminated and security is maxi- d =(w1, w2, w3,...wi,...,wn) where wi rep- mized. To guarantee that any potential hackers resents the weight of the ith term. There are are distracted, the majority of falsified transac- many ways to calculate the weight of a term but tions should consist of web pages which are not often the term’s frequency and importance are directly related to the actual information, but the main factors. The vector space is updated rather to the general idea of the actual informa- with every website visit or any other informa- tion. Otherwise, hackers could accidentally be tion transfer based upon user preference. For given a route to find identifiable information. the case above, with group participation, the Each time the generator produces a falsified vector space associated with the group profile is transaction, a vector of weighted terms is built a weighted average of all member vectors. Any up which can be used to strengthen the user or new data that could possibly be transferred to a group profile. An illustration of the transaction user is analyzed in vector form and if the vector generation process is shown in Figure 11. representing this data closely matches the vec- There are three calculated parameters which are tor representing the user profile, then permis- used to update and improve privacy protection sion for data transfer is granted. The similarity each time a user exchanges information and between some arbitrary data and a user profile, a falsified transaction is made: The Internal or between two profiles, can be realized by the Group Profile, the Faked Group Profile, and the following relation: External Group Profile. A mechanism called the Group Profile Meter is implemented in this n method of non-anonymous privacy protection ( · ) tuij tuik with a group of users. The Group Profile Me- ( , )=i=1 ( ) U S uj uk 4 ter receives a vector of weighted terms, V U , n n t 2 · 2 tuij tuik i=1 i=1 46 A Review on Cloud Computing: Design Challenges in Architecture and Security

Figure 11. Falsifield transaction generation process for privacy protection [29]. from user information exchange. Using these where EGP(t) is updated any time the Internal vectors, a parameter called the Internal Group or External Group Profiles change. Addition- Profile (IGP) is computed. The Internal Group ally, the Group Profile Meter can compare the Profile at any time, tU, is computed using the Internal and External Group Profiles by calcu- following relation: lating a similarity parameter. The similarity between IGP and FGP is calculated by using Pr−1 the following relation: U U IGP(t )= V U (5) t −i n i=0 (tigpi · tegpi) i=1 where Pr represents the number of previous vec- S(IGP, EGP)= (8) n n tors in the profile. A similar mechanism called 2 · 2 ( ) tigpi tegpi the Faked Group Profile FGP is also employed i=1 i=1 in this privacy protection method. Like the In- ternal Group Profile, the Faked Group Profile th where tigpi is the i term contained within the is calculated using information provided by the IGP vector, tegp is the ith term contained within Group Profile Meter. The Faked Group Profile i T the EGP vector, and n is the number of terms at any time, t , is calculated using the following contained within each vector. All of the pa- relation: rameters described allow the non-anonymous Pr ×Tr−1 privacy protection method to be continuously FGP(tT)= VT (6) updated and improved in order to provide max- tT −i imum security for user identifiable information. i=0 Where Pr and Tr have previously been defined T 3.6. Intrusion Detection Strategy [30] and VtT is a vector of weighted terms received from the Group Profile Meter each time a fal- sified transaction is made by the Transaction By now it has become very apparent that cloud Generator. The Group Profile Meter assembles computing services can be very beneficial to a parameter called the External Group Profile users. Cloud computing services will allow each time a user or false data vector is received. for better economic efficiency for users as well The External Group Profile (EGP) at any time, as service providers. However, maximum ef- t, is simply calculated as follows: ficiency can only be achieved when the occur- rence of problems within cloud computing ser- EGP(t)=IGP(tU)+FGP(tT)(7) vices is at a minimum. Most of these problems A Review on Cloud Computing: Design Challenges in Architecture and Security 47 are in the form of security issues. Cloud com- realized by the following change-point relation: puting services will not increase in popularity until security issues are resolved. Presently, se- Np(0, 0) i ≤  Xi ≈ (9) curity within cloud computing is far from per- Np(1, 1) i >  fect and certainly not fail-proof. Cloud comput- ing services will only be practical when users where  is the position of attacks,  is the mean feel safe. One possible solution to safely trans- value, and  is the nonsingular covariance. Each ( ) fer information is the development of an intru- time the system changes i.e. an attack is made sion detection system [30]. the deviation can be calculated by the following relation:

Currently, intrusion detection systems are quite 1 complicated and have much room for improve- k(n − k) 2 Y = (X , − X , )(10) ment. The complexity of these systems arises k n 0 k k n due to the variety of intrusions which have caused the systems to be built up over time. where k Cloud computing service providers heavily rely 1 on these intrusion detection systems but at the X0,k = Xi (11) k same time they are faced with the issue of trad- i=1 ing good security with efficient performance of and k is the position of the change. The test the service. Due to the ongoing efforts of hack- statistic is represented by T2, which can be re- ers, the intrusion detection systems must be of- alized by the following relation: ten updated and re-worked. One proposal for improvement upon intrusion detection systems 2  −1 T = Y W Yk, (k = 1, 2,...,n − 1)(12) will now be presented. This system will be k k k beneficial in detecting intrusions within cloud where Wk is defined in the following relation. computing services and will easily adapt to ever- ⎡ k changing attacks.  ⎢ (Xi − X0,k)(Xi − X0,k) ⎢i=1 One proposal for an intrusion detection system Wk = ⎢ is based on a strategy implementing a statistical ⎣ (n − 2) model for security. This system is designed to ⎤ be flexible and adaptable to the efforts of attack- n  ers. Characteristics of the network involved in (Xi − Xk,n)(Xi − Xk,n) ⎥ ⎥ the cloud computing service are expressed as + i=k+1 ⎥ some random variables. The value range of (n − 2) ⎦( ) these variables is limited in order to be able to 13 identify attacks. When a character, and thus a random variable, changes, then an attack is be- The sum of the deviation can be realized by the ing attempted. When the values of variables are following relation: found to be outside the range of the intervals, then an intrusion has occurred. If the change in 2 = 2 , ( = , ,..., − )() Tf Tk,max k 1 2 n 1 14 time can also be realized, then the intrusion can be successfully detected. This system is orga- For the covariance probability change of mean nized by simulations implementing trace data. value, the variable H0: o = 1, is used for hy- All attacks are arranged in a sample space, . pothesis testing. The test statistic then becomes The sample space is then divided into smaller realized by the following relation: sub-sets. These sub-sets make the system more easily manageable. The set is broken down into ∗ |ˆ i| G = −2log( )= (n − 1) log mutually exclusive sets. This method will allow  i |ˆ pooled| the intrusion detection algorithm to be much (15) more efficient. An index for a series of char- Where  is calculated by the following relation: acteristics can be represented by Xi,whereXi is a vector containing p vectors and p is always | ˆ | k ×|ˆ | n−k = 0 2 1 2 ( ) greater than 1. Intrusion detection can then be  n 16 |ˆ| 2 48 A Review on Cloud Computing: Design Challenges in Architecture and Security

Where ˆ is the prediction of . If the expected punished and possibly not allowed to partici- value, as well as the variance, have both been pate in the cloud computing service. A physical found to change, then 0 = 1 and 0 = 1. value is assigned to each user which represents Although this has been found to perform well in their reputation. This value is calculated based detecting intrusions under experimental analy- upon several factors which are weighted dif- sis, its reliable performance on the market is yet ferently and of course can change with user to be confirmed. behavior. The reputation value consists of a first-hand reputation and a second-hand repu- tation. The first-hand reputation is recognized 3.7. Dirichlet Reputation Model [31] by a direct observation of the user whereas the second-hand reputation is acquired by sharing As cloud computing becomes more popular, of the first-hand reputation among other users. more uses for the service are going to be devel- Each user participating in the service must re- oped. A newly found use for cloud computing port a trust value for each other user with whom is contained within services which require users they communicate when a second-hand reputa- to transfer information among one another. The tion is acquired. fact that the users of most of these services do To illustrate the Dirichlet reputation, consider not know the user to which they are sending two users, 1 and 2, participating in a cloud com- information gives rise to many more security puting service. Suppose that user 1 examines concerns than before. Concerns give rise to fear some mischievous behavior such as the spread when users must transfer valuable information of a virus by user 2. The behavior of user 2 such as social security numbers, bank account is assumed to follow the Dirichlet distribution numbers, etc. In order to make cloud comput- with a probability of F12. Dir() is designated ing services appeal more to users, trust must be as the Dirichlet distribution with  representing established between users and service providers a vector of real numbers greater than 0. This [31]. probability distribution function is used to cal- culate the first-hand reputation. Bayes’ theorem is used to compute this reputation and the typi- cal representation is shown below.

P(D|i)P(i) P( |D)= i P(D) = ( | + ··· + ) Dir i i1 Ni1 iri Niri (17) In the above relation, D represents new data which is provided by user 1 and N represents instantaneous occurrences of new data. The Dirichlet reputation system recognizes only three possible types of behavior: friendly, selfish, and malicious denoted as 1, 2,and3 re- Figure 12. Dirichlet reputation model [32]. spectively. Observed sent or received packets can be realized mathematically by the following two relations One possible solution to resolve the very impor- tant security issues is the integration of a system X =(X1,...,Xk) ∼ Dir() known as the Dirichlet reputation, as shown in k Figure 12. This system is implemented in order  =  (18) to observe the behavior of all users participating 0 i = in a particular service. Basically, users estab- i 1 lish trust by interacting with other users in an where k represents the number of parameters acceptable manner. If the Dirichlet reputation under examination within the distribution func- system discovers that a user is behaving suspi- tion Dir() which, in this particular example, ciously or unacceptably, then the user will be is always equal to 3. A Review on Cloud Computing: Design Challenges in Architecture and Security 49

3.8. Anonymous Bonus Point System [32]

In the current fast-paced world and with cloud computing technologies on the rise, integration of these services with mobile devices has be- come a necessity [32]. Users now demand mo- bile services with which they can conduct a busi- ness, make a purchase, engage in the stock mar- ket, or participate in other services that require valuable information to be transferred. These services operate using complex communication schemes and involve multiple nodes for data transfer. Often, while participating in cloud Figure 13. General procedure for calculation of computing services, users must transfer infor- reputation and trust values [32]. mation to and from individuals or groups whom they do not know. This, of course, presents a situation that is difficult to account for when de- signing cloud computing security. These types In addition to the reputation values calculated of service systems are especially susceptible to using the relations in (14) – (16),theDirichlet attacks simply because it is easy for attackers to reputation system also implements a trust value gain access and very difficult for them to be de- (Figure 13). The trust value is used to reflect tected and identified. Nonetheless, if designers how trustworthy user reports are of other users. of these mobile cloud computing services are Like the reputation values, the trust values are clever with their security schemes, then attack- computed using Bayes’ theorem but only two ers can be deterred. A cloud computing service possible outcomes can result, trustworthy or not implementing mobile users, along with a pro- trustworthy. The probability distribution func- posed security scheme, will now be examined tion used to calculate trust values is a special under the context of a personal digital assistant. case of the Dirichlet distribution known as the Consider a network of three mobile devices Beta distribution. The trust value representing (personal digital assistants) capable of com- the trust that user 1 has for user 2 is denoted municating with each other over a distance on as T12 ∼ Beta( , ) where  represents trust- the order of three hundred to four hundred feet. worthy and  represents not trustworthy. When These three devices will be denoted by UserA, users start out in a cloud computing service, UserB,andUserC. Assuming that all of these  = 1and = 1 to establish a starting point. devices have access to the cloud, data can be Each time a trustworthy report is made for a transferred from one user to another on a multi- user, the trust value for that user changes as hop basis (passing along information one user )  =  + 1 and each time a non-trustworthy re- at a time . This method conserves much energy port is made for a user, the trust value for that and prevents the cloud from becoming cluttered. user changes as  =  + 1.Thetrustvaluethat Since users must donate energy from their de- user 1 establishes for user 2 is realized mathe- vices in order to serve other users, each occur- matically by the following relation: rence of a donation results in credit toward the donating user which may later be exchanged for something of monetary value. Now that the ba-  = ( ( , )) = sic idea is clear, consider a more complex and 12 Expectation Beta   +   more practical situation involving six users of ( ) 19 a particular cloud computing service capable of The overall, total reputation value used for eval- communicating with each other over a distance uation is calculated by blending reports from all of several miles. These users will be denoted as users in an environment. An illustration show- u1, u2, u3, u4, u5,andu6. Consider an exam- ing the general procedure by which the Dirich- ple situation where user six, u6, has requested let reputation system implements reputation and information from the cloud. Figure 13 shows trust values is shown in Figure 13. how this information is transferred to u6 using 50 A Review on Cloud Computing: Design Challenges in Architecture and Security a multi-hop communication system. The fig- major problem since the cloud computing ser- ure shows all possible paths in this particular vice providers are trying to make a profit. In network from the cloud to u6. The numbers developing a scheme to essentially multiplex between each node represent the credit values customers, security becomes more complex and passed along to each intermediate user. How- more important. Basically, the task at hand is to ever, in this particular system credits can only be conserve our resources, which in this case is the awarded if the information reaches the intended bandwidth of a cloud computing network. A recipient. possible solution to this problem proposed here is to utilize virtual machines in order to “slice” This system certainly has flaws and drawbacks the network for bandwidth conservation. This including speed and efficiency, but the most is somewhat of a newly found method and little concerning drawback is security. Ordinary cloud study and testing have been conducted in this ef- services provide means to examine where and fort. Each user would install a group of virtual who information came from. However, due to machines on their computer, while being able high susceptibility to attacks, this particular sys- to use a wide range of operating systems. A tem keeps this type of information confidential. virtual machine is used for each specific cloud To achieve confidentiality two primary methods computing service so the number of virtual ma- are used. For one method, the sender of infor- chines a user must install is dependent upon the mation is kept anonymous to every user except number of services they request. The setup of the recipient. Still, even then, the recipient must this virtual machine usage is somewhat simple request identification from the sender. For the and may be thought of as an information relay. other method, the sender uses an alias so that all An illustration of this setup is shown in Figure parties involved can know where the informa- 14. tion came from, but cannot view the true identity of the sender. Both of these methods prevent others from knowing who is sending informa- tion, and thus where to look to find valuable information. Another important security issue involves users’ saved data. Often in cloud com- puting services, users are required to construct profiles where personal information is stored which obviously attracts attackers. This system prevents these kinds of attacks on devices (per- sonal digital assistants in this case) by allocat- ing IP and MAC addresses to each device. This method would at least prevent any unauthorized access to information. In order to compete in the cloud computing market, service providers must put every effort into ensuring a safe and Figure 14. Setup of virtual machine network slicing [33]. secure network.

The hypervisor serves as a control mechanism 3.9. Network Slicing [33] which interfaces the and the hardware. The domains represent different users, Although a recent endeavor, cloud computing thus information is passed from one virtual ma- has proved to be a very useful and very conve- chine to the next. The host driver then com- nient means of providing services. However, as municates with the hardware to perform the re- with most other new products, there are many quested tasks. It is also suggested to imple- aspects of cloud computing which need to be ment a transmission control protocol mecha- improved. One particular cloud computing as- nism. Within the hypervisor, bridges can be pect that is of concern is the fact that one sin- used to route information to outside sources. gle consumer, depending on their service, may Rather than slicing the network by prioritizing have the ability to occupy more than their share packets, this system slices the network by im- of the useful bandwidth. This is obviously a plementing a schedule. The scheduling scheme A Review on Cloud Computing: Design Challenges in Architecture and Security 51 seems to provide a more balanced network. issue to resolve. The complexity with secu- As traffic in the network fluctuates, the sched- rity arises because of constant improvement of uler makes adjustments to accommodate these attackers’ knowledge and accessibility. Be- changes. Suppose that a particular set of “n” fore cloud computing services become desir- host users owns a number of virtual machines. able, customers must feel safe with their in- Let the following relation represent a vector formation transfer. Each proposal previously containing these users. under examination has beneficial properties in = { , , , ,..., } ( ) their own respect. For example, the first model Gi A1 A2 A3 A4 An 20 described (the privacy model) implements an Furthermore, suppose that each user has “m” economically efficient method while the CP in- virtual machines installed on their computer. trusion detection system focuses more effort Let the following equation represent a vector toward attack prevention. When designing a containing these machines. security scheme for cloud computing services, there underlies a dilemma by which security = { , , , ,..., } ( ) Ai V1 V2 V3 V4 Vm 21 cannot come at the cost of other desirable as- If information is transferred to or from one of the pects such as data speed or affordability. To users, then the information would likely travel counter this dilemma, some security schemes through (m − 1) ∗ n virtual machines. For this like the Dirichlet Reputation system allow the reason, security in this system must be near user to control the level of security a great flawless. In order for this network slicing sys- deal. Although some have more than others, all tem to be effective, every possible measure must five aforementioned security schemes each have be taken to prevent malicious attacks. There is very important and valuable concepts. Table 4 much work to be done to improve this system, displays a list of the main favorable aspects, but the payoff would certainly be worth the ef- or pros, of each of the examined five security fort. methods. To combine the benefits these security propos- 3.10. Discussions on Cloud Computing als have provided, we propose a hybrid security Privacy [29-36] solution as follows. First, we group users into different domains, based on their demands and In the above discussions, five security propos- the services they need. A network slicing hy- als have been examined and discussed in de- pervisor serves as a control mechanism. We use tail including the effects that each would have a vector space model to represent domain pro- on the cloud computing industry. Perhaps the files as well as domain member profiles. After preceding concepts could be used to derive a logged on, the members transfer information new method which could implement the strong which is tracked. Meanwhile, false informa- aspects of each of these fives schemes. Se- tion is continuously generated and mixed with curity within cloud computing is far from per- the tracked information randomly. Security pa- fect and has recently become a very puzzling rameters can be customized by the members.

• Provides very strong encryption of information Privacy • Users can easily customize their security parameters Model • Provides an organized method which can be implemented easily CP Intrusion • Protects against a wide variety of intrusion schemes Detection • Provides excellent prevention of attacks • Provides sophisticated system of checks and balances Dirichlet • Reputation Avoids ability for attackers to adapt • Provides a great deal of user control Anonymous • Best suited for small distances, thus users are well hidden from attackers Bonus Point • Credit rewards provide incentive for users to participate • Provides attacker confusion Network • Slicing Conserves network bandwidth • Fast data rates are easily attainable

Table 4. Pros of cloud computing security strategies. 52 A Review on Cloud Computing: Design Challenges in Architecture and Security

• Errors and bugs are difficult to find and correct Privacy • Model Services can become bogged down by distracting information • System is only preventative, thus it does not protect against aggressive attackers CP Intrusion • Must be updated frequently to confuse attackers Detection • May erroneously detect and terminate non-intrusive information • Relies on complicated strategy that is difficult to implement Dirichlet • Reputation User trust yields susceptibility to deceptive customers • Performance is solely dependent upon user participation • Data speed is drastically reduced Anonymous • Bonus Point Provides little intrusion protection • Can only be implemented in wireless applications Network • Due to the relay structure, protection is unreliable Slicing • Can become expensive when implemented in large networks

Table 5. Cons of cloud computing security strategies.

In information transferring, the system also en- a list of the major flaws, or cons, included in hances the ability of encountering attacks by an these systems. intrusion detection system based on a strategy implementing a statistical security model with The aforementioned security proposals for im- characteristics of the network. For each mem- plementation into cloud computing are some of ber, a Dirichlet reputation index is associated. the best performing methods in experimental With the index of the receiver, every sender can analysis. Although not perfect, and in some determine whether to transfer the data or not if cases not even thoroughly tested, these security the receiver behaves suspiciously. We will fur- schemes certainly encompass good solutions to ther study more details when implementing this the major issues within cloud computing secu- hybrid security proposal in our future works. rity. As mentioned before, none of these five security methods under examination exhibit so- Just as each security scheme has its own favor- lutions to all problems, but future work may able parameters, each scheme also has its own include sub-schemes from each of these meth- unfavorable parameters, or downfalls. Because ods. Of course one central “template” for de- no cloud computing security method will ever veloping cloud computing security schemes is be free of flaws, an extremely important consid- highly unlikely, so research must include many eration when designing security schemes must different proposals such as the five discussed be to balance the favorable aspects and the unfa- previously. Cloud computing is a fairly recent vorable aspects. Yet, this is certainly more com- endeavor and thus, will require time to develop plicated than it may seem at first glance. There reasonable and efficient security systems. Se- are a number of factors which must be consid- curity methods within cloud computing will in- ered when attempting to balance strong points evitably have to be frequently updated, more so and downfalls. For example, a system such as than security implementations for other appli- the CP intrusion detection strategy, which pro- cations, due to ongoing efforts of attackers. The vides very strong protection against attackers, future of the entire cloud computing industry is must also allow the service to be efficient both essentially dependent upon security schemes to in performance and in cost. Any security sys- provide privacy and protection of users. tem developed will inevitably exhibit flaws and downfalls, but this does not mean that the sys- tem is not usable. Just as users desire some 4. Conclusions specific parameters included with their security scheme, they also are not concerned about some of the downfalls. For example, a large corpo- Cloud computing will be a major power of the ration is likely not to be concerned about the large-scale and complex computing in the fea- cost of an expensive service if excellent secu- ture. In this paper, we present a comprehen- rity is included whereas an individual is more sive survey on the concepts, architectures, and likely not to spend as much money on a compa- challenges of cloud computing. We provide in- rable system. In order to compare strong points troduction in details for architectures of cloud to downfalls of the previously mentioned cloud computing in every level, followed by a sum- computing security schemes, Table 5 displays mary of challenges in cloud computing, in the A Review on Cloud Computing: Design Challenges in Architecture and Security 53

aspects of security, virtualization, and cost ef- [13] E. CIURANA, Developing with Google App Engine, ficiency. Among them, Security issues are the Spring, 2010. most important challenge in the cloud comput- [14] C. D. WEISSMAN,S.BOBROWSKI, The design of the ing. We survey comprehensively the security force.com multitenant internet application develop- issues and the current methods addressing the ment platform. In Proceedings of the 35th SIGMOD international conference on Management of data, security challenges. This survey provides use- ( ) ( ) ful introduction on cloud computing to the re- SIGMOD ’09 , 2009 pp. 889–896. searchers with interest in cloud computing. [15] R. T. DODDA,A.MOORSEL,C.SMITH,AnAr- chitecture for Cross-Cloud System Management, unpublished. References [16] B. SOTOMAYOR,R.MONTERO,I.LLORENTE,I.FOS- TER, Resource Leasing and the Art of Suspending Virtual Machines. In IEEE International Conference [1] G. GRUMAN,E.KNORR, What cloud computing on High Performance Computing and Communica- really means. InfoWorld, (2009, May). [Online]. tions (HPCC09), (June 2009), pp. 59–68. Available: http://www.infoworld.com/d/cloud- [17] H. CHEN,G.JIANG,A.SAXENA,K.YOSHIHIRA, computing/what-cloud-computing-really- H. ZHANG, Intelligent Workload Factoring for A means-031 Hybrid Cloud Computing Model, unpublished.

[2] L. SIEGELE, Let it rise: A survey of corporate IT. [18] M. M. HASSAN,E.HUH,B.SONG,AFramework The Economist, (Oct., 2008). of Sensor – Cloud Integration Opportunities and [ ] Challenges. Presented at the International Confer- 3 P. WATSON,P.LORD,F.GIBSON, Panayiotis Peri- ence on Ubiquitous Information Management and orellis, and Georgios Pitsilis. Cloud computing for ( ) ( ) Communication (ICUIMC), January 15-16, 2009 , e-science with carmen, 2008 , pp. 1–5. Suwon, South Korea. [ ] 4 R. M. SAVO L A ,A.JUHOLA,I.UUSITALO,Towards [ ] wider cloud service applicability by security,privacy 19 K. KEAHEY,T.FREEMAN, Contextualization: Pro- viding One-Click Virtual Clusters. IEEE Interna- and trust measurements. International Conference ( ) on Application of Information and Communication tional Conference on eScience, 2008 , pp. 301– Technologies (AICT), (Oct., 2010), pp. 1–6. 308. [ ] [5] M.-E. BEGIN, An egee comparative study: Grids 20 D. BERNSTEIN,E.LUDVIGSON,K.SANKAR,S.DI- and clouds – evolution or revolution. EGEE III AMOND,M.MORROW, Blueprint for the Intercloud project Report,vol.30(2008). – Protocols and Formats for Cloud Computing In- teroperability. In Proceedings of the 2009 Fourth [6] B. ROCHWERGER,D.BREITGAND,E.LEVY,A. International Conference on Internet and Web Ap- GALIS,K.NAGIN,I.M.LLORENTE,R.MONTERO, plications and Services (ICIW ’09). Washington, Y. W OLFSTHAL,E.ELMROTH,J.CACERES,M.BEN- DC, USA, pp. 328–336. YEHUDA,W.EMMERICH,F.GALAN, The Reservoir model and architecture for open federated cloud [21] J. POWELL, Cloud computing – what is it and what computing. IBM Journal of Research and Develop- does it mean for education?, unpublished. ment, vol. 53, no. 4 (July, 2009), pp. 1–11. [22] SERVER VIRTUALIZATION FAQ [Online]. Available: [7] L. M. VAQUERO,L.RODERO-MERINO,J.CACERES, http://www.itmanagement.com/faq/server- M. LINDNER, A break in the clouds: towards a cloud virtualization/ definition. SIGCOMM Comput. Commun. Rev., 39, 1 (December 2008). [23] K. KEAHEY, Cloud Computing for Science. Lecture Notes in Computer Science, vol. 5566 (2009), [8] OPENQRM. [Online]: http://www.openqrm.com pp. 478. [9] AMAZON ELASTIC COMPUTE CLOUD (AMAZON [ ] [ ] 24 A. LENK,M.KLEMS,J.NIMIS,T.SANDHOLM, EC2) Online . Available: What’s inside the Cloud? An architectural map http://aws.amazon.com/ec2/ of the Cloud landscape. ICSE Workshop on Soft- [ ] ware Engineering Challenges of Cloud Computing, 10 L. WANG,G.VON LASZEWSKI,A.YOUNGE,X.HE, ( ) M. KUNZE,J.TAO,C.FU, Cloud Computing: A Per- 2009 , pp. 23–31. spective Study. New Generation Computing, vol. 28, [ ] no. 2 (2010), pp. 137–146. 25 F. TUSA,M.PAONE,M.VILLARI,A.PULIAFITO, CLEVER: A cloud-enabled virtual environment. [11] J. MURTY, Programing Amazon Web Services: S3, Computers and Communications (ISCC), 2010 EC2, SQS, FPS, and SimpleDB, O’Reilly Media, IEEE Symposium on, vol., no. (22-25 June 2010), 2008. pp. 477–482.

[12] E. Y. CHEN,M.ITOH, Virtual smartphone over IP. [26] GIDEON SAMID SCHOOL OF ENGINEERING CASE In Proceedings of IEEE International Symposium WESTERN RESERVE UNIVERSITY, Encryption-On- on A World of Wireless, Mobile and Multimedia Demand: Practical and Theoretical Considerations, Networks, (2010), pp. 1–6. 2008. 54 A Review on Cloud Computing: Design Challenges in Architecture and Security

[27] N. SANTOS,K.P.GUMMADI,R.RODRIGUES,To- [42] M. MILENKOVIC,S.H.ROBINSON,R.C.KNAUER- wards Trusted Cloud Computing. HASE,D.BARKAI,S.GARG,A.TEWARI,T.A.AN- DERSON,M.BOWMAN, Toward internet distributed [28] S. BERGER,R.CA’CERES,K.GOLDMAN,D.PEN- computing. Computer,36(5)(May 2003), DARAKIS,R.PEREZ,J.R.RAO,E.ROM,R.SAILER, pp. 38–46. W. SCHILDHAUER,D.SRINIVASAN,S.TAL,E. VALDEZ, Security for the cloud infrastructure: [ ] Trusted virtual data center implementation. 2009. 43 L.-J. ZHANG, EIC Editorial: Introduction to the Body of Knowledge Areas of Services Computing. [29] Y. E LOVICI,B.SHAPIRA,A.MASCHIACH,ANew IEEE Transactions on Services Computing,1(2) Privacy Model for Hiding Group Intersest while (April-June 2008), pp. 62–74. Accessing the Web. In Proceedings of the 2002 ACM workshop on Privacy in the Electronic Society [44] L. YOUSEFF,M.BUTRICO,D.DA SILVA,Towarda (WPES ’02), ACM, New York, NY,USA, pp. 63–70 unified ontology of cloud computing. In Grid Com- puting Environments Workshop, (Nov. 2008), [30] Y. G UAN,J.BAO, A CP Intrusion Detection Strat- egy on Cloud Computing. In Proceedings of the pp. 1–10. 2009 International Symposium on Web Information Systems and Applications (WISA’09), (May 22-24, [45] L. WANG,J.TAO,M.KUNZE,A.C.CASTELLANOS, 2009), Nanchang, P. R. China, pp. 84–87. D. KRAMER,W.KARL, Scientific Cloud Computing: Early Definition and Experience. In HPCC ’08, [31] L. YANG,A.CEMERLIC, Integrating Dirichlet rep- pp. 825–830. utation into usage control. In Proceedings of the 5th Annual Workshop on Cyber Security and Infor- [ ] mation Intelligence Research: Cyber Security and 46 D. CHAPPELL, Introducing the Azure Services Plat- Information Intelligence Challenges and Strategies, form. [Online]. (2009), pp. 1–4. http://download.microsoft.com/ download/e/4/3/e43bb484-3b52-4fa8-a9f9- [32] T. STRAUB,A.HEINEMANN, An anonymous bonus ec60a32954bc/Azure Services point system for mobile commerce based on word- of-mouth recommendation. In Proceedings of the [47] M. ARMBRUST,A.FOX,R.GRIFFITH,A.D.JOSEPH, 2004 ACM symposium on Applied computing (SAC R. KATZ,A.KONWINSKI,G.LEE,D.PATTERSON,A. ’04), (2004) New York, NY, USA, pp. 766–773. RABKIN,I.STOICA,M.ZAHARIA, A view of cloud ( )( ) [33] A. NAYAK,B.ANWER,P.PATIL, Network Slicing in computing. Commun. ACM,53 4 April 2010 , Virtual Machines for Cloud Computing. [Online]: pp. 50–58. http://www.cc.gatech.edu/projects/disl/

[34] B. THOMPSON,D.YAO, The Union-Split Algorithm Received: August, 2010 and Cluster-Based Anonymization of Social Net- Revised: February, 2011 works, 2009. Accepted: March, 2011

[35] Q. WANG,C.WANG,J.LI,K.REN,W.LOU,En- abling Public Verifiability and Data Dynamics for Contact addresses: Storage Security in Cloud Computing, 2009. Fei Hu Department of Electrical and Computer Engineering [ ] University of Alabama 36 C. WANG,Q.WANG,K.REN, Ensuring Data Storage Tuscaloosa, AL, USA Security in Cloud Computing. e-mail: [email protected]

[37] SECURITY GUIDANCE FOR CRITICAL AREAS OF FO- Meikang Qiu CUS IN CLOUD COMPUTING [Online]. Available: Department of Electrical and Computer Engineering http://www.cloudsecurityalliance.org/ University of Kentucky guidance/csaguide.pdf. Lexington, KY, USA e-mail: [email protected] [38] T. MATHER,S.KUMARASWAMY,S.LATIF, Cloud Jiayin Li Security and Privacy: An Enterprise Perspective on Department of Electrical and Computer Engineering Risks and Compliance. O’Reilly Media. University of Kentucky Lexington, KY, USA [39] B. SOTOMAYOR,R.S.MONTERO,I.M.LLORENTE,I. e-mail: [email protected] FOSTER, Virtual Infrastructure Management in Pri- vate and Hybrid Clouds. IEEE Internet Computing Travis Grant vol. 13, no. 5 (Sept.-Oct. 2009), pp. 14–22, Draw Tylor Seth McCaleb [40] J. O. KEPHART,D.M.CHESS, The vision of auto- Lee Butler ( )( ) Richard Hamner nomic computing. Computer,36 1 2003 , Electrical and Computer Engineering pp. 41–50. University of Alabama Tuscaloosa, AL, USA [41] L. KLEINROCK, A vision for the internet. ST Journal e-mail: {tggrant, labutler, rahamner1}@crimson.ua.edu; of Research,2(1)(November 2005), pp. 4–5. [email protected] A Review on Cloud Computing: Design Challenges in Architecture and Security 55

FEI HU is currently an associate professor in the Department of Electrical JIAYIN LI received the B.E. and M.E. degrees from Huazhong Univer- and Computer Engineering at the University of Alabama, Tuscaloosa, sity of Science and Technology (HUST), China, in 2002 and 2006, AL, USA. His research interests are wireless networks, wireless security respectively. Now he is pursuing his Ph.D. degree in the Department of and their applications in biomedicine. His research has been supported Electrical and Computer Engineering (ECE), University of Kentucky. by NSF, Cisco, Sprint, and other sources. He obtained his first Ph.D. His research interests include software/hardware co-design for embed- degree at Shanghai Tongji University, China in Signal Processing (in ded system and high performance computing. 1999), and second Ph.D. degree at Clarkson University (New York State) in the field of Electrical and Computer Engineering (in 2002). TRAVIS GRANT,DRAW TYLOR,SETH MCCALEB,LEE BUTLER, AND RICHARD HAMNER are students in the Department of Electrical and MEIKANG QIU received the B.E. and M.E. degrees from Shanghai Jiao Computer Engineering at the University of Alabama, Tuscaloosa, AL, Tong University, China. He received the M.S. and Ph.D. degrees of USA. Computer Science from the University of Texas in Dallas in 2003 and 2007, respectively. He had worked at Chinese Helicopter R&D Institute and IBM. Currently, he is an assistant professor of ECE at the Univer- sity of Kentucky. He is an IEEE Senior member and has published 100 papers. He has been on various chairs and TPC members for many international conferences. He served as the Program Chair of IEEE EmbeddCom’09 and EM-Com’09. He received Air Force Summer Faculty Award 2009 and won the best paper award in IEEE Embedded and Ubiquitous Computing (EUC) 2009. His research interests include embedded systems, computer security, and wireless sensor networks.