Cloud Computing and Computer Clouds
Total Page:16
File Type:pdf, Size:1020Kb
Cloud Computing and Computer Clouds Dan C. Marinescu Computer Science Division Department of Electrical Engineering & Computer Science University of Central Florida, Orlando, FL 32816, USA Email: [email protected] February 3, 2012 1 Contents 1 Preface 5 2 Introduction 6 2.1 Network-centric computing and network-centric content . 7 2.2 Cloud computing - an old idea whose time has come . 11 2.3 Cloud computing paradigms and services . 13 2.4 Ethical issues in cloud computing . 16 2.5 Cloud storage diversity and vendor lock-in . 17 2.6 Further readings . 19 2.7 History notes . 20 3 Basic concepts 21 3.1 Parallel computing . 21 3.2 Parallel computer architecture . 23 3.3 Distributed systems . 25 3.4 Communication in a distributed system . 26 3.5 Process coordination . 29 3.6 Logical clocks . 31 3.7 Message delivery rules; causal delivery . 32 3.8 Runs and cuts; causal history . 35 3.9 Enforced modularity; the client-server paradigm . 39 3.10 Consensus protocols . 43 3.11 Modelling concurrency with Petri Nets . 47 3.12 Analysis of communicating processes . 52 3.13 Further readings . 53 3.14 History notes . 53 4 Cloud Infrastructure 55 4.1 Cloud computing at Amazon . 55 4.2 Cloud computing, the Google perspective . 59 4.3 Windows Azure and Online Services . 61 4.4 Open-source software platforms for private clouds . 63 4.5 Service level agreements and compliance level agreements . 65 4.6 Software licensing . 66 4.7 Energy use and ecological impact of large-scale data centers . 67 4.8 User experience . 70 4.9 History notes . 72 4.10 Further readings . 72 5 Cloud Computing 74 5.1 Challenges for cloud computing . 75 5.2 Existing cloud applications and new application opportunities . 76 5.3 Workflows - coordination of multiple activities . 77 5.4 Coordination based on a state machine model - the ZooKeeper . 85 5.5 The MapReduce programming model . 88 2 5.6 A case study: the GrepTheWeb application . 89 5.7 Clouds for science and engineering . 92 5.8 Benchmarks for high performance computing on a cloud . 93 5.9 Cloud computing and biology research . 96 5.10 Social computing, digital content, and cloud computing . 98 5.11 Further readings . 101 6 Virtualization 102 6.1 Layering and virtualization . 103 6.2 Virtual machines . 105 6.3 Virtual machine monitors (VMMs) . 107 6.4 Performance and security isolation . 109 6.5 Architectural support for virtualization; full and paravirtualization . 110 6.6 Case study: Xen-a VMM based on paravirtualization . 112 6.7 Optimization of network virtualization in Xen 2.0 . 116 6.8 vBlades - paravirtualization targeting a x86-64 Itanium processor . 118 6.9 A performance comparison of virtual machines . 121 6.10 Virtual machine security . 124 6.11 Software fault isolation . 126 6.12 History notes . 126 6.13 Further readings . 128 7 Resource Management 129 7.1 A utility-based model for cloud-based Web services . 132 7.2 Applications of control theory to task scheduling on a cloud . 135 7.3 Stability of a two-level resource allocation architecture . 139 7.4 Coordination of multiple autonomic performance managers . 140 7.5 Cloud scheduling subject to deadlines . 142 7.6 Resource bundling; combinatorial auctions for cloud resources . 148 7.7 Further readings . 152 8 Networking support for cloud computing 153 8.1 Packet-switched networks and the Internet . 153 8.2 The transformation of the Internet . 157 8.3 Network resource management . 161 8.4 Web access and the TCP congestion control window . 164 8.5 Interconnects for warehouse-scale computers . 167 8.6 Storage area networks . 169 8.7 Content delivery networks . 172 8.8 Overlay networks for cloud computing . 174 8.9 Further readings . 182 9 Storage Systems 183 9.1 GPFS, IBM's General Parallel File System . 183 9.2 Calypso, a distributed ¯le system . 183 9.3 Lustre, a massively parallel distributed ¯le system . 183 9.4 GFS, the Google File System . 183 3 9.5 A storage abstraction layer for portable cloud applications . 183 9.6 Megastore . 183 9.7 Performance of large-scale distributed storage systems . 183 10 Cloud Security 184 10.1 A taxonomy for attacks on cloud services . 184 10.2 Security checklist for cloud models . 184 10.3 Privacy impact assessment for cloud computing . 184 10.4 Risk controls in cloud computing . 184 10.5 Security services life cycle management . 184 10.6 Secure collaboration in cloud computing . 184 10.7 Secure VM execution under an un-trusted OS . 184 10.8 Anonymous access control and accountability for cloud computing . 184 10.9 Inter-cloud security . 184 10.10Social impact of privacy in cloud computing . 184 10.11Further readings . 184 11 Complex Systems and Self-Organization 185 11.1 Quantifying complexity. 185 11.2 Emergence and self-organization . 186 11.3 Complexity of computing and communication systems . 187 12 Cloud Application Development 189 12.1 Connecting clients with cloud instances through ¯rewalls . 192 12.2 Security rules for application- and transport-layer protocols in EC2 . 195 12.3 How to launch an EC2 Linux instance and connect to it . 198 12.4 How to use S3 in Java . 200 12.5 How to manage SQS services in C# . 202 12.6 How to install the Simple Noti¯cation Service on Ubuntu 10.04 . 204 12.7 How to create an EC2 Placement Group and use MPI . 205 12.8 How to install hadoop on eclipse on a Windows system . 207 12.9 Cloud applications for cognitive radio networks . 212 12.9.1 Motivation and basic concepts . 212 12.9.2 A distributed algorithm to compute the trust in a CR . 214 12.9.3 Simulation of a distributed trust computing algorithm . 215 12.9.4 A history-based cloud service for cognitive radio networks . 217 12.10Adaptive data streaming to and from a cloud . 220 12.10.1 Digital music streaming . 221 13 Glossary 242 13.1 General . 242 13.2 Virtualization . 243 13.3 Cloud Services . 244 13.4 Layers and functions . 244 13.5 Desirable characteristics/attributes . 245 13.6 Cloud security . 246 4 1 Preface The contents of this book represent a series of lectures given in the Fall 2011 to a graduate level class on cloud computing. Cloud computing is a movement started sometime during the middle of the ¯rst decade of the new millennium; the movement is motivated by the idea that information processing can be done more e±ciently on large farms of computing and storage systems accessible via the Internet. The appeal of cloud computing is that it o®ers scalable and elastic computing and storage services; the resources used for these services can be metered and the users can be charged only for the resources they used. Cloud computing is cost e®ective because of the multiplexing of resources. Application data is stored closer to the site where it is used in a manner that is device and location independent; potentially, this data storage strategy increases reliability as well as security. The maintenance and security are ensured by service providers; the service providers can operate more e±ciency due to specialization and centralization. The cloud movement is not without skeptics and critics. The skeptics question what is actually a cloud, what is new, how does it di®er from other types of large-scale distributed system, why cloud computing could be successful when grid computing had only limited success. For example, Larry Ellison, the CEO of Oracle, was quoted in the Wall Street Journal on September 26, 2008: \the interesting thing about Cloud Computing is that we've rede¯ned Cloud Computing to include everything that we already do...;" Andy Iserwood, the Vice-President of HP's European Software Sales, was quoted in the ZDnet News on December 11, 2008 as saying: \a lot of people.