Data Centers and Cloud Computing Data Centers
Total Page:16
File Type:pdf, Size:1020Kb
Data Centers and Cloud Computing Data Centers • Large server and storage farms – 1000s of servers • Intro. to Data centers – Many TBs or PBs of data • Virtualization Basics • Used by – Enterprises for server applications – Internet companies • Intro. to Cloud Computing • Some of the biggest DCs are owned by Google, Facebook, etc • Used for – Data processing – Web sites – Business apps Computer Science Lecture 22, page 2 Computer Science Lecture 22, page 3 Inside a Data Center MGHPCC Data Center • Giant warehouse filled with: • Racks of servers • Storage arrays • Cooling infrastructure • Power converters • Backup generators • Data center in Holyoke Computer Science Lecture 22, page 4 Computer Science Lecture 22, page 5 Modular Data Center Virtualization • ...or use shipping containers • Each container filled with thousands of servers • Can easily add new containers • Virtualization: extend or replace an existing interface to – “Plug and play” mimic the behavior of another system. – Just add electricity – Introduced in 1970s: run legacy software on newer mainframe hardware • Allows data center to be easily • Handle platform diversity by running apps in VMs expanded – Portability and flexibility • Pre-assembled, cheaper Computer Science Lecture 22, page 6 Computer Science Lecture 22, page 7 Types of Interfaces Types of OS-level Virtualization • Different types of interfaces – Assembly instructions – System calls • Type 1: hypervisor runs on “bare metal” – APIs • Type 2: hypervisor runs on a host OS • Depending on what is replaced /mimiced, we obtain – Guest OS runs inside hypervisor different forms of virtualization • Both VM types act like real hardware • Emulation (Bochs), OS level, application level (Java, Rosetta, Wine) Computer Science Lecture 22, page 8 Computer Science Lecture 22, page 9 Server Virtualization Virtualization in Data Centers • Virtual Servers • Allows a server to be “sliced” into Virtual Machines – Consolidate servers • VM has own OS/applications – Faster deployment VM 1 VM 2 – Easier maintenance • Rapidly adjust resource allocation Windows Linux Virtualization Layer • VM migration within a LAN • Virtual Desktops – Host employee desktops in VMs – Remote access with thin clients – Desktop is available anywhere Work Windows Linux – Easier to manage and maintain Home Computer Science Lecture 22, page 10 Computer Science Lecture 22, page 11 Data Center Challenges Data Center Costs • Resource management • Running a data center is expensive – How to efficiently use server and storage resources? – Many apps have variable, unpredictable workloads – Want high performance and low cost – Automated resource management – Performance profiling and prediction • Energy Efficiency – Servers consume huge amounts of energy – Want to be “green” http://perspectives.mvdirona.com/2008/11/28/ – Want to save money CostOfPowerInLargeScaleDataCenters.aspx Computer Science Lecture 22, page 12 Computer Science Lecture 22, page 13 Economy of Scale What is the cloud? • Larger data centers can be cheaper to buy and run than smaller ones – Lower prices for buying equipment in bulk – Cheaper energy rates Remotely available • Automation allows small number of sys admins to manage Pay-as-you-go thousands of servers High scalability Shared infrastructure • General trend is towards larger mega data centers – 100,000s of servers Azure • Has helped grow the popularity of cloud computing Computer Science Lecture 22, page 14 Computer Science Lecture 22, page 15 The Cloud Stack PaaS: Google App Engine Software as a Service Hosted applications • Provides highly scalable execution platform Managed by provider – Must write application to meet App Engine API Office apps, CRM – App Engine will autoscale your application Platform as a Service – Strict requirements on application state Platform to let you run • “Stateless” applications much easier to scale Azur your own apps Software platforms Provider handles • Not based on virtualization scalability Infrastructure as a Service – Multiple users’ threads running in same OS – Allows google to quickly increase number of “worker threads” Raw infrastructure running each client’s application Can do whatever you want with it Servers & storage • Simple scalability, but limited control Computer Science Lecture 22, page 16 Computer Science Lecture 22, page 17 IaaS: Amazon EC2 Public or Private • Rents servers and storage to customers – Uses virtualization to share each server for multiple customers • Not all enterprises are comfortable with using public cloud – Economy of scale lowers prices services – Can create VM with push of a button – Don’t want to share CPU cycles or disks with competitors – Privacy and regulatory concerns Smallest Medium Largest VCPUs 1 5 33.5 • Private Cloud RAM 613MB 1.7GB 68.4GB – Use cloud computing concepts in a private data center Price $0.02/hr $0.17/hr $2.10/hr • Automate VM management and deployment Storage $0.10/GB per month • Provides same convenience as public cloud Bandwidt $0.10 per GB h • May have higher cost • Hybrid Model Computer Science Lecture 22, page 18 Computer Science Lecture 22, page 19 Programming Models Cloud Challenges • Client/Server • Privacy / Security – Web servers, databases, CDNs, etc – How to guarantee isolation between client resources? • Batch processing • Extreme Scalability – Business processing apps, payroll, etc – How to efficiently manage 1,000,000 servers? • Map Reduce • Programming models – Data intensive computing – How to effectively use 1,000,000 servers? – Scalability concepts built into programming model Computer Science Lecture 22, page 20 Computer Science Lecture 22, page 21.