<<

Data Centers and Computing Data Centers

• Large and storage farms – 1000s of servers • Intro. to Data centers – Many TBs or PBs of data

Basics • Used by – Enterprises for server applications – companies • Intro. to • Some of the biggest DCs are owned by , , etc • Used for – Data processing – Web sites – Business apps

Computer Science Lecture 22, page 2 Science Lecture 22, page 3

Inside a Data Center MGHPCC Data Center

• Giant warehouse filled with: • Racks of servers • Storage arrays

• Cooling infrastructure • Power converters • generators

• Data center in Holyoke

Computer Science Lecture 22, page 4 Computer Science Lecture 22, page 5 Modular Data Center Virtualization • ...or use shipping containers

• Each container filled with thousands of servers

• Can easily add new containers • Virtualization: extend or replace an existing interface to – “Plug and play” mimic the behavior of another system. – Just add electricity – Introduced in 1970s: run legacy software on newer mainframe hardware • Allows data center to be easily • Handle platform diversity by running apps in VMs expanded – Portability and flexibility • Pre-assembled, cheaper

Computer Science Lecture 22, page 6 Computer Science Lecture 22, page 7

Types of Interfaces Types of OS-level Virtualization

• Different types of interfaces – Assembly instructions – System calls • Type 1: hypervisor runs on “bare metal” – APIs • Type 2: hypervisor runs on a host OS • Depending on what is replaced /mimiced, we obtain – Guest OS runs inside hypervisor different forms of virtualization • Both VM types act like real hardware • Emulation (Bochs), OS level, application level (Java, Rosetta, Wine) Computer Science Lecture 22, page 8 Computer Science Lecture 22, page 9 Server Virtualization Virtualization in Data Centers • Virtual Servers • Allows a server to be “sliced” into Virtual Machines – Consolidate servers • VM has own OS/applications – Faster deployment VM 1 VM 2 – Easier maintenance • Rapidly adjust resource allocation Windows Virtualization Layer • VM migration within a LAN • Virtual Desktops – Host employee desktops in VMs – Remote access with thin clients – Desktop is available anywhere Work Windows Linux – Easier to manage and maintain

Home Computer Science Lecture 22, page 10 Computer Science Lecture 22, page 11

Data Center Challenges Data Center Costs

• Resource management • Running a data center is expensive – How to efficiently use server and storage resources? – Many apps have variable, unpredictable workloads – Want high performance and low cost – Automated resource management – Performance profiling and prediction

• Energy Efficiency – Servers consume huge amounts of energy

– Want to be “green” http://perspectives.mvdirona.com/2008/11/28/ – Want to save money CostOfPowerInLargeScaleDataCenters.aspx

Computer Science Lecture 22, page 12 Computer Science Lecture 22, page 13 Economy of Scale What is the cloud?

• Larger data centers can be cheaper to buy and run than smaller ones – Lower prices for buying equipment in bulk – Cheaper energy rates Remotely available • allows small number of sys admins to manage Pay-as-you-go thousands of servers High scalability Shared infrastructure • General trend is towards larger mega data centers – 100,000s of servers Azure • Has helped grow the popularity of cloud computing

Computer Science Lecture 22, page 14 Computer Science Lecture 22, page 15

The Cloud Stack PaaS: Software Hosted applications • Provides highly scalable execution platform Managed by provider – Must write application to meet App Engine API Office apps, CRM – App Engine will autoscale your application – Strict requirements on application state Platform to let you run • “Stateless” applications much easier to scale Azur your own apps Software platforms Provider handles • Not based on virtualization scalability Infrastructure as a Service – Multiple users’ threads running in same OS – Allows google to quickly increase number of “worker threads” Raw infrastructure running each client’s application Can do whatever you want with it Servers & storage • Simple scalability, but limited control Computer Science Lecture 22, page 16 Computer Science Lecture 22, page 17 IaaS: Amazon EC2 Public or Private • Rents servers and storage to customers – Uses virtualization to share each server for multiple customers • Not all enterprises are comfortable with using public cloud – Economy of scale lowers prices services – Can create VM with push of a button – Don’t want to share CPU cycles or disks with competitors – Privacy and regulatory concerns Smallest Medium Largest VCPUs 1 5 33.5 • Private Cloud RAM 613MB 1.7GB 68.4GB – Use cloud computing concepts in a private data center Price $0.02/hr $0.17/hr $2.10/hr • Automate VM management and deployment Storage $0.10/GB per month • Provides same convenience as public cloud Bandwidt $0.10 per GB h • May have higher cost

• Hybrid Model Computer Science Lecture 22, page 18 Computer Science Lecture 22, page 19

Programming Models Cloud Challenges

• Client/Server • Privacy / Security – Web servers, databases, CDNs, etc – How to guarantee isolation between client resources?

• Batch processing • Extreme Scalability – Business processing apps, payroll, etc – How to efficiently manage 1,000,000 servers?

• Map Reduce • Programming models – Data intensive computing – How to effectively use 1,000,000 servers? – Scalability concepts built into programming model

Computer Science Lecture 22, page 20 Computer Science Lecture 22, page 21