Large Scale Distributed Computing - Summary
Total Page:16
File Type:pdf, Size:1020Kb
Large Scale Distributed Computing - Summary Author: @triggetry, @lanoxx Source available on github.com/triggetry/lsdc-summary December 10, 2013 Contents 1 Preamble 3 2 CHAPTER 1 3 2.1 Most common forms of Large Scale Distributed Computing . 3 2.2 High Performance Computing . 3 2.2.1 Parallel Programming . 5 2.3 Grid Computing . 5 2.3.1 Definitions . 5 2.3.2 Virtual Organizations . 6 2.4 Cloud Computing . 6 2.4.1 Definitions . 6 2.4.2 5 Cloud Characteristics . 7 2.4.3 Delivery Models . 7 2.4.4 Cloud Deployment Types . 7 2.4.5 Cloud Technologies . 7 2.4.5.1 Virtualization . 7 2.4.5.2 The history of Virtualization . 8 2.5 Web Application Frameworks . 9 2.6 Web Services . 9 2.7 Multi-tenancy . 9 2.8 BIGDATA .................................................. 10 3 CHAPTER 2 11 3.1 OS / Virtualization . 11 3.1.1 Batchsystems . 11 3.1.1.1 Common Batch processing usage . 11 3.1.1.2 Portable Batch System (PBS) . 12 3.1.2 VGE - Vienna Grid Environment . 12 3.2 VMs, VMMs . 12 3.2.1 Why Virtualization? . 12 3.2.2 Types of virtualization . 13 3.2.3 Hypervisor vs. hosted Virtualization . 14 3.2.3.1 Type 1 and Type 2 Virtualization . 14 3.2.4 Basic Virtualization Techniques . 14 3.3 Xen ...................................................... 14 3.3.1 Architecture . 14 3.3.2 Dynamic Memory Control (DMC) . 16 3.3.3 Balloon Drivers . 16 3.3.4 Paravirtualization . 16 3.3.5 Domains in Xen . 17 3.3.6 Hypercalls in Xen . 17 3.4 VMWare . 17 3.4.1 Hosted vs. Hypervisor Architecture . 17 3.5 Cloud Management . 18 3.6 OpenNebula . 18 1 3.7 Eucalyptus . 19 3.7.1 Components . 19 3.8 Virtualization - Glossary . 19 4 CHAPTER 3 22 4.1 Self adaptable Clouds: Cloud Monitoring and Knowledge Management . 22 4.1.1 Traditional MAPE Loop . 22 4.1.2 SLA . 22 4.1.3 LoM2His Framework . 22 4.1.4 Cloud Characteristics . 23 4.1.5 How to make Clouds energy-efficient . 23 4.1.6 How to avoid SLA violations . 23 4.1.7 How to structure actions . 25 4.2 Policy Modes . 25 4.3 Cloud Market . 25 4.4 Cloud Characteristics . 25 4.5 Cloud Enabling Technologies . 25 4.6 Problems when providing virtual goods . 26 4.7 Resource markets in Research . 26 4.8 Commercial Resource Providers . 27 4.9 Liquidity Problems in Markets . 27 4.10 The importance of SLAs in markets . 27 4.11 Managing SLAs . 27 4.12 The SLA Template Lifecycle . 27 4.13 SLA Mapping in Double Auctions . 27 4.14 Consequences of few resource types . 29 4.15 Mapping the SLA Landscape for High Performance Clouds . 29 4.16 SLA Mapping Approach . 29 5 CHAPTER 4 30 5.1 Map-Reduce Overview . 30 5.2 Map Reduce Sequence of Actions . 31 5.3 Master Data Structures . 32 5.4 Fault Tolerance . 32 5.4.1 Worker Failure . 32 5.4.2 Master Failure . 32 5.5 Data Flow . 32 5.6 Partitioning function . 33 5.7 Combiner function . 33 5.8 Input and Output Types . 34 5.9 Hadoop . 34 5.10 Hive . 34 5.10.1 HiveQL . 35 5.11 HadoopDB . ..