Improving Innovation and Entrepreneurship Competences of Iranian Higher Education Graduates through Data Analytics
Module 5: Distributed Systems
Dr. Atakan Aral, Vienna University of Technology February 8, 2018 Evolution of Distributed Systems
2
1960s 1970s 1980s 1990s 2000s 2010s • Main- • Client / • Clusters • Grids • Clouds • Micro- frame Server services systems • Edge / Fog Why Distributed Systems?
3
Performance Redundancy / Fault Tolerance / Availability Scalability / Flexibility Resource sharing / Utilization Economics Accessibility Mobility Issues with Distributed Systems
4
Security Overloading / Load balance Synchronization Management More components to fail (e.g. Network) 5
Module Outline Module Outline
6
Cloud Computing (Yesterday 09:00-12:15) Use cases Example problems and solutions Distributed Systems (Today 09:00-17:00) Technologies from industry Hands-on session Module Outline
7
Server virtualization including Hypervisors Network virtualization Cloud OSs SLAs and markets, managing market liquidity SLA management and negotiation Practical session: Distributed system modeling with runway 8
High Performance Computing Aerodynamic simulation Stock exchange simulation Discovery of new galaxies
Modeling and simulation of Preoperative surgery planning meso-scale weather phenomena Parallel Processing – does it really work?
10
1 worker = 1000 days 2 workers = 500 days
...
1000 workers = 1 day?! HPC - Infrastructures
11 Supercomputer •Custom processor •Tightly coupled
Cluster • COTS-Components • Loosely coupled • Beowulf Clusters
Grids •Interconnection of computational resources across different administration domains •Virtual organizations (VO) How sustainable is LSDC?
Cluster and HPC Distributed Computing Computing • Tightly coupled • Loosely coupled • Homogeneous • Heterogeneous • Single System Image • Single administration
Big Data Grid Computing • No SQL DBs • Real time • Large scale processing • Cross-organizational • Distributed • Geographical distribution Queries • Distributed management After 2013: Cloud Computing Sky, Galaxy, Computing, • Provisioned on demand Ultra scale data ?? – • Service guarantee well, we are ready! • VMs and Web 2.0-based Sequential Processing
13
Memory Store program and data in memory
fetch execute CPU get instructions and data from memory CPU Decode instruction
Execute it sequentially Supercomputer - Architectures
14
Vector Processor Single operation on multiple data Single Multiple instruction Instruction Scalar Processor Single operation on single data Single Data SISD MISD
SISD e.g. sequential e.g. … doesn't computer really exist… Single Instruction, Single Data Multiple MISD MIMD SIMD Data e.g. vector e.g. clusters, Single Instruction, Multiple Data computing nowadays MISD multicores Multiple Instruction, Single Data
MIMD Flynn's taxonomy (1966) Multiple Instruction, Multiple Data Memory architecture (I)
15
• Shared memory CPU Multiple CPUs can operate independently CPU Memory CPU Changes in memory visible to other CPUs CPU Uniform memory access: Symmetric Multiprocessor - SMP Non-uniform memory access Memory Architecture (II)
16
Distributed Memory CPU CPU Memory Memory Communication Network network required Processors have their CPU CPU own memory Memory Memory No global address space Memory Architecture (II)
17
Hybrid-distributed- Node 4 Memory Memory Node 1 shared CPU CPU CPU CPU Network Used in most
parallel computers CPU CPU CPU CPU today Node 3 Memory Memory Node 2 Cache-coherent SMP nodes Distributed memory multiple SMP nodes Parallel Programming
18
Shared Memory (“parallel computing”) Machine 1 Machine 2
e.g. OpenMP Task 0 Task 1 Threads Data Message Passing Data (“distributed computing”) e.g. Message Passing interface Network (MPI) send(data) receive(data) Hybrid approaches e.g. OpenMP and MPI MPI and POSIX Threads TOP500 Ranking
19
1018 floating point operations per second HPC - Who does it need ?
20
Multi-core architectures Scientists? ?
Or everyone? 21
Grid Computing Grid Computing
hardware and software infrastructure that clusters and integrates high- end computers, networks, databases and scientific instruments
virtual supercomputer
virtual organizations
WWGrid 23
Source: “CSA“ Grid Computing Vision
G Mobile R Supercomputer, PC-Cluster Access I D
M I D Workstation D L Data-storage, Sensors, Experiments E W A R E Source: “EGEE“ Visualising Internet, Networks The Grid Reality – that didn‘t work well ... Replica Catalogue UI Input “sandbox” DataSets info JDL Information Output “sandbox” Service Resource
Broker Job Submit Event Submit Job
Author. Query Job
&Authen. Publish Expanded JDLExpanded
Globus RSL Storage Element Job StatusJob Submission Service Logging & Compute Book-keeping Job Status Element Grid Computing
Internet → sharing, distribution, and pervasive access to information
Grid Computing → sharing, distribution, and pervasive access to computing power
…“computational Grid is hardware and software infrastructure that provides dependable, consistent, and pervasive access to high-end computational capabilities”… Foster, Kesselman (1998)
…“grid computing is concerned with coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations”… Foster, Kesselman (2000)
... „Grid ... uses standard, open, general purpose protocols and interfaces coordinates resources that are NOT subject to centralized control delivers non-trivial qualities of service”… Foster, Kesselman (2002) What is the Grid?
27
Grid Computing Origin in academia Moving to industry, commerce,… highly popular
Compute Grids, Data Grids, Science Grids, Access Grids, Knowledge Grids, Bio Grids, Sensor Grids, Cluster Grids, Campus Grids, Tera Grids, Commodity Grids,…
Grid Checklist: … coordinates resources that are not subject to centralized control …
… using standard, open, general-purpose protocols and interfaces…
… to deliver nontrivial qualities of service… Success Stories
28 EGEE Infrastructure
Scale > 170 sites in 39 countries > 17 000 CPUs > 5 PB storage > 10 000 concurrent jobs per day Source: Erwin Laure, > 60 Virtual Organizations CERN 30
Virtualization Virtualization Middleware
31
Virtual Machines OS VMWare – (Vsphere) Xen Middleware Management OpenNebula Eucalyptus Aneka Clouds FoSII Monitoring, Knowledge Management, SLA management, energy efficiency Programming Models Map Reduce Access Management VieSLAF, Compliance Management, Security Issues VMs
32
VMM decouples the software from the hardware by forming a level of indirection between the software running in the virtual machine (layer above the VMM) and the hardware.
VMM
total mediation of all interactions between the virtual machine and underlying hardware
allowing strong isolation between virtual machines and supporting the multiplexing of many virtual machines on a single hardware platform.
The central design goals for VMMs:
Compatibility
Performance
Simplicity Why virtualization?
33
Consolidate workloads to reduce hardware, power, and space requirements.
Run multiple operating systems simultaneously — as an enterprise upgrade path, or to leverage the advantages of specific operating systems
Run legacy software on newer, more reliable, and more power-efficient hardware.
Dynamically migrate workloads to provide fault tolerance.
Provide redundancy to support disaster recovery.
Elasticity by means of vertical and horizontal scaling Types of virtualization
34
Software, or full virtualization Hypervisor “trap” the machine operations to read or modify the system’s status or perform input/output (I/O) operations Emulation of operations Status code consistent with the OS
Partial virtualization or para-virtualization Eliminates trapping and emulating Guest OS knows about hypervisor
Hardware-assisted virtualization hardware extensions to the x86 system architecture to eliminate much of the hypervisor overhead assoc. with trapping / emulating I/O ops. Rapid Virtualization Indexing Hardware assisted memory management Hosted vs. Hypervisor
35
A hosted architecture A hypervisor (bare-metal)
installs and runs the virtualization architecture layer as an application on top of installs the virtualization layer an operating system and supports directly on a clean x86-based the broadest range of hardware system. configurations. a hypervisor is more efficient it has direct access to the than a hosted architecture and hardware delivers greater scalability, VMware Player, ACE, robustness and performance Workstation and Server Xen Basic VM Techniques
36
CPU Virtualization the basic VMM technique of direct execution – executing the virtual machine on the real machine, while letting the VMM retain ultimate control of the CPU. X86 architecture Memory Virtualization VMM maintain a shadow of the virtual machine’s memory- management data structure I/O Virtualization using a channel processor, the VMM safely export I/O device access directly to the virtual machine e.g. VNICs CPU Virtualization
37
Basic direct execution running the virtual machine’s privileged (operating-system kernel) and unprivileged code in the CPU’s unprivileged mode, while the VMM runs in privileged mode.
When the virtual machine attempts to perform a privileged operation, the CPU traps into the VMM, which emulates the privileged operation on the virtual machine state that the VMM manages
Example: interrupts Letting a guest operating system disable interrupts would not be safe since the VMM could not regain control of the CPU. Instead, the VMM would trap the operation to disable interrupts and then record that interrupts were disabled for that virtual machine. Challenges in Virtualizing CPUs
38
most modern CPU architectures were not designed to be virtualizable, including the popular x86 architecture. x86 operating systems use the x86 POPF instruction (pop CPU flags from stack) to set and clear the interrupt-disable flag. When it runs in unprivileged mode, POPF does not trap. ?? it simply ignores the changes to the interrupt flag, so direct execution techniques will not work for privileged-mode code that uses this instruction
x86 architecture: unprivileged instructions let the CPU access privileged state. Software running in the virtual machine can read the code segment register to determine the processor’s current privilege level. A virtualizable processor would trap this instruction, and the VMM could then patch what the software running in the virtual machine to reflect the virtual machine’s privilege level. The x86, however, doesn’t trap the instruction, so with direct execution, the software would see the wrong privilege level in the code segment register. Techniques for Virtualizing CPUs
Paravirtualization VMM builder defines the virtual machine interface by replacing nonvirtualizable portions of the original instruction set with easily virtualized and more efficient equivalents. Although operating systems must be ported to run in a virtual machine, most normal applications run unmodified. Drawback: incompatibility. Any operating system run in a paravirtualized VMM must be ported to that architecture. Operating system vendors must cooperate, legacy operating systems cannot run, and existing machines cannot easily migrate into virtual machines.
Direct execution combined with fast binary translation In most modern operating systems, the processor modes that run normal application programs are virtualizable and hence can run using direct execution. A binary translator can run privileged modes that are nonvirtualizable, patching the nonvirtualizable x86 instructions. a high-performance virtual machine that matches the hardware and thus maintains total software compatibility. Techniques for Virtualizing CPUs
40
Hardware assisted trapping Sensitive instructions are automatically trapped by the hardware No need for binary translation or OS modification A new privilege level: The hypervisor can now run at "Ring -1“. Memory Virtualization
41
The shadow page table lets the VMM precisely control which pages of the machine’s memory are available to a virtual machine. When the operating system running in a virtual machine establishes a mapping in its page table, the VMM detects the changes and establishes a mapping in the corresponding shadow page table entry that points to the actual page location in the hardware memory.
When the virtual machine is executing, the hardware uses the shadow page table for memory translation so that the VMM can always control what memory each virtual machine is using. Challenges in Memory Virtual.
42
VMM’s virtual memory subsystem constantly controls how much memory goes to a virtual machine Periodically reclaim some of that memory by paging a portion of the virtual machine out to disk.
The operating system running in the virtual machine (the GuestOS), however, is likely to have much better information than a VMM’s virtual memory system about which pages are good candidates for paging out. E.g., GuestOS might note that the process that created a page has exited, which means nothing will access the page again. The VMM operating at the hardware level does not see this and might wastefully page out that page. Solution: balloon processes by inflating the memory (see Xen part)
size of modern operating systems and applications. Solution: content-based page sharing for their server products I/O Virtualization
43
Rather than communicating with the device using traps into the VMM
the software in the virtual machine could directly read and write the device. Network Virtualization
44
External: Combine and divide LANs into virtual networks
Internal: Emulate a physical network with software in a virtualized server Separates logical network behaviour from the underlying physical network resources
Virtualized Network Interface Card (vNIC) Has its own MAC address create virtual networks between virtual machines without the network traffic consuming bandwidth on the physical network NIC teaming allows multiple physical NICS to appear as one and failover transparently for virtual machines virtual machines can be seamlessly relocated to different systems while keeping their existing MAC addresses Similar to memory virtualization, total virtual bandwidth is more than what is physically available Network Virtualization
45
A Distributed (or Elastic) Virtual Switch functions as a single virtual switch across all associated hosts Virtual machines to maintain consistent network configuration as they migrate across multiple hosts It can route traffic internally between virtual machines or link to an external network by connecting to physical Ethernet adapters . Main benefits network virtualization . Better utilization of the available resources by consolidating various applications on less servers . Cost effective because hardware NICs can be virtualized . Reduces the time to provision network / deploy virtual machines and applications 46 47
Xen Xen and the Art of Virtualization
48
Drawbacks with the full virtualization Certain supervisor instructions must be handled by the VMM for correct virtualization, but executing these with insufficient privilege fails silently rather than causing a convenient trap Efficiently virtualizing x86 MMU is also difficult
-- > paravirtualization Source: “Ian Pratt, Keir Fraser, Steven Hand, Christian Limpach, and Andrew Warfield. Xen 3.0 and the art of virtualization. In Proceedings of the Linux Symposium, volume 2, Ottawa, Ontario, Canada, July 2005.” Paravirtualization
49
Support for unmodified application binaries is essential, or users will not transition to Xen.
Supporting full multi-application operating systems is important, as this allows complex server configurations to be virtualized within a single guest OS instance.
Paravirtualization is necessary to obtain high performance and Even on cooperative machine strong resource isolation on architectures, completely hiding the uncooperative machine effects of resource virtualization from architectures such as x86. guest Oses risks both correctness and performance. Paravirtualization
50
Source: “Ian Pratt, Keir Fraser, Steven Hand, Christian Limpach, and Andrew Warfield. Xen 3.0 and the art of virtualization. In Proceedings of the Linux Symposium, volume 2, Ottawa, Ontario, Canada, July 2005.” Domains in XEN
51
Dom0 Priviledged access to hardware Major goal: device multiplexing
DomUs Guest Oses No direct access to hardware, there are special drivers But there are exceptions
Hardware Virtual Machine Since extensions for the VMs are build in IA-32/AMD64 it is possible for Xen to run unmodified Oses Xen – CPU Virtualization
52
IA-32 has 4 rings
AMD64 has only 2 rings – security zone sharing between guest Os and the apps
separation of kernels with the apps using the hypervisors Xen - Hypercalls
53 Xen – Memory Management
54
shadows balloons Xen – I/O
55 56
VMWare Before and After Virtualization
57
Before Virtualization: After Virtualization: • Single OS image per machine • Hardware-independence of operating system and applications • Software and hardware tightly coupled • Virtual machines can be provisioned to any system • Running multiple applications on same machine often creates conflict • Can manage OS and application as a single unit by • Underutilized resources encapsulating them into virtual machines • Inflexible and costly infrastructure
Source: VMWare Hosted - Hypervisor Architecture
58 VMWare Family
59 60
Cloud Markets 61 Introduction
62
An important characteristic of Cloud markets is the liquidity of the traded good For the market to function efficiently, a sufficient number of market participants is needed Creating such a market with a large number of providers and consumers is far from trivial Resource consumers will only join, if they are able to find what they need quickly Resource providers will only join, if they can be fairly certain that their resources will be sold Not meeting either of these conditions will deter providers and consumer from using the market Cloud Characteristic
63
Deployment Types Delivery Models Private cloud Cloud Software as a Service (SaaS) Enterprise owned or leased e.g., in Use provider’s applications over a network case of data centers, HPC centers,… e.g., Salesforce.com,… Community cloud Cloud Platform as a Service (PaaS) Shared infrastructure for specific Deploy customer-created applications to a community cloud e.g., Google App Engine, Microsoft Azure, … Public cloud Cloud Infrastructure as a Service (IaaS) Sold to the public, mega-scale Rent processing, storage, network capacity, infrastructure e.g., ec2, S3,… and other fundamental computing resources Hybrid cloud e.g., EC2 – Elastic Computer Cloud, S3 – Simple Storage Service, Simple DB,… Composition of two or more clouds
Source: ”Effectively and Securely Using the Cloud Computing Paradigm” Peter Mell, Tim Grance NIST, Information Technology Laboratory Cloud enabling Technologies
64 Primary Technologies Other Technologies
Virtualization Autonomic Systems
Grid technology Web 2.0
SOA Web application
Distributed Computing frameworks
Broadband Network SLAs
Browser as a platform Source: ”Effectively, Information Technology Laboratoryand Securely Using the Cloud Free and Open Source Computing Paradigm” Software Peter Mell, Tim Grance NIST Problems when providing virtual goods 65
The use of virtualization enables providers to create a wide range of resource types and allows consumers to specify their needs precisely If the resource variability of both sides is large, consumers and providers will not meet, since their offers may differ slightly
Demonstrate the problems caused by a large number of resource definitions Simulation We will then introduce an approach, based on SLA mappings, which ensures sufficient liquidity in the market State-of-the-Art: Resource Markets in Research 66
The research into resource markets can be divided into two groups, when looking at their attempts of describing the tradable good
The first group consists largely of Grid market designs that did not define goods clearly: E.g. GRACE developed a market architecture for Grid markets and outlined a market mechanism E.g. The SORMA project focused more on fairness and efficient resource allocation; it also identified several requirements for open Grid markets (allocative efficiency, computational tractability, individual rationality, etc) State-of-the-Art: Resource Markets in Research 67
The second group has simplified the computing resource good by focusing on only one aspect of it In MACE, the importance of developing a definition for the tradable good was recognized and an abstraction was developed. The liquidity of goods and the likelihood that consumer and providers with common offers can meet, was not addressed The Popcorn market only traded Java Operations which simplified the matching between consumers and providers The Spawn market was envisioned to work with CPU time slices which makes the matching of demand and supply trivial but forces consumers to determine the number of required CPU cycles Neither group discusses the liquidity of goods in Cloud computing markets State-of-the-Art: Commercial Resource Providers 68
In recent years, a large number of commercial Cloud providers have entered the utility computing market, offering a number of different types of services Resource providers who only provide computing resources (e.g. Amazon, Tsunamic Technologies) Saas providers who sell their own resources together with their own software services (e.g. Google Apps, Salesforce.com) Companies that attempt to run a mixed approach, i.e. they allow users to create their own services but at the same time, offer their own services (Sun N1 Grid, Microsoft Azure)
In the current market, providers only sell a single type of resources (with the exception of Amazon)
This limited number of different resource types enables a market creation, since all demand is channeled towards very few resource types Liquidity Problems in Markets
69
If an open Cloud market is created, in which resource specifications are left to the trader, would such a market be able to match providers and consumers?
This we have simulated in a double auction market environment with varying numbers of resource types and traders
The matching probability was used as a measure to determine how attractive a market would be to providers and consumers Liquidity Problems in Markets
70
With 5000 resource types, about 40,000 traders would be needed to achieve a matching probability of 75%
With 10,000 resource types, about 46,000 traders would be needed to achieve a matching probability of 75%
With 276,480 resource types, about 33 million traders would be needed to achieve a matching probability of 75% The Challenge
Based on the analysis of the liquidity problems in markets, we are faced with an interesting research challenge: On the one hand, to fully exploit the potential of open markets, a large number of providers and consumers is necessary On the other hand, the large number of potential traders might inflate the variety of resources which leads to the problem that the supply and the demand are spread across a wide range of resources
To give traders few restrictions, an approach is needed which allows traders to define their resources (or requirements) freely while facilitating SLA matching 71 Importance of SLAs in markets
72
Current adaptive SLA matching mechanisms are based on OWL and DAML-S and other semantic technologies However, none of these approaches address the issues of the open market In most existing approaches, user and consumer have to agree either on specific ontologies or have to belong to a specific portal None of the approaches deal with semi-automatic definition of SLA mappings enabling negotiations between inconsistent SLA templates. Heterogeneity of Clouds
73 Solution: NegotiationBootstrapping Cloud How to map … between different DB DB SLAtemplates? DB
AppsApps.. WS
SLA Negotiation SLA Template Strategy A Y Client How to map Middleware between different Consumer negotiation strategies?
SLA Negotiation Template Solution: Strategy B X ServiceMediation Provider Managing SLAs
74
The figure depicts the registry responsible for the management of SLA mappings
The registry comprises different SLA templates, each of which represents a specific application domain
The registry works as follows Providers assign to their service a particular public SLA template Next, they may assign SLA mappings Consumers search for the services using meta-data and search terms After finding appropriate services, each consumer may define mappings Public SLA templates should be updated frequently to reflect the actual SLAs used The SLA Template Lifecycle
75 1. We assume that for specific domains, specific SLA templates are generated 2. These generated SLA templates are then published in the public registry, while learning functions for the adaptation of these public SLA templates are defined 3. SLA mappings are defined manually by users 4. Automatic mapping of SLAs. 5. Based on the learning function and based on the submitted SLA mappings a new version of the SLA template can be defined and published in the registry Autonomic Process
76
QoS Example
Autonomic Manager Service QoS Metric Compositions Protocol analysis planning Mapping Strategies
Knowledge
QoS Metric Negotiation using Protocol VieSLAF framework Evaluation monitoring execution
Sensor Actuator Autonomic Process for MN and SM 77 Definition and publication Prerequisite Sensor of meta-negotiation document
monitoring Execution of meta negotiation Detection of SLA inconsistencies
Evaluation of existing Evaluation of existing analysis bootstrapping strategies SLA mappings
Knowledge Application of existing and Application of existing and definition of new planning definition of new SLA mappings bootstrapping strategies
Application of SLA mappings to Execution of bootstrapping execution fulfill successful SLA contracting
Negotiation Service Mediation Actuator Bootstrapping Lifecycle of a self-manageable Cloud Service 78
Meta Negotiation Negotiation
Self-Management
Post processing Execution Architecture for Self-manageable Cloud Services 79
Autonomic Manager Service 1
planning execution Actuator Service 2 Knowledge …
analysis monitoring Sensor Service n
Self-management interface Negotiation interface Job management interface Auction Overview
80
In an auction, a seller offers an item for sale, but does not establish a price
Stakeholders: Bidders (i.e., potential buyers) Sellers Intermediaries
Shill bidders place bids on behalf of the seller to artificially inflate the price of an item Auction Models
81
Public Auction Types Template X English Auction (1) define mapping (1) define mapping Dutch Auction (2) send bids Reverse English Auction Local Local Alternate Offers Template Template A (n) match B Auction
Automated Definition of SLA mappings Application of SLA mappings auctions based on agent models Dutch Auctions
82
Dutch auctions (i.e., descending-price auctions)
Form of open auction in which bidding starts at a high price and drops until a bidder accepts the price
Dutch auctions often are better for the seller
An effective means of moving large numbers of commodity items quickly Posted Price Mechanism
83
Public Template X (1) define mapping (1) define mapping 2. post price
Local Local Template 3. match Template A B
Definition of SLA mappings Application of SLA mappings Negotiated Price
84
Public Template X
(1) define mapping (2) send (1) define mapping metadata
(3) offer
Local (4) acknowledgment Local Template Template A B
Definition of SLA mappings Application of SLA mappings Practical Session (runway)
85
Contact: Atakan Aral, PhD Institute of Information Systems Engineering Vienna University of Technology [email protected] http://rucon.ec.tuwien.ac.at
These slides are mostly adapted from course materials by Univ. Prof. Dr. Ivona Brandic for 184.271 Large-scale Distributed Computing course at TU Wien.