Programming Models and Frameworks
Advanced Cloud Computing
15-719/18-847b
Garth Gibson Greg Ganger Majd Sakr
Jan 30, 2017 15719/18847b Adv. Cloud Computing 1 Advanced Cloud Computing Programming Models
• Ref 1: MapReduce: simplified data processing on large clusters. Jeffrey Dean and Sanjay Ghemawat. OSDI’04. 2004. http://static.usenix.org/event/osdi04/tech/full_papers/dean/ dean.pdf • Ref 2: Spark: cluster computing with working sets. Matei Zaharia, Mosharaf Chowdhury, Michael Franklin, Scott Shenker, Ion Stoica. USENIX Hot Topics in Cloud Computing (HotCloud’10). http://www.cs.berkeley.edu/~matei/papers/2010/ hotcloud_spark.pdf
Jan 30, 2017 15719/18847b Adv. Cloud Computing 2 Advanced Cloud Computing Programming Models
• Optional • Ref 3: DyradLinQ: A system for general-purpose distributed data- parallel computing using a high-level language. Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Ulfar Erlingsson, Pradeep Kumar Gunda, Jon Currey. OSDI’08. http://research.microsoft.com/en-us/projects/dryadlinq/dryadlinq.pdf • Ref 4: Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Hellerstein (2010). "GraphLab: A New Parallel Framework for Machine Learning." Conf on Uncertainty in Artificial Intelligence (UAI). http://www.select.cs.cmu.edu/publications/scripts/papers.cgi
Jan 30, 2017 15719/18847b Adv. Cloud Computing 3 Advanced Cloud Computing Programming Models
• Optional • Ref 5: TensorFlow: A system for large-scale machine learning. Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeff Dean, Matthieu Devin, Sanjay Ghemawatt, Geoffrey Irving, Michael Isard. OSDI’16. https://www.usenix.org/system/files/conference/osdi16/osdi16- abadi.pdf
Jan 30, 2017 15719/18847b Adv. Cloud Computing 4 Recall the SaaS, PaaS, IaaS Taxonomy
• Service, Platform or Infrastructure as a Service
o SaaS: service is a complete application (client-server computing) • Not usually a programming abstraction
o PaaS: high level (language) programming model for cloud computer • Eg. Rapid prototyping languages • Turing complete but resource management hidden
o IaaS: low level (language) computing model for cloud computer • Eg. Assembler as a language • Basic hardware model with all (virtual) resources exposed • For PaaS and IaaS, cloud programming is needed
o How is this different from CS 101? Scale, fault tolerance, elasticity, ….
Jan 30, 2017 15719/18847b Adv. Cloud Computing 5 Embarrassingly parallel “Killer app:” Web servers
• Online retail stores (like amazon.com for example)
o Most of the computational demand is for browsing product marketing, forming and rendering web pages, managing customer session state • Actual order taking and billing not as demanding, have separate specialized services (Amazon bookseller backend)
o One customer session needs a small fraction of one server
o No interaction between customers (unless inventory near exhaustion) • Parallelism is more cores running identical copy of web server • Load balancing, maybe in name service, is parallel programming
o Elasticity needs template service, load monitoring, cluster allocation
o These need not require user programming, just configuration
Jan 30, 2017 15719/18847b Adv. Cloud Computing 6 Eg., Obama for America Elastic Load Balancer
Jan 30, 2017 15719/18847b Adv. Cloud Computing 7 What about larger apps?
• Parallel programming is hard – how can cloud frameworks help? • Collection-oriented languages (Sipelstein&Blelloch, Proc IEEE v79, n4, 1991)
o Also known as Data-parallel
o Specify a computation on an element; apply to each in collection • Analogy to SIMD: single instruction on multiple data
o Specify an operation on the collection as a whole • Union/intersection, permute/sort, filter/select/map • Reduce-reorderable (A) /reduce-ordered (B) – (A) Eg., ADD(1,7,2) = (1+7)+2 = (2+1)+7 = 10 – (B) Eg., CONCAT(“the “, “lazy “, “fox “) = “the lazy fox “ • Note the link to MapReduce …. its no accident
Jan 30, 2017 15719/18847b Adv. Cloud Computing 8 High Performance Computing Approach
• HPC was almost the only home for parallel computing in the 90s • Physical simulation was the killer app – weather, vehicle design, explosions/collisions, etc – replace “wet labs” with “dry labs”
o Physics is the same everywhere, so define a mesh on a set of particles, code the physics you want to simulate at one mesh point as a property of the influence of nearby mesh points, and iterate
o Bulk Synchronous Processing (BSP): run all updates of mesh points in parallel based on value at last time point, form new set of values & repeat • Defined “Weak Scaling” for bigger machines – rather than make a fixed problem go faster (strong scaling), make bigger problem go same speed
o Most demanding users set problem size to match total available memory
Jan 30, 2017 15719/18847b Adv. Cloud Computing 9 High Performance Computing Frameworks
• Machines cost O($10-100) million, so
o emphasis was on maximizing utilization of machines (congress checks)
o low-level speed and hardware specific optimizations (esp. network)
o preference for expert programmers following established best practices • Developed MPI (Message Passing Interface) framework (eg. MPICH)
o Launch N threads with library routines for everything you need: • Naming, addressing, membership, messaging, synchronization (barriers) • Transforms, physics modules, math libraries, etc
o Resource allocators and schedulers space share jobs on physical cluster
o Fault tolerance by checkpoint/restart requiring programmer save/restore
o Proto-elasticity: kill N-node job & reschedule a past checkpoint on M nodes • Very manual, deep learning curve, few commercial runaway successes
Jan 30, 2017 15719/18847b Adv. Cloud Computing 10 Broadening HPC: Grid Computing
• Grid Computing started with commodity servers (predates Cloud)
o 1989 concept of “killer micros” that would kill off supercomputers • Frameworks were less specialized, easier to use (& less efficient)
o Beowulf, PVM (parallel virtual machine), Condor, Rocks, Sun Grid Engine • For funding reasons grid emphasized multi-institution sharing
o So authentication, authorization, single-signon, parallel-ftp
o Heterogeneous workflow (run job A on mach. B, then job C on mach. D) • Basic model: jobs selected from batch queue, take over cluster • Simplified “pile of work”: when a core comes free, take a task from the run queue and run to completion
Jan 30, 2017 15719/18847b Adv. Cloud Computing 11 Cloud Programming, back to the future
• HPC demanded too much expertise, too many details and tuning • Cloud frameworks all about making parallel programming easier
o Willing to sacrifice efficiency (too willing perhaps)
o Willing to specialize to application (rather than machine) • Canonical BigData user has data & processing needs that require lots of computer, but doesn’t have CS or HPC training & experience
o Wants to learn least amount of computer science to get results this week
o Might later want to learn more if same jobs become a personal bottleneck
Jan 30, 2017 15719/18847b Adv. Cloud Computing 12 2005 NIST Arabic-English Competition
Expert human translator BLEU Score Translate 100 articles 0.7 • 2005 : Google wins! Usable translation 0.6 Qualitatively better 1st entry Human-edittable translation Google Not most sophisticated approach 0.5 Topic ISI IBM+CMU No one knew Arabic identification UMD 0.4 JHU+CU Brute force statistics Edinburgh
0.3 But more data & compute !! 200M words from UN translations Useless 0.2 1 billion words of Arabic docs
Systran 1000 processor cluster 0.1 Mitre ! Can’t compete w/o big data FSC 0.0 Jan 30, 2017 15719/18847b Adv. Cloud Computing 13 Cloud Programming Case Studies
• MapReduce
o Package two Sipelstein91 operators filter/map and reduce as the base of a data parallel programming model built around Java libraries • DryadLinq
o Compile workflows of different data processing programs into schedulable processes • Spark
o Work to keep partial results in memory, and declarative programming • TensorFlow
o Specialize to iterative machine learning
Jan 30, 2017 15719/18847b Adv. Cloud Computing 14 MapReduce (Majd)
Jan 30, 2017 15719/18847b Adv. Cloud Computing 15 DryadLinq
• Simplify efficient data parallel code
o Compiler support for imperative and declarative (eg., database) operations
o Extends MapReduce to workflows that can be collectively optimized • Data flows on edges between processes at vertices (workflows) • Coding is processes at vertices and expressions representing workflow • Interesting part of the compiler operates on the expressions
o Inspired by traditional database query optimizations – rewrite the execution plan with equivalent plan that is expected to execute faster
Jan 30, 2017 15719/18847b Adv. Cloud Computing 16 DryadLinq
• Data flowing through a graph abstraction
o Vertices are programs (possibly different with each vertex)
o Edges are data channels (pipe-like)
o Requires programs to have no side-effects (no changes to shared state)
o Apply function similar to MapReduce reduce – open ended user code • Compiler operates on expressions, rewriting execution sequences
o Exploits prior work on compiler for workflows on sets (LINQ)
o Extends traditional database query planning with less type restrictive code • Unlike traditional plans, virtualizes resources (so might spill to storage)
o Knows how to partition sets (hash, range and round robin) over nodes
o Doesn’t always know what processes do, so less powerful optimizer than database – where it can’t infer what is happening, it takes hints from users
o Can auto-pipeline, remove redundant partitioning, reorder partitionings, etc
Jan 30, 2017 15719/18847b Adv. Cloud Computing 17 Spark: Optimize/generalize MR for iterative apps
Through files (disk) • MapReduce uses disks for input, tmp, & output • Want to use memory mostly • Machine Learning apps iterate over same data to “solve” something
o Way too much use of disk when the data is not giant • Spark is MR rewrite: more general (dryad-like graphs of work), more interactive (scala interpreter) & more efficient (in-memory)
Jan 30, 2017 15719/18847b Adv. Cloud Computing 18 Spark Resilient Distributed Datasets (RDD)
• Spark programs are functional, deterministic => same input means same result
o This is the basis of selective re-execution and automated fault-tolerance • Spark is a set/collection (called an RDD) oriented system
o Splits a set into partitions, and assign to workers to parallelize operation • Store invocation (code & args) with inputs as a closure
o Treat this as a “future” – compute now or later at system’s choice (lazy)
o If code & inputs already at node X, “args” is faster to send than results • Futures can be used as compression on wire & in replica nodes
Jan 30, 2017 15719/18847b Adv. Cloud Computing 19 Spark Resilient Distributed Datasets (RDD) con’t
• Many operators are builtins (well-known properties, like Dryad)
o Spark automates transforms when pipelining multiple builtin operations • Spark is lazy – only specific operators force computation
o E,g, materialize in file system
o Build programs interactively, computing only when and what user needs
o Lineage is chain of invocations: future on future … delayed compute • Replication/FT: ship & cache RDDs on other nodes
o Can recompute everything there if needed, but mostly don’t
o Save space in memory on replicas and network bandwidth
o Need entire lineage to be replicated in non-overlapping fault domains
Jan 30, 2017 15719/18847b Adv. Cloud Computing 20 Spark “combining python functions” example
• rdd_x.map(foo).map(bar) • Function foo() takes in a record x and outputs a record y • Function bar() takes in a record y and outputs a record z • Spark automatically creates a function foo_bar() that takes in a record x and outputs a record z.
Feb 1, 2016 15719 Adv. Cloud Computing 21 Next day plan
• Encapsulation and virtual machines
o Guest lecturer, Michael Kozuch, Intel Labs
Feb 1, 2016 15719 Adv. Cloud Computing 22 Programming Models MapReduce 15-719/18-847b Advanced Cloud Computing Spring 2017
Majd Sakr, Garth Gibson, Greg Ganger
January 30, 2017
1 Motivation
• How do you perform batch processing of large data sets using low cost clusters with thousands of commodity machines which frequently experience partial failure or slowdowns Batch Processing of Large Datasets
• Challenges – Parallel programming – Job orchestration – Scheduling – Load Balancing – Communication – Fault Tolerance – Performance – … Google MapReduce
• Data parallel framework for processing Big Data on large commodity hardware • Transparently tolerates – Data faults – Computation faults • Achieves – Scalability and fault tolerance Commodity Clusters