Harnessing Grid Resources with Data-Centric Task Farms 3 2) Tackle Increasingly Complex Problems

HarnessingHarnessing GridGrid ResourcesResources withwith DataData--CentricCentric TaskTask FarmsFarms Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Collaborators: Ian Foster (UC/CI/ANL), Yong Zhao (MS), Mike Wilde (CI/ANL), Zhao Zhang (CI), Rick Stevens (UC/CI/ANL), Alex Szalay (JHU), Jerry Yan (NASA/AMES), Catalin Dumitrescu (FANL) Hyde Park Global Investments LLC Interview Talk April 18th, 2008 What will we do with 1+ Exaflops and 1M+ cores? 1) Tackle Bigger and Bigger Problems Computational Scientist as Hero 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms 3 2) Tackle Increasingly Complex Problems Computational Scientist as Logistics Officer 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms 4 “More Complex Problems” • Use ensemble runs to quantify climate model uncertainty • Identify potential drug targets by screening a database of ligand structures against target proteins • Study economic model sensitivity to key parameters • Analyze turbulence dataset from multiple perspectives • Perform numerical optimization to determine optimal resource assignment in energy problems 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms 5 Programming Model Issues • Multicore processors • Massive task parallelism • Massive data parallelism • Integrating black box applications • Complex task dependencies (task graphs) • Failure, and other execution management issues • Data management: input, intermediate, output • Dynamic task graphs • Dynamic data access involving large amounts of data • Documenting provenance of data products 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms 6 Problem Types Input Hi data Data Big data and size analysis, mining many tasks Med Heroic MPI Many loosely coupled tasks Lo tasks 1 1K 1M Number of tasks 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms 7 Motivating Example: AstroPortal Stacking Service + + • Purpose + – On-demand “stacks” of + + random locations within + ~10TB dataset + • Challenge = – Rapid access to 10-10K “random” files S4 Sloan Data – Time-varying load Web page or Web • Solution Service – Dynamic acquisition of compute, storage 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms 8 Challenge #1: Long Queue Times • Wait queue times are typically longer than the job duration times SDSC DataStar 1024 Processor Cluster 2004 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms9 Challenge #2: Slow Job Dispatch Rates • Production LRMs Æ ~1 job/sec dispatch rates Medium Size Grid Site (1K processors) 100% • What job durations are 90% 80% needed for 90% efficiency: 70% 60% 50% – Production LRMs: 900 sec Efficiency 40% 30% 20% – Development LRMs: 100 sec 10% 0% 0.001 0.01 0.1 1 10 100 1000 10000 100000 – Experimental LRMs: 50 sec Task Length (sec) 1 task/sec (i.e. PBS, Condor 6.8) 10 tasks/sec (i.e. Condor 6.9.2) 100 tasks/sec 500 tasks/sec (i.e. Falkon) 1K tasks/sec 10K tasks/sec – 1~10 sec should be possible 100K tasks/sec 1M tasks/sec Throughput System Comments (tasks/sec) Condor (v6.7.2) - Production Dual Xeon 2.4GHz, 4GB 0.49 PBS (v2.1.8) - Production Dual Xeon 2.4GHz, 4GB 0.45 Condor (v6.7.2) - Production Quad Xeon 3 GHz, 4GB 2 Condor (v6.8.2) - Production 0.42 4/18/2008Condor ( Harnessingv6.9.3) - Development Grid Resources with Data-Centric Task Farms11 10 Condor-J2 - Experimental Quad Xeon 3 GHz, 4GB 22 Challenge #3: Poor Scalability of Shared File Systems • GPFS vs. LOCAL 1000000 GPFS R – Read Throughput LOCAL R GPFS R+W LOCAL R+W • 1 node: 0.48Gb/s vs. 1.03Gb/s Î 2.15x 100000 • 160 nodes: 3.4Gb/s vs. 165Gb/s Î 48x – Read+Write Throughput: 10000 • 1 node: 0.2Gb/s vs. 0.39Gb/s Î 1.95x 1000 • 160 nodes: 1.1Gb/s vs. 62Gb/s Î 55x Throughput (Mb/s) – Metadata (mkdir / rm -rf) 100 • 1 node: 151/sec vs. 199/sec Î 1.3x 1 10 100 1000 Number of Nodes • 160 nodes: 21/sec vs. 31840/sec Î 1516x 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms11 Hypothesis “Significant performance improvements can be obtained in the analysis of large dataset by leveraging information about data analysis workloads rather than individual data analysis tasks.” • Important concepts related to the hypothesis – Workload: a complex query (or set of queries) decomposable into simpler tasks to answer broader analysis questions – Data locality is crucial to the efficient use of large scale distributed systems for scientific and data-intensive applications – Allocate computational and caching storage resources, co-scheduled to optimize workload performance 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms 12 Proposed Solution: Part 1 Abstract Model and Validation • AMDASK: – An Abstract Model for DAta-centric taSK farms • Task Farm: A common parallel pattern that drives independent computational tasks – Models the efficiency of data analysis workloads for the split/merge class of applications – Captures the following data diffusion properties • Resources are acquired in response to demand • Data and applications diffuse from archival storage to new resources • Resource “caching” allows faster responses to subsequent requests • Resources are released when demand drops • Considers both data and computations to optimize performance • Model Validation – Implement the abstract model in a discrete event simulation – Validate model with statistical methods (R2 Statistic, Residual Analysis) 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms 13 Proposed Solution: Part 2 Practical Realization • Falkon: a Fast and Light-weight tasK executiON framework – Light-weight task dispatch mechanism – Dynamic resource provisioning to acquire and release resources – Data management capabilities including data-aware scheduling – Integration into Swift to leverage many Swift-based applications • Applications cover many domains: astronomy, astro-physics, medicine, chemistry, and economics 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms 14 AMDASK: Performance Efficiency Model • B: Average Task Execution Time: – K: Stream of tasks 1 Β = ∑ µ(κ) – µ(k): Task k execution time | Κ | k∈Κ • Y: Average Task Execution Time with Overheads: ⎧ 1 – ο(k): Dispatch overhead [µ(κ) + o(κ)], δ ∈φ(τ ),δ ∈Ω ⎪ | Κ | ∑ – ς(δ,τ): Time to get data Y = κ∈Κ ⎨ 1 ⎪ [µ(κ) + o(κ) + ζ (δ ,τ )], δ ∉φ(τ ),δ ∈Ω ⎪| Κ | ∑ • V: Workload Execution Time: ⎩ κ∈Κ – A: Arrival rate of tasks ⎛ B 1 ⎞ V = max⎜ , ⎟*| Κ | – T: Transient Resources ⎝ | Τ | Α ⎠ • W: Workload Execution Time with Overheads ⎛ Υ 1 ⎞ W = max⎜ , ⎟*| Κ | 4/18/2008 Harnessing Grid Resources with Data-Centric⎝ | Τ |TaskΑ ⎠Farms 17 AMDASK: Performance Efficiency Model • Efficiency ⎧ Y 1 1, ≤ V ⎪ |T | A Ε = E = ⎨ ⎛ B | Τ | ⎞ Y 1 W ⎪max⎜ , ⎟, > ⎪ ⎝ Y Α*Y ⎠ |T | A • Speedup ⎩ S = E*| T | • Optimizing Efficiency – Easy to maximize either efficiency or speedup independently – Harder to maximize both at the same time • Find the smallest number of transient resources |T| while maximizing 4/18/2008speedup*efficiency Harnessing Grid Resources with Data-Centric Task Farms 18 Performance Efficiency Model Example: 1K CPU Cluster • Application: Angle - distributed data mining • Testbed Characteristics: – Computational Resources: 1024 – Transient Resource Bandwidth: 10MB/sec – Persistent Store Bandwidth: 426MB/sec • Workload: – Number of Tasks: 128K – Arrival rate: 1000/sec – Average task execution time: 60 sec – Data Object Size: 40MB 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms 19 Performance Efficiency Model Example: 1K CPU Cluster Falkon on ANL/UC TG Site: PBS on ANL/UC TG Site: Peak Dispatch Throughput: 500/sec Peak Dispatch Throughput: 1/sec Scalability: 50~500 CPUs Scalability: <50 CPUs Peak speedup: 623x Peak speedup: 54x 100% 1000 100% 1000 90% 90% 80% 80% 70% 70% 100 100 60% 60% 50% 50% Speedup Speedup Efficiency 40% Efficiency 40% 10 10 30% 30% 20% Efficiency 20% Efficiency 10% Speedup 10% Speedup Speedup*Efficiency Speedup*Efficiency 0% 1 0% 1 12481632641282565121024 1 2 4 8 16 32 64 128 256 512 1024 Number of Processors Number of Processors 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms 20 Model Validation: Simulations • Implement the abstract model in a discrete event simulation • Simulation parameters – number of storage and computational resources – communication costs – management overhead – workloads (inter-arrival rates, query complexity, data set properties, and data locality) • Model Validation –R2 Statistic – Residual analysis 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms 23 Falkon: a Fast and Light-weight tasK executiON framework • Goal: enable the rapid and efficient execution of many independent jobs on large compute clusters • Combines three components: –a streamlined task dispatcher able to achieve order-of- magnitude higher task dispatch rates than conventional schedulers Î Challenge #1 – resource provisioning through multi-level scheduling techniques Î Challenge #2 – data diffusion and data-aware scheduling to leverage the co-located computational and storage resources Î Challenge #3 4/18/2008 Harnessing Grid Resources with Data-Centric Task Farms 24 Falkon: The Streamlined Task Dispatcher Clients • Tier 1: Dispatcher – GT4 Web Service accepting WS task submissions from clients and sending them to available Provisioner Compute executors Resources Executor 1 • Tier 2: Executor Dispatcher – Run tasks on local resources WS • Provisioner Executor n – Static and dynamic resource Compute Resource 1 provisioning Compute Resource m 4/18/2008 Harnessing Grid Resources

Harnessing Grid Resources with Data-Centric Task Farms 3 2) Tackle Increasingly Complex Problems

Astrophysics with Terabytes

Things You Need to Know About Big Data

Gerard Lemson Alex Szalay, Mike Rippin DIBBS/Sciserver Collaborative Data-Driven Science

Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets Year 1 Progress Report & Year 2 Proposal

Warmth Elevating the Depths: Shallower Voids with Warm Dark Matter

Large Scale Structures of the Universe Proceedings of the 130Th Symposium of the International Astronomical Union, Dedicated to the Memory of Marc A

2016 – Aug – 09 CURRICULUM VITAE Mike Specian Contact

Migrating a Multiterabyte Archive from Object to Relational Databases

Extreme Data-Intensive Scientific Computing on Gpus

The World-Wide Telescope, an Archetype for Online Science

Mark C. Neyrinck University of the Basque Country Curriculum Vitæ Mobile Ph: +34 663 490 872 (Spain) 48080 Bilbao, Spain +1 808 232 7263 (US)

The Importance of Data Locality in Distributed Computing Applications Alex Szalay, Julian Bunn, Jim Gray, Ian Foster, Ioan Raicu