Programming Models and Frameworks Advanced Cloud

Total Page:16

File Type:pdf, Size:1020Kb

Programming Models and Frameworks Advanced Cloud Programming Models and Frameworks Advanced Cloud Computing 15-719/18-847b Garth Gibson Greg Ganger Majd Sakr Jan 30, 2017 15719/18847b Adv. Cloud Computing 1 Advanced Cloud Computing Programming Models • Ref 1: MapReduce: simplified data processing on large clusters. Jeffrey Dean and Sanjay Ghemawat. OSDI’04. 2004. http://static.usenix.org/event/osdi04/tech/full_papers/dean/ dean.pdf • Ref 2: Spark: cluster computing with working sets. Matei Zaharia, Mosharaf Chowdhury, Michael Franklin, Scott Shenker, Ion Stoica. USENIX Hot Topics in Cloud Computing (HotCloud’10). http://www.cs.berkeley.edu/~matei/papers/2010/ hotcloud_spark.pdf Jan 30, 2017 15719/18847b Adv. Cloud Computing 2 Advanced Cloud Computing Programming Models • Optional • Ref 3: DyradLinQ: A system for general-purpose distributed data- parallel computing using a high-level language. Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Ulfar Erlingsson, Pradeep Kumar Gunda, Jon Currey. OSDI’08. http://research.microsoft.com/en-us/projects/dryadlinq/dryadlinq.pdf • Ref 4: Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Hellerstein (2010). "GraphLab: A New Parallel Framework for Machine Learning." Conf on Uncertainty in Artificial Intelligence (UAI). http://www.select.cs.cmu.edu/publications/scripts/papers.cgi Jan 30, 2017 15719/18847b Adv. Cloud Computing 3 Advanced Cloud Computing Programming Models • Optional • Ref 5: TensorFlow: A system for large-scale machine learning. Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeff Dean, Matthieu Devin, Sanjay Ghemawatt, Geoffrey Irving, Michael Isard. OSDI’16. https://www.usenix.org/system/files/conference/osdi16/osdi16- abadi.pdf Jan 30, 2017 15719/18847b Adv. Cloud Computing 4 Recall the SaaS, PaaS, IaaS Taxonomy • Service, Platform or Infrastructure as a Service o SaaS: service is a complete application (client-server computing) • Not usually a programming abstraction o PaaS: high level (language) programming model for cloud computer • Eg. Rapid prototyping languages • Turing complete but resource management hidden o IaaS: low level (language) computing model for cloud computer • Eg. Assembler as a language • Basic hardware model with all (virtual) resources exposed • For PaaS and IaaS, cloud programming is needed o How is this different from CS 101? Scale, fault tolerance, elasticity, …. Jan 30, 2017 15719/18847b Adv. Cloud Computing 5 Embarrassingly parallel “Killer app:” Web servers • Online retail stores (like amazon.com for example) o Most of the computational demand is for browsing product marketing, forming and rendering web pages, managing customer session state • Actual order taking and billing not as demanding, have separate specialized services (Amazon bookseller backend) o One customer session needs a small fraction of one server o No interaction between customers (unless inventory near exhaustion) • Parallelism is more cores running identical copy of web server • Load balancing, maybe in name service, is parallel programming o Elasticity needs template service, load monitoring, cluster allocation o These need not require user programming, just configuration Jan 30, 2017 15719/18847b Adv. Cloud Computing 6 Eg., Obama for America Elastic Load Balancer Jan 30, 2017 15719/18847b Adv. Cloud Computing 7 What about larger apps? • Parallel programming is hard – how can cloud frameworks help? • Collection-oriented languages (Sipelstein&Blelloch, Proc IEEE v79, n4, 1991) o Also known as Data-parallel o Specify a computation on an element; apply to each in collection • Analogy to SIMD: single instruction on multiple data o Specify an operation on the collection as a whole • Union/intersection, permute/sort, filter/select/map • Reduce-reorderable (A) /reduce-ordered (B) – (A) Eg., ADD(1,7,2) = (1+7)+2 = (2+1)+7 = 10 – (B) Eg., CONCAT(“the “, “lazy “, “fox “) = “the lazy fox “ • Note the link to MapReduce …. its no accident Jan 30, 2017 15719/18847b Adv. Cloud Computing 8 High Performance Computing Approach • HPC was almost the only home for parallel computing in the 90s • Physical simulation was the killer app – weather, vehicle design, explosions/collisions, etc – replace “wet labs” with “dry labs” o Physics is the same everywhere, so define a mesh on a set of particles, code the physics you want to simulate at one mesh point as a property of the influence of nearby mesh points, and iterate o Bulk Synchronous Processing (BSP): run all updates of mesh points in parallel based on value at last time point, form new set of values & repeat • Defined “Weak Scaling” for bigger machines – rather than make a fixed problem go faster (strong scaling), make bigger problem go same speed o Most demanding users set problem size to match total available memory Jan 30, 2017 15719/18847b Adv. Cloud Computing 9 High Performance Computing Frameworks • Machines cost O($10-100) million, so o emphasis was on maximizing utilization of machines (congress checks) o low-level speed and hardware specific optimizations (esp. network) o preference for expert programmers following established best practices • Developed MPI (Message Passing Interface) framework (eg. MPICH) o Launch N threads with library routines for everything you need: • Naming, addressing, membership, messaging, synchronization (barriers) • Transforms, physics modules, math libraries, etc o Resource allocators and schedulers space share jobs on physical cluster o Fault tolerance by checkpoint/restart requiring programmer save/restore o Proto-elasticity: kill N-node job & reschedule a past checkpoint on M nodes • Very manual, deep learning curve, few commercial runaway successes Jan 30, 2017 15719/18847b Adv. Cloud Computing 10 Broadening HPC: Grid Computing • Grid Computing started with commodity servers (predates Cloud) o 1989 concept of “killer micros” that would kill off supercomputers • Frameworks were less specialized, easier to use (& less efficient) o Beowulf, PVM (parallel virtual machine), Condor, Rocks, Sun Grid Engine • For funding reasons grid emphasized multi-institution sharing o So authentication, authorization, single-signon, parallel-ftp o Heterogeneous workflow (run job A on mach. B, then job C on mach. D) • Basic model: jobs selected from batch queue, take over cluster • Simplified “pile of work”: when a core comes free, take a task from the run queue and run to completion Jan 30, 2017 15719/18847b Adv. Cloud Computing 11 Cloud Programming, back to the future • HPC demanded too much expertise, too many details and tuning • Cloud frameworks all about making parallel programming easier o Willing to sacrifice efficiency (too willing perhaps) o Willing to specialize to application (rather than machine) • Canonical BigData user has data & processing needs that require lots of computer, but doesn’t have CS or HPC training & experience o Wants to learn least amount of computer science to get results this week o Might later want to learn more if same jobs become a personal bottleneck Jan 30, 2017 15719/18847b Adv. Cloud Computing 12 2005 NIST Arabic-English Competition Expert human translator BLEU Score Translate 100 articles 0.7 • 2005 : Google wins! Usable translation 0.6 Qualitatively better 1st entry Human-edittable translation Google Not most sophisticated approach 0.5 Topic ISI IBM+CMU No one knew Arabic identification UMD 0.4 JHU+CU Brute force statistics Edinburgh 0.3 But more data & compute !! 200M words from UN translations Useless 0.2 1 billion words of Arabic docs Systran 1000 processor cluster 0.1 Mitre ! Can’t compete w/o big data FSC 0.0 Jan 30, 2017 15719/18847b Adv. Cloud Computing 13 Cloud Programming Case Studies • MapReduce o Package two Sipelstein91 operators filter/map and reduce as the base of a data parallel programming model built around Java libraries • DryadLinq o Compile workflows of different data processing programs into schedulable processes • Spark o Work to keep partial results in memory, and declarative programming • TensorFlow o Specialize to iterative machine learning Jan 30, 2017 15719/18847b Adv. Cloud Computing 14 MapReduce (Majd) Jan 30, 2017 15719/18847b Adv. Cloud Computing 15 DryadLinq • Simplify efficient data parallel code o Compiler support for imperative and declarative (eg., database) operations o Extends MapReduce to workflows that can be collectively optimized • Data flows on edges between processes at vertices (workflows) • Coding is processes at vertices and expressions representing workflow • Interesting part of the compiler operates on the expressions o Inspired by traditional database query optimizations – rewrite the execution plan with equivalent plan that is expected to execute faster Jan 30, 2017 15719/18847b Adv. Cloud Computing 16 DryadLinq • Data flowing through a graph abstraction o Vertices are programs (possibly different with each vertex) o Edges are data channels (pipe-like) o Requires programs to have no side-effects (no changes to shared state) o Apply function similar to MapReduce reduce – open ended user code • Compiler operates on expressions, rewriting execution sequences o Exploits prior work on compiler for workflows on sets (LINQ) o Extends traditional database query planning with less type restrictive code • Unlike traditional plans, virtualizes resources (so might spill to storage) o Knows how to partition sets (hash, range and round robin) over nodes o Doesn’t always know what processes do, so less powerful optimizer than database – where it can’t infer what is happening, it takes hints from users o Can auto-pipeline, remove
Recommended publications
  • Bench - Benchmarking the State-Of- The-Art Task Execution Frameworks of Many- Task Computing
    MATRIX: Bench - Benchmarking the state-of- the-art Task Execution Frameworks of Many- Task Computing Thomas Dubucq, Tony Forlini, Virgile Landeiro Dos Reis, and Isabelle Santos Illinois Institute of Technology, Chicago, IL, USA {tdubucq, tforlini, vlandeir, isantos1}@hawk.iit.edu Stanford University. Finally HPX is a general purpose C++ Abstract — Technology trends indicate that exascale systems will runtime system for parallel and distributed applications of any have billion-way parallelism, and each node will have about three scale developed by Louisiana State University and Staple is a orders of magnitude more intra-node parallelism than today’s framework for developing parallel programs from Texas A&M. peta-scale systems. The majority of current runtime systems focus a great deal of effort on optimizing the inter-node parallelism by MATRIX is a many-task computing job scheduling system maximizing the bandwidth and minimizing the latency of the use [3]. There are many resource managing systems aimed towards of interconnection networks and storage, but suffer from the lack data-intensive applications. Furthermore, distributed task of scalable solutions to expose the intra-node parallelism. Many- scheduling in many-task computing is a problem that has been task computing (MTC) is a distributed fine-grained paradigm that considered by many research teams. In particular, Charm++ [4], aims to address the challenges of managing parallelism and Legion [5], Swift [6], [10], Spark [1][2], HPX [12], STAPL [13] locality of exascale systems. MTC applications are typically structured as direct acyclic graphs of loosely coupled short tasks and MATRIX [11] offer solutions to this problem and have with explicit input/output data dependencies.
    [Show full text]
  • Adaptive Data Migration in Load-Imbalanced HPC Applications
    Louisiana State University LSU Digital Commons LSU Doctoral Dissertations Graduate School 10-16-2020 Adaptive Data Migration in Load-Imbalanced HPC Applications Parsa Amini Louisiana State University and Agricultural and Mechanical College Follow this and additional works at: https://digitalcommons.lsu.edu/gradschool_dissertations Part of the Computer Sciences Commons Recommended Citation Amini, Parsa, "Adaptive Data Migration in Load-Imbalanced HPC Applications" (2020). LSU Doctoral Dissertations. 5370. https://digitalcommons.lsu.edu/gradschool_dissertations/5370 This Dissertation is brought to you for free and open access by the Graduate School at LSU Digital Commons. It has been accepted for inclusion in LSU Doctoral Dissertations by an authorized graduate school editor of LSU Digital Commons. For more information, please [email protected]. ADAPTIVE DATA MIGRATION IN LOAD-IMBALANCED HPC APPLICATIONS A Dissertation Submitted to the Graduate Faculty of the Louisiana State University and Agricultural and Mechanical College in partial fulfillment of the requirements for the degree of Doctor of Philosophy in The Department of Computer Science by Parsa Amini B.S., Shahed University, 2013 M.S., New Mexico State University, 2015 December 2020 Acknowledgments This effort has been possible, thanks to the involvement and assistance of numerous people. First and foremost, I thank my advisor, Dr. Hartmut Kaiser, who made this journey possible with their invaluable support, precise guidance, and generous sharing of expertise. It has been a great privilege and opportunity for me be your student, a part of the STE||AR group, and the HPX development effort. I would also like to thank my mentor and former advisor at New Mexico State University, Dr.
    [Show full text]
  • Beowulf Clusters — an Overview
    WinterSchool 2001 Å. Ødegård Beowulf clusters — an overview Åsmund Ødegård April 4, 2001 Beowulf Clusters 1 WinterSchool 2001 Å. Ødegård Contents Introduction 3 What is a Beowulf 5 The history of Beowulf 6 Who can build a Beowulf 10 How to design a Beowulf 11 Beowulfs in more detail 12 Rules of thumb 26 What Beowulfs are Good For 30 Experiments 31 3D nonlinear acoustic fields 35 Incompressible Navier–Stokes 42 3D nonlinear water wave 44 Beowulf Clusters 2 WinterSchool 2001 Å. Ødegård Introduction Why clusters ? ² “Work harder” – More CPU–power, more memory, more everything ² “Work smarter” – Better algorithms ² “Get help” – Let more boxes work together to solve the problem – Parallel processing ² by Greg Pfister Beowulf Clusters 3 WinterSchool 2001 Å. Ødegård ² Beowulfs in the Parallel Computing picture: Parallel Computing MetaComputing Clusters Tightly Coupled Vector WS farms Pile of PCs NOW NT/Win2k Clusters Beowulf CC-NUMA Beowulf Clusters 4 WinterSchool 2001 Å. Ødegård What is a Beowulf ² Mass–market commodity off the shelf (COTS) ² Low cost local area network (LAN) ² Open Source UNIX like operating system (OS) ² Execute parallel application programmed with a message passing model (MPI) ² Anything from small systems to large, fast systems. The fastest rank as no.84 on todays Top500. ² The best price/performance system available for many applications ² Philosophy: The cheapest system available which solve your problem in reasonable time Beowulf Clusters 5 WinterSchool 2001 Å. Ødegård The history of Beowulf ² 1993: Perfect conditions for the first Beowulf – Major CPU performance advance: 80286 ¡! 80386 – DRAM of reasonable costs and densities (8MB) – Disk drives of several 100MBs available for PC – Ethernet (10Mbps) controllers and hubs cheap enough – Linux improved rapidly, and was in a usable state – PVM widely accepted as a cross–platform message passing model ² Clustering was done with commercial UNIX, but the cost was high.
    [Show full text]
  • Improving MPI Threading Support for Current Hardware Architectures
    University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange Doctoral Dissertations Graduate School 12-2019 Improving MPI Threading Support for Current Hardware Architectures Thananon Patinyasakdikul University of Tennessee, [email protected] Follow this and additional works at: https://trace.tennessee.edu/utk_graddiss Recommended Citation Patinyasakdikul, Thananon, "Improving MPI Threading Support for Current Hardware Architectures. " PhD diss., University of Tennessee, 2019. https://trace.tennessee.edu/utk_graddiss/5631 This Dissertation is brought to you for free and open access by the Graduate School at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Doctoral Dissertations by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. To the Graduate Council: I am submitting herewith a dissertation written by Thananon Patinyasakdikul entitled "Improving MPI Threading Support for Current Hardware Architectures." I have examined the final electronic copy of this dissertation for form and content and recommend that it be accepted in partial fulfillment of the equirr ements for the degree of Doctor of Philosophy, with a major in Computer Science. Jack Dongarra, Major Professor We have read this dissertation and recommend its acceptance: Michael Berry, Michela Taufer, Yingkui Li Accepted for the Council: Dixie L. Thompson Vice Provost and Dean of the Graduate School (Original signatures are on file with official studentecor r ds.) Improving MPI Threading Support for Current Hardware Architectures A Dissertation Presented for the Doctor of Philosophy Degree The University of Tennessee, Knoxville Thananon Patinyasakdikul December 2019 c by Thananon Patinyasakdikul, 2019 All Rights Reserved. ii To my parents Thanawij and Issaree Patinyasakdikul, my little brother Thanarat Patinyasakdikul for their love, trust and support.
    [Show full text]
  • Exascale Computing Project -- Software
    Exascale Computing Project -- Software Paul Messina, ECP Director Stephen Lee, ECP Deputy Director ASCAC Meeting, Arlington, VA Crystal City Marriott April 19, 2017 www.ExascaleProject.org ECP scope and goals Develop applications Partner with vendors to tackle a broad to develop computer spectrum of mission Support architectures that critical problems national security support exascale of unprecedented applications complexity Develop a software Train a next-generation Contribute to the stack that is both workforce of economic exascale-capable and computational competitiveness usable on industrial & scientists, engineers, of the nation academic scale and computer systems, in collaboration scientists with vendors 2 Exascale Computing Project, www.exascaleproject.org ECP has formulated a holistic approach that uses co- design and integration to achieve capable exascale Application Software Hardware Exascale Development Technology Technology Systems Science and Scalable and Hardware Integrated mission productive technology exascale applications software elements supercomputers Correctness Visualization Data Analysis Applicationsstack Co-Design Programming models, Math libraries and development environment, Tools Frameworks and runtimes System Software, resource Workflows Resilience management threading, Data Memory scheduling, monitoring, and management and Burst control I/O and file buffer system Node OS, runtimes Hardware interface ECP’s work encompasses applications, system software, hardware technologies and architectures, and workforce
    [Show full text]
  • Postgres List All Tables in All Schema
    Postgres List All Tables In All Schema Coronal and louche Jonathan still cannibalise his goblin unheedingly. Motivated and marvelous Yance hard-wearing:kneecap her linchpin she bridling anesthetized showily andwhile balloting Darrell decimatedher girlhood. some aliyah intensely. Jeffry is Conditional by not in postgres schemas which are made free consultation with The list all schemas live rows of tables in schemas in our case insensitive names exist in a database host itself, and worse yet accurate counts are. Arm full stack exchange for postgres installed in southeast asia a postgres list all tables in schema. The live rows into your schema list views when you get! Very useful meaning that use one schema and other sites, postgres service for all tables! Sqlalchemy authors and foreign data separate privacy notice through either drop schemas are referenced by using restoro by revoking them. This approach we use for other kinds of varying levels of schema list in postgres database. Other views are currently looking at wellesley college studying media arts and all tables in postgres schema list of. Create or if there are retrieved either exactly the tables in postgres list all schema names with the database? True, render a FULL OUTER JOIN, type of an OUTER JOIN. Registry for storing, managing, and securing Docker images. University College London Computer Science Graduate. Subscribe you receive weekly cutting edge tips, strategies, and news when need to snap your web business. All occurences of postgres databases on a followup post, and users in postgres all schema list tables? You are commenting using your Twitter account.
    [Show full text]
  • Parallel Data Mining from Multicore to Cloudy Grids
    311 Parallel Data Mining from Multicore to Cloudy Grids Geoffrey FOXa,b,1 Seung-Hee BAEb, Jaliya EKANAYAKEb, Xiaohong QIUc, and Huapeng YUANb a Informatics Department, Indiana University 919 E. 10th Street Bloomington, IN 47408 USA b Computer Science Department and Community Grids Laboratory, Indiana University 501 N. Morton St., Suite 224, Bloomington IN 47404 USA c UITS Research Technologies, Indiana University, 501 N. Morton St., Suite 211, Bloomington, IN 47404 Abstract. We describe a suite of data mining tools that cover clustering, information retrieval and the mapping of high dimensional data to low dimensions for visualization. Preliminary applications are given to particle physics, bioinformatics and medical informatics. The data vary in dimension from low (2- 20), high (thousands) to undefined (sequences with dissimilarities but not vectors defined). We use deterministic annealing to provide more robust algorithms that are relatively insensitive to local minima. We discuss the algorithm structure and their mapping to parallel architectures of different types and look at the performance of the algorithms on three classes of system; multicore, cluster and Grid using a MapReduce style algorithm. Each approach is suitable in different application scenarios. We stress that data analysis/mining of large datasets can be a supercomputer application. Keywords. MPI, MapReduce, CCR, Performance, Clustering, Multidimensional Scaling Introduction Computation and data intensive scientific data analyses are increasingly prevalent. In the near future, data volumes processed by many applications will routinely cross the peta-scale threshold, which would in turn increase the computational requirements. Efficient parallel/concurrent algorithms and implementation techniques are the key to meeting the scalability and performance requirements entailed in such scientific data analyses.
    [Show full text]
  • High Performance Integration of Data Parallel File Systems and Computing
    HIGH PERFORMANCE INTEGRATION OF DATA PARALLEL FILE SYSTEMS AND COMPUTING: OPTIMIZING MAPREDUCE Zhenhua Guo Submitted to the faculty of the University Graduate School in partial fulfillment of the requirements for the degree Doctor of Philosophy in the Department of Computer Science Indiana University August 2012 Accepted by the Graduate Faculty, Indiana University, in partial fulfillment of the require- ments of the degree of Doctor of Philosophy. Doctoral Geoffrey Fox, Ph.D. Committee (Principal Advisor) Judy Qiu, Ph.D. Minaxi Gupta, Ph.D. David Leake, Ph.D. ii Copyright c 2012 Zhenhua Guo ALL RIGHTS RESERVED iii I dedicate this dissertation to my parents and my wife Mo. iv Acknowledgements First and foremost, I owe my sincerest gratitude to my advisor Prof. Geoffrey Fox. Throughout my Ph.D. research, he guided me into the research field of distributed systems; his insightful advice inspires me to identify the challenging research problems I am interested in; and his generous intelligence support is critical for me to tackle difficult research issues one after another. During the course of working with him, I learned how to become a professional researcher. I would like to thank my entire research committee: Dr. Judy Qiu, Prof. Minaxi Gupta, and Prof. David Leake. I am greatly indebted to them for their professional guidance, generous support, and valuable suggestions that were given throughout this research work. I am grateful to Dr. Judy Qiu for offering me the opportunities to participate into closely related projects. As a result, I obtained deeper understanding of related systems including Dryad and Twister, and could better position my research in the big picture of the whole research area.
    [Show full text]
  • VGP Fact Sheet
    Final 2013 VGP Fact Sheet U.S. Environmental Protection Agency 2013 Final Issuance of National Pollutant Discharge Elimination System (NPDES) Vessel General Permit (VGP) for Discharges Incidental to the Normal Operation of Vessels Fact Sheet Agency: Environmental Protection Agency (EPA) Action: Notice of NPDES General Permit Page 1 of 198 Final 2013 VGP Fact Sheet TABLE OF CONTENTS 1. General Information ...................................................................................................................9 1.1. Does this Action Apply to Me? ........................................................................................9 1.2. Further Information ...........................................................................................................9 2. Background ................................................................................................................................9 2.1. The Clean Water Act ........................................................................................................9 2.2. Legal Challenges .............................................................................................................10 2.3. Congressional Legislation ...............................................................................................11 2.4. General Permits ...............................................................................................................12 2.5. Public Comment on EPA’s Proposed VGP ....................................................................13
    [Show full text]
  • MPICH Installer's Guide Version 3.3.2 Mathematics and Computer
    MPICH Installer's Guide∗ Version 3.3.2 Mathematics and Computer Science Division Argonne National Laboratory Abdelhalim Amer Pavan Balaji Wesley Bland William Gropp Yanfei Guo Rob Latham Huiwei Lu Lena Oden Antonio J. Pe~na Ken Raffenetti Sangmin Seo Min Si Rajeev Thakur Junchao Zhang Xin Zhao November 12, 2019 ∗This work was supported by the Mathematical, Information, and Computational Sci- ences Division subprogram of the Office of Advanced Scientific Computing Research, Sci- DAC Program, Office of Science, U.S. Department of Energy, under Contract DE-AC02- 06CH11357. 1 Contents 1 Introduction 1 2 Quick Start 1 2.1 Prerequisites ........................... 1 2.2 From A Standing Start to Running an MPI Program . 2 2.3 Selecting the Compilers ..................... 6 2.4 Compiler Optimization Levels .................. 7 2.5 Common Non-Default Configuration Options . 8 2.5.1 The Most Important Configure Options . 8 2.5.2 Using the Absoft Fortran compilers with MPICH . 9 2.6 Shared Libraries ......................... 9 2.7 What to Tell the Users ...................... 9 3 Migrating from MPICH1 9 3.1 Configure Options ........................ 10 3.2 Other Differences ......................... 10 4 Choosing the Communication Device 11 5 Installing and Managing Process Managers 12 5.1 hydra ............................... 12 5.2 gforker ............................... 12 6 Testing 13 7 Benchmarking 13 8 All Configure Options 14 i 1 INTRODUCTION 1 1 Introduction This manual describes how to obtain and install MPICH, the MPI imple- mentation from Argonne National Laboratory. (Of course, if you are reading this, chances are good that you have already obtained it and found this doc- ument, among others, in its doc subdirectory.) This Guide will explain how to install MPICH so that you and others can use it to run MPI applications.
    [Show full text]
  • Phase 2.1 Report
    Phase 2.1 Report DOE Award: DE-EE0002777 AltaRock Energy, Inc. March 10, 2014 Contributing Authors AltaRock Energy Trenton T. Cladouhos, Susan Petty, Yini Nordin, Geoff Garrison, Matt Uddenberg, Michael Swyer, Kyla Grasso Consultants and Sub-recipients Paul Stern (PLS Environmental) Eric Sonnenthal (LBNL) Dennise Templeton (LLNL) Pete Rose (EGI) Gillian Foulger and Bruce Julian (Foulger Consulting) Acknowledgment: This material is based upon work supported by the Department of Energy under Award Number DE-EE0002777. Disclaimer: This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof. Table of Contents Table of Figures ...........................................................................................................................................
    [Show full text]
  • 3.0 and Beyond
    MPICH: 3.0 and Beyond Pavan Balaji Computer Scien4st Group Lead, Programming Models and Run4me systems Argonne Naonal Laboratory MPICH: Goals and Philosophy § MPICH con4nues to aim to be the preferred MPI implementaons on the top machines in the world § Our philosophy is to create an “MPICH Ecosystem” TAU PETSc MPE Intel ADLB HPCToolkit MPI IBM Tianhe MPI MPI MPICH DDT Cray Math MPI MVAPICH Works Microso3 Totalview MPI ANSYS Pavan Balaji, Argonne Na1onal Laboratory MPICH on the Top Machines § 7/10 systems use MPICH 1. Titan (Cray XK7) exclusively 2. Sequoia (IBM BG/Q) § #6 One of the top 10 systems 3. K Computer (Fujitsu) uses MPICH together with other 4. Mira (IBM BG/Q) MPI implementaons 5. JUQUEEN (IBM BG/Q) 6. SuperMUC (IBM InfiniBand) § #3 We are working with Fujitsu 7. Stampede (Dell InfiniBand) and U. Tokyo to help them 8. Tianhe-1A (NUDT Proprietary) support MPICH 3.0 on the K Computer (and its successor) 9. Fermi (IBM BG/Q) § #10 IBM has been working with 10. DARPA Trial Subset (IBM PERCS) us to get the PERCS plaorm to use MPICH (the system was just a li^le too early) Pavan Balaji, Argonne Na1onal Laboratory MPICH-3.0 (and MPI-3) § MPICH-3.0 is the new MPICH2 :-) – Released mpich-3.0rc1 this morning! – Primary focus of this release is to support MPI-3 – Other features are also included (such as support for nave atomics with ARM-v7) § A large number of MPI-3 features included – Non-blocking collecves – Improved MPI one-sided communicaon (RMA) – New Tools Interface – Shared memory communicaon support – (please see the MPI-3 BoF on
    [Show full text]