The OSCAR Solution Stack for Cluster Computing the OSCAR Solution
Total Page:16
File Type:pdf, Size:1020Kb
The OSCAR Solution Stack for Cluster Computing The OSCAR Solution Stack for Cluster Computing Tim Mattson Intel Corp Computational Software labs U.S.A. OSCAR [email protected] Skip my intro Agenda l Cluster Computing Software Stacks – What they are and why we desperately need a common base software stack. l OSCAR and the Open Cluster Group l OSCAR Today l OSCAR Tomorrow l Cluster Computing Standards The OSCAR Solution Stack for Cluster Computing1 The OSCAR Solution Stack for Cluster Computing Intel wants to get rich with clusters The role of Independent Software Vendors l Intel likes cluster since they so many high-end processors per system. l … but people don’t buy computers, they buy problem solutions (I.e. applications+computers to run them). l Hence if we want a healthy HPC industry, we need a rich set of cluster-enabled applications. TheThe academic/nationalacademic/national lablab sharesshares thisthis needneed withwith industry.industry. IndustrialIndustrial HPCHPC useuse translatestranslates intointo betterbetter productsproducts andand moremore fundingfunding So where is the cluster software? l Except for a few CFD# codes, crash codes, a handful of bioinformatics and chemistry codes, and a smattering of other codes, there aren’t many ISV* supported cluster enabled codes. l Most ISV’s have ignored parallel computing. l Why is this the case? *ISV = Independent Software Vendor #CFD = Computational Fluid Dynamics The OSCAR Solution Stack for Cluster Computing2 The OSCAR Solution Stack for Cluster Computing In the late 80’s, we thought portable programming environments were the solution. So computer scientists created a few… ABCPL CORRELATE GLU Mentat Parafrase2 ACE CPS GUARD Legion Paralation pC++ ACT++ CRL HAsL. Meta Chaos Parallel-C++ SCHEDULE SciTL Active messages CSP Haskell Midway Parallaxis POET And the Adl Cthreads HPC++ Millipede ParC SDDA. Adsmith CUMULVS JAVAR. CparPar ParLib++ ADDAP DAGGER HORUS Mirage ParLin SHMEM SIMPLE AFAPI DAPPLE HPC MpC Parmacs ISV’s ALWAN Data Parallel C IMPACT MOSIX Parti Sina AM DC++ ISIS. Modula-P pC SISAL. distributed smalltalk AMDC DCE++ JAVAR Modula-2* pC++ laughed AppLeS DDD JADE Multipol PCN SMI. SONiC Amoeba DICE. Java RMI MPI PCP: ARTS DIPC javaPG MPC++ PH Split-C. Athapascan -0b DOLIB JavaSpace Munin PEACE SR at us. Sthreads Aurora DOME JIDL Nano-Threads PCU Automap DOSMOS. Joyce NESL PET Strand. SUIF. bb_threads DRL Khoros NetClasses++ PETSc Blaze DSM-Threads Karma Nexus PENNY Synergy Telegrphos BSP Ease . KOAN/Fortran-S Nimrod Phosphorus “Too BlockComm ECO LAM NOW POET. SuperPascal C*. Eiffel Lilac Objective Linda Polaris TCGMSG. Threads.h++. "C* in C Eilean Linda Occam POOMA C** Emerald JADA Omega POOL-T TreadMarks many” TRAPPER CarlOS EPL WWWinda OpenMP PRESTO Cashmere Excalibur ISETL-Linda Orca P-RIO uC++ C4 Express ParLin OOF90 Prospero UNITY options UC CC++ Falcon Eilean P++ Proteus Chu Filaments P4-Linda P3L QPC++ V ViC * Charlotte FM Glenda p4-Linda PVM Charm FLASH POSYBL Pablo PSI Visifold V-NUS don’t VPE Charm++ The FORCE Objective-Linda PADE PSDM Cid Fork LiPS PADRE Quake Win32 threads Cilk Fortran-M Locust Panda Quark WinPar help. WWWinda CM-Fortran FX Lparx Papers Quick Threads Converse GA Lucid AFAPI. Sage++ XENOOPS XPC Code GAMMA Maisie Para++ SCANDAL COOL Glenda Manifold Paradigm SAM Zounds ZPL In the mid-90’s, standard API’s were the solution. So we finally agreed on a small set of API’s… l Thread Libraries – Win32 API – POSIX threads. l Compiler Directives – OpenMP - portable shared memory parallelism. l Message Passing Libraries – MPI - message passing WeWe pickedpicked upup aa fewfew moremore ISV ISV’’ss,, butbut mostmost ignoredignored us.us. The OSCAR Solution Stack for Cluster Computing3 The OSCAR Solution Stack for Cluster Computing What’s missing? A standard API isn’t enough … ISV’s need a standard platform to build their business upon. … and the platform must be easy for the general user to use. Cluster Computing Platforms l A Platform is the API’s, tools, interfaces, and everything else required to install, maintain and use a computing system. It includes: •Sys admin tools •Batch queue •Installers •Scheduler •“Parallel unix tools” •Performance monitoring •Libraries •Debugging The OSCAR Solution Stack for Cluster Computing4 The OSCAR Solution Stack for Cluster Computing So we all ran off and created cluster platforms Beowulf “how to” Linux Networx Extreme Linux Alinka NPACI-Rocks Score SE … and countless proprietary and home-brewed platforms. Are we at risk of scaring away the ISV’s (just as we did with API-glut in the early 90’s)? *Brands and names are property of their respective owners. My hope is … l To see a small number (on the order of two) common base platforms emerge. uLook very similar to end-users. u“Vendors” distinguish themselves with “value added instantiations”, not new interfaces. l To try and make this happen, a group of us came together as “The Open Cluster Group”. The OSCAR Solution Stack for Cluster Computing5 The OSCAR Solution Stack for Cluster Computing Agenda l Cluster Computing Software Stacks – What they are and why we desperately need a common base software stack. l OSCAR and the Open Cluster Group l OSCAR Today l OSCAR Tomorrow l Cluster Computing Standards Cluster Software stacks The Open Cluster Group l We are an open group dedicated to open source solutions for cluster computing. uDell, IBM, Intel, LLNL, MSC.software, NCSA, ORNL, SGI, Indiana University and others. l Our first project is OSCAR uOpen Source Cluster Application Resources FollowFollow ourour progressprogress at:at: http://www.http://www.openclustergroupopenclustergroup.org.org *Brands and names are property of their respective owners. The OSCAR Solution Stack for Cluster Computing6 The OSCAR Solution Stack for Cluster Computing Cluster Software stacks OSCAR top Level Strategy • OSCAR is a snap-shot of best-known- methods for building, programming and using clusters. • OSCAR is NOT a standard! • It will bring uniformity to clusters, foster commercial versions of OSCAR, and make clusters more broadly acceptable. Commercially supported Open Source value added OSCAR on OSCAR with instantiations of other OS’s? Linux OSCAR *Brands and names are property of their respective owners. Introduction to OSCAR OSCAR Milestones OSCAROSCAR 1.21.2 DevelopersDevelopers releaserelease (Feb(Feb’02)’02) GeneralGeneral PublicPublic release release OSCAR 1.1 AnnouncementAnnouncement OSCAR 1.0 OSCAR 1.1 OSCAR 1.0 RedHat 7.1 andand demo demo release release (Mar’01) RedHat 7.1 (Mar’01) support OSCAROSCAR 0.80.8 atat support (August’01) SCSC’00’00 (Nov(Nov’’00)00) (August’01) DevelopersDevelopers Kick-off Kick-off releaserelease OSCAROSCAR 22 meeting at OSCAR 2 meeting at OSCAROSCAR 1.01.0 architecturearchitecture OSCAR 2 Oak Ridge planned Oak Ridge atat LinuxLinux meetingmeeting planned National release National WorldWorld (April(April’01)’01) release Lab (Apr’00) (Q3’02) Lab (Apr’00) (Jan(Jan’’01)01) (Q3’02) The OSCAR Solution Stack for Cluster Computing7 The OSCAR Solution Stack for Cluster Computing OSCAR: How are we doing? l OSCAR is doing great: – “We’re number 1”. – Over 16,000 downloads in 2001. – 35% in a recent poll on www.cluster.top500.org – Lively on-line user group – starting to answer questions before we can get to them! – OSCAR has brought many new people into clustering l OSCAR is not doing very well: – Maintenance mode still not enabled. – Poor support for adding/deleting nodes. – Too hard to adjust to new releases from RedHat. – OSCAR is still way to complicated. NoneNone of of us us are are satisfied. satisfied. OSCAR OSCAR isn isn’t’t ready ready for for the the production production orientedoriented “ “commercialcommercial worldworld”” (I(I send send such such people people to to Scyld Scyld oror MSC.software).MSC.software). Agenda l Cluster Computing Software Stacks – What they are and why we desperately need a common base software stack. l OSCAR and the Open Cluster Group l OSCAR Today l OSCAR Tomorrow The OSCAR Solution Stack for Cluster Computing8 The OSCAR Solution Stack for Cluster Computing OSCAR Basics l Version 1.0, 1.1 u LUI = Linux Utility for cluster Install – Network boots nodes via PXE or floppy – Nodes install themselves from rpms over NFS from the server – Post installation configuration of nodes and server executes l Version 1.2+ u SIS = System Installation Suite – System Imager + LUI = SIS – Creates image of node filesystem locally on server – Network boots nodes via PXE or floppy – Nodes synchronize themselves with server via rsync – Post installation configuration of nodes and server executes Introduction to OSCAR OSCAR Contents/Status OS Layer OSCAR 1.0: RedHat 6.2 OSCAR 1.2: RedHat 7.2 Installation & System OSCAR 1.0, 1.1: LUI Configuration OSCAR 1.2 and on: SIS Security OpenSSH / OpenSSL System Services & C3 Cluster Management Job Management PBS, Maui scheduler. Programming gcc, PVM, MPIch and LAM-MPI Environment Packaging Components integrated and auto-installed. Documentation. *Brands and names are property of their respective owners. The OSCAR Solution Stack for Cluster Computing9 The OSCAR Solution Stack for Cluster Computing Hardware Considerations l Server & Clients uMust be IA32 systems – IA64 in pre-beta. uMust be connected by an Ethernet network (preferably a private one) l Clients uShould contain identical hardware uPXE Enabled NIC or Floppy Drive Installation Overview l Install RedHat 7.2 l Download OSCAR l Copy RPMS to server l Run wizard u Build image per client type (partition layout, HD type) u Define clients (network info, image binding) u Setup networking (collect MAC addresses, configure DHCP, build boot floppy) u Boot clients / build u Complete setup (post install) u Install test suite The OSCAR Solution Stack for Cluster Computing10 The OSCAR Solution Stack for Cluster Computing Agenda l Cluster Computing Software Stacks – What they are and why we desperately need a common base software stack.