Grid Computing with Globus
Total Page:16
File Type:pdf, Size:1020Kb
INITIATIVE FOR GLOBUS IN EUROPE Grid Computing with Globus Dr. Helmut Heller ([email protected]) Leibniz Supercomputing Centre (LRZ) Munich, Germany IGE Project Director IDRIS, December 16th, 2010 INITIATIVE FOR GLOBUS IN EUROPE Overview • Why Grid computing? What is it anyway? • A brief history of Globus in context • Globus toolkit structure • Globus usage worldwide • … and Globus in Europe? IGE! • Outlook: new things to come • Summary 2 INITIATIVE FOR GLOBUS IN EUROPE Why Grid computing? • Common problems of HPC-center users: – A single user computes at more than one HPC center – Authentication is cumbersome: username and password have to be entered over and over again (and they differ from site to site!), security problems (password sniffers!) – Data have to be transferred manually from one site to another (which can lead to confusion: where are the valid data?) – The user has to master different interfaces, e.g., file systems (quota, policies, etc.), queuing systems (NQS, LoadLeveler, etc.), user administration, etc. • GRID-computing addresses these problems so that the user can concentrate on science and is not distracted by computer idiosyncrasies 3 INITIATIVE FOR GLOBUS IN EUROPE What is Grid computing? • GRID- computing simplifies the usage of distributed but coupled resources • This also implies resource sharing and a coordinated usage of shared resources • GRID-computing provides a uniform, simple, and easy-to-learn interface to shared, distributed, coupled resources – GRID-computing introduces a common layer of abstraction – The user no longer has to fight with the idiosyncrasies of each computer system or each computing center • The user interacts now with the GRID as a whole and not with individual components 4 INITIATIVE FOR GLOBUS IN EUROPE Grid Hour Glass Analogy •Local sites are heterogeneous •Each has its own local policies Higher-Level Services •Queuing systems, monitors, and Users firewalls, etc. are different •Grid unifies Standard GT5 •Common management Interfaces abstractions & interfaces •Middleware as an abstraction layer to hide local heterogeneity by a standard interface: the middleware! Local heterogeneity From Jennifer Shopf 5 INITIATIVE FOR GLOBUS IN EUROPE My Vision Meta-Scheduler User at home Middleware: Globus, UNICORE, gLite, .... Resources, world-wide 6 INITIATIVE FOR GLOBUS IN EUROPE A Brief History of Globus Globus Toolkit® History 30000 Does not include downloads from: NMI, UK eScience, EU Datagrid, GT 2.0 GT 2.2 Released Released IBM, Platform, etc. Physiology of the Grid Paper Released 25000 GT 2.0 beta Released NSF GRIDS Center Initiated, DOE begins 20000 Anatomy of the Grid SciDAC program Paper Released Significant GT 1.1.4 and Commercial MPICH-G2 Released Interest in The Grid: Blueprint for a New Computing Grids 15000 Infrastructure published NSF & European Commission Initiate Many New Grid Projects First DARPA, NSF, EuroGlobus and DOE GT 1.1.3 Conference Released Held in begin funding MPICH-G 10000 Grid work released Lecce Early Application GT 1.1.2 Globus Project wins Successes Reported Released Global Information ftp.globus.org from Month per Downloads Infrastructure GT 1.0.0 GT 1.1.1 Award Released Released 5000 NASA initiates Information Power Grid, DOE increases support 0 1997 1998 1999 2000 2001 2002 from: www.globus.org 7 INITIATIVE FOR GLOBUS IN EUROPE More Recent Globus History in Context UNICORE U5 U6 gLite CREAM-CE GT2.4 GT3.0 GT4.0 GT5.0 OGSA WSRF OGSI ARC 2003 2004 2005 2009 8 INITIATIVE FOR GLOBUS IN EUROPE (+;7'*#>:+"? !"#$%&"'()*&+,(-.*(!/,0&1"0! gLite (1) !,+@A>':++,B#:'0'!"#$$$% CD!'!#$$&%'''''''''''''''&,#7E'!#$$&%' 9F!'!#$$(% 89#:7'!#$$)% *+,-./01,2 34546,78420 Main improvements: Globus components: •metascheduler (WMS) *+,-./01,2 •GSI •VO management6*7'89#:7';#$$,7<="7 •myProxy!"#$%&'()*++,-'./%-'012320445 •stability: production! •GridFTP •file catalogs, SRM, dCache... •GRAM •UI - CE - SE From Ariel Garcia 9 INITIATIVE FOR GLOBUS IN EUROPE Computing resource accessgLite (2) Enabling Grids for E-sciencE CREAM Globus User gLite non-gLite User / UI client client component component Resource gLite WMS In production Existing prototype Possible development ICE Condor-G GT4 GRAM CREAM LCG-CE gLiteCE CEMon (GT2) jobmanager Site ! EGEE authZ, Batch InfoSys, BLAH System Accounting From Ludek Matyska EGEE-II INFSO-RI-031688 10 INITIATIVE FOR GLOBUS IN EUROPE ARC • ARC = Advanced Resource Connector • Developed by the Nordic countries • Used in NorduGrid for HEP • Replaced Globus GRAM with own solution based on GridFTP, as GRAM was not fast enough: HTP! • EU KnowARC project brings standards to ARC: JSDL, WS, GLUE2, BES, UR, ... – HED - hosting environment for Web services (WS) 11 INITIATIVE FOR GLOBUS IN EUROPE More Recent Globus History in Context UNICORE U5 U6 gLite CREAM-CE GT2.4 GT3.0 GT4.0 GT5.0 OGSA WSRF OGSI ARC 2003 2004 2005 2009 12 INITIATIVE FOR GLOBUS IN EUROPE … and Globus? • Many other MWs go WebService (WS), while Globus moves back to PRE-WS – why? – GT4 WS-GRAM was never as stable as GT2- GRAM – User survey showed that direct demand from users for WS was very limited, clients use jGlobus, Condor, SAGA, etc. instead – Big Grids were still using GT2.4 GRAM – GT5-GRAM is more stable than GT2-GRAM and more performant than GT4 WS-GRAM 13 Globus Provides Building Blocks Basic components for Grid functionality –Not turnkey solutions, but building blocks & tools for application developers & system integrators Highest-level services are often application specific; we let apps concentrate on that Easier to reuse than to reinvent –Compatibility with other Grid systems comes for free GT provides basic infrastructure to get you one step closer to your science results Amsterdam, November 3rd, 2009 9th DEISA Training Session 14 From J. Schopf Globus Alliance INITIATIVE FOR GLOBUS IN EUROPE Toolkit Components • Globus is a toolkit VOMS with many owners GSI and contributors GSI-ssh • All is open source OGSA-DAI software, freely BES-GRAM usable GRAM • Use the tools (some GridFTP or all) to build a Grid GridWay Globus Globus 15 GT4 Architecture Applications Users Organizations GridEnvironment Data Mgmt. Information Mgmt. Information ResourceMgmt. Grid Security Infrastructure (GSI) Amsterdam, November 3rd, 2009 9th DEISA Training Session 16 Grid Security Infrastructure (GSI) Applications Users Organizations GridEnvironment Data Mgmt. Information Mgmt. Information ResourceMgmt. Grid Security Infrastructure (GSI) Amsterdam, November 3rd, 2009 9th DEISA Training Session 17 Grid Security Infrastructure (GSI) Private Key - known only by owner Public Key- known to everyone What one key encrypts, the other decrypts Guarantees Integrity Authentication And Privacy Amsterdam, November 3rd, 2009 9th DEISA Training Session 18 From J. Schopf Globus Alliance Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch09s03.html Proxy Certificate Proxy allows you to do single sign-on –Credential delegation uses proxy –Limited lifetime (default: 12 hours) Amsterdam, November 3rd, 2009 9th DEISA Training Session 19 From J. Schopf Globus Alliance Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch10s05.html Benefits of Single Sign-on Don’t need to remember (or even know) ID/ passwords for each resource More secure –No ID/password is sent over the wire – not even in encrypted form –Proxy certificate expires in a few hours and then is useless to anyone else –Don’t need to write down 10 passwords It’s fast and it’s easy! Amsterdam, November 3rd, 2009 9th DEISA Training Session 20 From J. Schopf Globus Alliance Data Management Applications Users Organizations GridEnvironment Data Mgmt. Information Mgmt. Information ResourceMgmt. Grid Security Infrastructure (GSI) Amsterdam, November 3rd, 2009 9th DEISA Training Session 21 GridFTP: The Protocol A high-performance, secure, reliable data transfer protocol optimized for high- bandwidth wide-area networks –FTP with well-defined extensions –Uses basic Grid security (GSI) –Multiple data channels for parallel transfers –Partial file transfers –Third-party transfers –Reusable data channels –Command pipelining GGF recommendation GFD.20 Amsterdam, November 3rd, 2009 9th DEISA Training Session 22 From J. Schopf Globus Alliance GridFTP: The Service Multiple TCP streams between sender and receiver possible Sender pushes multiple blocks in parallel streams Blocks reassembled at receiving side and put into correct order Protection against Parallel Transfer dropped packets for each Fully utilizes bandwidth of stream network interface on single nodes Amsterdam, November 3rd, 2009 9th DEISA Training Session 23 From J. Schopf Globus Alliance Striped GridFTP Service Multiple nodes work together as a single logical GridFTP server Multiple nodes of the cluster are used to transfer data into/out of the cluster – Each node reads/writes only pieces it is responsible for – Head node coordinates transfers Multiple levels of parallelism Parallel Filesystem – CPU, bus, NIC, disk etc. Parallel Filesystem – Maximizes use of Gbit + WANs Striped Transfer Fully utilizes bandwidth of Gb+ WAN using multiple nodes Amsterdam, November 3rd, 2009 9th DEISA Training Session 24 From J. Schopf Globus Alliance GridFTP Features TCP buffer size control –Tune buffers to latency of network –Regular FTP optimized for low latency networks, not tunable by user Dramatic improvements for high latency WAN transfers –90% of network utilization possible • Tuning has been done by DEISA staff for DEISA systems: gscp wrapper with syntax similar to scp Amsterdam,