Archer Update Andrew Washbrook University of Edinburgh

Archer Update Andrew Washbrook University of Edinburgh

Archer Update Andrew Washbrook University of Edinburgh HPC working group mee:ng 10th December 2014 Archer Details • Archer is the UK’s primary academic research supercomputer • Operaonal since Nov 2013 • NEW: Phase 2 upgrade completed in Nov 2014 • Cray XC30 system • Each compute node comprises of: • 2 x 12-core 2.7 GHz Ivy Bridge processors • At least 64 GB of DDR3-1833 MHz main memory • Cray Aries interconnect (mul:-:er all-to-all connec:vity) • 4.4 PB scratch storage (Lustre) • 3008 4920 compute nodes è 72,192 118,080 cores • 1.56 >2 Petaflops of theore:cal peak performance. 2 Questions for ATLAS Weekly How much are we using for G4 and what are the prospects for using more? • We currently have access to Archer via a nominal allocaon pledged to University of Edinburgh researchers (Archer directors :me) • An effec:ve proof of concept to demonstrate effec:ve use of opportunis:c slots would help bolster the case to request addi:onal resources What are the main challenges for using HPCs in general? • Specific challenges are covered in the following slides • Some of these will be Archer specific • (Hopefully) some of these will have been addressed at other HPC sites 3 Compute Node Connectivity Challenge: Outgoing connec=vity is (in general) required throughout the life=me of a job • Up un:l Archer Phase 2 there was no external connec:vity available on the compute nodes Approaches • External connec:vity may now be possible by the use of new Cray Realm Specific IP addressing (RSIP) for compute nodes • Allows compute and service nodes to share the IP addresses configured on the external Gigabit Ethernet interfaces of network nodes • Included on Archer for third party sobware licence validaon • Not meant for large scale data transfer • Performing scaling tests underway to determine limitaons (if any) Alternatives • ssh port forwarding to select servers on the Archer network 4 Software Delivery Challenge: Availability of ATLAS soIware on HPC compute nodes Approaches • CVMFS is not available from compute nodes • External connec:vity issues (see previous slide) • Problemac installing CVMFS sobware, FUSE and autofs on compute nodes • Currently have a local snapshot of CVMFS as a pragmac op:on • CVMFS directory is rsynced to machine resident at Tier-2 site • Repository copy is cleansed of absolute paths and replaced with local path • Copy is rsynced over to Archer shared storage • For now only selected releases extracted during tes:ng phase Alternatives • Archer will allow external filesystem resident on edge server to be mounted on scheduling nodes • Not clear how his would be beneficial – could try CVMFS over NFS solu:on • Parrot/CVMFS • Pacman 5 Job Environment Challenge: Define HPC compute node environment before each job Approaches • c/w Grid environment script + Worker node tarball soluon used for some shared Tier-2 facili:es (e.g. ECDF) • HEP specific libraries (covered by the HEP_OSlibs meta rpm) have to be made available without rpm installaon method • Need to address poten:al conflicts in common tools and libraries used by other HPC users (gcc, python) • Favoured the use of an TCL module to define path setup to be consistent with sobware management of other HPC applicaons on Archer • asetup command works with some tweaking of absolute paths 6 HPC Pilots Challenge: A pilot submiLed to a HPC batch system will (in general) have to request and manage workload across many compute nodes • Pilot executes on a scheduling node and then requests compute resources using aprun Approaches • The easiest approach would be just to submit wholenode pilots • Archer queue limitaons: maximum 16 queued jobs, 8 running jobs per user • Assuming no MPI implementaon the pilot would need to handle many instances of wholenode jobs • Job finishes at the speed of the slowest if all wholenode jobs be launched simultaneously • Inefficient resource allocaon over :me and burns through quota Alternatives • Yoda Event Service approach makes sense to use given job throughput limitaons on Archer • Note that this challenge is not isolated to HEP - alternave HPC pilot solu:on (on top of SAGA) is being used for Molecular Dynamics simulaon framework (ExTASY) on Archer • RADICAL pilot: hnp://radical-cybertools.github.io/radical-pilot/index.html 7 Workload Scheduling and Backfilling Challenge: Assuming that pilots handle mul=-node workloads how many compute resources (and how much wallclock =me) should a pilot request at any given me? Approaches • We could just es:mate a fixed value that has a reasonable chance of being scheduled • Would like to follow approach made on Titan • Pilot polls scheduler directly to determine most efficient resource allocaon based on backfilling informaon • Unfortunately we cannot do this at Archer due to a different scheduler implementaon • Titan has PBSpro has access to “showbf” command • Archer has Cray ALPS • Ongoing discussions with Cray on alternaves - informaon could be derived from superset of informaon from apstat, qstat, xtnodestat commands - but less trivial than direct method 8 Other Challenges and Outlook Grid storage interacon • May need two stage copy to move job output across to local storage • Could be done via post-processing node outside of job life:me Dynamic Shared libraries vs Stac libraries • aprun normally expects stac libraries • Need to explore if this is a real issue for ATLAS workload Outlook • No major blockers on progress (for now) • External connec:vity was a long-standing issue but could be resolved in Phase 2 setup • Troubleshoo:ng session scheduled with Archer admins to resolve remaining issues in validaon exercise • Dormant ARC CE connected to Archer - will revive service to allow low level test jobs if suitable pilots are available • Would like scheduling and resource allocaon to be efficient as possible • Yoda Event Service could be more efficient use of resources - will perform ini:al tes:ng in step with deployment • Aiming for a reliable service early next year if challenges can be addressed (or at 9 least mi:gated) .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    9 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us