Hepix Report

Hepix Report

HEPiX Report Helge Meinhard, Pawel Grzywaczewski, Romain Wartel / CERN-IT Post-C5/Computing Seminar 03 December 2010 CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Outline • Meeting organisation, site reports, (benchmarking,) infrastructure (Helge Meinhard) • Storage, OS and applications, miscellaneous (Pawel Grzywaczewski) • Virtualisation, security and networking, grid and cloud (Romain Wartel) HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 HEPiX • Global organisation of service managers and support staff providing computing facilities for HEP • Covering all platforms of interest (Unix/Linux, Windows, Grid, …) • Aim: Present recent work and future plans, share experience, advise managers • Meetings ~ 2 / y (spring in Europe, autumn typically in North America) HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 HEPiX Autumn 2010 (1) • Held 01 – 05 November at Cornell University, Ithaca NY – CESR: Electron-positron storage ring; CLEO: experiment doing a lot of interesting b physics – New player in the HEPiX field at as site… but a well-known face: Chuck Boeheim, the previous north-American co-chair of HEPiX – Good local organisation – Nice auditorium in conference hotel, basically unlimited coffee supply – First face-to-face meeting in 2010 for most participants HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 HEPiX Autumn 2010 (2) • Format: Pre-defined tracks with conveners and invited speakers per track – Still room for spontaneous talks – either fit into one of the tracks, or classified as ‘miscellaneous’ – Again proved to be the right approach; in view of the low number of participants, an extremely rich, interesting and packed agenda – Judging by number of submitted abstracts, no real hot spot: 8 infrastructure, 8 Grid/clouds, 7 storage, 6 virtualisation, 6 OS and apps, 5 network and security, 3 miscellaneous, 1 benchmarking – Some abstracts submitted late, planning difficult • Full details and slides: http://indico.cern.ch/conferenceDisplay.py?confId=92498 • Trip report by Alan Silverman available, too http://cdsweb.cern.ch/record/1307061 HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 HEPiX Autumn 2010 (3) • 47 registered participants, of which 11 from CERN – Barring, Bell, Grzywaczewski, Janyst, Kelemen, Meinhard, Salter, Schwickerath, Silverman, T Smith, Wartel – Other sites: ASGC, Caspur, CEA, CNAF, Cornell, DESY Hamburg, DESY Zeuthen, FNAL, FZU, IN2P3, INFN Milano, INFN Pavia, JLAB, KISTI, LAL, NIKHEF, RAL, SFU, SLAC, TRIUMF, Umea U – Compare with Berkeley (autumn 2009): 61 participants, of which 9 from CERN HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 HEPiX Autumn 2010 (4) • 62 talks, of which 19 from CERN – Compare with Berkeley: 62 talks, of which 16 from CERN • Next meetings: – Spring 2011: GSI Darmstadt (May 2 - 6) • Possibly followed by an LCG workshop over the weekend – Autumn 2011: Vancouver (to be confirmed, date to be decided; 20th anniversary of HEPiX!) HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 Oracle/SUN Policy Concerns • Recent observations: – Significantly increased HW prices for Thor-style machines – Very significantly increased maintenance fees for Oracle (ex-Sun) software running on non-Oracle hardware • Sun GridEngine, Lustre, OpenSolaris, Java, OpenOffice, VirtualBox, … – Very limited collaboration with non-Oracle developers – Most Oracle software has already got forked as open- source projects – (At least) two Oracle-independent consortia around Lustre • HEP labs very concerned HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 Site Reports (1) • Worker node acquisitions: HPC-style small form-factor boxes (e.g. 4 dual CPU systems in a 2U enclosure) very popular – HP, Dell, Supermicro, Acer, … – Overheating CPUs because of missing thermal grease… • Disk storage – Many sites using storage-in-a-box (a la CERN or Thumper/Thor) – Some dedicated storage with SAN (FC or iSCSI) or NAS uplink • Dell MD1000, DDN 6620 (60 drives in 4U) – ASGC reporting servers of 160 TB with 10GE HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 Site Reports (2) • Tape storage: Large sites appear to prefer SL8500 robots and LTO5 drives • Networking: OPN hasn’t reached out yet to Norwegian and Slowenian parts of Nordic T1 • GPUs mentioned only for non-HEP applications HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 Site Reports (3) • Batch schedulers: One singularity (BQS) gets eradicated at last… – Replaced by (Sun|Oracle) Grid Engine • Configuration tools: Random walk of Quattor, cfengine, Puppets, … – One more very positive Quattor report from RAL • Monitoring: Many sites using Nagios, appears to become a de-facto standard • Drupal, Jabber, Trac mentioned a number of times HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 Site Reports (4) • Other interesting points – DESY: turning into a centre for accelerator research, particle physics and photon physics. New requirements • MacOS, Windows HPC • Large RAM machines (~500 GB!) • Lustre on MacOS and Windows • 20…100 GB from XFEL from 2012/13 on • NUMA, GPU computing • Hundreds of VOs with a handful users each – SLAC: Similar conversion as DESY • Mixed experience with back-charging for scientific computing • Tender with fixed budget for maximal capacity HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 Site Reports (5) –FNAL • Return air is 23 deg C, too low • DOE review recommends physics capacity without UPS – JLAB: Using tool (Surveyer) to control power consumption of desktop PCs – large savings – RAL: Quibble about email addresses – GSI: Compute requirements similar as for LHC • “Cube” computer centre prototyped – PUE 1.1 – FZU Prag: Experience with a 100 A circuit- breaker tripping… • … because it has been running at 96 A constantly… HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 Infrastructure (1) • 8 talks, 3 from CERN – LSF scalability tests (Schwickerath) – CERN-IT procurement (Barring) – CERN computer centre upgrade project (Salter) • Update on Quattor at RAL (Collier) – Started with new batch worker nodes. Batch done, now covering disk servers gradually – Good experience, estimate saving 0.3…0.5 FTE – Good experience with QWG as well – NIKHEF starting tests, too HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 Infrastructure (2) • Assets server at LEPP, Cornell U (Pulver) – OCSNG: multi-platform “moment in time” hardware and software inventory – GLPI: full lifecycle asset management – ZENOSS: agent-less monitoring, extensible via Python scripts • Scientific computing at JLAB (Philpott) – Extensive experience with GPUs: mixture of gaming cards and professional units • Mostly using CUDA, interested in OpenCL – Lustre: 300 TB on commodity HW – lots of fun points HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010 Infrastructure (3) • Batch infrastructure resource at DESY: BIRD (Finnern) – Collecting smallish resources, parasitic batch usage based on SGE fair-share – Contributing projects are granted fair-share points • Infrastructure improvements at IN2P3 (Olivero) – Current room: new transformers, diesel, cooling unit – New building: 2 levels of 850 m2, no offices • No raised floor, no false ceiling. All services from above • Minimal electrical redundance, will perhaps use EDF offering of double dedicated connection • Construction started in April 2010, scheduled to finish by February 2011. first production in March 2011 HEPiX report – Helge.Meinhard at cern.ch – 03-Dec-2010.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    17 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us