Selecting a New Batch System at CC-IN2P3
Total Page:16
File Type:pdf, Size:1020Kb
Selecting a new batch system at CC-IN2P3 Bernard CHAMBON ([email protected]) Hepix, Darmstadt, may 2011 Overview Why we give up our BQS batch system The process of selection of a new batch system The winner is … Annex – Criteria in details – A graphic view of the ranking Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 1 Why to give up our BQS batch system BQS's story – In house development, started in 1992 (related to VM (with BMON) to UNIX migration) – Having deeply evolved (scalability, robustness, functionalities) over years… – To process intensive flow of jobs (>100,000 jobs per day), sequential or parallel jobs, local or grid users, with a recognized availability. – One cluster of 15,000 slots (~1300 WNs), another one for parallel jobs with 1064 slots (64 WNs) (numbers on Q1/2011) Why to change – Only used at CC-IN2P3 (not designed to be easily installed elsewhere) – Missing functionnalities currently expected (e.g. parametric and interactive jobs, reservation, DRMAA compliance, etc.) – Requiring manpower for development (> 2 FTE over past years) Batch migration was decided in june 2009 Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 2 The process of selection In July 2009, we started a study of existing solutions,focusing on batch system used in HEP community How we proceeded : – List of batch systems used in HEP world – Setting up a list of criteria – Setting up a WEB survey – Audit of products People involved in the study : – BQS developers – Operation team – System + Grid administrators Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 3 The process of selection List of current batch systems used (fall 2009) – LSF : CERN, SLAC, CNAF (INFN), KEK – Torque/Maui : FNAL (for parallels jobs), RAL, NIKHEF, PIC, ASCG (Taïwan), TRIUMF, NDGF and many T2 in LCG – Torque/Moab : NERSC (National Energy Research Scientific CC) – PBS-Pro : KIT (Grid-KA), NERSC – SGE : DESY (3 clusters) , TACC, NERSC, TSUBAME (Tokyo Institute of Technology) Le-SC (London e-Science Centre) – Condor : BNL, FNAL – BQS : CC-IN2P3 – LoadLeveler : Product from IBM – SLURM : Product developed at LLNL, used by the french 'CEA' (military research unit) Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 4 The process of selection Setting up a list of criteria The starting point of the study was a list of criteria mixing batch point of view and (daily) operation point of view. This list was made of weighted items (Wi) group by weighted topics (Wt) Items was evaluated with Ni notes in the range [0..10] We get a topic rating Nt = sigma(Wi x Ni) / sigma(Wi) then a global rating N = sigma(Wt x Nt) / sigma(Wt) We get a list of ~15 topics (See next 2 slides) See Annex for more details on thoses topics! Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 5 The process of selection List of criteria (1/2) Scalability How man jobs (simultaneous, per day), how many requests per second Robustness Behaviour in case of server. Built-in failover. Sharing Sharing resources between users, groups,VOs & roles; sharing cluster between sequential and parallel jobs, sharing heterogenous workernodes Scheduler Richness of functionnalities, ease of configuration, AFS AFS token management (yes or no, how to, customization) Worker Nodes Limits of job's consumption. Crash recovery behaviour Interfacing to the grid Which middleware and who deliver it Administration & operation Widespread of possibilities for parameterization Available monitoring; health status disgnostic Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 6 The process of selection List of criteria (2/2) Parallel and interactive jobs Supported or not, howto , customization Information repository Pertinence and ease of access for information (cf MySql DB used in BQS) Accounting Information availability, which kind of, periods, Constraint on deployment Dependancies on third-party products (FileSystem, database, OS, etc.) Software support Richness of documentation, vitality of the product and discussion lists Procurement cost Cost for licences and|or support Maintenance cost How many FTE to keep the product running See Annex for details on thoses topics (items and ponderation)! Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 7 The process of selection Setting up a WEB survey – We set up a web survey in fall 2009. – We got ~15 responses. Let me greatly thank people who take time to answer that survey – Batch system used Occurrence!Name!!Location ! 3 LSF : CERN, INFN-T1, INFN-Trieste (T3) 2 PBS-Pro : KIT (ex FZK), NERSC (for // jobs ) 6 Torque/Maui : RAL, Nikhef, Grif, HPC2N, Fermilab (for // jobs), DESY 1 SGE : NERSC 1 Torque/Moab : NERSC 1 Torque/Catalina : SDSC (using also LoadLeveler + Catalina) 1 Slurm : CEA/DAM – Level of satisfaction : from 0 (very bad) to 5 (very good) • 5 for LSF • 3 ~ 4 for Torque/M • 2 for PBS-Pro • 4 for other ones – The results are partially available (whole information is not publicly disclosed). Ask by email. Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 8 The process of selection Auditing solutions, for the following batch systems : – LSF (from Platform Computing) – SGE (from SUN) – PBS-Pro (from Altair) – Torque/Maui Selecting solutions – By using the criteria's grid we got the following rating table Numbers are in the range [0..10], a higher number means a better appreciation. Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 9 The process of selection The first round between LSF, SGE, PBS-Pro, Torque Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 10 Before going ahead … I'd like to insist on the following points – The grid criteria was built from our operation point of view and our expertise on BQS batch system. In other words, critera and ponderation are very CC-IN2P3 centric – We chose to NOT evaluate some batch systems, such as Condor or Slurm. – Keep in mind that the notes should be considered according to their relative values Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 11 The process of selection (continuation …) Status after the first round – LSF and SGE are very close, according to our criteria – We need thorough investigations for a better understanding and, ultimately, in order to select one system. We decided to install and work on each product – Open Source version for Grid Engine (6.2u5 - 12/2009) ; An evaluation version for LSF 7 – 1 FTE, for 10 days, on each product. – Verifying some basic functionnalities, first feed-back on installation and configuration, first impressions with commands. Conclusion – It appeared to us that SGE was more suitable in various ways : clear concepts, easier configuration (due to GUI), database usage (Berkeley DB), better scalability, etc. – We added the information as two new topics : Results from 10 days of work and Feeling of the products – At the same time, we decided to remove procurement cost and maintenance cost topics for the final notation Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 12 The process of selection The last round between LSF and SGE (GE 6.2u5) Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 13 Conclusion We decided to select Grid Engine, in february 2010 We started to explore it in details by : – Getting exhaustive view of functionnalities – Testing scalability and robustness – Developing adaptation to CC-IN2P3 requirements The following presentation GridEngine Setup at CCIN2P3 will give you more details. Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 14 Annex Criteria in details A graphic view of the ranking Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 15 Annex : Topics in details Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 16 Annex : Topics in details Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 17 Annex : Topics in details Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 18 Annex : Topics in details Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 19 Annex : Topics in details Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 20 Annex : Topics in details Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 21 Annex : Topics rating, first round Distribution of rating between BQS, LSF, SGE, PBS-Pro, Torque-Maui Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 22 Annex : Topics rating, last round Distribution of rating between LSF and SGE (last round) Selecting a new batch system at CC-IN2P3 - HEPIX, Darmstadt, may 2011 23 .