CMS Tier-2 at Nov 15, 2012 Fengping Hu - [email protected] Erik Gough - [email protected] Outline

● Overview ● Computing Resources ● Community clusters ● Setup, challenges and future work ● Storage Resouces ● HDFS ● Lessons Learned Resources Overview

● Computing ● 1 dedicated clusters, 4 community clusters ● ~10k dedicated batch slots ● Opportunistic slots ● Storage ● Coexists with dedicated CMS cluster ● 2.4 PB Raw disk Purdue is one of the big cms t2 sites Community Cluster Program at Purdue

● Shared cluster computing infrastructure ● Foundation of Purdue's research infrastructure ● Peace of Mind ● Low Overhead ● Cost Effectives ● CMS bought into 4 clusters in 5 years ● CMS benefits from community clusters it didn't buy (, recycled clusters) ● Other VOs have opportunistic access 4 clusters in 5 years – steele

● Installed on May 5, 2008 ● Ranked 104th on the November 2008 top500 super computer sites list ● 8 core Dell PowerEdge 1950 – 7261/536 cores – 60Teraflops ● Moved to an HP POD, a self-contained, modular, shipping container- style unit in 2010 ● retiring 4 clusters in 5 years – Rossmann

● Went in production on Sep 1, 2010 ● Ranked 126 on the November 2010 TOP500 ● Dual 12-core AMD Opteron 6172 processors – 11040/4416 cores ● 10-gigabit Ethernet interconnects 4 clusters in 5 years – Hansen

● Went in production on September 15, 2011 ● Would rank 198th on the November 2011 top500 ● four 12-core AMD Opteron 6176 processors – 9648/1200 cores ● 10 Gigabit Ethernet interconnects 4 clusters in 5 years –

● Launched in Nov 2011 ● ranked 54th on the November 2011 TOP500 list ● two 8-core Xeon-E5 processors – 10560/1600 cores ● 56Gbits FDR Infiniband connections from Mellanox The Dedicated cms cluster

● 2440 total batch slots: 70 sun x2200+112 Dell 1950 + 41 HP Proliant DL 165 ● Also runs hadoop storage ● Condor as the only batch scheduler Job scheduling setups

Dedicated cluster Community clusters

Compute Element Compute Element Compute Element Gatekeeper Gatekeeper Gatekeeper

Condor JobManager Pbs JobManager Pbs JobManager

condor flock Pbs, condor Pbs, condor flock

One CE per cluster condor flocking- the only form of federation Pros and Cons

● Pros ● One CE per cluster, straightforward in setup ● Cons ● One CE per cluster lead to unnecessary number of CEs ● Unnecessary job exemption ● Inadequate load balancing Job slot utilization Alternatives – custom globus jobmanager

Gatekeeper

Custom globus JobManager

Pbs cluster 1 Pbs cluster 2 ...

● Federate clusters at grid middle-ware level

● A custom globus jobmanager that knows load balancing ● Not too hard to implement but is discouraged by OSG Other Alternatives

● Campus Grid Factory

● Osg model for setting campus grid utilize a condor based submit host to submit jobs across multiple clusters with different job schedulers

● Investigating ● Pbs p2p Scheduling ● Storage Resources ● Purdue SE Integrated into dedicated CMS cluster ● Mix of CE/SE nodes and large storage nodes ● Migrated from dCache to Hadoop earlier this year ● Last of the T2s to get rid of dCache ● Able to migrate ~1PB of data with no CE downtime and a small SE downtime for cutover

● Access methods ● Fuse mounts on cluster head nodes ● Xrootd used for CMSSW job reads ● SRM ● Hoping to break 3PB raw space by 2013 ● Lessons Learned ● Twiki is your friend ● CMS T2 community is also your friend ● max.xcievers ● Watch your NN fsimage ● Rack awareness ● SRM supportedProtocolList ● Tune your I/O settings ● Hadoop 0.20 single disk failures ● Documentation ● Pay it forward ● Questions