Data Transfer Performance for the Earth System Grid Federation (PDF)
Total Page:16
File Type:pdf, Size:1020Kb
Data Transfer Performance for the Earth System Grid Federation Mary Hester, ESnet Lawrence Berkeley National Laboratory Internet2 Global Summit Denver, CO April 8, 2014 Topics • Network Goals • Implementing the Project Plan • Motivation • Current Status 4/8/14 2 Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science Network Specific Goals for ESGF Proposal Sites: ANU/NCI, BADC, DKRZ, KNMI, PCMDI Goal: 500MB/sec (4Gbps) sustained data transfer rate, disk to disk, between sites Stretch goal: 1GB/sec (8Gbps) disk to disk by August 2015 Timeline: • March 2014: deploy perfSONAR systems • May 2014: deploy 10GE data servers • May 2014: achieve 500MB/sec (4Gbps) network performance • August 2014: achieve 500MB/sec (4Gbps) disk to disk 4/8/14 3 Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science Location of International Sites 4/8/14 4 Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science Project Network Map AARNet Router at Pac Wave Seattle 10GE KNMI BADC AARNet 40G Trans Pacific Backbone (7-10Gbps allocation) KNMI Data Node 100GE 10GE 10GE 10GE Pacific Wave Exchange 10GE 10GE Seattle Rutherford Lab BADC AARNet Router (RAL) 10GE Data Nodes (6) at ANU ESnet test 10GE 10GE PNWG-CR5 DTNs (3) SURFnet 100GE 10GE 10GE NCI Network Janet ESnet5 100GE 100GE 100G Backbone 30G (3x10GE) 10GE 10GE AOFA-CR5 MANLAN Trans Exchange Atlantic New York LLNL-MR2 100GE GEANT 100GE ANU/NCI 20G 10GE Data Nodes (3) (2x10GE) Trans SUNN-CR5 100GE Atlantic 10GE WASH-CR5 WIX Exchange LLNL Border Washington DC DFN Router PCMDI Data Nodes (4) 10GE 1ge LLNL Campus 3.5Gbps 3.5Gbps Network DKRZ 1ge 10GE Data Nodes (6) 10GE DKRZ 4/8/14 5 Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science Plan of Action 1. Start a working group within ESGF 2. Deploy perfSONAR boxes at the participating sites 3. Deploy/upgrade data transfer nodes (DTNs) 4. Gather metrics to track progress 5. Documentation Long-term: other ESGF sites will be incorporated into this effort (the working group) after these first few sites (via the EYR program) get up and running. 4/8/14 6 Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science Motivation: We Know This Works Network engineering efforts typically result in 10x to 500x performance increase Three stages of performance engineering: 1. Network • Sufficient bandwidth capacity • Clean network paths (no packet loss!) 2. Facility • Systems • Storage • Tools (e.g. Globus) 3. User • Systems • Storage • Tools 4/8/14 7 Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science Small Packet Loss Leads to Huge TCP Performance Problems Local With loss, high performance (LAN) beyond metro distances is essentially impossible International Metro Area Regional Continental Measured (TCP Reno) Measured (HTCP) Theoretical (TCP Reno) Measured (no loss) 4/8/14 Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science Current Status at Each Site CEDA: Physical and VM perfSONAR hosts deployed • Physical host is up – good test data already obtained • Collaboration with JANET ongoing • Collaboration with GEANT established DKRZ: Physical perfSONAR host deployed • Good test data already obtained • Collaboration with DFN and GEANT established NCI: perfSONAR host deployment in progress • Collaboration with AARNet established 4/8/14 9 Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science Current Status at Each Site KNMI: Just upgraded to 10G connection • Working on purchasing a perfSONAR host PCMDI: perfSONAR host up – needs filter adjustments • Science DMZ design meeting • DTN upgrades planned • Some hardware fixes required – work ongoing Looking forward: • All sites should be up and running with perfSONAR shortly • Performance verification – good progress is being made • DTNs are next 4/8/14 10 Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science perfSONAR Dashboard: Packet Loss 4/8/14 11 Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science perfSONAR Dashboard: Throughput Tests 4/8/14 12 Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science Performance Between LLNL and DKRZ Washington ESnet Router WIX Exchange Frankfurt GEANT Router Frankfurt DFN Router 100G 100G 3x10G 100G GEANT Wa sh. to DFN = 0.7 Gbps 100G 100G DFN to Wash = 0.6 Gbps RTT = 108 ms DFN ESnet 2x10G LLNL ESnet DKRZ Router LLNL to DKRZ = 0.5 Gbps Border Router 10G DKRZ to LLNL = 0.7 Gbps 2x10G RTT = 166 ms 4/8/14 13 Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science Performance Between LLNL and CEDA New York MANLAN Amsterdam London ESnet Router Exchange GEANT Router GEANT Router 100G 100G 3x10G 100G GEANT 100G 100G Janet ESnet 10G LLNL ESnet Rutherford Lab Router Border Router LLNL to RAL = 0.5 Gbps 40G 10G RAL to LLNL = 0.5 Gbps RTT = 156 ms 4/8/14 14 Lawrence Berkeley National Laboratory U.S. Department of Energy | Office of Science Thanks! Mary Hester: [email protected] Visit: es.net Networking Knowledge Base: fasterdata.es.net .