ESnet4: Networking for the Future of Science

ASCAC August 9, 2006

William E. Johnston

ESnet Department Head and Senior Scientist Lawrence Berkeley National Laboratory

www.es.net 1 ¾ DOE Office of Science and ESnet – the ESnet Mission • ESnet’s primary mission is to enable the large- scale science that is the mission of the Office of Science (SC): – Sharing of massive amounts of data – Supporting thousands of collaborators world-wide – Distributed data processing – Distributed data management – Distributed simulation, visualization, and computational steering • ESnet also provides network and collaboration services to DOE laboratories and other DOE programs in cases where this is cost effective.

2 Office of Science US Community Drives ESnet Design for Domestic Connectivity Pacific Northwest National Laboratory Idaho National Ames Laboratory Laboratory Argonne National Brookhaven Laboratory National Fermi Laboratory Lawrence National Berkeley Accelerator National Laboratory Laboratory

Stanford Linear Accelerator Princeton Center Plasma Physics Laboratory

Lawrence Thomas Jefferson Livermore National National Accelerator Facility Laboratory General Oak Ridge Atomics National Laboratory Institutions supported by SC Sandia Los Alamos National National Major User Facilities Laboratories Laboratory National DOE Specific-Mission Laboratories Renewable Energy DOE Program-Dedicated Laboratories Laboratory DOE Multiprogram Laboratories Footprint of Largest SC Data Sharing Collaborators Drives the International Footprint that ESnet Must Support

• Top 100 data flows generate 50% of all ESnet traffic (ESnet handles about 3x109 flows/mo.) • 91 of the top 100 flows are from the Labs to other institutions (shown) (CY2005 data) ESnet History

ESnet0/MFENet ESnet0/MFENet 56 Kbps microwave and mid-1970s-1986 satellite links ESnet1 ESnet formed to serve the 56 Kbps, X.25 to 1986-1995 Office of Science 45 Mbps T3 ESnet2 Partnered with Sprint to IP over 155 Mbps ATM 1995-2000 build the first national net footprint ATM network

ESnet3 Partnered with Qwest to IP over 10Gbps SONET 2000-2007 build a national Packet over SONET network and optical channel Metropolitan Area transition in progress Networks ESnet4 Partner with US Research& IP and virtual circuits on 2007-2012 Education community to a configurable optical build a dedicated national infrastructure with at optical network least 5-6 optical channels of 10-100

Gbps each 5 ESnet’s Place in U. S. and International Science • ESnet, /Abilene, and National Lambda Rail (NLR) provide most of the nation’s transit networking for basic science – Abilene provides national transit networking for most of the US universities by interconnecting the regional networks (mostly via the GigaPoPs) – ESnet provides national transit networking and ISP service for the DOE Labs – NLR provides various science-specific and network R&D circuits • GÉANT plays a role in Europe similar to Abilene and ESnet in the US – it interconnects the European National Research and Education Networks (NRENs), to which the European R&E sites connect – GÉANT currently carries all non-LHC ESnet traffic to Europe, and this is a significant fraction of all ESnet traffic

6 ESnet3 Today Provides Global High-Speed Internet Connectivity for DOE Facilities and Collaborators (summer, 2006) Japan (SINet) SINet (Japan) GÉANT Australia (AARNet) CA*net4 MREN Russia (BINP) - France, Germany, (CA*net4 / e France Netherlands Italy, UK, etc P v a Taiwan (TANet2) o GLORIAD StarTap P W CERN

G c (Russia, China) Singaren i Taiwan (TANet2, (USLHCnet f i W c Korea (Kreonet2 ASCC) CERN+DOE funded) N A SE P A P ESnet IP core: LIGO Packet over SONET N

PNNL Optical Ring and e A

n AU L Hubs C e H l S N i t I - a b r S N A l L i Y A g C M C e MIT h h i t n IN N e EE A l A L P i BNL b b i TWC A JGI l FNAL e ESnet Science DataESnet Science Network (SDN) core Lab DC n LLNL PPPL LBNL AMES ANL Offices e SNLL NERSC SNV CHI MAE-E SLAC PAIX-PA OSC GTN Equinix -NV NNSA SNV SDN Equinix, etc. TEL N H R

NASA KCP Ames BEC E C JLAB L M ORNL D A X YUCCA MT OSTI ORAU G LANL Po ARM ATL P SDSC SNLA NOAA P SRS GA A Abilene AU Allied N LB T A M Signal E AMPATH AMPATH N X U DOE-ALB 42 end user sites (S. America) (S. America) Office Of Science Sponsored (22) ELP International (high speed) NNSA Sponsored (12) 10 Gb/s SDN core Joint Sponsored (3) 10G/s IP core Other Sponsored (NSF LIGO, NOAA) 2.5 Gb/s IP core Other R&E peering points MAN rings (≥ 10 G/s) Laboratory Sponsored (6) Lab supplied links commercial peering points R&E OC12 ATM (622 Mb/s) networks Specific R&E network peers IP SNV ESnet core hubs OC12 / GigEthernet OC3 (155 Mb/s) Abilene high-speed peering points with Internet2/Abilene 45 Mb/s and less ESnet is a Highly Reliable Infrastructure

“5 nines” (99.999%) “4 nines” “3 nines”

Dually connected sites 8 ¾A Changing Science Environment is the Key Driver of the Next Generation ESnet • Large-scale collaborative science – big facilities, massive data, thousands of collaborators – is now a significant aspect of the Office of Science (“SC”) program

• SC science community is almost equally split between Labs and universities – SC facilities have users worldwide

• Very large international (non-US) facilities (e.g. LHC and ITER) and international collaborators are now a key element of SC science

• Distributed systems for data analysis, simulations, instrument operation, etc., are essential and are now common (in fact dominate this data analysis that now generates 50% of all ESnet traffic) 9 Planning the Future Network - ESnet4 There are many stakeholders for ESnet 1. SC programs – Advanced Scientific Computing Research – Basic Energy Sciences – Biological and Environmental Research – Fusion Energy Sciences – High Energy Physics – Nuclear Physics – Office of Nuclear Energy 2. Major scientific facilities – At DOE sites: large experiments, supercomputer centers, etc.

– Not at DOE sites: LHC, ITER These account 3. SC supported scientists not at the Labs for 85% of all (mostly at US R&E institutions) ESnet traffic 4. Other collaborating institutions (mostly US, European, and AP R&E) 5. Other R&E networking organizations that support major collaborators – Mostly US, European, and Asia Pacific networks 6. Lab operations and general population

7. Lab networking organizations 10 Planning the Future Network - ESnet4 • Requirements of the ESnet stakeholders are primarily determined by 1) Data characteristics of instruments and facilities that will be connected to ESnet • What data will be generated by instruments coming on-line over the next 5-10 years? • How and where will it be analyzed and used? 2) Examining the future process of science • How will the processing of doing science change over 5-10 years? • How do these changes drive demand for new network services? 3) Studying the evolution of ESnet traffic patterns • What are the trends based on the use of the network in the past 2-5 years? • How must the network change to accommodate the future traffic patterns implied by the trends?

11 (1) Requirements from Instruments and Facilities DOE SC Facilities that are, or will be, the top network users

• Advanced Scientific Computing Research • Biological and Environmental Research – National Energy Research Scientific – William R. Wiley Environmental Molecular Computing Center (NERSC) (LBNL)* Sciences Laboratory (EMSL) (PNNL)* – National Leadership Computing Facility – Joint Genome Institute (JGI) (NLCF) (ORNL)* – Structural Biology Center (SBC) (ANL) – Argonne Leadership Class Facility (ALCF) • Fusion Energy Sciences (ANL)* – DIII-D Tokamak Facility (GA)* • Basic Energy Sciences – National Synchrotron Light Source (NSLS) – Alcator C-Mod (MIT)* (BNL) – National Spherical Torus Experiment (NSTX) – Stanford Synchrotron Radiation Laboratory (PPPL)* (SSRL) (SLAC) – ITER – Advanced Light Source (ALS) (LBNL)* • High Energy Physics – Advanced Photon Source (APS) (ANL) – Tevatron Collider (FNAL) – Spallation Neutron Source (ORNL)* – B-Factory (SLAC) – National Center for Electron Microscopy – Large Hadron Collider (LHC, ATLAS, CMS) (NCEM) (LBNL)* (BNL, FNAL)* – Combustion Research Facility (CRF) (SNLL)* • Nuclear Physics – Relativistic Heavy Ion Collider (RHIC) (BNL)* – Continuous Electron Beam Accelerator Facility (CEBAF) (JLab)*

*characterized by current case studies 12 The Largest Facility: Large Hadron Collider at CERN LHC CMS detector 15m X 15m X 22m,12,500 tons, $700M

human (for scale)

13 (2) Requirements from Examining the Future Process of Science

•In a major workshop [1], and in subsequent updates [2], requirements were generated by asking the science community how their process of doing science will / must change over the next 5 and next 10 years in order to accomplish their scientific goals •Computer science and networking experts then assisted the science community in – analyzing the future environments – deriving middleware and networking requirements needed to enable these environments •These were complied as case studies that provide specific 5 & 10 year network requirements for bandwidth, footprint, and new services

14 Science Networking Requirements Aggregation Summary Science End2End Connectivity Today 5 years Traffic Network Services Drivers Reliability End2End End2End Characteristics Band Band Science Areas width width / Facilities

Magnetic 99.999% • DOE sites 200+ 1 Gbps • Bulk data • Guaranteed Fusion Energy Mbps bandwidth (Impossible • US Universities • Remote control without full • Guaranteed QoS • Industry redundancy) • Deadline scheduling NERSC and - • DOE sites 10 Gbps 20 to 40 • Bulk data • Guaranteed ACLF Gbps bandwidth • US Universities • Remote control • Guaranteed QoS • International • Remote file system sharing • Deadline Scheduling • Other ASCR supercomputers • PKI / Grid NLCF - • DOE sites Backbone Backbone • Bulk data Band band width • US Universities • Remote file width parity system sharing • Industry parity • International Nuclear - • DOE sites 12 Gbps 70 Gbps • Bulk data • Guaranteed Physics (RHIC) bandwidth • US Universities • PKI / Grid • International

Spallation High • DOE sites 640 Mbps 2 Gbps • Bulk data Neutron Source (24x7 operation) Science Network Requirements Aggregation Summary Science End2End Connectivity Today 5 years Traffic Network Services Drivers Reliability End2End End2End Characteristics Band Band width Science Areas width / Facilities

Advanced - • DOE sites 1 TB/day 5 TB/day • Bulk data • Guaranteed Light Source bandwidth • US Universities 300 Mbps 1.5 Gbps • Remote control • PKI / Grid • Industry

Bioinformatics - • DOE sites 625 Mbps 250 Gbps • Bulk data • Guaranteed bandwidth • US Universities 12.5 • Remote control Gbps in • High-speed • Point-to- two years multicast multipoint Chemistry / - • DOE sites - 10s of • Bulk data • Guaranteed Combustion Gigabits per bandwidth • US Universities second • PKI / Grid • Industry

Climate - • DOE sites - 5 PB per year • Bulk data • Guaranteed Science bandwidth • US Universities 5 Gbps • Remote control • PKI / Grid • International Immediate Requirements and Drivers High Energy 99.95+% • US Tier1 (FNAL, BNL) 10 Gbps 60 to 80 Gbps • Bulk data • Guaranteed Physics (LHC) bandwidth (Less • US Tier2 (30-40 Gbps • Coupled data than 4 (Universities) per US Tier1) analysis • Traffic isolation hrs/year) processes • International (Europe, • PKI / Grid Canada) Changing Science Environment ⇒ New Demands on Network • Increased capacity – Needed to accommodate a large and steadily increasing amount of data that must traverse the network • High network reliability – Essential when interconnecting components of distributed large-scale science • High-speed, highly reliable connectivity between Labs and US and international R&E institutions – To support the inherently collaborative, global nature of large- scale science • New network services to provide bandwidth guarantees – Provide for data transfer deadlines for • remote data analysis, real-time interaction with instruments, coupled computational simulations, etc.

17 3) These Trends are Seen in Observed Evolution of Historical ESnet Traffic Patterns

1400 1200 top 100 1000 sites 800 600 400

Terabytes / month Terabytes 200 0 Jul, 00 Jul, 02 Jul, 03 Jul, 04 Jul, 05 Jul, 06 Jul, Jul, 01 Jul, Jan, 00 Jan, 01 Jan, 02 Jan, 03 Jan, 04 Jan, 05 Jan, 06 Jan,

ESnet Monthly Accepted Traffic, January, 2000 – June, 2006 •ESnet is currently transporting more than1 petabyte (1000 terabytes) per month •More than 50% of the traffic is now generated by the top 100 sites 18 ESnet Traffic has Increased by 10X Every 47 Months, on Average, Since 1990

Apr., 2006 1 PBy/mo. 10000.0 Nov., 2001 100 TBy/mo. 53 months

R2 = 0.9898 1000.0 Jul., 1998 10 TBy/mo. 40 months

100.0 Oct., 1993 1 TBy/mo. 57 months 10.0 Aug., 1990 100 MBy/mo. 38 months 1.0 Terabytes / month Terabytes

0.1

0.0 Jan, 90 Jan, 91 Jan, 92 Jan, 93 Jan, 94 Jan, 95 Jan, 96 Jan, 97 Jan, 98 Jan, 99 Jan, 00 Jan, 01 Jan, 02 Jan, 03 Jan, 04 Jan, 05 Jan, 06 Jan, Log Plot of ESnet Monthly Accepted Traffic, January, 1990 – June, 2006 Requirements from Network Utilization Observation • In 4 years, we can expect a 10x increase in traffic over current levels without the addition of production LHC traffic – Nominal average load on busiest backbone links is ~1.5 Gbps today – In 4 years that figure will be ~15 Gbps based on current trends • Measurements of this type are science-agnostic – It doesn’t matter who the users are, the traffic load is increasing exponentially – Predictions based on this sort of forward projection tend to be conservative estimates of future requirements because they cannot predict new uses • Bandwidth trends drive requirement for a new network architecture – New architecture/approach must be scalable in a cost-effective way

20 Large-Scale Flow Trends, June 2006 Subtitle: “Onslaught of the LHC”)

Traffic Volume of the Top 30 AS-AS Flows, June 2006 DOE Office of Science Program (AS-AS = mostly Lab to R&E site, a few Lab to R&E network, a few “other”) LHC / High Energy Physics - Tier 0-Tier1 LHC / HEP - T1-T2 140 HEP 120 Nuclear Physics 100 Lab - university 80 LIGO (NSF) 60 Lab - commodity Terabytes 40 Math. & Comp. (MICS) 20 0 CERN -> BNL -> CERN SLAC -> UK R&E -> SLAC F e rm ila b ->M IT Italy R&E SLAC -> Italy SLAC -> Italy R&ESLAC Italy -> ESNET -> CalTech ESNET -> BNL -> French R&E French BNL -> F e rm ila b -> U K R & E F e rm ila b ->E s to n ia RIKEN (Japan) -> BNL -> (Japan) RIKEN R&E Ferm -> ilab Italy Ferm ilab -> Italy R&E Italy -> Ferm ilab BNL RIKEN-> (Japan) F e rm ila b -> U . F lo rid a F e rm ila b ->S w is s R & E SL A C ->IN 2P3 (F ran ce) F e rm ila b -> U . O k la h o m a F e rm ila b -> B e lg iu m R & E

Abilene (USAbilene R&E) PNNL -> FNALR&E) (US Abilene -> PNNL -> CERN traffic is comparable to BNL -> CERN UC San Diego -> Fermilab -> Diego UC San Ferm ilab -> G -> Ferm erm ilab any R&E Ferm ilab -> G -> Ferm erm ilab any R&E IN2P3 (France) -> Ferm -> (France) IN2P3 ilab Argonne -> US Com US m -> odity Argonne U . N e b .-L in c o ln -> F e rm ila b F e rm ila b -> U . N e b .-L in c o ln but on layer 2 flows that are not yet monitored for traffic – soon) SLAC -> Karlsruhe (Germany) Karlsruhe -> SLAC NERSC (DOE Supercomputer) -> LBNL -> Supercomputer) (DOE NERSC Fermilab -> DESY-Hamburg (Germany) DESY-Hamburg -> Fermilab ¾Traffic Patterns are Changing Dramatically total traffic, total traffic, 1200TBy TBy

1000 1200

1/05 800 6/06 1000

600 800

600 400 400 2 TB/month 200 2 TB/month 200

0 0

1200

1000

7/05 800 600 • While the total traffic is increasing 400

2 TB/month 200 exponentially

0 – Peak flow – that is system-to- system – bandwidth is decreasing

1200 – The number of large flows is 1/06 1000 800 increasing

600

400

2 TB/month 200

0

22 The Onslaught of Grids Question: Why is peak flow bandwidth decreasing while total traffic is increasing?

plateaus indicate the emergence of parallel transfer systems (a lot of systems transferring the same amount of data at the same time)

Answer: Most large data transfers are now done by parallel / Grid data movers • In June, 2006 72% of the hosts generating the 1000 flows were involved in parallel data movers (Grid applications) • This is the most significant traffic pattern change in the history of ESnet • This has implications for the network architecture that favor path multiplicity and route diversity

23 Requirements from Traffic Flow Observations • Most of ESnet science traffic has a source or sink outside of ESnet – Drives requirement for high-bandwidth peering – Reliability and bandwidth requirements demand that peering be redundant – Multiple 10 Gbps peerings today, must be able to add more bandwidth flexibly and cost-effectively – Bandwidth and service guarantees must traverse R&E peerings • Collaboration with other R&E networks on a common framework is critical • Seamless fabric • Large-scale science is now the dominant user of the network – Satisfying the demands of large-scale science traffic into the future will require a purpose-built, scalable architecture – Traffic patterns are different than commodity Internet

24 Network Observation – Circuit-like Behavior Look at Top 20 Traffic Generator’s Historical Flow Patterns Over 1 year, the flow / “circuit” duration is about 3 months 1550

1350

1150

950

750

550 Gigabytes/day 350

150

-50 (no data) 9/23/04 1/23/05 2/23/05 3/23/05 4/23/05 5/23/05 6/23/05 7/23/05 8/23/05 9/23/05 10/23/04 11/23/04 12/23/04 LIGO – CalTech (host to host) 25 ¾The ESnet Response to the Requirements

I) A new network architecture and implementation strategy • Rich and diverse network topology for flexible management and high reliability • Dual connectivity at every level for all large-scale science sources and sinks • A partnership with the US research and education community to build a shared, large-scale, R&E managed optical infrastructure • a scalable approach to adding bandwidth to the network • dynamic allocation and management of optical circuits II) Development and deployment of a virtual circuit service • Develop the service cooperatively with the networks that are intermediate between DOE Labs and major collaborators to ensure and-to-end interoperability

26 Next Generation ESnet: I) Architecture and Configuration

• Main architectural elements and the rationale for each element 1) A High-reliability IP core (e.g. the current ESnet core) to address – General science requirements – Lab operational requirements – Backup for the SDN core – Vehicle for science services – Full service IP routers 2) Metropolitan Area Network (MAN) rings to provide – Dual site connectivity for reliability – Much higher site-to-core bandwidth – Support for both production IP and circuit-based traffic – Multiply connecting the SDN and IP cores 2a) Loops off of the backbone rings to provide – For dual site connections where MANs are not practical 3) A Science Data Network (SDN) core for – Provisioned, guaranteed bandwidth circuits to support large, high-speed science data flows – Very high total bandwidth – Multiply connecting MAN rings for protection against hub failure – Alternate path for production IP traffic – Less expensive router/switches – Initial configuration targeted at LHC, which is also the first step to the general configuration that will address all SC requirements – Can meet other unknown bandwidth requirements by adding lambdas

27 ESnet3 Core Architecture and Implementation

ESnet IP core

Strengths Weaknesses • single cuts in any ring will not • core ring, MAN, and sites connect isolate any site in a single hub - a single point of ESnet sites • most SC Labs are dually failure ESnet hubs / core network connection points connected to the core ring so no • two cuts in the core ring on single cut will isolate opposite sides of a hub partitions Metro area rings (MANs) the network • no route diversity Other networks 28 ESnet4 Core Architecture and Implementation

ESnet IP core

Strengths Weaknesses ESnet Science Data • No single circuit or equipment • Both IP and SDN IP circuits Network (SDN) core failure can isolate SC Labs (optical channels) are on a single • Route diversity permits traffic fiber so simultaneous cuts on the optimization (especially for opposite sides of an outer ring will ESnet sites circuits) for better network still partition the network ESnet IP routers utilization of capacity and failover ESnet SDN circuit switches • Diverse footprint allows better access to US regional nets and Metro area rings (MANs) LHC tier 2 sites (universities) 29 ESnet4 Configuration Core networks: 40-50 Gbps in 2009-2010, 160-400 Gbps in 2011-2012 Canada Europe CERN (30 Gbps) (CANARIE) (GEANT) Asia-Pacific Canada Asia Pacific CERN (30 Gbps) (CANARIE) GLORIAD (Russia and Europe - China) a (GEANT) i ic s if e Science Data A c ttl a a P d Se Network Core n la e Boston go v ca le Boise IP Core hi C C Australia

s sa y an it Denver K C e al v a Washington ny t 1625 miles / 2545 km 1625 miles n n u a DC S l Tulsa t A LA Albuquerque

San Diego South America (AMPATH) Australia ton us Jacksonville IP core hubs South America Ho (AMPATH) Production IP core (10Gbps) ◄ SDN hubs SDN core (20-30-40Gbps) ◄ Primary DOE Labs Core network fiber path is High speed cross-connects MANs (20-60 Gbps) or with Ineternet2/Abilene ~ 14,000 miles / 24,000 km backbone loops for site access Possible hubs International connections 2700 miles / 4300 km 30 Next Generation ESnet: II) Virtual Circuits • Traffic isolation and traffic engineering – Provides for high-performance, non-standard transport mechanisms that cannot co-exist with commodity TCP-based transport – Enables the engineering of explicit paths to meet specific requirements • e.g. bypass congested links, using lower bandwidth, lower latency paths • Guaranteed bandwidth (Quality of Service (QoS)) – User specified bandwidth – Addresses deadline scheduling • Where fixed amounts of data have to reach sites on a fixed schedule, so that the processing does not fall far enough behind that it could never catch up – very important for experiment data analysis • Reduces cost of handling high bandwidth data flows – Highly capable routers are not necessary when every packet goes to the same place – Use lower cost (factor of 5x) switches to relatively route the packets • Secure – The circuits are “secure” to the edges of the network (the site boundary) because they are managed by the control plane of the network which is isolated from the general traffic • Provides end-to-end connections between Labs and collaborator institutions 31 OSCARS: Guaranteed Bandwidth VC Service For SC Science

• ESnet Virtual Circuit project: On-demand Secured Circuits and Advanced Reservation System (OSCARS) • To ensure compatibility, the design and implementation is done in collaboration with the other major science R&E networks and end sites – Internet2: Bandwidth Reservation for User Work (BRUW) • Development of common code base – GEANT: Bandwidth on Demand (GN2-JRA3), Performance and Allocated Capacity for End-users (SA3-PACE) and Advance Multi-domain Provisioning System (AMPS) extends to NRENs – BNL: TeraPaths - A QoS Enabled Collaborative Data Sharing Infrastructure for Peta- scale Computing Research – GA: Network Quality of Service for Magnetic Fusion Research – SLAC: Internet End-to-end Performance Monitoring (IEPM) – USN: Experimental Ultra-Scale Network Testbed for Large-Scale Science • In its current phase this effort is being funded as a research project by the Office of Science, Mathematical, Information, and Computational Sciences (MICS) Network R&D Program • A prototype service has been deployed as a proof of concept – To date more then 20 accounts have been created for beta users, collaborators, and developers – More then 100 reservation requests have been processed 32 ¾Federated Trust Services – Support for Large-Scale Collaboration • Remote, multi-institutional, identity authentication is critical for distributed, collaborative science in order to permit sharing widely distributed computing and data resources, and other Grid services • Public Key Infrastructure (PKI) is used to formalize the existing web of trust within science collaborations and to extend that trust into cyber space – The function, form, and policy of the ESnet trust services are driven entirely by the requirements of the science community and by direct input from the science community • International scope trust agreements that encompass many organizations are crucial for large-scale collaborations – ESnet has lead in negotiating and managing the cross-site, cross- organization, and international trust relationships to provide policies that are tailored for collaborative science ¾ This service, together with the associated ESnet PKI service, is the basis of the routine sharing of HEP Grid-based computing resources between US and Europe

33 Usage Statistics for the DOEGrids Certification Authority that 7500 7250 7000 Issues Identity Certificates for Grid Authentication 6750 6500 6250 6000 User Certificates 5750 5500 5250 Service Certificates 5000 4750 Expired(+revoked) 4500 4250 Certificates 4000 Total Certificates Issued 3750 3500 Total Cert Requests 3250 3000 2750 2500 2250 2000 1750

No.of certificates or requests 1500 1250 1000 750 500 250 0

3 3 3 Jan-0 eb-0 F pr-03 -03 Mar-0A ay M Jun-03Jul-03ug-03 Production3 service began in June 2003 A ep-03 ct-03 v-03 S O 4 4 No -0 04 Dec-0 Jan-04 ar-0 Feb M Apr- un-04 04 May-04J Jul-04 -04 Aug- Sep 5 User Certificates 1999 Oct-04TotalNov-04 No. of0 Certificates5 5479 Dec-04Jan-05 -0 5 Feb- -0 Mar Apr-05 -05 Host & Service Certificates 3461 Total No. of RequestsMay Jun Jul-05 7006 ESnet SSL Server CA Certificates 38 DOEGrids CA 2 CA Certificates (NERSC) 15 FusionGRID CA certificates 76 * Report as of Jun 15, 2005 ¾Summary • ESnet is currently satisfying its mission by enabling SC science that is dependant on networking and distributed, large-scale collaboration: “The performance of ESnet over the past year has been excellent, with only minimal unscheduled down time. The reliability of the core infrastructure is excellent. Availability for users is also excellent” - DOE 2005 annual review of LBL • ESnet has put considerable effort into gathering requirements from the DOE science community, and has a forward-looking plan and expertise to meet the five-year SC requirements – A Lehman review of ESnet (Feb, 2006) has strongly endorsed the plan presented here

35 References 1. High Performance Network Planning Workshop, August 2002 – http://www.doecollaboratory.org/meetings/hpnpw 2. Science Case Studies Update, 2006 (contact [email protected]) 3. DOE Science Networking Roadmap Meeting, June 2003 – http://www.es.net/hypertext/welcome/pr/Roadmap/index.html 4. DOE Workshop on Ultra High-Speed Transport Protocols and Network Provisioning for Large-Scale Science Applications, April 2003 – http://www.csm.ornl.gov/ghpn/wk2003 5. Science Case for Large Scale Simulation, June 2003 – http://www.pnl.gov/scales/ 6. Workshop on the Road Map for the Revitalization of High End Computing, June 2003 – http://www.cra.org/Activities/workshops/nitrd – http://www.sc.doe.gov/ascr/20040510_hecrtf.pdf (public report) 7. ASCR Strategic Planning Workshop, July 2003 – http://www.fp-mcs.anl.gov/ascr-july03spw 8. Planning Workshops-Office of Science Data-Management Strategy, March & May 2004 – http://www-conf.slac.stanford.edu/dmw2004

36 Additional Information

37 LHC Tier 0, 1, and 2 Connectivity Requirements Summary CERN-1 TRIUMF Vancouver (Atlas T1, CANARIE USLHCNet

Canada) Seattle CERN-2 Toronto Abilene / Gigapop Footprint Virtual Circuits BNL ESnet (Atlas T1) CERN-3 SDN Boise Chicago

New York Denver GÉANT-1 KC

Sunnyvale ESnet FNAL Wash DC IP Core (CMS T1)

LA GÉANT-2 Albuq. San Diego Dallas Atlanta GÉANT

Jacksonville

USLHC nodes Abilene/GigaPoP nodes • Direct connectivity T0-T1-T2 ESnet IP core hubs • USLHCNet to ESnet to Abilene ESnet SDN/NLR hubs Tier 1 Centers • Backup connectivity Cross connects with Internet2/Abilene Tier 2 Sites • SDN, GLIF, VCs 38 Example Case Study Summary Matrix: Fusion • Considers instrument and facility requirements, the process of science drivers and resulting network requirements cross cut with timelines Feature Anticipated Requirements Time Science Instruments Network Services and Frame and Facilities Process of Science Network Middleware • Each experiment only gets a few • Real-time data access and • PKI certificate authorities that enable strong Near-term days per year - high productivity analysis for experiment steering authentication of the community members is critical (the more that you can analyze and the use of Grid security tools and • Experiment episodes (“shots”) between shots the more effective services. generate 2-3 Gbytes every you can make the next shot) • Directory services that can be used to 20 minutes, which has to be • Shared visualization capabilities provide the naming root and high-level delivered to the remote analysis (community-wide) indexing of shared, sites in two minutes in order to persistent data that transforms into analyze before next shot community information and knowledge • Highly collaborative experiment • Efficient means to sift through large data and analysis environment repositories to extract meaningful information from unstructured data.

• 10 Gbytes generated by • Real-time data analysis for • Network bandwidth and data • Parallel network I/O between simulations, 5 years experiment every 20 minutes experiment steering combined analysis computing capacity data archives, experiments, and visualization (time between shots) to be with simulation interaction = big guarantees (quality of service) • High quality, 7x24 PKI identity delivered in two minutes productivity increase for inter-shot data analysis authentication infrastructure • Gbyte subsets of much larger • Real-time visualization and • Gbits/sec for 20 seconds out • End-to-end quality of service and quality of simulation datasets to be delivered interaction among collaborators of 20 minutes, guaranteed service management in two minutes for comparison across United States • 5 to 10 remote sites involved • Secure/authenticated transport to ease access with experiment • Integrated simulation of the for data analysis and through firewalls • Simulation data scattered across several distinct regions of the visualization • Reliable data transfer United States reactor will produce a much more • Transient and transparent data replication for • Transparent security realistic model of the fusion real-time reliability • Global directory and naming process • Support for human collaboration tools services needed to anchor all of the distributed metadata • Support for “smooth” collaboration in a high-stress environment • Simulations generate 100s of • Real-time remote operation of the • Quality of service for network • Management functions for network quality 5+ years Tbytes experiment latency and reliability, and for of service that provides the request and • ITER – Tbyte per shot, PB per • Comprehensive integrated co-scheduling computing access mechanisms for the experiment run year simulation resources time, periodic traffic noted above. 39 Large-Scale Flow Trends – 2004/2005 (Among other things these observations help establish the network footprint requirements)

ESnet TopEsnet 20 Top Host-to-Host 20 Host-to-Host Flows Flows by Site, by Sept. Site, 2004 Sep. to Sept. 2004 2005 to Sep. 2005 160 140 120 100 80 60

TeraByes/yr. 40 TB/year 20 0 BNL-Riken (JP) BNL-Riken (JP) BNL-Riken (JP) BNL-Riken BNL-Riken (JP) BNL-Riken (JP) BNL-Riken LIGO - CalTech - LIGO FNAL - UBC (CA) FNAL - IN2P3 (FR) IN2P3 - FNAL (FR) IN2P3 - FNAL SLAC - IN2P3 (FR) Instrument – University SLAC - IN2P3 (FR) SLAC - IN2P3 (FR)

Nuclear Physics (RHIC) testing MAN BA ESnet SLAC - INFN, PadovaSLAC INFN, - (IT) PadovaSLAC INFN, - (IT) High Energy Physics PadovaINFN, SLAC - (IT) SLAC - INFN, Bologna (IT) INFN, - SLAC Test traffic SLAC - Rutherford Lab (UK) SLAC - Rutherford Lab (UK) SLAC - Rutherford Lab (UK)

40 The Increasing Dominance of Science Traffic Traffic Volume of the Top 100 AS-AS Flows by Month (Mostly Lab to R&E site, a few Lab to R&E network – all science)

1000

900

800

700

600

500

400

Terabytes/month 300

200

100

0 0 0 04 , , 0 6 0 . 0 05 t, 05 04 05 c 04 4 40 n, 4 Nov 4 04 5 50 50 n 5 Nov 5 05 , 60 60 6 6 Oct, n, , O , n , , 0 n, Ju g, Ju r Jun. 0 ar, ul, 04 Sep, 04 Ja ar, Sep, Ja May, 04 J Au May, 05 Jul, 05Aug, May, 0 Ja Feb, 0M Apr, 04 Dec, Feb M Apr, 05 Dec Feb Ma Apr, 06

41 Parallel Data Movers now Predominate Look at the hosts involved in 2006-01-31–— the plateaus in the host-host top 100 flows are all parallel transfers (thx. to Eli Dart for this observation)

A132023.N1.Vanderbilt.Edu lstore1.fnal.gov 5.847 bbr-xfer07.slac.stanford.edu babar2.fzk.de 2.113 A132021.N1.Vanderbilt.Edu lstore1.fnal.gov 5.884 bbr-xfer05.slac.stanford.edu babar.fzk.de 2.254 A132018.N1.Vanderbilt.Edu lstore1.fnal.gov 6.048 bbr-xfer04.slac.stanford.edu babar.fzk.de 2.294 A132022.N1.Vanderbilt.Edu lstore1.fnal.gov 6.39 bbr-xfer07.slac.stanford.edu babar.fzk.de 2.337 A132021.N1.Vanderbilt.Edu lstore2.fnal.gov 6.771 bbr-xfer04.slac.stanford.edu babar2.fzk.de 2.339 A132023.N1.Vanderbilt.Edu lstore2.fnal.gov 6.825 bbr-xfer05.slac.stanford.edu babar2.fzk.de 2.357 A132022.N1.Vanderbilt.Edu lstore2.fnal.gov 6.86 bbr-xfer08.slac.stanford.edu babar2.fzk.de 2.471 A132018.N1.Vanderbilt.Edu lstore2.fnal.gov 7.286 bbr-xfer08.slac.stanford.edu babar.fzk.de 2.627 A132017.N1.Vanderbilt.Edu lstore1.fnal.gov 7.62 bbr-xfer04.slac.stanford.edu babar3.fzk.de 3.234 A132017.N1.Vanderbilt.Edu lstore2.fnal.gov 9.299 bbr-xfer05.slac.stanford.edu babar3.fzk.de 3.271 bbr-xfer08.slac.stanford.edu babar3.fzk.de 3.276 A132023.N1.Vanderbilt.Edu lstore4.fnal.gov 10.522 A132021.N1.Vanderbilt.Edu lstore4.fnal.gov 10.54 bbr-xfer07.slac.stanford.edu babar3.fzk.de 3.298 bbr-xfer05.slac.stanford.edu bbr-datamove10.cr.cnaf.infn.it 2.366 A132018.N1.Vanderbilt.Edu lstore4.fnal.gov 10.597 bbr-xfer07.slac.stanford.edu bbr-datamove10.cr.cnaf.infn.it 2.519 A132018.N1.Vanderbilt.Edu lstore3.fnal.gov 10.746 bbr-xfer04.slac.stanford.edu bbr-datamove10.cr.cnaf.infn.it 2.548 A132022.N1.Vanderbilt.Edu lstore4.fnal.gov 11.097 bbr-xfer08.slac.stanford.edu bbr-datamove10.cr.cnaf.infn.it 2.656 A132022.N1.Vanderbilt.Edu lstore3.fnal.gov 11.097 bbr-xfer08.slac.stanford.edu bbr-datamove09.cr.cnaf.infn.it 3.927 A132021.N1.Vanderbilt.Edu lstore3.fnal.gov 11.213 bbr-xfer05.slac.stanford.edu bbr-datamove09.cr.cnaf.infn.it 3.94 bbr-xfer04.slac.stanford.edu bbr-datamove09.cr.cnaf.infn.it 4.011 A132023.N1.Vanderbilt.Edu lstore3.fnal.gov 11.331 bbr-xfer07.slac.stanford.edu bbr-datamove09.cr.cnaf.infn.it 4.177 A132017.N1.Vanderbilt.Edu lstore4.fnal.gov 11.425 bbr-xfer04.slac.stanford.edu csfmove01.rl.ac.uk 5.952 A132017.N1.Vanderbilt.Edu lstore3.fnal.gov 11.489 move03.gridpp.rl.ac.uk 5.959 babar.fzk.de bbr-xfer03.slac.stanford.edu 2.772 bbr-xfer04.slac.stanford.edu babar.fzk.de bbr-xfer02.slac.stanford.edu 2.901 bbr-xfer05.slac.stanford.edu csfmove01.rl.ac.uk 5.976 babar2.fzk.de bbr-xfer06.slac.stanford.edu 3.018 bbr-xfer05.slac.stanford.edu move03.gridpp.rl.ac.uk 6.12 babar.fzk.de bbr-xfer04.slac.stanford.edu 3.222 bbr-xfer07.slac.stanford.edu csfmove01.rl.ac.uk 6.242 bbr-export01.pd.infn.it bbr-xfer03.slac.stanford.edu 11.289 bbr-xfer08.slac.stanford.edu move03.gridpp.rl.ac.uk 6.357 bbr-xfer08.slac.stanford.edu csfmove01.rl.ac.uk 6.48 bbr-export02.pd.infn.it bbr-xfer03.slac.stanford.edu 19.973 bbr-xfer07.slac.stanford.edu move03.gridpp.rl.ac.uk 6.60442 Network Observation – Circuit-like Behavior (2) Look at Top 20 Traffic Generator’s Historical Flow Patterns Over 1 year, flow / “circuit” duration is about 1 day to 1 week 950

750

550

350

Gigabytes/day 150

-50 (no data) 9/23/04 1/23/05 2/23/05 3/23/05 4/23/05 5/23/05 6/23/05 7/23/05 8/23/05 9/23/05 10/23/04 11/23/04 12/23/04

SLAC - IN2P3, France (host to host) 43 ESnet3 Overall Architecture

ESnet IP core

Abilene core

ESnet sites ESnet hubs / core network connection points Metro area rings (MANs)

US R&E regional nets

International R&E nets

US commercial ISPs 44 Virtual Circuit Service Functional Requirements • Support user/application VC reservation requests – Source and destination of the VC – Bandwidth, start time, and duration of the VC – Traffic characteristics (e.g. flow specs) to identify traffic designated for the VC • Manage allocations of scarce, shared resources – Authentication to prevent unauthorized access to this service – Authorization to enforce policy on reservation/provisioning – Gathering of usage data for accounting • Provide circuit setup and teardown mechanisms and security – Widely adopted and standard protocols (such as MPLS and GMPLS) are well understood within a single domain – Cross domain interoperability is the subject of ongoing, collaborative development – secure and-to-end connection setup is provided by the network control plane • Enable the claiming of reservations – Traffic destined for the VC must be differentiated from “regular” traffic • Enforce usage limits – Per VC admission control polices usage, which in turn facilitates guaranteed bandwidth

– Consistent per-hop QoS throughout the network for transport predictability 45