1
Services to the US Tier-1 Sites LHCOPN
April 4th, 2006
Joe Metzger
ESnet Engineering GroupLawrence Berkeley National Laboratory
2
Outline
• Next Generation ESnetNext Generation ESnetRequirementsArchitectureStudying Architectural AlternativesReliabilityConnectivity2010 Bandwidth and Footprint goal
• ESnet Circuit ServicesOSCARSLHCOPN CircuitsBNLFERMI
3
Next Generation ESnet
• Current IP Backbone Contract Expires End of 07– Backbone Circuits
– Hub Colocation Space
– Some Site Access Circuits
• Acquisition– Background research in progress
• Implementation– Major changes may happen in 2007
• No Negative LHC Impact– Should not change primary LHCOPN paths– May change/improve some US Tier 1 to US Tier 2 paths
4
Next Generation ESnet Requirements
• Greater reliability– Multiple connectivity at several levels
• Two backbones: production IP and Science Data Network (SDN)
• Redundant site access links
• Redundant, high bandwidth US and international R&E connections
– Continuous, end-to-end monitoring to anticipate problems and assist in debugging distributed applications
• Connectivity– Footprint to reach major collaborators in the US, Europe, and Asia– Connections to all major R&E peering points– Initial build-out that satisfies near-term LHC connectivity requirements
• More bandwidth– Multiple lambda based network – SDN– Scalable bandwidth– Initial build-out that satisfies near-term LHC bandwidth requirements
5
• Main architectural elements and the rationale for each element1) A High-reliability IP core (e.g. the current ESnet core) to address
– General science requirements– Lab operational requirements– Backup for the SDN core– Vehicle for science services– Full service IP routers
2) Metropolitan Area Network (MAN) rings to provide– Dual site connectivity for reliability– Much higher site-to-core bandwidth– Support for both production IP and circuit-based traffic– Multiply connecting the SDN and IP cores
2a) Loops off of the backbone rings to provide– For dual site connections where MANs are not practical
3) A Science Data Network (SDN) core for– Provisioned, guaranteed bandwidth circuits to support large, high-speed science data flows– Very high total bandwidth– Multiply connecting MAN rings for protection against hub failure– Alternate path for production IP traffic– Less expensive router/switches– Initial configuration targeted at LHC, which is also the first step to the general configuration that
will address all SC requirements– Can meet other unknown bandwidth requirements by adding lambdas
Next Generation ESnet Architecture
6
ESnet Target Architecture: High-reliability IP Core
Chi
cago
Atl
anta
Seattle
Albuquerque
IP Core
LA
Denver
Primary DOE Labs
Possible hubs
SDN hubs
IP core hubs
Washington DCS
un
nyv
ale
New York
San Diego
Cle
vela
nd
7
MetropolitanArea Rings
ESnet Target Architecture:Metropolitan Area Rings
New York
Chi
cago
Washington DC
Atl
anta
Seattle
AlbuquerqueSan Diego
LA
Su
nn
yval
e
Denver
Primary DOE Labs
Possible hubs
SDN hubs
IP core hubs
Cle
vela
nd
8
ESnet Target Architecture:Loops Off the IP Core
New York
Chi
cago
Washington DC
Atl
anta
CERN
Seattle
AlbuquerqueSan Diego
LA
Su
nn
yval
e
Denver
Loop off Backbone
Primary DOE Labs
Possible hubs
SDN hubs
IP core hubs
Cle
vela
nd
9
ESnet Target Architecture: Science Data Network
New York
Chi
cago
Atl
anta
Seattle
Albuquerque
Science Data Network Core
San Diego
LA
Su
nn
yval
e
Denver
Primary DOE Labs
Possible hubs
SDN hubs
IP core hubs
Cle
vela
nd
Washington DC
10
10-50 Gbps circuitsProduction IP coreScience Data Network coreMetropolitan Area NetworksInternational connections
MetropolitanArea Rings
ESnet Target Architecture: IP Core+Science Data Network Core+Metro Area Rings
New York
Chi
cago
Washington DC
Atl
anta
Seattle
AlbuquerqueSan
Diego
LA
Su
nn
yval
e
Denver
Loop off Backbone
SDN Core
IP Core
Primary DOE Labs
Possible hubs
SDN hubs
IP core hubs
international connections international
connections
international connections
international connections
international connectionsin
tern
atio
nal
co
nn
ecti
on
s
Cle
vela
nd
11
Studying Architectural Alternatives
• ESnet has considered a number of technical variations that could result from the acquisition process
• Dual Carrier Model– One carrier provides IP circuits, a 2nd provides SDN circuits– Physical diverse Hubs, Fiber, Conduit
• Diverse fiber routes in some areas.
• Single Carrier Model– One carrier provides both SDN and IP circuits– Use multiple smaller rings to improve reliability in the face of
partition risks
• In event of dual cut, fewer sites are isolated because of richer cross connections
• Multiple lambdas also provide some level of protection– May require additional engineering effort, colo space and
equipment to meet the reliability requirements
12
Primary DOE Labs
IP core hubs
Dual Carrier Model
possible hubs
SDN hubs
New York
Chi
cago
Washington DC
Atl
anta
Seattle
AlbuquerqueSan Diego
LA
10-50 Gbps circuitsProduction IP coreScience Data Network coreMetropolitan Area Networks
Su
nn
yval
e
Denver
SDN Core
IP Core
Cle
vela
nd
13
Single Carrier ModelS
un
ny
va
le Denver
Seattle
San Diego
Chicago
Jacksonville
Atlanta
Albuquerque
Ne
w Y
ork
Boise
Wash. DC
router+switch site
sites,peers, etc.
rtr IP core
swSDN core
MAN connections
IP core
SDNcore
IP core
SDNcore
sw
switch site
core coresw
core
Sites onMAN ring
sites,peers, etc.
peers, etc.
MAN ring
Lambda used for IP coreLambdas used for SDN core
sw
Kansas City
San Antonio
Cleveland
SDN & IP are different
lambdas on the same fiber
rtr
14
Reliability
• Reliability within ESnet – Robust architecture with redundant equipment to reduce or
eliminate risk of single or multiple failures
• End-to-End Reliability
– Close planning collaboration with national and international partners
– Multiple distributed connections with important national and international R&E networks
– Support end-to-end measurement and monitoring across multiple domains (PerfSONAR)• Collaboration between ESnet, GEANT, Internet2, and European
NRENS
• Building measurement infrastructure for use by other monitoring and measurement tools
15
Primary DOE Labs
IP core hubs
Connectivity
SDN hubs
GEANT (Europe)
Asia-Pacific
New York
Chi
cago
Washington DC
Atl
anta
CERN
Seattle
AlbuquerqueAu
s.A
ust
rali
a SDN Core
IP Core
San Diego
LA
10-50 Gbps circuitsProduction IP coreScience Data Network coreMetropolitan Area NetworksInternational connections
Su
nn
yval
e
Denver
AMPATH
AMPATH
CANARIE
CANARIE CANARIE CERN
High Speed Cross connects with Abilene Gigapops and International Peers
AsiaPacific
GLORIAD
Asia Pacific
Cle
vela
nd
16
ESnet 2007 SDN+MANs Upgrade Increment
ESnet IP core hubs
New hubs
ESnet SDN/NLR switch/router hubs
NLR PoPs
ESnet SDN/NLR switch hubsESnet Science Data Network core (10G/link))CERN/DOE supplied (10G/link)International IP connections (10G/link)
Denver
Seattle
Su
nn
yv
ale
LA
San Diego
Chicago
Raleigh
Jacksonville
Atlanta
KC
El Paso - Las Cruces
Phoenix
Dallas
Albuq. Tulsa
Clev.
Boise
CE
RN
-1G
ÉA
NT
-1G
ÉA
NT
-2
Wash DC
CE
RN
-2CE
RN
-3
Ogden
Portland
Baton RougePensacola
San Ant. Houston
Pitts.
NYC
ESnet IP core sub-hubs
17
ESnet 2008 SDN+MANs Upgrade Increment
ESnet IP core hubs
New hubs
ESnet SDN/NLR switch/router hubs
NLR PoPs
ESnet SDN/NLR switch hubsESnet Science Data Network core (10G/link))CERN/DOE supplied (10G/link)International IP connections (10G/link)
Denver
Seattle
Su
nn
yv
ale
LA
San Diego
Chicago
Raleigh
Jacksonville
Atlanta
KC
El Paso - Las Cruces
Phoenix
Dallas
Albuq. Tulsa
Clev.
Boise
CE
RN
-1G
ÉA
NT
-1G
ÉA
NT
-2
Wash DC
CE
RN
-2CE
RN
-3
Ogden
Portland
Baton RougePensacola
San Ant. Houston
Pitts.
NYC
ESnet IP core sub-hubs
PPPL
GA
ORNL-ATL
18
ESnet 2009 SDN+MANs Upgrade Increment
ESnet IP core hubs
New hubs
ESnet SDN/NLR switch/router hubs
NLR PoPs
ESnet SDN/NLR switch hubsESnet Science Data Network core (10G/link))CERN/DOE supplied (10G/link)International IP connections (10G/link)
Denver
Seattle
Su
nn
yv
ale
LA
San Diego
Chicago
Raleigh
Jacksonville
Atlanta
KC
El Paso - Las Cruces
Phoenix
Dallas
Albuq. Tulsa
Clev.
Boise
CE
RN
-1G
ÉA
NT
-1G
ÉA
NT
-2
Wash DC
CE
RN
-2CE
RN
-3
Ogden
Portland
Baton RougePensacola
San Ant. Houston
Pitts.
NYC
ESnet IP core sub-hubs
PPPL
GA
ORNL-ATL
Denver
Seattle
Su
nn
yv
ale
LA
San Diego
Chicago
Raleigh
Jacksonville
KC
El Paso - Las Cruces
Phoenix
Dallas
Albuq. Tulsa
Clev.
Boise
CE
RN
-1G
ÉA
NT
-1G
ÉA
NT
-2
Wash DC
CE
RN
-2CE
RN
-3
Ogden
Portland
San Ant.
Pitts.
19
ESnet 2010 SDN+MANs Upgrade Increment(Up to Nine Rings Can be Supported with the Hub Implementation)
ESnet IP core hubs
New hubs
ESnet SDN/NLR switch/router hubs
NLR PoPs
ESnet SDN/NLR switch hubs
SDN links added since last presentation to DOE ESnet Science Data Network core (10G/link))CERN/DOE supplied (10G/link)International IP connections (10G/link)
Denver
Seattle
Su
nn
yv
ale
LA
San Diego
Chicago
Raleigh
Jacksonville
Atlanta
KC
El Paso - Las Cruces
Phoenix
Dallas
Albuq. Tulsa
Clev.
Boise
CE
RN
-1G
ÉA
NT
-1G
ÉA
NT
-2
Wash DC
CE
RN
-2CE
RN
-3
Ogden
Portland
Baton RougePensacola
San Ant. Houston
Pitts.
PPPL
20C
leve
land
Primary DOE Labs
IP core hubs
possible new hubs
SDN hubs
Europe(GEANT)
Asia-Pacific
New York
Chi
cago
Washington DC
Atl
anta
CERN 30 Gbps
Seattle
AlbuquerqueAu
s.A
ust
rali
a
Science Data Network Core 30 Gbps
IP Core10 Gbps
San Diego
LA
Production IP coreSDN coreMANsInternational connections
Su
nn
yval
e
Denver
South America(AMPATH)
South America(AMPATH)
Canada(CANARIE)
CERN 30 GbpsCanada(CANARIE)
Europe(GEANT)
Bandwidth and Footprint Goal – 2010
MetropolitanArea Rings20+ Gbps
Asi
a-Pac
ific
Asia Pacific
High Speed Cross connects with I2/Abilene
160-400 Gbps in 2011 with equipment upgrade
GLORIAD
21
OSCARS: Guaranteed Bandwidth Virtual Circuit Service
• ESnet On-demand Secured Circuits and Advanced Reservation System (OSCARS)
• To ensure compatibility, the design and implementation is done in collaboration with the other major science R&E networks and end sites– Internet2: Bandwidth Reservation for User Work (BRUW)
• Development of common code base– GEANT: Bandwidth on Demand (GN2-JRA3), Performance and Allocated Capacity for
End-users (SA3-PACE) and Advance Multi-domain Provisioning System (AMPS) Extends to NRENs
– BNL: TeraPaths - A QoS Enabled Collaborative Data Sharing Infrastructure for Peta-scale Computing Research
– GA: Network Quality of Service for Magnetic Fusion Research– SLAC: Internet End-to-end Performance Monitoring (IEPM)– USN: Experimental Ultra-Scale Network Testbed for Large-Scale Science
• Its current phase is a research project funded by the Office of Science, Mathematical, Information, and Computational Sciences (MICS) Network R&D Program
• A prototype service has been deployed as a proof of concept– To date more then 20 accounts have been created for beta users, collaborators, and
developers– More then 100 reservation requests have been processed– BRUW Interoperability tests successful– DRAGON interoperability tests planned– GEANT (AMPS) interoperability tests planned
22
2005 2006 2007 2008
• Dedicated virtual circuits
• Dynamic virtual circuit allocation
ESnet Virtual Circuit Service Roadmap
• Generalized MPLS (GMPLS)
• Dynamic provisioning of Multi-Protocol Label Switching (MPLS) circuits (Layer 3)
• Interoperability between VLANs and MPLS circuits(Layer 2 & 3)
• Interoperability between GMPLS circuits, VLANs, and MPLS circuits (Layer 1-3)
Initial production service
Full production service
23
ESnet Portions of LHCOPN Circuits
• Endpoints are VLANs on a trunk– BNL and FERMI will see 3 Ethernet VLANS from ESnet
– CERN will see 3 VLANS on both interfaces from USLHCnet
• Will be dynamic Layer 2 circuits using AToM– Virtual interfaces on the ends will be tied to VRFs
– VRFs for each circuit will be tied together using an MPLS LSP or LDP
– Manually configured
• Dynamic provisioning of circuits with these capabilities is on the OSCARS roadmap for 2008
• USLHCnet portion will be static initially– They may explore using per-vlan spanning tree
24
Physical Connections
AOA-mr1Cisco6500
R01lcg CERN
Force 10 E1200
R02ext CERN
Cisco 6500
e600gva1 USLHC
Force 10 E600
e600gva2 USLHC
Force 10 E600
e600nyc USLHC
Force 10 E600
e600chi USLHC
Force 10 E600
manlan Cisco6500
StarlightForce 10E1200
Bnl-mr1Cisco6500
BNL
Chi-sl-sdn1T320
FERMI
Fnal-rt1M20
Chi-cr1T320
GEANT
Canarie
JCMVersion 6.0March 29th
ro2lcg CERN
Force 10 E1200
R01ext CERN
Cisco 6500
Chi-sl-mr1Cisco6500fnal-mr1
Cisco6500
Aoa-cr1JuniperT320
25
BNL LHCOPN Circuits
AOA-mr1Cisco6500
R01lcg CERN
Force 10 E1200
R02ext CERN
Cisco 6500
e600gva1 USLHC
Force 10 E600
e600gva2 USLHC
Force 10 E600
e600nyc USLHC
Force 10 E600
e600chi USLHC
Force 10 E600
manlan Cisco6500
StarlightForce 10E1200
Bnl-mr1Cisco6500
BNL
Chi-sl-sdn1T320
FERMI
Fnal-rt1M20
Chi-cr1T320
GEANT
Canarie
JCMVersion 6.0March 29th
ro2lcg CERN
Force 10 E1200
R01ext CERN
Cisco 6500
Chi-sl-mr1Cisco6500fnal-mr1
Cisco6500
PrimaryCircuit
1st BackupCircuit
2nd BackupCircuit
Circuits are carried across the LIMAN in MPLS LSPs and can use either path.
This circuit is carried in an LSP from BNL to Starlight. It can go either way around the LIMAN and either way around the ESnet backbone between NYC and CHI. It will not go through AOA once the 60 Hudson hub is installed (FY07 or FY08)
Path of last resort (Routed)Aoa-cr1
JuniperT320
Green and yellow circuits are carried across aoa-cr1 until aoa-mr1 is deployed.
26
FERMI LHCOPN Circuits
AOA-mr1Cisco6500
R01lcg CERN
Force 10 E1200
R02ext CERN
Cisco 6500
e600gva1 USLHC
Force 10 E600
e600gva2 USLHC
Force 10 E600
e600nyc USLHC
Force 10 E600
e600chi USLHC
Force 10 E600
manlan Cisco6500
StarlightForce 10E1200
Bnl-mr1Cisco6500
BNL
Chi-sl-sdn1T320
FERMI
Fnal-rt1M20
Chi-cr1T320
GEANT
Canarie
JCMVersion 6.0March 29th
ro2lcg CERN
Force 10 E1200
R01ext CERN
Cisco 6500
Chi-sl-mr1Cisco6500fnal-mr1
Cisco6500
Aoa-cr1JuniperT320
Path of last resort(Routed)
PrimaryCircuit
1st Backup Circuit
2nd Backup Circuit
Only 1 path across the ESnet links between fnal-mr1 and Starlight is shown for the green and yellow path to simplify the diagram. However they will be riding MPLS LSPs and will automatically fail over to any other available links.
The red path will also fail over to other circuits in the Chicago area, or other paths between Chicago and New York if available.
Note: The green primary circuit will be 10Gbps across ESnet. The red path is shared with other production traffic.
27
Outstanding Issues
• Is a single point of failure at the Tier 1 edges a reasonable long term design?
• Bandwidth guarantees in outage scenarios– How do the networks signal that something has failed to the
applications?– How do sites sharing a link during a failure coordinate BW
utilization?
• What expectations should be set for fail-over times?– Should BGP timers be tuned?
• We need to monitor the backup paths ability to transfer packets end-to-end to ensure they will work when needed.– How are we going to do it?