1 the intersection of grids and networks: where the rubber hits the road william e. johnston esnet...
TRANSCRIPT
1
The Intersection of Grids and Networks:Where the Rubber Hits the Road
William E. JohnstonESnet Manager and Senior Scientist
Lawrence Berkeley National Laboratory
2
Objectives of this Talk
• How a production R&E network works
• Why some types of services needed by Grids / widely distributed computing environments are hard
3
Outline
• How do Networks Work?
• Role of the R&E Core Network
• ESnet as a Core Networko ESnet Has Experienced Exponential Growth Since 1992
o ESnet is Monitored in Many Ways
o How Are Problems Detected and Resolved?
• Operating Science Mission Critical Infrastructureo Disaster Recovery and Stability
o Recovery from Physical Attack / Failure
o Maintaining Science Mission Critical Infrastructurein the Face of Cyberattack
• Services that Grids need from the Networko Public Key Infrastructure example
4
How Do Networks Work?
• Accessing a service, Grid or otherwise, such as a Web server, FTP server, etc., from a client computer and client application (e.g. a Web browser_ involveso Target host names
o Host addresses
o Service identification
o Routing
5
How Do Networks Work?
• When one types “google.com” into a Web browser to use the search engine, the following takes placeo The name “google.com” is resolved to an Internet address
by the Domain Name System (DNS) – a hierarchical directory service
o The address is attached to a network packet (which carries the data – a google search request in this case) which is then sent out of the computer into the network
o The first place that the packet reaches is a router that must decide how to get that packet to its desitnatiion (google.com)
6
How Do Networks Work?
o In the Internet, routing is done “hot potato” - Routers are in your site LANs and at your ISP, and
each router typically communicates directly with several other routers
- The first router to receive your packet takes a quick look at the address and says, if I send this packet to router B that will probably take it closer to its destination. So it sends it to B without further adieu.
- Router B does the same thing, and so forth, until the packet reaches google.com
o What makes this work is routing protocols that exchange reachability information between all directly connected routers – “BGP” is the most common such protocol in WANs
7
How Do Networks Work?
• Once the packet reaches its destination (the computer called google.com) it must be delivered to the google search engine, as opposed to the google mail server that may be running on the same machine.o This is accomplished with a service identifier that is put on
the packet by the browser (the client side application)- The service identifier says that this packet is to be delivered to the
Web server on the destination system – on each system every server/service has a unique identified called a “port number”
o So when someone says that the Blaster/Lovsan worm is attacking port 135 on the system called google.com, they mean that a worm program somewhere in the Internet is trying to gain access to the service at port 135 on google.com (usually to exploit a vulnerability).
8
Role of the R&E Core Network: Transit (Deliver Every Packet)
LBNL
Google, Inc.
ESnet(Core network)
Big ISP(e.g. SprintLink)
gatewayrouter
router
router
router
router
router
corerouter
router
peeringrouter
corerouter
borderrouter
border/gateway routers•implement separate site and network provider policy (including site firewall
policy)
peering routers•implement/enforce routing policy for
each provider•provide
cyberdefense
router
router
core routers•focus on high-speed packet forwarding
peeringrouter
9
Outline
• How do Networks Work?
• Role of the R&E Core Network
• ESnet as a Core Networko ESnet Has Experienced Exponential Growth Since 1992
o ESnet is Monitored in Many Ways
o How Are Problems Detected and Resolved?
• Operating Science Mission Critical Infrastructureo Disaster Recovery and Stability
o Recovery from Physical Attack / Failure
o Maintaining Science Mission Critical Infrastructurein the Face of Cyberattack
• Services that Grids need from the Networko Public Key Infrastructure example
10
What is ESnet
• ESnet is a large-scale, very high bandwidth network providing connectivity between DOE Science Labs and their science partners in the US, Europe, and Japan
• Essentially all of the national data traffic supporting US open science is carried by two networks – ESnet and Internet-2 / Abilene (which plays a similar role for the university community)
• ESnet is very different from commercial ISPs (Internet Service Providers) like Earthlink, AOL, etc.o Most big ISPs provide small amounts of bandwidth to a
large number of siteso ESnet supplies very high bandwidth to a small number of
sites
11
TWC
JGISNLL
LBNL
SLAC
YUCCA MT
BECHTEL
PNNLLIGO
INEEL
LANL
SNLAAlliedSignal
PANTEX
ARM
KCP
NOAA
OSTIORAU
SRS
ORNLJLAB
PPPL
ANL-DCINEEL-DCORAU-DC
LLNL/LANL-DC
MIT
ANL
BNL
FNALAMES
4xLAB-DCNERSC
NR
EL
ALBHUB
LLNL
GA DOE-ALB
SDSC
Japan
GTN&NNSA
International (high speed)OC192 (10G/s optical)OC48 (2.5 Gb/s optical)Gigabit Ethernet (1 Gb/s)OC12 ATM (622 Mb/s)OC12 OC3 (155 Mb/s)T3 (45 Mb/s)T1-T3T1 (1 Mb/s)
QWESTATM
ESnet IP
GEANT - Germany - France - Italy - UK - etc. Sinet (Japan)Japan – Russia(BINP)
CA*net4CERNMRENNetherlandsRussiaStarTapTaiwan (ASCC)
CA*net4KDDI (Japan)FranceSwitzerlandTaiwan (TANet2)
AustraliaCA*net4Taiwan (TANet2)Singaren
ESnet core ring: Packet over SONET Optical Ring
and Hubs
ELP HUB
SNV HUB CHI HUB
NYC HUB
ATL HUB
DC HUB
MAE-E
Starlig
htChi NAP
Fix-W
PAIX-W
MAE-W
NY-NAP
PAIX-E
Euqinix
PNW
G
SEA HUB
ESnet Connects DOE Facilities and Collaborators
SNV HUB
Abi
lene A
bile
ne
Abilene
Ab
ilene
NNSA Sponsored (12)Joint Sponsored (3)Other Sponsored (NSF LIGO, NOAA)Laboratory Sponsored (6)
42 end user sites
peering points
ESnet hubs
Office Of Science Sponsored (22)
12
10GE
10GE
RTR
Current Architecture
RTR
optical fiber ring
Wave division multiplexing
• today typically 64 x 10 Gb/s optical channels per fiber
• channels (referred to as “lambdas”) are usually used in bi-directional pairs
Lambda channels are converted to electrical channels
• usually SONET data framing or Ethernet data framing
• can be clear digital channels (no framing – e.g. for digital HDTV)
ESnet IP router
ESnet core
Site IP router
Site – ESnet network policy demarcation
(“DMZ”)
site LAN
ESnet hub
ESnet site
RTR
RTR
RTRRTR
A ring topology network is inherently reliable – all single point failures are mitigated by routing traffic in
the other direction around the ring.
STARLI
GH
T
MAE-E
NY-NAP
PAIX-E
GA
LB
NL
Peering – ESnet’s Logical Infrastructure – Connects the DOE Community With its Collaborators
ESnet Peering (connections to other networks)
Commercial
NYC HUBS
SEA HUB
Japan
SNV HUB
MAE-W
FIX
-W
PAIX-W 26 PEERS
CA*net4CERNMRENNetherlandsRussiaStarTapTaiwan (ASCC)
Abilene +7 Universities
22 PEERS
MAX GPOP
GEANT - Germany - France - Italy - UK - etc SInet (Japan)KEKJapan – Russia (BINP)
AustraliaCA*net4Taiwan
(TANet2)Singaren
20 PEERS3 PEERS
LANL
TECHnet
2 PEERS
39 PEERS
CENICSDSC
PNW-GPOP
CalREN2 CHI NAP
Distributed 6TAP19 Peers
2 PEERS
KDDI (Japan)France
EQX-ASH
1 PEER
1 PEER
5 PEERS
ESnet provides complete access to the Internet by managing the full complement of Global Internet routes (about 150,000) at 10 general/commercial peering points + high-speed peerings w/ Abilene and the international networks.
ATL HUB
University
International
Commercial
Abilene
EQX-SJ
Abilene
6 PEERS
Abilene
14
What is Peering?
• Peering points exchange routing information that says “which packets I can get closer to their destination”
• ESnet daily peeringreport(top 20 of about 100)
• This is a lot of work
peering with this outfitis not random, it carriesroutes that ESnet needs(e.g. to the Russian Backbone Net)
AS routes peer
1239 63384 SPRINTLINK
701 51685 UUNET-ALTERNET
209 47063 QWEST
3356 41440 LEVEL3
3561 35980 CABLE-WIRELESS
7018 28728 ATT-WORLDNET
2914 19723 VERIO
3549 17369 GLOBALCENTER
5511 8190 OPENTRANSIT
174 5492 COGENTCO
6461 5032 ABOVENET
7473 4429 SINGTEL
3491 3529 CAIS
11537 3327 ABILENE
5400 3321 BT
4323 2774 TWTELECOM
4200 2475 ALERON
6395 2408 BROADWING
2828 2383 XO
7132 1961 SBC
15
• Why so many routes? So that when I want to get to someplace out of the ordinary, I can get there. For example:http://www-sbras.nsc.ru/eng/sbras/copan/microel_main.html (Technological Design Institute of Applied Microelectronics of SB RAS 630090, Novosibirsk, Russia)
Peering routers
Start: 134.55.209.5 snv-lbl-oc48.es.net ESnet core
134.55.209.90 snvrt1-ge0-snvcr1.es.net ESnet peering at Sunnyvale
63.218.6.65 pos3-0.cr01.sjo01.pccwbtn.net AS3491 CAIS Internet
63.218.6.38 pos5-1.cr01.chc01.pccwbtn.net “ “
63.216.0.53 pos6-1.cr01.vna01.pccwbtn.net “ “
63.216.0.30 pos5-3.cr02.nyc02.pccwbtn.net “ “
63.218.12.37 pos6-0.cr01.ldn01.pccwbtn.net “ “
63.218.13.134 rbnet.pos4-1.cr01.ldn01.pccwbtn.net AS3491->AS5568 (Russian Backbone Network) peering point
195.209.14.29 MSK-M9-RBNet-5.RBNet.ru Russian Backbone Network
195.209.14.153 MSK-M9-RBNet-1.RBNet.ru “ “
195.209.14.206 NSK-RBNet-2.RBNet.ru “ “
Finish: 194.226.160.10 Novosibirsk-NSC-RBNet.nsc.ru RBN to AS 5387 (NSCNET-2)
What is Peering?
16
ESnet is Engineered to Move a Lot of Data
Annual growth in the past five years has increased from 1.7x annually to just over 2.0x annually.
TB
ytes
/M
onth
ESnet is currently transporting about 250 terabytes/mo.
ESnet Monthly Accepted Traffic
17
Traffic coming into ESnet = GreenTraffic leaving ESnet = BlueTraffic between sites% = of total ingress or egress traffic
Note that more that 90% of the ESnet traffic is OSC traffic
ESnet Appropriate Use Policy (AUP)
All ESnet traffic must originate and/or terminate on an ESnet an site (no transit traffic is allowed)
Who Generates Traffic, and Where Does it Go?ESnet Inter-Sector Traffic Summary,
Jan 2003 / Feb 2004 (1.7X overall traffic increase, 1.9X OSC increase) (the international traffic is increasing due to BABAR at SLAC and the LHC tier 1 centers at
FNAL and BNL)
Peering Points
Commercial
R&E (mostlyuniversities)
International
21/14%
17/10%
9/26%
14/12%
10/13%
4/6%
ESnet
~25/18%
DOE collaborator traffic, inc.data
72/68%
53/49%
DOE is a net supplier of data because DOE facilities are used by universities and commercial entities, as well as by DOE researchers
DOE sites
18
ESnet Top 20 Data Flows, 24 hrs., 2004-04-20
Fermila
b (US)
CERN
SLAC (US)
IN2P3 (F
R)
1 te
raby
te/d
ay
SLAC (US)
INFN P
adva (I
T)
Fermila
b (US)
U. C
hicago (U
S)
CEBAF (US)
IN2P3 (F
R)
INFN P
adva (I
T) S
LAC (US)
U. Toro
nto (CA)
Ferm
ilab (U
S)
DFN-WiN
(DE)
SLAC (U
S)
DOE Lab D
OE Lab
DOE Lab D
OE Lab
SLAC (US)
JANET (U
K)
Fermila
b (US)
JANET (U
K)
Argonne (U
S) Leve
l3 (US)
Argonne
SURFnet (
NL)
IN2P3 (F
R) S
LAC (US)
Fermila
b (US)
INFN P
adva (I
T)
A small number of science users
account for a significant
fraction of all ESnet traffic
19
Top 50 Traffic Flows Monitoring – 24hr – 1 Int’l Peering Point
10 flows> 100 GBy/day
More than 50 flows
> 10 GBy/day
20
Scalable Operation is Essential
• R&E networks typically operate with a small staff
• The key to everything that the network provides is scalabilityo How do you manage a huge infrastructure with a small
number of people?
o This issue dominates all others when looking at whether to support new services (e.g. Grid middleware)
- Can the service be structured so that its operational aspects do not scale as a function of the use population?
- If not, then it cannot be offered as a service
21
Scalable Operation is Essential
• The entire ESnet network is operated by fewer than 15 people
7X24 Operations Desk (2-4 FTE)
7X24 On-Call Engineers (7 FTE)
Core Engineering Group (5 FTE)
Infr
astr
uctu
re (
6 F
TE
)
Man
agem
ent,
res
ourc
e m
anag
emen
t,ci
rcui
t ac
coun
ting,
gro
up le
ads
(4 F
TE
)S
cience Services
(middlew
are andcollaboration tools) (5 F
TE
)
•Automated, real-time monitoring of traffic levels and operating state of some 4400 network entities is the primary network
operational and diagnosis tool
SecureNet
Network ConfigurationOSPF Metrics (internal
routing and connectivity)
Performance
Hardware Configuration IBGP Mesh (WAN routing and connectivity)
23
TWC
JGISNLL
LBNL
SLAC
YUCCA MT
BECHTEL
PNNLLIGO
INEEL
LANL
SNLAAlliedSignal
PANTEX
ARM
AlliedSignal
NOAA
OSTIORAU
SRS
ORNLJLAB
PPPL
ANL-DCINEEL-DCORAU-DC
LLNL/LANL-DC
MIT
ANL
BNL
FNALAMES
NevisYale
4xLAB-DC
Brandeis
NERSC
NR
EL
ALBHUB
LLNL
GA
DOE-ALB
SDSC
Japan
GTN&NNSA
International (high speed)OC192 (10G/s optical)OC48 (2.5 Gb/s optical)Gigabit Ethernet (1 Gb/s)OC12 ATM (622 Mb/s)OC12 OC3 (155 Mb/s)T3 (45 Mb/s)T1-T3T1 (1 Mb/s)
QWESTATM
ESnet IP
GEANT - Germany - France - Italy - UK - etc Sinet (Japan)Japan – Russia(BINP)
CA*net4CERNMRENNetherlandsRussiaStarTapTaiwan (ASCC)
CA*net4KDDI (Japan)FranceSwitzerlandTaiwan (TANet2)
AustraliaCA*net4Taiwan (TANet2)Singaren
How Are Problems Detected and Resolved?
SEA HUB
ELP HUB
SNV HUB CHI HUB
NYC HUB
ATL HUB
DC HUB
When a hardware alarm goes off here, the 24x7
operator is notified
24
ESnet is Monitored in Many Ways
SecureNet
ESnet configuration OSPF MetricsPerformance
IBGP MeshHardware Configuration
Drill Down into the Configuration DB to Operating Characteristics of Every Device
e.g. cooling air temperature for the router chassis air inlet, hot-point, and air exhaust for the ESnet gateway router at PNNL
26
Problem Resolution
• Let’s say that the diagnoistics have pinpointed a bad module in a router rack in the ESnet hub in NYC
• Almost all high-end routers, and other equipment that ESnet uses, have multiple, redundant modules for all critical functions
• Failure of a module (e.g. a power supply or a control computer) can be corrected on-the-fly, without turning off the power or impacting the continued operation of the router
• Failed modules are typically replaced by a “smart hands” service at the hubs or siteso One of the many essential scalability mechanisms
27
ESnet is Monitored in Many Ways
SecureNet
ESnet configuration OSPF MetricsPerformance
IBGP MeshHardware Configuration
Drill Down into the Hardware Configuration DBfor Every Wire Connection
Equipment rack detail at AOA,
NYC Hub(one of the
10 Gb/s core optical ring sites)
• Equipment wiring detail for two modules at the AOA, NYC Hub
• This allows “smart hands” – e.g., Qwest personnel at the NYC site – to replace modules for ESnet)
The Hub Configuration
Database
30
What Does this Equipment Actually Look Like?
Equipment rack detail at
NYC Hub, 32 Avenue
of the Americas
(one of the 10 Gb/s core optical ring
sites)
Picture detail
31
Cisco 7206AOA-AR1
(low speed links to MIT & PPPL)
($38,150 list)
Juniper M20AOA-PR1
(peering RTR)($353,000 list)
Juniper T320AOA-CR1
(Core router)
($1,133,000 list)
Juniper OC192
Optical Ring Interface
(the AOA end of the OC192
to CHI($195,000
list)
Juniper OC48Optical Ring
Interface (the AOA end of the OC48 to DC-HUB
($65,000 list)
AOAPerformance Tester
($4800 list)
Qwest DS3 DCX
DC / AC Converter($2200 list)
Lightwave Secure
Terminal Server($4800 list) ESnet core
equipment @ Qwest
32 AofA HUB NYC, NY
(~$1.8M, list)
Sentry power 48v 30/60 amp
panel($3900 list)
Sentry power 48v 10/25 amp
panel($3350 list)
Typical Equipment of an ESnet Core Network Hub
32
Outline
• How do Networks Work?
• Role of the R&E Core Network
• ESnet as a Core Networko ESnet Has Experienced Exponential Growth Since 1992
o ESnet is Monitored in Many Ways
o How Are Problems Detected and Resolved?
• Operating Science Mission Critical Infrastructureo Disaster Recovery and Stability
o Recovery from Physical Attack / Failure
o Maintaining Science Mission Critical Infrastructurein the Face of Cyberattack
• Services that Grids need from the Networko Public Key Infrastructure example
33
Operating Science Mission Critical Infrastructure
• ESnet is a visible and critical piece of DOE science infrastructure
o if ESnet fails,10s of thousands of DOE and University users know it within minutes if not seconds
• Requires high reliability and high operational security in the systems that are integral to the operation and management of the network
o Secure and redundant mail and Web systems are central to the operation and security of ESnet
- trouble tickets are by email
- engineering communication by email
- engineering database interfaces are via Web
o Secure network access to Hub routers
o Backup secure telephone modem access to Hub equipment
o 24x7 help desk and 24x7 on-call network engineer
[email protected] (end-to-end problem resolution)
34
LBNL
PPPL
BNL
AMES
Remote Engineer• partial duplicate infrastructure
DNS
Remote Engineer• partial duplicate
infrastructure
TWCRemoteEngineer
Disaster Recovery and Stability
• The network must be kept available even if, e.g., the West Coast is disabled by a massive earthquake, etc.
ATL HUB
SEA HUB
ALBHUB
NYC HUBS
DC HUB
ELP HUB
CHI HUB
SNV HUB Duplicate InfrastructureCurrently deploying full replication of the NOC databases and servers and Science Services databases in the NYC Qwest carrier hub
Engineers, 24x7 Network Operations Center, generator backed power
• Spectrum (net mgmt system)• DNS (name – IP address
translation)• Eng database• Load database• Config database• Public and private Web• E-mail (server and archive)• PKI cert. repository and
revocation lists• collaboratory authorization
service
Reliable operation of the network involves• remote Network Operation Centers (3) • replicated support infrastructure• generator backed UPS power at all critical
network and infrastructure locations
• high physical security for all equipment• non-interruptible core - ESnet core
operated without interruption throughoN. Calif. Power blackout of 2000othe 9/11/2001 attacks, andothe Sept., 2003 NE States power blackout
35
Recovery from Physical Attack / Core Ring Failure
New York (AOA)
Chicago (CHI)
Sunnyvale (SNV)
Atlanta (ATL)
Washington, DC (DC)
El Paso (ELP)
Site gateway router
SiteLAN
ESnet border router DMZ
Site
Hubs(backbone routers
and local loop connection points)
ESnet backbone(optical fiber
ring)
Local loop(Hub to local site)
The Hubs have lots of connections
(42 in all)
We can route traffic either way around the
ring, so any single failure in the ring is
transparent to ESnet users
Xnormal traffic flow
reversed traffic flow
The local loops are still single points of failure
break in the ring
Maintaining Science Mission Critical Infrastructurein the Face of Cyberattack
• A Phased Security Architecture is being implemented to protects the network and the ESnet sites
• The phased response ranges from blocking certain site traffic to a complete isolation of the network which allows the sites to continue communicating among themselves in the face of the most virulent attacks
o Separates ESnet core routing functionality from external Internet connections by means of a “peering” router that can have a policy different from the core routers
o Provide a rate limited path to the external Internet that will insure site-to-site communication during an external denial of service attack
o Provide “lifeline” connectivity for downloading of patches, exchange of e-mail and viewing web pages (i.e.; e-mail, dns, http, https, ssh, etc.) with the external Internet prior to full isolation of the network
37
Cyberattack Defense
LBNL
ESnet
router
router
borderrouter
X
peeringrouter
Lab
Lab
gatewayrouter
ESnet second response – filter traffic from outside of ESnet
Lab first response – filter incoming traffic at their ESnet gateway router
ESnet third response – shut down the main peering paths and provide only limited bandwidth paths for specific
“lifeline” services
Xpeeringrouter
gatewayrouter
border router
router
attack trafficX
ESnet first response – filters to assist a site
Sapphire/Slammer worm infection created a Gb/s of traffic on the ESnet core until filters were put in place (both into and out of sites) to damp it out.
38
ESnet WAN Security and Cybersecurity• Cybersecurity is a new dimension of ESnet security
o Security is now inherently a global problemo As the entity with a global view of the network, ESnet has an
important role in overall security
30 minutes after the Sapphire/Slammer worm was released, 75,000 hosts running Microsoft's SQL Server (port 1434) were infected.
(“The Spread of the Sapphire/Slammer Worm,” David Moore (CAIDA & UCSD CSE), Vern Paxson (ICIR &LBNL), Stefan Savage (UCSD CSE), Colleen Shannon (CAIDA), Stuart Staniford (Silicon Defense), Nicholas Weaver (Silicon Defense & UC Berkeley EECS) http://www.cs.berkeley.edu/~nweaver/sapphire ) Jan., 2003
39
ESnet and Cybersecurity
Sapphire/Slammer worm infection hits creating almost a full Gb/s (1000 megabit/sec.) traffic spike on the ESnet backbone
40
Outline
• Role of the R&E Transit Network
• ESnet is Driven by the Requirements of DOE Science
• Terminology – How Do Networks Work?
• How Does it Work? – ESnet as a Backbone Networko ESnet Has Experienced Exponential Growth Since 1992o ESnet is Monitored in Many Ways o How Are Problems Detected and Resolved?
• Operating Science Mission Critical Infrastructureo Disaster Recovery and Stability
o Recovery from Physical Attack / Failureo Maintaining Science Mission Critical Infrastructure
in the Face of Cyberattack
• Services that Grids need from the Networko Public Key Infrastructure example
41
Organized by Office of Science
Mary Anne Scott, Chair Dave Bader Steve Eckstrand Marvin Frazier Dale Koelling Vicky White
Workshop Panel Chairs Ray Bair and Deb AgarwalBill Johnston and Mike WildeRick StevensIan Foster and Dennis GannonLinda Winkler and Brian TierneySandy Merola and Charlie Catlett
August 13-15, 2002
Network and Middleware Needs of DOE Science
•Focused on science requirements that driveo Advanced Network Infrastructureo Middleware Researcho Network Researcho Network Governance Model
•The requirements for DOE science were developed by the OSC science community representing major DOE science disciplines
o Climateo Spallation Neutron Sourceo Macromolecular Crystallographyo High Energy Physics
o Magnetic Fusion Energy Scienceso Chemical Scienceso Bioinformatics
Available at www.es.net/#research
42
Grid Middleware Requirements (DOE Workshop)
• A DOE workshop examined science driven requirements for network and middleware and identified twelve high priority middleware services (see www.es.net/#research)
• Some of these services have a central management component and some do not
• Most of the services that have central management fit the criteria for ESnet support. These include, for example
o Production, federated RADIUS authentication serviceo PKI federation serviceso Virtual Organization Management services to manage organization
membership, member attributes and privilegeso Long-term PKI key and proxy credential managemento End-to-end monitoring for Grid / distributed application debugging and
tuningo Some form of authorization service (e.g. based on RADIUS)o Knowledge management services that have the characteristics of an
ESnet service are also likely to be important (future)
43
Grid Middleware Services
• ESnet provides several “science services” – services that support the practice of science
• A number of such services have an organization like ESnet as the natural providero ESnet is trusted, persistent, and has a large (almost
comprehensive within DOE) user base
o ESnet has the facilities to provide reliable access and high availability through assured network access to replicated services at geographically diverse locations
o However, service must be scalable in the sense that as its user base grows, ESnet interaction with the users does not grow (otherwise not practical for a small organization like ESnet to operate)
44
Science Services: PKI Support for Grids
• Public Key Infrastructure supports cross-site, cross-organization, and international trust relationships that permit sharing computing and data resources and other Grid services
• DOEGrids Certification Authority service provides X.509 identity certificates to support Grid authentication provides an example of this modelo The service requires a highly trusted provider, and requires a
high degree of availability
o The service provider is a centralized agent for negotiating trust relationships, e.g. with European CAs
o The service scales by adding site based or Virtual Organization based Registration Agents that interact directly with the users
o See DOEGrids CA (www.doegrids.org)
45
Science Services: Public Key Infrastructure
• DOEGrids CA policies are tailored to science Gridso Digital identity certificates for people, hosts and services
o Provides formal and verified trust management – an essential service for widely distributed heterogeneous collaboration, e.g. in the International High Energy Physics community
This service was the basis of the first routine sharing of HEP computing resources between US and Europe
Have recently added a second CA with a policy that supports secondary issuers that need to do bulk issuing of certificates with central private key managemento NERSC will auto issue certs when accounts are set up – this
constitutes an acceptable identity verification
o A variant of this will also be set up to support security domain gateways such as Kerberos – X509 – e.g. KX509 – at FNAL
46
Science Services: Public Key Infrastructure
• The rapidly expanding customer base of this service will soon make it ESnet’s largest collaboration service by customer count
Registration AuthoritiesANLLBNLORNLDOESG (DOE Science Grid)ESG (Climate)FNALPPDG (HEP)Fusion GridiVDGL (NSF-DOE HEP collab.)NERSCPNNL
47
Grid Network Services Requirements (GGF, GHPN)
• Grid High Performance Networking Research Group, “Networking Issues of Grid Infrastructures” (draft-ggf-ghpn-netissues-3) – what networks should provide to Gridso High performance transport for bulk data transfer (over 1Gb/s
per flow)
o Performance controllability to provide ad hoc quality of service and traffic isolation.
o Dynamic Network resource allocation and reservation
o High availability when expensive computing or visualization resources have been reserved
o Security controllability to provide a trusty and efficient communication environment when required
o Multicast to efficiently distribute data to group of resources.
o How to integrate wireless network and sensor networks in Grid environment
48
Transport Services
• network tools available to build serviceso queue management
- provide forwarding priorities different from best effort
- e.g.– scavenger (discard if anything behind in the queue)– expedited forwarding (elevated priority queuing)– low latency forwarding (highest priority – ahead of all
other traffic)
o path management
- tagged traffic can be managed separately from regular traffic
o policing
- limit the bandwidth of an incoming stream
49
Priority Service: Guaranteed Bandwidth
usersystem2
0
1000
network pipe
bandwidth
reserved for production, best effort traffic
bandwidth management model
available for elevated priority traffic
borderrouter
borderrouter
?
flag traffic fromuser system1 for
expedited forwarding
bandwidthbroker
usersystem1
site A
site B
50
Priority Service: Guaranteed Bandwidth
usersystem2
borderrouter
borderrouter
? bandwidthbrokeruser
system1
• What is wrong with this? (almost everything)
there may be several users that
want all of the premium bandwidth
at the same time
the user may send data into the high priority
stream at a high enough bandwidth that
it interferes with production traffic (and
not even know it)
this is at least three independent
networks, and probably more
a user that was a priority at site A may
not be at site B
site A
site B
51
Priority Service: Guaranteed Bandwidth
usersystem2
usersystem1
site B
resourcemanager
resourcemanager
resourcemanager
polic
er
auth
oriz
atio
n
shap
er
site A
bandwidthbroker
allocationmanager
• To address all of the issues is complex
52
Priority Service
• So, practically, what can be done?
• With available tools can provide a small number of provisioned circuitso secure and end-to-end (system to system)
o various Quality of Service possible, including minimum latency
o a certain amount of route reliability (if redundant paths exist in the network)
o end systems can manage these circuits as single high bandwidth paths or multiple lower bandwidth paths of (with application level shapers)
o non-interfering with production traffic, so aggressive protocols may be used
53
Priority Service: Guaranteed Bandwidth
usersystem2
usersystem1
site B
polic
er
site A
• will probably be service level agreements among transit networks allowing for a fixed amount of priority traffic – so the resource manager does minimal checking and no authorization
• will do policing, but only at the full bandwidth of the service agreement (for self protection)
resourcemanager
auth
oriz
atio
n
resourcemanager
resourcemanager
allocation will probably be
relatively static and ad hocbandwidth
broker
54
Grid Network Services Requirements (GGF, GHPN)
• Grid High Performance Networking Research Group, “Networking Issues of Grid Infrastructures” (draft-ggf-ghpn-netissues-3) – what networks should provide to Gridso High performance transport for bulk data transfer (over 1Gb/s
per flow)
o Performance controllability to provide ad hoc quality of service and traffic isolation.
o Dynamic Network resource allocation and reservation
o High availability when expensive computing or visualization resources have been reserved
o Security controllability to provide a trusted and efficient communication environment when required
o Multicast to efficiently distribute data to group of resources.
o Integrated wireless network and sensor networks in Grid environment
55
High Throughput Requirements
1) High average throughput
2) Advanced protocol capabilities available and usable at the end-systems
3) Lack of use of QoS parameters
Current issues
1) Low average throughput
2) Semantic gap between socket buffer interface and the protocol capabilities of TCP
Analyzed reasons
1a) End system bottleneck,
1b) Protocol misconfigured,
1c) Inefficient Protocol
1d) Mixing of congestion control and error recovery
2a) TCP connection Set up: Blocking operations vs asynchronous
2b)Window scale option not accessible through the API
Available solutions
1a) Multiple TCP sessions
1b) Larger MTU
1c) ECN
Proposed alternatives
1) Alternatives to TCP (see DT-RG survey document)
2) OS by-pass and protocol off-loading
3) Overlays
4) End to end optical paths
56
A New Architecture
• The essential requirements cannot be met with the current, telecom provided, hub and spoke architecture of ESnet
• The core ring has good capacity and resiliency against single point failures, but the point-to-point tail circuits are neither reliable nor scalable to the required bandwidth
ESnetCore/Backbone
New York (AOA)
Chicago (CHI)
Sunnyvale (SNV)
Atlanta (ATL)
Washington, DC (DC)
El Paso (ELP)
DOE sites
57
A New Architecture
• A second backbone ring will multiply connect the MAN rings to protect against hub failure
• All OSC Labs will be able to participate in some variation of this new architecture in order to gain highly reliable and high capacity network access
EuropeAsia-
Pacific
ESnetCore/Backbone
New York (AOA)
Chicago (CHI)
Sunnyvale (SNV)
Atlanta (ATL)
Washington, DC (DC)
El Paso (ELP)
DOE sites
58
Conclusions• ESnet is an infrastructure that is critical to DOE’s
science mission and that serves all of DOE
• Focused on the Office of Science Labs
• ESnet is working on providing the DOE mission science networking requirements with several new initiatives and a new architecture
• QoS is hard – but we have enough experience to do pilot studies (which ESnet is just about to start)
• Middleware services for large numbers of users are hard – but they can be provided if careful attention is paid to scaling