30 june 2004 1 wide area networking performance challenges olivier martin, cern uk dti visit
TRANSCRIPT
30 June 2004 1
Wide Area Networking Performance Challenges
Olivier Martin, CERNUK DTI visit
2UK-DTI visit 2
30 June 2004
Presentation Outline
CERN’s connectivity to the Internet DataTAG project overview Wide Area Networking challenges
Where do we want to be by the start of the LHC in 2007? Where are we now?
30 June 2004 Slide 3
CERN External NetworkingMain Internet Connections
• Gen. Purp. North American A&R Connectivity
• (combined with DataTAG)
CERN
ATRIUMVTHD / FR
NetherLight
DataTAG
GEANT
SWITCH10Gbps
2.5Gbps
1Gbps
10Gbps
2.5Gbps10Gbps
10Gbps
Network Research
CERNInternet
ExchangePoint
Swiss NationalResearch Network
General PurposeEuropean A&R
connectivity
USLIC
CIXP
4Final DataTAG Review, 24 March 2004 4
http://www.datatag.orghttp://www.datatag.org
Project partners
5Final DataTAG Review, 24 March 2004 5
DataTAG Mission
EU US Grid network research High Performance Transport protocols
Inter-domain QoS
Advance bandwidth reservation
EU US Grid Interoperability
Sister project to EU DataGRID
TTransransAAtlantictlantic G Gridrid
LHC Data Grid HierarchyLHC Data Grid Hierarchy
Tier 1
Tier2 Center
Online System
CERN 700k SI95 ~1 PB Disk; Tape Robot
FNAL: 200k SI95; 600 TBIN2P3 Center INFN Center RAL Center
InstituteInstituteInstituteInstitute ~0.25TIPS
Workstations
~100-400 MBytes/sec
2.5/10 Gbps
0.1–1 Gbps
Physicists work on analysis “channels”
Each institute has ~10 physicists working on one or more channels
Physics data cache
~PByte/sec
10 Gbps
Tier2 CenterTier2 CenterTier2 Center
~2.5 Gbps
Tier 0 +1
Tier 3
Tier 4
Tier2 Center Tier 2
Experiment
CERN/Outside Resource Ratio ~1:2Tier0/( Tier1)/( Tier2) ~1:1:1
30 June 2004 UK DTI visit Slide 7
grid for a physicsstudy group
Deploying the LHC Grid
grid for a regional group
les.
rob
ert
son
@ce
rn.c
h
Tier2
Lab a
Uni a
Lab c
Uni n
Lab m
Lab b
Uni bUni y
Uni x
Tier3physics
department
Desktop
Germany
Tier 1
USA
UK
France
Italy
Taipei?
CERN Tier 1
Japan
The LHC Computing
CentreCERN Tier 0
10 June 2004 UK-DTI visit Slide 8
Main Networking Challenges
Fulfill the, yet unproven, assertion that the network can be « nearly » transparent to the Grid
Deploy suitable Wide Area Network infrastructure (50-100 Gb/s)Deploy suitable Local Area Network infrastructure (matching or
exceeding that of the WAN)Seamless interconnection of LAN & WAN infrastructures
firewall?End to End issues (transport protocols, PCs (Itanium, Xeon), 10GigE
NICs (Intel, S2io)where are we today:
memory to memory: 6.5Gb/smemory to disk: 1.2MB (Windows 2003
server/NewiSys)disk to disk: 400MB (Linux), 600MB (Windows)
10 June 2004 UK-DTI visit Slide 9
Main TCP issues
Does not scale to some environments– High speed, high latency– Noisy
Unfair behaviour with respect to:– Round Trip Time (RTT)– Frame size (MSS)– Access Bandwidth
Widespread use of multiple streams in order to compensate for inherent TCP/IP limitations (e.g. Gridftp, BBftp):
– Bandage rather than a cureNew TCP/IP proposals in order to restore performance in single stream
environments– Not clear if/when it will have a real impact– In the mean time there is an absolute requirement for backbones
with:– Zero packet losses,– And no packet re-ordering
10 June 2004 UK-DTI visit Slide 10
TCP dynamics(10Gbps, 100ms RTT, 1500Bytes
packets)Window size (W) = Bandwidth*Round Trip Time
– Wbits = 10Gbps*100ms = 1Gb– Wpackets = 1Gb/(8*1500) = 83333 packets
Standard Additive Increase Multiplicative Decrease (AIMD) mechanisms:
– W=W/2 (halving the congestion window on loss event)– W=W + 1 (increasing congestion window by one packet
every RTT)Time to recover from W/2 to W (congestion
avoidance) at 1 packet per RTT:– RTT*Wp/2 = 1.157 hour– In practice, 1 packet per 2 RTT because of delayed acks,
i.e. 2.31 hourPackets per second:
– RTT*Wpackets = 833’333 packets
10 June 2004 UK-DTI visit Slide 11
10G DataTAG testbed extension to Telecom World 2003 and
Abilene/Cenic
On September 15, 2003, the DataTAG On September 15, 2003, the DataTAG project was the first transatlantic testbed project was the first transatlantic testbed offering direct 10GigE access using offering direct 10GigE access using Juniper’sJuniper’s VPN layer2/10GigE emulation.VPN layer2/10GigE emulation.
12Final DataTAG Review, 24 March 2004 12
Internet2 land speed record history (IPv4 & IPv6) period 2000-2003
0.000
1.000
2.000
3.000
4.000
5.000
6.000
Month Mar-00 Apr-02 Sep-02 Oct-02 Nov-02 Feb-03 May-03 Oct-03 Nov-03 Nov-03Month
Evolution of the I2LSR in Gigabit/second
IPv4 (Gb/s)
IPv6 (Gb/s)
0
10000
20000
30000
40000
50000
60000
70000
Month Mar-00 Apr-02 Sep-02 Oct-02 Nov-02 Feb-03 May-03 Oct-03 Nov-03 Nov-03
Month
Internet2 landspeed record history(in terabit-meters/second)
IPv4 terabit-meters/second)
IPv6 (terabit-meters/second)
Impact of a single multi-Gb/s flow on the Abilene backbone
13Final DataTAG Review, 24 March 2004 13
Internet2 land speed record history (IPv4 & IPv6) period 2000-2004
0.760
0.402
0.923
2.38
5.44
5.64
4.2226
0.956
0.402
0.923
2.38
5.44
5.64
6.25
7.09
0.483
0.348
0.983
4
0.483
0.348
0.983
4
Month
Mar-00
Apr-02
Sep-02
Oct-02
Nov-02
Feb-03
May-03
Oct-03
Nov-03
Nov-03
Feb-04
Apr-04
May-04
IPv6 (Gb/s) multiplestreams
IPv6 (Gb/s) single stream
IPv4 (Gb/s) multiplestreams
IPv4 (Gb/s) single stream
IPv4
(G
b/s)
sin
gle
stre
am
IPv6
(G
b/s)
mul
tiple
stre
ams
Month
Oct-02
Oct-03
Apr-04
0.000
1.000
2.000
3.000
4.000
5.000
6.000
7.000
8.000
Gigabit/second
Type
Month
Evolution of Internet2 Landspeed record
Month
Mar-00
Apr-02
Sep-02
Oct-02
Nov-02
Feb-03
May-03
Oct-03
Nov-03
Nov-03
Feb-04
Apr-04
May-04