networking for atlas remote farms

23
Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004 R. Hughes-Jones Manchester Networking for ATLAS Remote Farms Richard Hughes-Jones The University of Manchester DataGrid WP7 – Dante Tests on the GÉANT Core End-2-End Measurements from the 4 th Year VLBI Project at Manchester New TCP stacks – the effect on throughput Some Simple Network Tests CERN-Manchester

Upload: ella-kirkland

Post on 31-Dec-2015

19 views

Category:

Documents


0 download

DESCRIPTION

Networking for ATLAS Remote Farms. Richard Hughes-Jones The University of Manchester. DataGrid WP7 – Dante Tests on the G ÉANT Core End-2-End Measurements from the 4 th Year VLBI Project at Manchester New TCP stacks – the effect on throughput Some Simple Network Tests CERN-Manchester. - PowerPoint PPT Presentation

TRANSCRIPT

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Networking for ATLAS Remote Farms

Richard Hughes-JonesThe University of Manchester

DataGrid WP7 – Dante Tests on the GÉANT CoreEnd-2-End Measurements from the 4th Year VLBI Project at ManchesterNew TCP stacks – the effect on throughputSome Simple Network Tests CERN-Manchester

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

DataGrid WP7 – Dante Tests on the GÉANT Core

Set-up

Supermicro PC in: London GEANT PoP Amsterdam GEANT PoP

Smartbits in: London GEANT PoP Frankfurt GEANT PoP

Long link UK-SE-DE2-IT-CH-FR-BE-NL

Short Link UK-FR-BE-NL

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Tests GÉANT Core: UDP throughput

UDP Throughput London-Amsterdam Available BW to packet on wire Then 1/t Wire rate 998 Mbit/s

for packets > 1400 bytes

Packet Loss None for large packets

Dips in BW lined to packet loss SysKonnect NIC int. per packet CPU load important

uk-nl_20tg4-hs-w100_01Oct03

0

100

200

300

400

500

600

700

800

900

1000

0 5 10 15 20 25 30 35 40Spacing between frames us

Rec

v W

ire r

ate

Mbi

ts/s

50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes

0102030405060708090

100

0 5 10 15 20 25 30 35 40Spacing between frames us

% P

acke

t los

s

50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Tests GÉANT Core: Packet re-ordering

Effect of Packet size London-Amsterdam Packets at 10 µs – line speed 10,000 sent Packet Loss ~ 0.1%

Re-order Distribution

Packet re-order uk-nl 10,000 BE sent wait 10 us 01 Oct 03

0

5

10

15

20

25

30

0 500 1000 1500Packet size bytes

Out

of o

rder

% 0

10

20

30

1400 1401 1402 1403 1404Packet size bytes

Ou

t o

f o

rde

r %

Packet re-order uk-nl 10,000 sent wait 10 us

0

100

200

300

400

500

0 1 2 3 4 5 6 7 8 9Length out-of-order

No.

Pac

kets 1400 bytes

1401 bytes

1402 bytes

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Tests GÉANT Core: Packet re-ordering

Effect of LBE background Amsterdam-London BE Test flow Packets at 10 µs – line speed 10,000 sent Packet Loss ~ 0.1%

Re-order Distributions:

UDP 1472 bytes NL-UK-lbexxx_7nov03

02468

101214161820

2 2.2 2.4 2.6 2.8 3 3.2Total Offered Rate Gbit/s

% O

ut o

f ord

er

hstcpStandard TCP line speed90% line speed

Packet re-order 1472 bytes uk-nl 21 Oct 03 10,000 sent wait 10 us

020000400006000080000

100000120000140000160000180000200000

1 2 3 4 5 6 7 8 9Length out-of-order

No.

Pac

kets

0 % lbe

10 % lbe

20 % lbe

30 % lbe

40 % lbe

50 % lbe

60 % lbe

70 % lbe

80 % lbe

Packet re-order 1400 bytes uk-nl 21 Oct 03 10,000 sent wait 10 us

0500

100015002000250030003500400045005000

1 2 3 4 5 6 7 8 9Length out-of-order

No.

Pac

kets

0 % lbe

10 % lbe

20 % lbe

30 % lbe

40 % lbe

50 % lbe

60 % lbe

70 % lbe

80 % lbe

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Tests GÉANT Core: Packet Jitter Amsterdam-London BE Test flow Packet spacing 80 µs

IPPremium Test flow

Flow: BE Background: none

0

10000

20000

30000

40000

50000

0 20 40 60 80 100 120 140

Latency us

Fre

qu

ency

Flow:IPP Background: none

0

50000

100000

150000

200000

250000

0 20 40 60 80 100 120 140Packet Jitter us

Fre

qu

ency

Flow:BE Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

0

5000

10000

15000

20000

25000

30000

35000

40000

0 20 40 60 80 100 120 140

Packet Jitter us

Fre

qu

ency

BE Test flow + Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

flow:IPP Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

0

10000

20000

30000

40000

50000

60000

0 20 40 60 80 100 120 140Packet Jitter us

1-w

ay l

aten

cy u

s

IPPremium Test flow + Background

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Tests GÉANT Core: 1-way Delay Amsterdam-London IPPremium Test flow Packet spacing 80 µs

BE Test flow + Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

BE Test flow + Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

Flow:IPP Background: none

11220

11240

11260

11280

11300

11320

11340

0 2000 4000 6000 8000 10000Packet No.

1-w

ay la

tenc

y us

Flow:BE Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

11000112001140011600118001200012200124001260012800

0 2000 4000 6000 8000 10000Packet No.

1-w

ay la

tenc

y us

Flow:IPP Background: 60% BE 1.4Gbit + 40% LBE 780Mbit

1120011250113001135011400114501150011550116001165011700

0 2000 4000 6000 8000 10000Packet No.

1-w

ay la

tenc

y us

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

VLBI Project: Test Topology

SuperJANET4

Jodrell

Manchester

SURFnet

JIVEDwingaloo

Adam MathewsSteve O’TooleUniv of Manchester

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Gnt5-DwMk5 11Nov03-1472 bytes

0

2

4

6

8

10

12

0 5 10 15 20 25 30 35 40Spacing between frames us

% P

acket

loss

Gnt5-DwMk5

DwMk5-Gnt5

Manchester to Dwingeloo 2.0G Hz Xeon 1.2 GHz PIII

Re-ordering vs Offered Load

VLBI Project: Throughput

Gnt5-DwMk5 11Nov03/DwMk5-Gnt5 13Nov03-1472bytes

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40Spacing between frames us

Recv W

ire r

ate

Mbits/s

Gnt5-DwMk5

DwMk5-Gnt5

Gnt5-DwMk5 11Nov03 1472 bytes

020406080

100

0 5 10 15 20 25 30 35 40Spacing between frames us

% K

erne

l S

ende

r

Gnt5-DwMk5 11Nov03 1472 bytes

020406080

100

0 5 10 15 20 25 30 35 40Spacing between frames us

% K

erne

l R

ecei

ver

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

1472 byte Packets man -> JIVE FWHM 22 µs (B2B 3 µs )

VLBI Project: Jitter & 1-way Delay

1472 bytes w=50 jitter Gnt5-DwMk5 28Oct03

0

2000

4000

6000

8000

10000

0 20 40 60 80 100 120 140

Jitter us

N(t

)

1472 bytes w=50 jitter Gnt5-DwMk5 28Oct03

1

10

100

1000

10000

0 20 40 60 80 100 120 140

Jitter us

N(t

)

1472 bytes w12 Gnt5-DwMk5 21Oct03

0

2000

4000

6000

8000

10000

12000

2000 2100 2200 2300 2400 2500 2600 2700 2800 2900 3000Packet No.

1-w

ay

de

lay

us

1472 bytes w12 Gnt5-DwMk5 21Oct03

0

2000

4000

6000

8000

10000

12000

0 1000 2000 3000 4000 5000Packet No.

1-w

ay d

elay

us

1-way Delay – note the packet loss (points with 0 –way delay)

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Aggregated Variance Method Divide time series length N into

blocks of size m Calc mean of each section Xm(k)

k= 1 … N/m Calc variance VXm of these Xm(k) Vary m size of the blocks

Plot on log-log & fit slope β Hurst parameter H

β = 2H -2 Measure:

β = -0.355 which gives H 0.822 H =1 no long range dependence

VLBI Project: Packet Loss – Long Range Effects?

y = -0.355x + 2.8826

0

0.5

1

1.5

2

2.5

3

3.5

4

0 0.5 1 1.5 2 2.5 3sub-sample size Log10( m )

Ag

gri

ga

te-v

ari

an

ce L

og

10

( X

(m)

)

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Traffic Flows Manchester – NetNorthWest - SuperJANET Access links

Two 1 Gbit/s

Access links:SJ4 to GÉANT GÉANT to SurfNet

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

High Performance TCP – DataTAG Different TCP stacks tested on the DataTAG Network 128 ms round trip time Drop 1 in 106

High-SpeedRapid recovery

ScalableVery fast recovery

StandardRecovery would

take ~ 10 mins

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Drop 1 in 25,000 Rtt 6.2 ms Recover in 1.6 s

High Performance TCP – MB-NG

Standard HighSpeed Scalable

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Some Network Tests TCP Request – Response

Zero stats OK done

Send statistics:CPU load & no. int1-way delay

Send event data

Request-Response time (Histogram)

Request event

Get remote statistics

●●● ●●●

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Lab Test: TCP Request-Response Histograms PC – router – PC BE Test flow Request spacing 0 µs

Request spacing 10 ms

0.5M bytes man02-3_7may04

0

200

400

600

800

1000

4280 4300 4320 4340 4360 4380 4400 4420 4440 4460

Latency us

N(t

)

0.5M bytes w 10ms man02-3_7may04

0

200

400

600

800

4280 4300 4320 4340 4360 4380 4400 4420 4440 4460Latency us

N(t

)

1.0M bytes man02-3_7may04

0

200

400

600

800

1000

8580 8600 8620 8640 8660 8680 8700 8720 8740 8760

Latency us

N(t

)

2.0 M bytes man02-3_7may04

0

200

400

600

800

1000

17080 17100 17120 17140 17160 17180 17200 17220 17240 17260

Latency us

N(t

)

1.0 M bytes w 10ms man02-3_7may04

0

200

400

600

800

1000

8580 8600 8620 8640 8660 8680 8700 8720 8740 8760

Latency us

N(t

)2.0 M bytes w 10ms man02-3_7may04

0

200

400

600

800

1000

17080 17100 17120 17140 17160 17180 17200 17220 17240 17260

Latency us

N(t

)

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Man-CERN: TCP Request-Response Latency DataTAG PC – backup link

BE Tests Request spacing 0 µs Win size 2.5Mbytes

Compare with UDP latency Large differences

Rtt of 20 msdelay*bw = 2.5 Mbytes

1Mbyte data = 690 pkts interesting bursts !

w05gva-gnt5_7May04_TCP

0

50000

100000

150000

200000

250000

300000

0 20000 40000 60000 80000 100000 120000 140000 160000

Message length bytes

Lat

ency

us

req-resp UDP latency us

ave time

w05gva-gnt5_7May04_TCP

0

100000

200000

300000

400000

500000

600000

0 20000 40000 60000 80000 100000 120000 140000 160000

Message length bytes

La

ten

cy

us

ave time

min time

max time

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Man-CERN: UDP Throughput & Packet Loss DataTAG PC – backup link BE Tests Throughput

Packet loss

w05gva-gnt5_7May04_UDP

0

100

200

300

400

500

600

700

800

900

1000

0 5 10 15 20 25 30 35 40Spacing between frames us

Rec

v W

ire

rate

Mb

its/

s

50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes

w05gva-gnt5_7May04_UDP

0

2

4

6

8

10

12

14

16

18

0 5 10 15 20 25 30 35 40Spacing between frames us

% P

acke

t lo

ss

50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Traffic Flows Manchester – NetNorthWest - SuperJANET Access links

Link to PC in M/c Access links: 1 GE Man to NNW

Total Man to NNW

NNW to SuperJANET4

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Divide time series of packets into 1000 slices of 50 packets

Total lost packets 1410 Average number / slice = 1.4

Calc Poisson Probability P(n, µ) = µ n e -µ

n!

Curves close but not exact Could be more than 1 process

VLBI Project: Packet Loss – Is it Poisson?

0

50

100

150

200

250

300

350

400

0 5 10 15n num lost in sub-sample

N(n

)

run12b

1

1.3

1.4

1.8

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Traffic QoS Classes on GÉANT Backbone

Normal Traffic

Normal Traffic +

Less Than Best Effort 2.0 Gbit/s

Normal Traffic +

Radio Astronomy Data 500 Mbit/s

Normal Traffic +

Radio Astronomy Data +

Less Than Best Effort 2.0 Gbit/s

Max Throughput on 2.5 G PoS

Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004R. Hughes-Jones Manchester

Some Measurements made during ER2002

No LBE

0

2

4

6

8

10

12

14

16

18

20

0 20 40 60 80 100 120 140 160 180 200Transfer number

No.

Out

of

ord

er

0

5000

10000

15000

20000

25000

No

. L

ost

num_badorder

num_lost

With 1.8Gbit LBE

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

0 20 40 60 80 100 120 140 160Transfer number

No.

Out

of

ord

er

0

5000

10000

15000

20000

25000

No

. L

ost

num_badorder

num_lost