investigating network performance – a case study

Post on 28-Jan-2016

35 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Investigating Network Performance – A Case Study. Ralph Spencer, Richard Hughes-Jones, Matt Strong and Simon Casey The University of Manchester G2 Technical Workshop, Cambridge, Jan 2006. Very Long Baseline Interferometry. eVLBI – using the Internet for data transfer. - PowerPoint PPT Presentation

TRANSCRIPT

Investigating Network Performance – A Case Study

Ralph Spencer, Richard Hughes-Jones, Matt Strong and Simon Casey

The University of ManchesterG2 Technical Workshop, Cambridge, Jan

2006

Very Long Baseline Interferometry

eVLBI – using the Internet for data transfer

GRS 1915+105: 15 solar mass BH in an X-ray binary: MERLIN observations

receding

600 mas = 6000 A.U. at 10 kpc

Sensitivity in Radio Astronomy

• Noise level• B=bandwidth, integration

time.• High sensitivity requires large

bandwidths as well as large collecting area e.g Lovell, GBT, Effelsberg, Camb. 32-m

• Aperture synthesis needs signals from individual antennas to be correlated together at a central site

• Need for interconnection data rates of many Gbit/sec

B/1

New Instruments are making the best use of bandwidth:

• eMERLIN 30 Gbps• Atacama Large mm Array

(ALMA) 120 Gbps• EVLA 120 Gbps• Upgrade to European VLBI:

eVLBI 1 Gbps• Square Km Array (SKA)

many Tbps

The European VLBI NetworkEVN

• Detailed radio imaging uses antenna networks over 100s-1000s km

• Currently use disk recording at 512Mb/s (Mk5)

• real-time connection allows greater – response– reliability– sensitivity– Need Internet

eVLBI

WesterborkNetherlands

Dedicated

Gbit link

EVN-NREN

OnsalaSweden

Gbit link

Jodrell BankUK

DwingelooDWDM link

CambridgeUK

MERLIN

MedicinaItaly

Chalmers University

of Technolo

gy, Gothenbu

rg

TorunPoland

Gbit link

Testing the Network for eVLBI Aim is to obtain maximum BW compatible

with VLBI observing systems in Europe and USA.

First sustained data flow tests in Europe:

iGRID 200224-26 September 2002

Amsterdam Science and Technology Centre (WTCW)

The Netherlands“ We hereby challenge the international research

community to demonstrate applications that benefit from huge amounts of bandwidth! ”

iGRID2002 Radio Astronomy VLBI Demo.

• Web based demonstration sending VLBI data– A controlled stream of UDP packets– 256-500 Mbit/s

• production network Man –Superjanet Geant --Amsterdam

• Dedicated lambda Amsterdam Dwingeloo

The Works:

n bytes

Wait timetime

Raid0Disc

UDP Data

Raid0Disc

RingBuffer RingBuffer

TCP Control

Web Interface

UDP Throughput on the Production WAN

Manc-UvA SARA 750 Mbit/s SJANET4 + Geant +

SURFnet 75% Manchester Access link

Manc-UvA SARA 825 Mbit/s

UDP Man-UvA Gig 19 May 02

0

100

200

300

400

500

600

700

800

900

1000

0 5 10 15 20 25 30 35 40

Transmit Time per frame us

Rec

v W

ire

rate

Mb

its/

s

50 bytes

100 bytes

200 bytes

400 bytes

600 bytes

800 bytes

1000 bytes

1200 bytes

1472 bytes

UDP Man-UvA Gig 28 Apr 02

0

100

200

300

400

500

600

700

800

900

1000

0 5 10 15 20 25 30 35 40

Transmit Time per frame us

Rec

v W

ire

rate

Mbi

ts/s

50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes

How do we test the network?• Simple connectivity test from Telescope site to

correlator (at JIVE, Dwingeloo, The Netherlands, or MIT Haystack Observatory, Massachusetts) : traceroute, bwctl

• Performance of link and end hosts: UDPmon, iPERF• Sustained data tests vlbiUDP (under development)• True eVLBI data from Mk5 recorder: pre-recorded

(Disk2Net) or Real Time (Out2Net)

Mk 5’s are 1.2 GHz P3’s with Streamstore cardsand 8-pack exchangeable disks, 1.3 Tbytes storage.Capable of 1 Gbps continuous recording and playback.Made by Conduant, Haystack design.

Jodrell BankUK

OnsalaSweden

MedicinaItaly

TorunPoland

EffelsbergGermany

WesterborkNetherlands

Telescope connections

JIVE

1Gb/s

1Gb/s

1Gb/s

155Mb/s

MERLIN

CambridgeUK

2* 1G

1Gb/s light now

MERLINe

??end 06???

eVLBI Milestones • January 2004: Disk buffered eVLBI session:

• Three telescopes at 128Mb/s for first eVLBI image

• On – Wb fringes at 256Mb/s

• April 2004: Three-telescope, real-time eVLBI session.

• Fringes at 64Mb/s• First real-time EVN image - 32Mb/s.

• September 2004: Four telescope real-time eVLBI• Fringes to Torun and Arecibo• First EVN, eVLBI Science session

• January 2005: First “dedicated light-path” eVLBI• ??Gbyte of data from Huygens descent

transferred from Australia to JIVE• Data rate ~450Mb/s

• 20 December 20 2004• connection of JBO to Manchester by 2 x 1 GE• eVLBI tests between Poland Sweden UK and Netherlands at 256 Mb/s

• February 2005• TCP and UDP memory – memory tests at rates up to 450 Mb/s(TCP) and 650 Mb/s (UDP)• Tests showed inconsistencies betweeb Red Hat kernals, rates of 128 Mb/s only obtained on 10 Feb• Haystack (US) – Onsala (Sweden) runs at 256 Mb/s

• 11 March 2005 Science demo• JBO telescope winded off, short run on calibrator source done

Summary of EVN eVLBI tests

• Regular tests with eVLBI Mk5 data every ~6 weeks – 128 Mpbs OK, 256 Mpbs often,– 512 Mbps Onsala – Jive occasionally– but not JBO at 512 Mbps – WHY NOT?(NB using Jumbo packets 4470 or 9000 bytes)

• Note correlator can cope with large error rates– up to ~ 1 %– but need high throughput for sensitivity– implications for protocols, since throughput on TCP

is very sensitive to packet loss.

Gnt5-DwMk5 11Nov03-1472 bytes

0

2

4

6

8

10

12

0 5 10 15 20 25 30 35 40Spacing between frames us

% P

acket

loss

Gnt5-DwMk5

DwMk5-Gnt5

Throughput vs packet spacing Manchester: 2.0G Hz Xeon Dwingeloo: 1.2 GHz PIII Near wire rate, 950 Mbps UDPmon

Packet loss

CPU Kernel Load sender

CPU Kernel Load receiver 4th Year project

Adam Mathews Steve O’Toole

Gnt5-DwMk5 11Nov03/DwMk5-Gnt5 13Nov03-1472bytes

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40Spacing between frames us

Recv W

ire r

ate

Mbits/s

Gnt5-DwMk5

DwMk5-Gnt5

Gnt5-DwMk5 11Nov03 1472 bytes

020406080

100

0 5 10 15 20 25 30 35 40Spacing between frames us

% K

erne

l S

ende

r

Gnt5-DwMk5 11Nov03 1472 bytes

020406080

100

0 5 10 15 20 25 30 35 40Spacing between frames us

% K

erne

l R

ecei

ver

UDP Throughput Oct-Nov 2003 Manchester-Dwingeloo Production

ESLEA

• Packet loss will cause low throughput in TCP/IP

• Congestion will result in routers drooping packets: use Switched Light Paths!

• Tests with MB-NG network Jan-Jun 05

• JBO connected to JIVE via UKLight in June (thanks to John Graham, UKERNA)

• Comparison tests between UKLight connections JBO-JIVE and production (SJ4-Geant)

Project Partners

Project Collaborators

The Council for the Central Laboratoryof the Research Councils

Funded by

EPSRC GR/T04465/01

www.eslea.uklight.ac.uk

£1.1 M, 11.5 FTE

UKLight Switched light path

Tests on the UKLight switched light-path Manchester : Dwingeloo

• Throughput as a function of inter-packet spacing (2.4 GHz dual Xeon machines)

• Packet loss for small packet size • Maximum size packets can reach

full line rates with no loss, and there was no re-ordering (plot not shown).

gig03-jiveg1_UKL_25Jun05

0100200300400500600700800900

1000

0 10 20 30 40Spacing between frames us

Rec

v W

ire r

ate

Mbi

t/s

50 bytes

100 bytes

200 bytes

400 bytes

600 bytes

800 bytes

1000 bytes

1200 bytes

1400 bytes

1472 bytes

gig03-jiveg1_UKL_25Jun05

0.0001

0.001

0.01

0.1

1

10

100

0 10 20 30 40Spacing between frames us

% P

acke

t lo

ss

50 bytes

100 bytes 200 bytes

400 bytes 600 bytes

800 bytes 1000 bytes

1200 bytes 1400 bytes

1472 bytes

Tests on the production network Manchester : Dwingeloo.

• Throughput

• Small (0.2%) packet loss was seen

• Re-ordering of packets was significant

gig6-jivegig1_31May05

0.0001

0.001

0.01

0.1

1

10

100

0 10 20 30 40Spacing between frames us

% P

acket

loss

50 bytes

100 bytes 200 bytes

400 bytes 600 bytes

800 bytes 1000 bytes

1200 bytes 1400 bytes

1472 bytes

UKLight using Mk5 recording terminals

Jodrell BankUK

DwingelooDWDM link

MedicinaItaly Torun

Poland

e-VLBI at the GÉANT2 Launch Jun 2005

UDP Performance: 3 Flows on GÉANT• Throughput: 5 Hour run 1500 byte MTU

• Jodrell: JIVE2.0 GHz dual Xeon – 2.4 GHz dual Xeon670-840 Mbit/s

• Medicina (Bologna): JIVE 800 MHz PIII – Mk5 (623) 1.2 GHz PIII 330 Mbit/s limited by sending PC

• Torun: JIVE 2.4 GHz dual Xeon – Mk5 (575) 1.2 GHz PIII

245-325 Mbit/s limited by security policing

(>600Mbit/s 20 Mbit/s) ?

• Throughput: 50 min period• Period is ~17 min

BW 14Jun05

0

200

400

600

800

1000

0 500 1000 1500 2000Time 10s steps

Rec

v w

ire ra

te M

bit/s

JodrellMedicinaTorun

BW 14Jun05

0

200

400

600

800

1000

200 250 300 350 400 450 500Time 10s steps

Rec

v w

ire ra

te M

bit/s

JodrellMedicinaTorun

18 Hour Flows on UKLightJodrell – JIVE, 26 June 2005

• Throughput:• Jodrell: JIVE

2.4 GHz dual Xeon – 2.4 GHz dual Xeon

960-980 Mbit/s

• Traffic through SURFnet

• Packet Loss– Only 3 groups with 10-150 lost

packets each– No packets lost the rest of the

time

• Packet re-ordering– None

man03-jivegig1_26Jun05

0

200

400

600

800

1000

0 1000 2000 3000 4000 5000 6000 7000

Time 10s steps

Rec

v w

ire r

ate

Mbi

t/s

w10

man03-jivegig1_26Jun05

900910920930940950

960970980990

1000

5000 5050 5100 5150 5200

Time 10s

Recv w

ire r

ate

Mbit/s w10

man03-jivegig1_26Jun05

1

10

100

1000

0 1000 2000 3000 4000 5000 6000 7000

Time 10s steps

Packet

Loss

w10

Recent Results 1:

• iGRID 2005 and SC 2005– Global eVLBI demonstration– Achieved 1.5 Gbps across Atlantic using UKLight– 3 VC-3-13c ~700 Mbps SDH links carrying data

across the Atlantic from Onsala, JBO and Westerbork telescopes

– 512 Mps K4 – Mk5data from Japan to USA– 512 Mbs Mk5 real time interferometry between

Onsala, Westford, Maryland Point antennas correlated at Haystack observatory

– Used VLSR technology from DRAGON project in US to set up light paths.

<JBO Mk2 Westerbork array>

Onsala 20-m

Kashima 34-m >

Recent results 2:• Why can Onsala achieve 512 Mbps from Mk5 to Mk5 even transatlantic?

– Identical Mk5 to JBO – Longer link

• iperf TCP JBO Mk5 to Man. rtt ~1ms 4420 byte packets get 960 Mpbs

• iperf TCP JBO Mk5 to JIVE rtt ~15ms 4420 byte packets get 777 Mpbs

Not much wrong with the networks!

• – –

• shows 94.7% kernel usage and 1.5% idle

• shows 96.3% kernel usage and 0.06% idle – no cpu left!

• Likelihood is that Onsala Mk 5 marginally faster cpu – at critical point for 512 Mbps transmission

• Solution – better motherboards for Mk5’s – about 40 machines to upgrade!

mk5-606-jive_9Dec05

0102030405060708090

100

0 1 2 3 4 5trial

% C

PU

ker

nel

00.511.522.533.544.55

% C

PU

mod

e

kernel

user

nice

idle

mk5-606-g7_10Dec05

0100200300400500600700800900

1000

0 2 4 6 8 10 12 14 16 18 20nice large value - low priority

Thr

ough

put M

bit/s

no CPU load

The Future:• Regular eVLBI tests in EVN continue• Testing Mk5 SuperStor interface <-> network

interaction• Test upgraded Mk5 recording devices• Investigate alternatives to TCP/UDP – DCCP,

vlbiUDP, tsunami, etc.• ESLEA comparing UKLight with production• EU’s EXPReS eVLBI project starts March 2006

– Connection of 100-m Effelsberg telescope in 2006– Protocols for distributed processing– Onsala-JBO correlator test link at 4 Gbps in 2007

• eVLBI will become routine in 2006!

Processing Nodes

Controller/DataConcentrator

VLBI Correlation: GRID Computation task

Questions ?

top related