bnl wide area data transfer for rhic & atlas: experience and plans bruce g. gibbard chep 2006...

20
BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Upload: gilbert-barton

Post on 05-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

BNL Wide Area Data Transfer for RHIC & ATLAS:

Experience and Plans

BNL Wide Area Data Transfer for RHIC & ATLAS:

Experience and Plans

Bruce G. Gibbard

CHEP 2006Mumbai, India

Page 2: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

2

CHEP 2006Mumbai, India

IntroductionIntroduction

The scale of computing required by modern High Energy and Nuclear Physics experiments can’t be met by single institutions, funding agencies or even countries

Grid computing, integrating widely distributed resources into a seamless facility, is the solution of choice

A critical aspect of such Grid computing is the ability to move massive data sets over great distances in near real time

High bandwidth wide area transfer rates Long term sustained operations

Page 3: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

3

CHEP 2006Mumbai, India

Specific Needs at BrookhavenSpecific Needs at Brookhaven

HENP Computing at BNL Tier 0 center for Relativistic Heavy Ion

Collider - RHIC Computing Facility (RCF) US Tier 1 center for ATLAS experiment at

the CERN LHC – ATLAS Computing Facility (ACF)

RCF requires data transfers to collaborating facilities

Such as RIKEN center in JapanACF requires data transfers from CERN

and on to ATLAS Tier 2 Centers (Universities)

Such as Boston, Chicago, Indiana, Texas/Arlington

Page 4: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

4

CHEP 2006Mumbai, India

BNL Staff InvolvedBNL Staff Involved

Those involved in this work at BNL were member of the RHIC and ATLAS Computing Facility and the PHENIX and ATLAS experiments

Not named here there were of course similar contributing teams at the far end of these transfers: CERN, Riken, Chicago, Boston, Indiana, Texas/Arlington

• M. Chiu• W. Deng• B. Gibbard• Z. Liu• S. Misawa• D. Morrison• R. Popescu• M. Purschke• O. Rind• J. Smith• Y. Wu• D. Yu

Page 5: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

5

CHEP 2006Mumbai, India

PHENIX Transfer of Polarized Proton Data To Riken Computing Facility in

Japan

PHENIX Transfer of Polarized Proton Data To Riken Computing Facility in

Japan

Near Real Time In particular not to tape storage so no

added tape retrieval required Very shortly after end of RHIC run,

transfer should endPart of RHIC Run in 2005 (~270

TB)Planned Again for RHIC Run in

2006

Page 6: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

6

CHEP 2006Mumbai, India

Page 7: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

7

CHEP 2006Mumbai, India

Typical Network Activity During PHENIX Data Transfer

Typical Network Activity During PHENIX Data Transfer

Page 8: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

8

CHEP 2006Mumbai, India

Page 9: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

9

CHEP 2006Mumbai, India

For ATLAS, (W)LCG Exercises

For ATLAS, (W)LCG Exercises

Service Challenge 3 Throughput Phase (WLCG and computing

sites develop, tune and demonstrate data transfer capacities)• July ‘05• Rerun in Jan ‘06

Service Challenge 4 To begin in April 2006

Page 10: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

10

CHEP 2006Mumbai, India

Read poolsDCap doors

SRM door doors

GridFTP doors doors

Control Channel

write pools

Data Channel

DCap Clients

Pnfs Manager Pool Manager

HPSS

GridFTP Clientsd

SRM Clients

Oak Ridge Batch system

dCache System

BNL ATLAS dCache/HPSS Based SEBNL ATLAS dCache/HPSS Based SE

Page 11: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

11

CHEP 2006Mumbai, India

Disk to Disk Phase of SC3Disk to Disk Phase of SC3 Transfer rate to 150 MB/sec achieved during early

standalone operations Even though FTS (transfer manager) failed to properly

support dCache SRMCP degrading performance of BNL Tier 1 dCache based storage element

Page 12: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

12

CHEP 2006Mumbai, India

Overall CERN Operations During Disk to Disk PhaseOverall CERN Operations During Disk to Disk Phase

Saturation of network connection at CERN required throttling of individual site performances

Page 13: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

13

CHEP 2006Mumbai, India

Disk to Tape PhaseDisk to Tape Phase

Page 14: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

14

CHEP 2006Mumbai, India

dCache Activity During Disk to Tape Phase

dCache Activity During Disk to Tape Phase

Tape Writing Phase Green indicated income data Blue indicates data being migrated out to HPSS, the

tape storage systemRate at 60-80 MBytes/sec were sustained

Page 15: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

15

CHEP 2006Mumbai, India

SC3 T1 – T2 ExercisesSC3 T1 – T2 Exercises Transfer to 4 Tier 2 sites (Boston, Chicago, Indiana,

Texas/Arlington) resulted in aggregate rates to 40 MB/sec but typically ~15 MB/sec and quite inconsistent

Tier 1 sites only supported Gridftp on classic storage elements and were not prepared to support sustained operations

Page 16: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

16

CHEP 2006Mumbai, India

Potential Network ContentionPotential Network Contention

BNL has been operating with an OC 48 ESnet WAN connection with 2 x 1 GB/sec connectivity over to the ATLAS/RHIC network fabric

PHENIX sustain transfer to Riken CCJ ATLAS Service Challenge Test

Page 17: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

17

CHEP 2006Mumbai, India

Network UpgradeNetwork Upgrade

ESnet OC48 WAN connectivity is being upgraded to 2 x

BNL site connectivity from border router to RHIC/ATLAS facility is being upgrade to redundant 20 Gb/sec paths

Internally, in place of previous channel bonding

ATLAS switches are being redundantly connected at 20 Gb/sec

RHIC switches are being redundantly connected at 10Gb/sec

All will be complete by end of this month

Page 18: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

18

CHEP 2006Mumbai, India

RHIC/PHENIX Plans ‘06RHIC/PHENIX Plans ‘06

RHIC will run again this year with polarized protons and so the data will again be transferred to Riken Center in Japan.

Data taking rates will be somewhat higher with somewhat better duty factor so transfer may have to support rates as much as a factor of two higher

Such running is likely to begin in early March

Expect to use SRM for transfer rather than just Gridftp for additional robustness

Page 19: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

19

CHEP 2006Mumbai, India

WLHC Service Challenge 4WLHC Service Challenge 4

Service challenge transfer goals are for nominal real transfer rates required by ATLAS to US Tier 1 in first years of LHC operation

200 MB/sec (Disk at CERN to Tape at BNL) Disk to Disk to begin in April with Disk to Tape to

follow as soon as possible BNL Tier 1 expects to be ready with new tape

system in April to do Disk to Tape BNL is planning on being able to use dCache

SRMCP in these transfersTier 2 exercises at a much more serious level

are anticipated using dCache/SRM on storage elements

Page 20: BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India

Bruce G. Gibbard

20

CHEP 2006Mumbai, India

ConclusionsConclusions

Good success to date in both ATLAS exercises and RHIC real operations

New round with significantly higher demands within next 1-2 months

Upgrades of network, storage elements, tape systems, and storage element interfacing should make it possible to satisfy these demands