the worldwide lhc computing grid wlcg service ramp-up lhcc referees’ meeting, january 2007

13
The Worldwide LHC Computing Grid WLCG Service Ramp-Up LHCC Referees’ Meeting, January 2007

Upload: lynne-washington

Post on 04-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Worldwide LHC Computing Grid WLCG Service Ramp-Up LHCC Referees’ Meeting, January 2007

The Worldwide LHC Computing Grid

WLCG Service Ramp-Up

LHCC Referees’ Meeting, January 2007

Page 2: The Worldwide LHC Computing Grid WLCG Service Ramp-Up LHCC Referees’ Meeting, January 2007

Ramp-Up Outline

The clear goal for 2007 is to be ready for first data taking ahead of the machine itself

• This translates to:– Dress Rehearsals in the 2nd half of the year– Preparation for these in the 1st half– Continuous service operation and hardening– Continual (quasi-continuous) experiment production

• Different views:– Experiment, site, Grid-specific, WLCG…

• Will focus on first and (mainly) last of these… – Other views, in particular site views, will come shortly

Page 3: The Worldwide LHC Computing Grid WLCG Service Ramp-Up LHCC Referees’ Meeting, January 2007

3

WLCG Commissioning WLCG Commissioning ScheduleSchedule

Still an Still an ambitious ambitious programme programme aheadahead

Timely testing Timely testing of full data of full data chain from chain from DAQ to T-2 DAQ to T-2 chain was chain was major item major item from last CR from last CR DAQDAQ T-0 T-0

still largely still largely untesteduntested

Page 4: The Worldwide LHC Computing Grid WLCG Service Ramp-Up LHCC Referees’ Meeting, January 2007

Service Ramp-Up

• As discussed at last week’s WLCG Collaboration Workshop, much work has already been done on service hardening – Reliable hardware, improved monitoring & logging, middleware enhancements

Much still remains to be done – this will be an on-going activity during the rest of 2007 and probably beyond

• The need to provide as much robustness in the services themselves – as opposed to constant baby-sitting – is well understood

• There are still new / updated services to deploy in full production (see previous slide)

It is unrealistic to expect that all of these will be ready prior to the start of the Dress Rehearsals

• Foresee a ‘staged approach’ – focussing on maintaining and improving both service stability and functionality (‘residual services’)

Must remain in close contact with both experiments and sites on schedule and service requirements – these will inevitably change with time

• Draft of experiment schedule (from December 2006) attached to agenda• Updated schedules presented last Friday during WLCG w/s (pointer)

Page 5: The Worldwide LHC Computing Grid WLCG Service Ramp-Up LHCC Referees’ Meeting, January 2007

5

Running continously throughout the year (increasing rate)Simulation productionCosmic ray data-taking (detector commissioning)

January to June:Data streaming tests

February and May:Intensive Tier0 tests

From February onwards:Data Distribution tests

From March onwards:Distributed analysis (intensive tests)

May to July:Calibration Data Challenge

June to OctoberFull Dress Rehearsal

November:GO!

ATLAS 2007 Timeline

Page 6: The Worldwide LHC Computing Grid WLCG Service Ramp-Up LHCC Referees’ Meeting, January 2007

6Stefano Belforte INFN Trieste

Timeline

February Deploy PhEDEx 2.5 T0-T1, T1-T1, T1-T2 independent transfers Restart job robot Start work on SAM FTS full deployment

March SRM v2.2 tests start T0-T1(tape)-T2 coupled transfers (same data) Measure data serving at sites (esp. T1) Production/analysis share at sites verified

April Repeat transfer tests with SRM v2.2, FTS v2 Scale up job load gLite WMS test completed (synch. with Atlas)

May Start ramping up to CSA07

June

Page 7: The Worldwide LHC Computing Grid WLCG Service Ramp-Up LHCC Referees’ Meeting, January 2007

WLCG Milestones

• These high-level milestones are complementary to the experiment-specific milestones and more detailed goals and objectives listed in the WLCG Draft Service Plan (see attachment to agenda)– Similar to that prepared and maintained in previous years– Regularly reviewed and updated through LCG ECM– Regular reports on status and updates to WLCG MB / GDB

Focus is on real production scenarios & (moving rapidly to) end to end testing– Time for component testing is over – we learnt a lot but not

enough! – Time before data taking is very short – let alone the dress

rehearsals• All data rates refer to the Megatable and to pp running• Any ‘factors’, such as accelerator and/or service

efficiency, are mentioned explicitly– N.B. ‘catch-up’ is a proven feature of the end-end FTS service

Page 8: The Worldwide LHC Computing Grid WLCG Service Ramp-Up LHCC Referees’ Meeting, January 2007

Q1 2007 – Tier0 / Tier1s

1. Demonstrate Tier0-Tier1 data export at 65% of full nominal rates per site using experiment-driven transfers

– Mixture of disk / tape endpoints as defined by experiment computing models, i.e. 40% tape for ATLAS; transfers driven by experiments

– Period of at least one week; daily VO-averages may vary (~normal)2. Demonstrate Tier0-Tier1 data export at 50% of full nominal rates

(as above) in conjunction with T1-T1 / T1-T2 transfers– Inter-Tier transfer targets taken from ATLAS DDM tests / CSA06 targets

3. Demonstrate Tier0-Tier1 data export at 35% of full nominal rates (as above) in conjunction with T1-T1 / T1-T2 transfers and Grid production at Tier1s

– Each file transferred is read at least once by a Grid job– Some explicit targets for WMS at each Tier1 need to be derived from

above4. Provide SRM v2.2 endpoint(s) that implement(s) all methods

defined in SRM v2.2 MoU, all critical methods pass tests– See attached list; Levels of success: threshold, pass, success, (cum

laude)– This is a requirement if production deployment is to start in Q2!

Page 9: The Worldwide LHC Computing Grid WLCG Service Ramp-Up LHCC Referees’ Meeting, January 2007

Q2 2007 – Tier0 / Tier1s

• As Q1, but using SRM v2.2 services at Tier0 and Tier1, gLite 3.x-based services and SL(C)4 as appropriate, (higher rates? (T1<->T1/2))

• Provide services required for Q3 dress rehearsals– Includes, for example, production Distributed

Database Services at required sites & scale

• Full detail to be provided in coming weeks…

Page 10: The Worldwide LHC Computing Grid WLCG Service Ramp-Up LHCC Referees’ Meeting, January 2007

Measuring Our Level of Success

• Existing tools and metrics, such as CMS PhEDEx quality plots, ATLAS DDM transfer status, provide clear and intuitive views

These plots are well known to the sites and provide a good measure of current status as well as showing evolution with time

• Need metrics for WMS related to milestone 3– CMS CSA06 metrics are a good model

Page 11: The Worldwide LHC Computing Grid WLCG Service Ramp-Up LHCC Referees’ Meeting, January 2007
Page 12: The Worldwide LHC Computing Grid WLCG Service Ramp-Up LHCC Referees’ Meeting, January 2007

12

DDM Functional Test 2006 (9 Tier-1s, 40 Tier-2s)

Tier-1 Tier-2s Sept 06 Oct 06 Nov 06

ASGC IPAS, Uni Melbourne Failed within the cloud

Failed for Melbourne

T1-T1 not testd

BNL GLT2, NET2,MWT2,SET2, WT2 done done 2+GB & DPM

CNAF LNF,Milano,Napoli,Roma1 65% failure rate

done

FZK CSCS, CYF, DESY-ZN, DESY-HH, FZU, WUP Failed from T2 to FZK

dCache problem

T1-T1 not testd

LYON BEIIJING, CPPM, LAPP, LPC, LPHNE, SACLAY, TOKYO

done done, FTS conn =< 6

NG not tested

not tested

not tested

PIC IFAE, IFIC, UAM Failed within the cloud

done

RAL CAM, EDINBOURGH, GLASGOW, LANCS, MANC, QMUL

Failed within the cloud

Failed for Edinbrg.

done

SARA IHEP, ITEP, SINP Failed IHEP not tested

IHEP in progress

TRIUMFALBERTA, TORONTO, UniMontreal, SFU, UVIC Failed within

the cloudFailed T1-T1

not testd

Ne

w D

Q2

re

lea

se

(0

.2.1

2)

Aft

er

SC

4 t

es

t

Page 13: The Worldwide LHC Computing Grid WLCG Service Ramp-Up LHCC Referees’ Meeting, January 2007

Summary

• 2007 will be an extremely busy and challenging year!

For those of us who have been working on LHC Computing for 15+ years (and others too…) it will nonetheless be extremely rewarding

¿ Is there a more important Computing Challenge on the planet this year ?

The ultimate goal – to enable the exploitation of the LHC’s physics discovery potential – is beyond measure