Download - Summary of the Analysis Systems

13th June 2006 David CollingImperial College London

1

Summary of the Analysis Systems


2

Slightly unusual to be asked to summarise a session that everybody has everyone has just sat through so:

• I will try summarise the important points of each model-This will be a personal view- I “manage” a distributed Tier 2 in the UK that currently supports all LHC experiments- I am involved in CMS computing/analysis

• Then there will be further opportunity to question the experiment experts about the implementation of the models on the Tier 2s.

Outline


3

• Firstly, only three of the four LHC experiments plan to do any analysis at the Tier 2s

• However, conceptually, those three have very similar models.

• They have the majority (if not all) end user analysis being performed at the Tier 2s. This gives the Tier 2 a crucial role in extracting the physics of the LHC.

• The analysis share the Tier 2s with Monte Carlo production

Comparing the Models


4

Comparing the Models

• The experiments want to be able to control the fraction of the Tier 2 resources that are used for different purposes (analysis v production, analysis A v analysis B)

• They all realise that data movement followed by knowledge of the content and location of those data is vitally important.

-They all separate the data content and data location databases- All have jobs going to the data

• The experiments all realise that there is a need to separate the user from complexity of the WLCG.


5

So where do they differ?

• Implementation, here they differ widely on:-What services need to be installed/maintained at each Tier 2.- What additional software software that they need above the “standard” grid installations.- Details of the job submission system (e.g. pilot jobs or not, very different UIs etc)

• How they handle different Grids

• Maturity:- CMS has a system capable of running >100K jobs/month whereas Atlas only has a few hundred GB of appropriate data


6

ImplementationsLets starts with Atlas…

• Different implementations on different Grids.

Looking at the EGEE Atlas implementation.• No services required at the Tier 2 only software installed by SGM.

• All services (file catalogue, data moving services of Don Quixote etc) at the local T1.

• As a Tier 2 “manager” this makes me very happy as it minimises the support load at the Tier 2 and means that it is left to experts at the Tier 1. Means that all sites within the London Tier 2 will be available for Atlas analysis.


7

Accessing data for analysis on the Atlas EGEE installation

SE

FTS

Datasetcatalog

Tier 0

VOBOX

Tier 1Tier 2

CE

LRC

http

lrc protocol rfio

dcap

gridftp

nfs


8

Atlas Implementations

Prioritisation mechanism will come from the EGEE Priorities Working group

Production

Long

Short

CE

CE

Software

70%

20 %

1 %

9 %


9

Atlas Implementations and maturity

US Using Panda system:

• Much more work at the Tier 2

• However US Tier 2 seem to be better endowed with support effort so this may not be problem.

NorduGrid• Implementation still ongoing

Maturity• Only a few hundred GB of appropriate data• Experience of SC4 will be important, especially Don Quixote


10

CMS Implementation

• Require installation of some services at Tier 2s: PhEDEx & trivial file catalogue

• However, it is possible to run the instances for different sites within a distributed T2 at a single site.

• So as a distributed Tier 2 “manager” I am not too unhappy … for example in the UK I can see that all sites in the London Tier 2 and in SouthGrid running CMS analysis but less likely in NorthGrid and ScotGrid


11

CMS Implementation across Grids

• Installation as similar as possible across EGEE and OSG.

• Same UI for both … called crab

•Can use crab to submit to OSG sites via an EGEE WMS or directly via CondorG


12

CMS Maturity

• PhEDex has proved to be a very reliable since DC04

• CRAB in use since end of 2004 • Hundred thousand jobs a month

• Tens of sites both for execution and submission

• Note that there are still failures


13

Alice Implementation

• Only really running on EGEE and Alice specific sites

• Puts many requirement on a site: xrootd, VO Box running AliEn SE and CE, the package management server, MonLisa server, LCG UI and AliEn file transfer.

• All jobs are submitted via AliEn tools

• All data is accessed only via AliEn


14

SACE

FTDMonaLisaPackManLCG-UI

Vo-Box

Port # Access: Outgoing + Service=============================================8082 incoming from World SE (Storage Element)8084 incoming from CERN CM (ClusterMonitor)8083 incoming from World FTD (FileTransferDaemon)9991 incoming from CERN PackMan

Disk Server 1

Disk Server 2

Disk Server 3Xrootdredirector

Port # Access: Outgoing + Service=============================================)1094 incoming from World xrootd file transfer.

Can runon the VO Box

LCG CELCG FTS/SRM-SE/LFC

Storage

Workernode configuration/requirements are equal for batch processing at Tier0/1/2 centers (2 GB Ram/CPU – 4 GB local scratch space)

Tier-2 Infrastructure/Setup Example


15


• All data access is via xrootd … allows innovative access to data. However, it is a requirement on site.

• May be able to use xrootd front ends to standard srm

• Batch analysis implicitly allows prioritisation through a central job queue

• However, this does involve using glexec like functionality


16

Alice Implementation –Batch analysis

JDL

TaskQueue

TaskQueue

SubmissionJDL

JDL

Optimiziation- Splitting- Requirements- FTS replication- Policies

apply

matchAliEnCE

Tier-2LCGUI RB LCG

CE

Tier-2

Agent

BatchSys.

ROOT

JDL

JDL

XML

AliEn FC

Xrootd

Tier-2 SE

API Services

Central services

Tier-2


17


• As a distributed Tier 2 “manager” this set up does not fill me with joy.

• I cannot imagine installing such VO boxes within the London Tier 2 and would be surprised if any UK Tier 2 sites (with the exception Birmingham) install such boxes.


18


Interactive Analysis

• More important to Alice than others.• Novel and interesting approach based Proof and xrootd

Maturity

• Currently, only a handful of people trying to perform Grid based analysis• Not a core part of SC4 activity for Alice.


19

Conclusions

• Three of the four experiments plan to use Tier 2 sites for end user analysis.

• These three experiments have conceptually similar models (at least for batch )

• The implementations of the similar models have very different implications for the Tier 2 supporting the VOs


20

Discussion…

Download - Summary of the Analysis Systems

Top Related