proof - parallel root facility kilian schwarz robert manteufel carsten preuß gsi bring the kb to...

28
PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI http://root.cern.ch Bring the KB to the PB not the PB to the KB

Upload: madlyn-fields

Post on 11-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

PROOF - Parallel ROOT Facility

Kilian SchwarzRobert Manteufel

Carsten PreußGSI

http://root.cern.ch

Bring the KB to the PB not the PB to the KB

Page 2: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

IntroductionA step towards a solution: (Ali) ROOT + AliEn

Andreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/Japan

● ROOT is becoming most popular available physics analysis toolkit.● Interactive analysis work in familiar C++ style syntax● data visualisation, an object-oriented I/O system● crucial role for the LCG project ?!● is successfully used within the AliROOT framework of the ALICE

experiment as an all in one solution ● PROOF extends workstation based concept of ROOT to

the 'parallel ROOT facility'. ● user procedures are kept identical during an analysis session● tasks are distributed automatically in the background

●AliEn as a GRID analysis platform provides two key elements:● a global filesystem

➔ files are indexed and tagged in a virtual file catalogue and everywhere globally accessible

● a global queuesystem➔ global job scheduling according to resource requirements

Page 3: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Parallel Analysis of Event Data

root

Remote PROOF Cluster

proof

proof

proof

TNetFile

TFile

Local PC

$ root

ana.Cstdout/obj

node1

node2

node3

node4

$ root

root [0] tree.Process(“ana.C”)

$ root

root [0] tree.Process(“ana.C”)

root [1] gROOT->Proof(“remote”)

$ root

root [0] tree.Process(“ana.C”)

root [1] gROOT->Proof(“remote”)

root [2] dset->Process(“ana.C”)

ana.C

proof

proof = slave server

proof

proof = master server

#proof.confslave node1slave node2slave node3slave node4

*.root

*.root

*.root

*.root

TFile

TFile

Page 4: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

PROOF - Scalability

Page 5: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

GSI environmentthe prooflogin-script

scanning user-parameters for errorsprocessing user-parameters

scanning LSF-Cluster for PROOF-jobstesting .rootrc

building scripts (cleanup.sh, proofstarter.C and proofd.sh) getting / setting ROOT-version

starting the wanted amount of PROOF-daemonsbuilding .proof.conf and .rootauthrc

starting local rootdstarting ROOT and executing proofstarter.C

killing jobs and processesremoving all builded files

starting PROOF, uploading packages and starting the analysis

Page 6: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

User-parameters/usr/local/bin/prooflogin

-s slave-count-t termination-time-v ROOT-version-f ROOT-files-lib library-files-par file-packages-mol starts the master on the localhost-? / -h / -help help for proof.shoptional : -file a text-file with all parameters

written in

Page 7: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

dedicated batch queue for PROOF

only proof jobs are started in the dedicated Proof queue

Quick Response Queue currently in test operation on 30

nodes of the GSI batch farm In this queue proof jobs have

advantage. Non proof jobs will be set on hold

Page 8: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

PROOF configuration files

$HOME/.rootauthrc (e.g.)proofserv lxb108:kschwarz:0 lxb109:kschwarz:0 lxb110:kschwarz:0

•$HOME/.proof.conf (e.g.)

node lxb108 port=1095 usrpwdslave lxb108 port=1096 usrpwdslave lxb109 port=1095 usrpwdslave lxb109 port=1096 usrpwdslave lxb110 port=1095 usrpwdslave lxb110 port=1096 usrpwd

Page 9: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Interactive Analysis with PROOF

Andreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/JapanAndreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/Japan

Basic requirements: ●Analysis data has to be stored as objects derived from TObject in ROOT trees

● proofds have to load extension libraries for user-specific objects toaccess the data members

● Analysis code has to be inserted in the automatically generated selector macro for the object to be analyzed:<classobject>->MakeSelector();

Receipe:-store your objects in trees-use the selector macro for analysis code

Page 10: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Interactive Analysis with AliEn + ROOT/PROOF

Andreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/JapanAndreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/Japan

● interactive analysis with Proof steered by a data packetizer

● in a local cluster:➔ cluster wide accessible data can be processed by all

slaves➔ packet takeover by all slaves!

● in a grid environment:➔ site wide accessible data can be processed by all slaves

➔ packet takeover by all slaves within one site !

Work Distribution

Page 11: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Short explanation for creating an analysis script

4 steps: Open ROOT-file. make Selector-files (analysis files) edit header-file edit source-file

Page 12: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Open ROOT-file and create Selector files

Open your file:TFile f(„/u/dvgamma/hitfile.root");

Create Selector files:hittree->MakeSelector(„Anaproof“);

the ROOT-file contains a tree called „hittree“

Page 13: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Edit header file Add your branch:

TBranch *b_myHit;

Set branch address: Fchain->SetBranchAddress(„myHit“,&myHit);

Add user defined objects and some data members(will be explained later)

Page 14: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Class TCounter Dummy class

designed for collecting analysis results from slaves

Keep the data till it‘s catched in SlaveTerminate()

Page 15: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Edit source file (1/2)

Explanation for each function see on top

Analysis is embedded in Anaproof::Process(Int_t entry)

the analysis checks the hits in chambers and counts them

the counterobject collects hitcounters

In Anaproof::SlaveTerminate() add:

fOutput->Add(counterobj);

Page 16: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Edit source file (2/2)

First you get your counterobj as an TObject from the outList

Convert it back before it can be used as an TCounter object

Page 17: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

libraries and packagesTMytrackerhit and TCounter are two user defined classes.To use them in a PROOF session you have to build a package that can beuploaded by a ROOT–demon.You need following dictionary/file mix:

libTMytrackerhit/PROOF-INF/SETUP.C

A look into Setup.C:Int_t SETUP(){ gSystem->Load(„/u/dvgamma/projects/globus/onestep/ROOT/libTMytrackhit.so“); gSystem->Load(„/u/dvgamma/projects/globus/onestep/PROOF/TCounter/TCounter.so“);

return 1;}

To create a package:„tar –czf libTMytrackerhit.par libTMytrackerhit;“

Page 18: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Finally launch a PROOF-Session and start the analysis

Start the PROOF-session via script or manually

Inside of the session:

Upload packages

Enable packages

Create TDSet

Add file

Start analysis

Page 19: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

At least a screenshot

Page 20: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Jointventure of AliEn + ROOT

Andreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/JapanAndreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/Japan

APIService

AliEnServices

DBIDB

Proxy Catalogue DB

AliEnC++ API

gSOAPClient

TGridClass

TAlienPlugin

<other>Plugin

●AliEn Services + Catalogue are accessible via TAlien(TGrid) class and global<gGrid> variable in ROOT

● TAlien uses a SOAP based AliEn C++ API Examples:

➔TGrid::Connect(“alien://aliendb1:15000/?direct”,””); // inititate gGrid with AliEn plugin (API server at aliendb1, port 15000)➔gGrid->mkdir(“/alice/acat03”);// create directory in virtual file catalogue

Client Host Client or Remote Host VO Service + DB Hosts

Page 21: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Executable = "root";Packages = "ROOT::3.10.01";Arguments = "Command::ROOT -x macro.C";InputData = {"/alice/production/peters/*Tree.root”};InputFile = {"LF:/alice/user/p/peters/macro.C"};OutputFile = {"myhisto.root"};

Andreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/JapanAndreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/Japan

AliEn Job Description ExampleRunning a ROOT macro on registered data

Simple and readable !

Page 22: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Interactive Analysis with AliEn + ROOT/PROOF

Andreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/JapanAndreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/Japan

PROOFPROOF

USER SESSIONUSER SESSION

PROOF PROOF SLAVE SLAVE SERVERSSERVERS

PROOF MASTERPROOF MASTER SERVERSERVER

AliEn Grid Proof Setup:

PROOF PROOF SLAVE SLAVE SERVERSSERVERS

PROOF PROOF SLAVE SLAVE SERVERSSERVERS TcpRouter

TcpRouterTcpRouter

TcpRouter

● Guaranteed site access throughmultiplexing TcpRouters

Page 23: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Interactive Analysis with AliEn + ROOT/PROOF

Andreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/JapanAndreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/Japan

Sample Session: Connect/Query

Page 24: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Interactive Analysis with AliEn + ROOT/PROOF

Andreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/JapanAndreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/Japan

Sample Session: connection to assigned proofds

Page 25: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Interactive Analysis with AliEn + ROOT/PROOF

Andreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/JapanAndreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/Japan

Sample Session: data processing

Page 26: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Unification of Batch & Interactive Analysis with AliEn + ROOT/PROOF

Andreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/JapanAndreas J. Peters CERN/Geneva @ ACAT03 – Tokyo/Japan

current implementation: ● datasets are represented by objects of the type TDSet in ROOT● a GRID data query assigns data files to TDSet Objects● the “process” method initiates the interactive processing on the assigned GRID proof cluster

to come:● the same “process” method initiates the batch processing of the same data set and automatic merging of results.

ALICE will test the analysis facilities during the physics data challenge end 2004.

Page 27: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

Thanks to Fons Rademakers Andreas Joachim Peters

For their contributions to the transparencies

Page 28: PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI  Bring the KB to the PB not the PB to the KB

http://www-w2k.gsi.de/root/