chep031 analysis of cms heavy ion simulation data using root/proof/grid jinghua liu for pablo yepes,...

35
Using Evidence to Improve Using Evidence to Improve Teaching and Learning Teaching and Learning (with Technology): (with Technology): Asking the Right Questions Asking the Right Questions http://tinyurl.com/cp6gs2 Stephen C. Ehrmann 1 [email protected]

Upload: horatio-hancock

Post on 17-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 1

Analysis of CMS Heavy Ion Simulation Data Using

ROOT/PROOF/Grid

Jinghua Liu for

Pablo Yepes, Jinghua Liu Rice University, Houston, TX

Maarten Ballintijn, Gunther Roland, Bolek Wyslouch, Jinlong Zhang

MIT, Cambridge, MA

Supported by NSF grants #0218603, #0219063

Page 2: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 2

Outline

From data analysis user’s point of view

Why: ROOT/PROOF/Grid How: Step by Step

What: Test Result Summary

Other PROOF talks in this conference:

Fons Rademakers

Maarten Ballintijn

Page 3: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 3

ROOT/PROOF

ROOT as a data analysis toolPROOF: Parallel ROOT Facility ,based on and part of ROOTon clusters of heterogeneous machines

• parallel analysis of objects in a set of files• parallel execution of scripts

Transparency, Scalability, Adaptability, Error handling, Authentication“Bring the KB to the PB not the PB to the KB” KB: code-->CPU, PB: data

Use distributed CPUs to analyze distributed data

Page 4: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 4

PROOF/Grid Interface

Use a Grid Resource Broker to detect which nodes in a cluster can be used in the parallel session

Use Grid File Catalogue and Replication Manager

Utilize Grid Monitoring Services Support Globus Authentication Abstract Grid interface

Page 5: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 5

Step by Step

Setup PC cluster(s) (for PROOF/Grid) Prepare the data files Write analysis code (algorithm) Compile a data set for PROOF Run a PROOF job Get the results

Page 6: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 6

PC Clusters Client machine (desktop) P4 @ 1.8GHz /512MB/40GB

Cluster1: 2 Dual Xeon @ 2.4GHz /1GB/360GB 1 Dual Athlon @ 1.73GHz /1GB/240GB

8 Dual PIII @ 400MHz /512MB/60GB Cluster 2:

3 Dual Athlon @ 1.67GHz /2GB/200GB Operating systems:

RedHat 6.1, RedHat 7.3, Slackware 8.1 Globus version: 2.2

Page 7: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 7

CMS Heavy Ion Simulation

Jet & high-pT particle angular correlation Use Calorimeters only

Page 8: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 8

CMS Heavy Ion Simulation

Pythia (event generator): 10,000 jet events Hijing (Heavy Ion event generator): 1000

events Each Hijing event (dN/dy~5000) was divided into ~500

sub-events Randomly re-combine 500 sub-events (from different

events) to form a new Hijing event, a cheap way to obtain more Monte Carlo events

CMSIM (GEANT 3 based simulation program for CMS)

Page 9: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 9

Data Production: Globus Jobs

Globus Gate Keeper (PBS)

Work node Work node Work node

Globus Gate Keeper (Condor)

Work node Work node Work node Work node

Client PC

Globus used to submit & manage the jobs

No data replication (files were intentionally stored locally)

Page 10: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 10

Build ROOT Tree

Superimpose jet events on top of Hijing events and generate ROOT Tree Standalone code linked with ROOT libraries

CMS: Ecal (Electromagnetic Calorimeter): barrel 61200 cells, endcap 14648 cells HCal (Hadronic Calorimeter): 14616 cells (multi-layer) 4032 towers calotree--Ecal cells (energy, position) Hcal towers (energy, position) 10,000 events were split into 100 files, 100 events

each, file size ~160MB, total data 16GB

Data distributed, each node got some local files

Page 11: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 11

TSelector – The Algorithms Create TSelector from TTree

$ root

root[0] TFile f(“heavyion001.root”)

root[1] calotree->MakeSelector(“myselector”)

root[2] .q

$ ls

myselector.C myselector.h

Add the analysis code (algorithm) into TSelector

$ vi myselector.h

$ vi myselector.C

Page 12: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 12

TSelector – The Algorithms myselector.h

Class myselector : public TSelector {

public:

TTree *fChain;

.

.

private:

TH1F *hist1d;

TH2F *hist2d;

.

.

.

}

Page 13: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 13

TSelector – The Algorithms myselector.C

void myselector::Begin(TTree *tree) {

hist1d = new TH1F(“DeltaPhi”,”DeltaPhi”,100,180.,180.);

Hist2d = new TH2F(“EtaPhi”,”EtaPhi”,100,-5.,5.,100,-4.,4.);

fOutput->Add(hist1d);

fOutput->Add(hist2d);

}

Bool_t myselector::Process(Int_t entry) {

user’s analysis code goes here!

for(i=0; i< nclusters; i++) {

if (Et1>5)

for(j=i+1; j< nclusters; j++) {

if(Et2>5) {

DeltaPhi= …

hist1d->Fill(DeltaPhi);

}

Page 14: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 14

TDSet – Data Location

Specify a collection of TTrees or files[] TDSet *ds = new TDSet(“TTree”, “calotree”);

[] ds->Add(“/data1/cms/cmsim/heavyion001.root”);

[] ds->Add(“/data1/cms/cmsim/heavyion002.root”);

[] ds->Add(“lfn://pcs21.rice.edu/data5/heavyion110.root”);

[] ds->Add(“lfn://pcs11.rice.edu/cms/cmsim/heavyion230.root”);

[] ds->Print();

Returned by DB or File Catalog query etc

It’s better to put these into a macro

Page 15: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 15

Running a PROOF Job$ root

[] gROOT->Proof(“proofmaster.rice.edu”);

[] TDSet *ds = new TDSet(“TTree”, “calotree”);

[] ds->Add(“. . .”);

. . .

[] ds->Process(“myselector.C+”, “options”, nentries, first);

(note: options must be pre-coded in myselector.C)

[] TH1F *h1=(TH1F *)gProof->GetOutput(“DeltaPhi”);

[] h1->Draw();

Page 16: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 16

Angular Correlation

Page 17: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 17

Scale plot Analysis speed vs. CPUs (PIII 1GHz equivalent)

CPU power/data size balanced

CPU intensive calculations

Page 18: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 18

Summary

CMS Heavy Ion Analysis implemented and tested with PROOF

Scales well with CPUs PROOF/Grid can provide the data

analysis power unavailable otherwise. This power can be achieved without much extra effort

PROOF/Grid interface is under rapid development. The plan is to extend the presented study to use Grid interface

Page 19: CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,

CHEP03 19

The End