ganga: an interface to the lhc computing grid

13
1 Ganga An interface to the LHC computing grid Matt Williams University of Birmingham

Upload: matt-williams

Post on 08-Jul-2015

398 views

Category:

Software


0 download

DESCRIPTION

Ganga is a tool, designed and used by the large particle physics experiments at CERN. Written in pure Python, it delivers a clean, usable interface to allow thousands of physicists to interact with the huge computing resources available to them. Video at https://www.youtube.com/watch?v=SSdluuVNU3Y

TRANSCRIPT

Page 1: Ganga: an interface to the LHC computing grid

1

GangaAn interface to the LHC computing grid

Matt WilliamsUniversity of Birmingham

)/, . #$("(, - ,#

Page 2: Ganga: an interface to the LHC computing grid

2

CERN and the LHC

● Largest particle physics experiment in the world

● 27km in circumference ● Over 100m underground ● Thousands of physicists● 100s of petabytes of data

Page 3: Ganga: an interface to the LHC computing grid

3

The Grid

Page 4: Ganga: an interface to the LHC computing grid

4

GANGA

● ~2001 LHCb started GANGA, an in-house tool– Specific to our needs

● By 2010 when the LHC turned on, it was used by many more– ATLAS, NA62, T2K and many more smaller experiements

● Python had always been the obvious choice– Used everywhere in Particle Physics (along with C++)

– Easy to create new plugins for experiments

● Can be scripted or with an IPython-based interactive console● Open source, released as GPL (like most CERN software)

Page 5: Ganga: an interface to the LHC computing grid

5

How is it used

j = Job(name = 'Example job')

j.application = Executable()

j.application.exe = File('test.sh')

j.outputfiles = [LocalFile('out.txt')]

j.backend = Local()

j.submit()

Page 6: Ganga: an interface to the LHC computing grid

6

Retrieving results

In [1]: j.peek()

total 200

-rw-r--r-- 1 phrfbi lhcb 0 Jun 22 2013 __syslog__

-rw-r--r-- 1 phrfbi lhcb 141999 Jun 22 2013 stdout

-rw-r--r-- 1 phrfbi lhcb 53671 Jun 22 2013 stderr

-rw-r--r-- 1 phrfbi lhcb 2463 Jun 22 2013 out.txt

-rw-r--r-- 1 phrfbi lhcb 135 Jun 22 2013 __jobstatus__

In [2]: j.peek('out.txt')

Page 7: Ganga: an interface to the LHC computing grid

7

Using the Grid

Just change backend from Local() to LCG()

Other backends are Interactive, PBS, LSF, SGE, Panda, Jedi, Dirac, Condor, ARC, CREAM...

Page 8: Ganga: an interface to the LHC computing grid

8

Input data and splitting

j = Job(name = 'Input splitter', backend = LCG())

j.application = Executable()

j.application.exe = File('analyse_data')

j.inputfiles = [LocalFile(f.strip()) for f in open('inputs.txt')]

j.splitter = SplitByFiles(filesPerJob = 10)

j.outputfiles = [LocalFile('histogram.root')]

j.submit()

Page 9: Ganga: an interface to the LHC computing grid

9

Mergers

j = Job(name = 'Merger', backend = LCG())

j.application = Executable()

j.application.exe = File('analyse_data')

j.inputfiles = [LocalFile(f.strip()) for f in open('inputs.txt')]

j.splitter = SplitByFiles(filesPerJob = 10)

j.outputfiles = [LocalFile('histogram.root')]

j.merger = RootMerger(files = ['histogram.root'])

j.submit()

Page 10: Ganga: an interface to the LHC computing grid

10

Job catalogue

In [1]: jobs

Out [1]:

fqid | status | name | subjobs | application | backend

----------------------------------------------------------------------

0 | completed | Example job | | Executable | Local

1 | running | Input splitter | 324 | Executable | LCG

2 | running | Merger | 324 | Executable | LCG

Page 11: Ganga: an interface to the LHC computing grid

11

Full API access

In [2]: jobs(2).status

Out [2]: running

In [3]: len([j for j in jobs(2).subjobs if j.status == 'completed'])

Out [3]: 24

In [4]: for subjob in jobs(2).subjobs:

if subjob.status == 'failed':

subjob.resubmit()

Can define custom functions in ~/.ganga.py which will be available at runtime

Page 12: Ganga: an interface to the LHC computing grid

12

Dealing with large files

j = Job(name = 'Large output', backend = Dirac())

j.application = Executable()

j.application.exe = File('analyse_data')

j.inputfiles = [DiracFile('input.root')]

j.outputfiles = [DiracFile('histogram.root')]

j.submit()

Page 13: Ganga: an interface to the LHC computing grid

13

Find more at cern.ch/ganga

Download code from cern.ch/ganga/download/

Thank you