ganga: user-friendly grid job submission and management tool

7
Journal of Physics: Conference Series OPEN ACCESS Ganga: User-friendly Grid job submission and management tool for LHC and beyond To cite this article: D C Vanderster et al 2010 J. Phys.: Conf. Ser. 219 072022 View the article online for updates and enhancements. You may also like A PanDA backend for the ganga analysis interface D C Vanderster, J Elmsheuser, D Liko et al. - Functional and large-scale testing of the ATLAS distributed analysis facilities with Ganga D C Vanderster, J Elmsheuser, M Biglietti et al. - User analysis of LHCb data with Ganga Andrew Maier, Frederic Brochu, Greg Cowan et al. - Recent citations The swiss army knife of job submission tools: grid-control F Stober et al - Marcus Hilbrich and Ralph Müller- Pfefferkorn - Status of the DIRAC Project A Casajus et al - This content was downloaded from IP address 177.8.153.141 on 02/01/2022 at 10:04

Upload: others

Post on 11-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ganga: User-friendly Grid job submission and management tool

Journal of Physics Conference Series

OPEN ACCESS

Ganga User-friendly Grid job submission andmanagement tool for LHC and beyondTo cite this article D C Vanderster et al 2010 J Phys Conf Ser 219 072022

View the article online for updates and enhancements

You may also likeA PanDA backend for the ganga analysisinterfaceD C Vanderster J Elmsheuser D Liko etal

-

Functional and large-scale testing of theATLAS distributed analysis facilities withGangaD C Vanderster J Elmsheuser M Bigliettiet al

-

User analysis of LHCb data with GangaAndrew Maier Frederic Brochu GregCowan et al

-

Recent citationsThe swiss army knife of job submissiontools grid-controlF Stober et al

-

Marcus Hilbrich and Ralph Muumlller-Pfefferkorn

-

Status of the DIRAC ProjectA Casajus et al

-

This content was downloaded from IP address 1778153141 on 02012022 at 1004

Ganga User-friendly Grid job submission and management

tool for LHC and beyond

D C Vanderster1 F Brochu

2 G Cowan

3 U Egede

4 J Elmsheuser

5 B Gaidoz

1

K Harrison6 H C Lee

7 D Liko

8 A Maier

1 J T Mościcki

1 A Muraru

1 K Pajchel

9

W Reece4 B Samset

9 M Slater

6 A Soroko

10 C L Tan

6 and M Williams

4

1 European Organization for Nuclear Research CERN CH-1211 Genegraveve 23

Switzerland 2 Department of Physics University of Cambridge JJ Thomson Avenue Cambridge

CB3 0HE United Kingdom 3 Particle Physics Experiments Group School of Physics University of Edinburgh

Edinburgh EH9 3JZ United Kingdom 4 Department of Physics Imperial College London Prince Consort Road London

SW7 2AZ United Kingdom 5 Ludwig-Maximilians-Universitaumlt Muumlnchen Geschwister-Scholl-Platz 1 80539

Munich Germany 6 School of Physics and Astronomy The University of Birmingham Edgbaston

Birmingham B15 2TT United Kingdom 7 Nationaal instituut voor subatomaire fysica (NIKHEF) Science Park 105 1098 XG

Amsterdam The Netherlands 8 Institute of High Energy Physics of the Austrian Academy of Sciences Nikolsdorfer

Gasse 18 A-1050 Wien Austria 9 Experimental Particle Physics Group Department of Physics University of Oslo PO

Box 1048 Blindern NO-0316 Oslo Norway 10

Department of Physics University of Oxford Parks Road Oxford OX1 3PU

United Kingdom

E-mail danielcolinvanderstercernch

Abstract Ganga has been widely used for several years in ATLAS LHCb and a handful of

other communities Ganga provides a simple yet powerful interface for submitting and

managing jobs to a variety of computing backends The tool helps users configuring

applications and keeping track of their work With the major release of version 5 in summer

2008 Gangas main user-friendly features have been strengthened Examples include a new

configuration interface enhanced support for job collections bulk operations and easier access

to subjobs In addition to the traditional batch and Grid backends such as Condor LSF PBS

gLiteEDG a point-to-point job execution via ssh on remote machines is now supported Ganga

is used as an interactive job submission interface for end-users and also as a job submission component for higher-level tools For example GangaRobot is used to perform automated end-

to-end testing of distributed data analysis Ganga comes with an extensive test suite covering

more than 350 test cases The development model involves all active developers in the release

management shifts which is an important and novel approach for the distributed software

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

ccopy 2010 IOP Publishing Ltd 1

collaborations Ganga 5 is a mature stable and widely-used tool with long-term support from

the HEP community

1 Introduction

Scientific users are driven by a requirement to get computing results quickly and with minimal effort These users have a wide variety of computing resources at their disposal namely workstations batch

systems and computing grids Each of these resources has a unique user interface and users have the

burden of continually reconfiguring their applications to take advantage of the diverse systems With this situation it is apparent that there is a need for an end-user tool that hides these complexities and

enables users to concentrate on their science and not the computing

Ganga is one such end-user tool Built with a modular architecture that maps applications to

arbitrary execution backends Ganga allows users to easily scale up their analyses from development on a local workstation or batch system to full scale analysis on the Grid By providing a familiar and

consistent user interface to all of the resource types Ganga allows users to get their computing results

quickly utilizing all of their available resources Co-developed by the LHCb and ATLAS high energy physics experiments Ganga is a mature and stable tool that is supported by a number of user

communities

This paper presents an overview of Ganga (section 2) highlighting the basic user workflow and the development model and then discusses the Ganga user community (section 3) emphasising Ganga

usage by the LHC experiments

2 Introduction to Ganga

Ganga [1] is a user-friendly job management tool for scientific computing As a modular application written in python it supports an extensible suite of execution backends allowing users to easily run

jobs on their local workstation a number of batch systems and many grids For job management it

allows users to submit and monitor their jobs and also to manipulate previously submitted job objects to perform repeated analyses For example in order to change an analysis from running on their local

workstation to another system a user need only copy the locally executed job object change the

selected backend to a batch or grid system and then resubmit the job

The suite of available execution backends available in Ganga is shown in Figure 1 Ganga is one of the main distributed analysis tools for the LHCb and ATLAS experiments As such it provides

modules that allow participants in these experiments to run their applications on the available

resources including those managed by PanDA [2] g-Lite [3] and ARC [4] for ATLAS and DIRAC-managed resources [5] for LHCb Ganga also provides users from other communities with access to

the distributed computing resources such as the EGEE [6] and the Open Science Grid [7]

Ganga is an open source project driven by the members of its user community Core development is maintained by LHCb and ATLAS though contributions and extensions from outside these

collaborations are common

21 Ganga User Interfaces

Ganga is a flexible tool with three different user interfaces The graphical user interface (GUI) provides a simple point-and-click approach to configuring submitting monitoring and otherwise

managing jobs Depicted in Figure 2 the GUI makes it simple for users to monitor the progress of

many running jobs The command line interface (CLI) of Ganga is generally the most commonly used Providing users

with an interactive python prompt the CLI provides a powerful interface to quickly manipulate job

objects Finally Ganga provides an application programming interface (API) to the data and functions

available in the CLI Using the API users can write python scripts to execute automated workflows

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

2

Figure 1 The suite of execution backends available in Ganga Users can develop and debug on the local workstation then test on a batch system and finally run a full analysis on one of the many grids

Figure 2 The graphical user interface provides a point-and-click approach to job management

22 The Job Object A Ganga Job is a python object whose members constitute the configuration of an application running

on an execution backend In addition to specifying the application and backend the Job object also

includes the datasets used by the application to read inputs and write outputs Finally the Job object allows users to optionally specify a splitter and a merger the former being a rule to divide a large task

into subjobs and the latter being a rule for combining the subjob outputs

Ganga maintains a persistent repository of previously submitted jobs Stored in a local database at a

configurable location (by default in the userrsquos home directory) the repository allows users to submit jobs then exit and later restart to continue monitoring the set of submitted jobs

In the Ganga CLI running the default application is trivially accomplished using Job()submit()

To make the same application run on the EGEE resources the command

Job(backend=LCG())submit() is used Previously submitted jobs can be easily monitored

with the jobs command and help(jobs) gives detailed documentation on the interface to the jobs

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

3

object such as the functions to select copy resubmit kill and remove jobs from the persistent

repository

23 The Remote Backend

One recently developed feature of Ganga is the Remote backend shown in Figure 3 This backend allows users to submit jobs from any host even if the client tools for the execution backend are not

installed The Remote backend works as follows first a user runs a local instance of Ganga (eg on

his or her laptop) and configures the application for running on the target resources second Ganga connects via a secure-shell channel to a remote Ganga instance running on a host that has the

execution backend clients installed finally the remote Ganga instance submits and monitors the actual

jobs via the desired backend In this way users can develop their analysis in the familiar environment of their usual workstation and access remote resources that would be otherwise inaccessible locally

24 Development and Testing

One of the main strengths of Ganga is its robustness which is ensured by an organized release

procedure and extensive testing For the creation of stable releases Ganga developers rotate through six-week terms as release manager In general the job of the release manager is quite simple a cut-off

date for a pre-release is announced the release manager collects tags for the many components and

finally the pre-release is built Each pre-release undergoes extensive testing with tests of both core- and extension-specific

functionalities the testing framework includes approximately 500 test cases In general all core bugs

submitted to the Savannah tracking system [8] get a test case these validate the bug fixes and help

prevent regressions After approximately 24 hours of testing the developers inspect the test results if all developers are

satisfied the stable release is made Generally a new stable release is produced bi-weekly

3 The Ganga Community Though Ganga is developed and used primarily by the ATLAS and LHCb experiments it is built as a

generic tool and indeed sees usage from many Virtual Organizations (VOs) of users A plot of the

unique users for the first four months of 2009 is presented in Figure 4 During that period Ganga saw nearly 1000 unique users and for any given week there were usually between 250 and 300 unique

users About 50 of users are from ATLAS 25 are from LHCb and 25 are from other VOs

Other usage statistics not shown indicate that the GUI is not commonly used with instances using

either the CLI or scripting API making up over 99 of the total Finally the tool is used by many institutes during the past six months Ganga has been used at client machines coming from more than

130 top level domains (eg cernch gridkade etchellip)

Figure 3 The Remote backend allows users to submit to otherwise unavailable execution backends via an intermediate Ganga instance

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

4

Figure 4 Weekly unique users from January 1 to April 30 2009 ATLAS users are in blue LHCb in

green and others in brown

31 Ganga and the LHCb Experiment

Ganga is the grid user interface for LHCb As such many experiment-specific plugins have been incorporated to allow LHCb physicists to easily run large-scale analyses In particular the Gaudi [9]

application has been well integrated into Ganga allowing users to work with multiple configurations

of Gaudi jobs After specifying the relevant datasets users can run Gaudi jobs locally on batch systems or on the grid resources managed by Dirac Complete details of the use of Ganga by LHCb

are provided in these proceedings [10]

32 Ganga and the ATLAS Experiment

For the ATLAS experiment Ganga is one of two front-end tools to the ATLAS grids the other tool pathena and Ganga share a common library for the configuration of ATLAS analyses for the grid

[11] As with LHCb experiment-specific plugins have been developed in order easily map the ATLAS

analysis application Athena [12] to the execution backends The application object Athena allows users to process the various data types produced by the ATLAS detector and AthenaMC provides

users with a small scale production system to generate personal Monte Carlo samples Full details of

the ATLAS plugins in Ganga are given in these proceedings [13]

ATLAS has also built distributed analysis testing software using Ganga The GangaRobot and HammerCloud services perform functional and stress testing of the distributed analysis factilities [14]

The GangaRobot test results are used to provide up-to-date site status information to running Ganga

sessions so that malfunctioning sites can be avoided The HammerCloud stress tests are used to identify bottlenecks in the grid architecture for example to discover local storage or network

limitations and to test the remote access of shared database resources

33 Ganga and the Wider Community There are also novel contributions to Ganga that come from outside the core development team For

example developers at Korea Institute of Science and Technology Information (KISTI) while

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

5

working on the WISDOM project [15] have developed a new multiplexing backend For their studies

the KISTI researchers need to make use of both LCGg-Lite accessible sites and also new sites

managed by the GridWay [16] metascheduler Since a GridWay backend did not exist they first

developed it using the examples from the pre-existing backends With all of their execution resources now accessible to Ganga they then developed the InterGrid backend [17] using information about the

current load on various sites InterGrid selects between LCG and GridWay for each submitted job In

this way the backend complexity is hidden and users can focus on getting results with minimal effort

4 Conclusions

This paper has presented Ganga a user-friendly job management tool for grid batch and local

systems The modular architecture of Ganga allows users to easily make use of all resources available

to them while spending little time reconfiguring their analyses for each execution backend With a stable development process and large user community Ganga is a mature project that will continue to

thrive as we move into the data taking era of the LHC

References [1] F Brochu et al Ganga a tool for computational-task management and easy access to Grid

resources Submitted to Computer Physics Communications arXiv09022685v1

[2] T Maeno PanDA distributed production and distributed analysis system for ATLAS 2008 J Phys Conf Ser 119 062036 (4pp) doi 1010881742-65961196062036

[3] E Laure et al Programming the Grid with gLite Computational Methods in Science and

Technology Computational Methods in Science and Technology 12(1)33-45 2006

[4] M Ellert et al The NorduGrid project using Globus toolkit for building Grid infrastructure Nucl Instrum Methods A502 (2003) 407

[5] A Tsaregorodtsev et al DIRAC A Scalable Lightweight Architecture for High Throughput

Computing Proceedings of the 5th IEEEACM International Workshop on Grid Computing pp 19-25 2004

[6] R Jones An overview of the EGEE project pp 1-8 of C Tuumlrker M Agosti and H-J Schek

(Eds) Peer-to-peer Grid and service-orientation in digital library architectures [Lecture

Notes in Computer Science 3664] (Springer Berlin 2005) [7] R Pordes et al The Open Science Grid J Phys Conf Ser 78 (2007) 012057

[8] Savannah Portal for Ganga httpssavannahcernchprojectsganga

[9] G Barrand et al GAUDI A software architecture and framework for building HEP data processing applications Computer Physics Communications 140(2001) 45-55

[10] A Maier et al User analysis of LHCb data with Ganga These proceedings

[11] D C Vanderster et al A PanDA Backend for the Ganga Analysis Interface These proceedings [12] Athena Framework httpstwikicernchtwikibinviewAtlasAthenaFramework

[13] J Elmsheuser et al Distributed Analysis in ATLAS using GANGA These proceedings

[14] D C Vanderster et al Functional and Large-Scale Testing of the ATLAS Distributed Analysis

Facilities with Ganga These proceedings [15] WISDOM project website httpwisdomeu-egeefr

[16] E Huedo et al The GridWay Framework for Adaptive Scheduling and Execution on Grids

Scalable Computing ndash Practice and Experience 6(3) 1-8 2005 [17] S Hwang et al An Approach to Grid Interoperability using Ganga EGEE User Forum Catania

March 2-6 2009

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

6

Page 2: Ganga: User-friendly Grid job submission and management tool

Ganga User-friendly Grid job submission and management

tool for LHC and beyond

D C Vanderster1 F Brochu

2 G Cowan

3 U Egede

4 J Elmsheuser

5 B Gaidoz

1

K Harrison6 H C Lee

7 D Liko

8 A Maier

1 J T Mościcki

1 A Muraru

1 K Pajchel

9

W Reece4 B Samset

9 M Slater

6 A Soroko

10 C L Tan

6 and M Williams

4

1 European Organization for Nuclear Research CERN CH-1211 Genegraveve 23

Switzerland 2 Department of Physics University of Cambridge JJ Thomson Avenue Cambridge

CB3 0HE United Kingdom 3 Particle Physics Experiments Group School of Physics University of Edinburgh

Edinburgh EH9 3JZ United Kingdom 4 Department of Physics Imperial College London Prince Consort Road London

SW7 2AZ United Kingdom 5 Ludwig-Maximilians-Universitaumlt Muumlnchen Geschwister-Scholl-Platz 1 80539

Munich Germany 6 School of Physics and Astronomy The University of Birmingham Edgbaston

Birmingham B15 2TT United Kingdom 7 Nationaal instituut voor subatomaire fysica (NIKHEF) Science Park 105 1098 XG

Amsterdam The Netherlands 8 Institute of High Energy Physics of the Austrian Academy of Sciences Nikolsdorfer

Gasse 18 A-1050 Wien Austria 9 Experimental Particle Physics Group Department of Physics University of Oslo PO

Box 1048 Blindern NO-0316 Oslo Norway 10

Department of Physics University of Oxford Parks Road Oxford OX1 3PU

United Kingdom

E-mail danielcolinvanderstercernch

Abstract Ganga has been widely used for several years in ATLAS LHCb and a handful of

other communities Ganga provides a simple yet powerful interface for submitting and

managing jobs to a variety of computing backends The tool helps users configuring

applications and keeping track of their work With the major release of version 5 in summer

2008 Gangas main user-friendly features have been strengthened Examples include a new

configuration interface enhanced support for job collections bulk operations and easier access

to subjobs In addition to the traditional batch and Grid backends such as Condor LSF PBS

gLiteEDG a point-to-point job execution via ssh on remote machines is now supported Ganga

is used as an interactive job submission interface for end-users and also as a job submission component for higher-level tools For example GangaRobot is used to perform automated end-

to-end testing of distributed data analysis Ganga comes with an extensive test suite covering

more than 350 test cases The development model involves all active developers in the release

management shifts which is an important and novel approach for the distributed software

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

ccopy 2010 IOP Publishing Ltd 1

collaborations Ganga 5 is a mature stable and widely-used tool with long-term support from

the HEP community

1 Introduction

Scientific users are driven by a requirement to get computing results quickly and with minimal effort These users have a wide variety of computing resources at their disposal namely workstations batch

systems and computing grids Each of these resources has a unique user interface and users have the

burden of continually reconfiguring their applications to take advantage of the diverse systems With this situation it is apparent that there is a need for an end-user tool that hides these complexities and

enables users to concentrate on their science and not the computing

Ganga is one such end-user tool Built with a modular architecture that maps applications to

arbitrary execution backends Ganga allows users to easily scale up their analyses from development on a local workstation or batch system to full scale analysis on the Grid By providing a familiar and

consistent user interface to all of the resource types Ganga allows users to get their computing results

quickly utilizing all of their available resources Co-developed by the LHCb and ATLAS high energy physics experiments Ganga is a mature and stable tool that is supported by a number of user

communities

This paper presents an overview of Ganga (section 2) highlighting the basic user workflow and the development model and then discusses the Ganga user community (section 3) emphasising Ganga

usage by the LHC experiments

2 Introduction to Ganga

Ganga [1] is a user-friendly job management tool for scientific computing As a modular application written in python it supports an extensible suite of execution backends allowing users to easily run

jobs on their local workstation a number of batch systems and many grids For job management it

allows users to submit and monitor their jobs and also to manipulate previously submitted job objects to perform repeated analyses For example in order to change an analysis from running on their local

workstation to another system a user need only copy the locally executed job object change the

selected backend to a batch or grid system and then resubmit the job

The suite of available execution backends available in Ganga is shown in Figure 1 Ganga is one of the main distributed analysis tools for the LHCb and ATLAS experiments As such it provides

modules that allow participants in these experiments to run their applications on the available

resources including those managed by PanDA [2] g-Lite [3] and ARC [4] for ATLAS and DIRAC-managed resources [5] for LHCb Ganga also provides users from other communities with access to

the distributed computing resources such as the EGEE [6] and the Open Science Grid [7]

Ganga is an open source project driven by the members of its user community Core development is maintained by LHCb and ATLAS though contributions and extensions from outside these

collaborations are common

21 Ganga User Interfaces

Ganga is a flexible tool with three different user interfaces The graphical user interface (GUI) provides a simple point-and-click approach to configuring submitting monitoring and otherwise

managing jobs Depicted in Figure 2 the GUI makes it simple for users to monitor the progress of

many running jobs The command line interface (CLI) of Ganga is generally the most commonly used Providing users

with an interactive python prompt the CLI provides a powerful interface to quickly manipulate job

objects Finally Ganga provides an application programming interface (API) to the data and functions

available in the CLI Using the API users can write python scripts to execute automated workflows

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

2

Figure 1 The suite of execution backends available in Ganga Users can develop and debug on the local workstation then test on a batch system and finally run a full analysis on one of the many grids

Figure 2 The graphical user interface provides a point-and-click approach to job management

22 The Job Object A Ganga Job is a python object whose members constitute the configuration of an application running

on an execution backend In addition to specifying the application and backend the Job object also

includes the datasets used by the application to read inputs and write outputs Finally the Job object allows users to optionally specify a splitter and a merger the former being a rule to divide a large task

into subjobs and the latter being a rule for combining the subjob outputs

Ganga maintains a persistent repository of previously submitted jobs Stored in a local database at a

configurable location (by default in the userrsquos home directory) the repository allows users to submit jobs then exit and later restart to continue monitoring the set of submitted jobs

In the Ganga CLI running the default application is trivially accomplished using Job()submit()

To make the same application run on the EGEE resources the command

Job(backend=LCG())submit() is used Previously submitted jobs can be easily monitored

with the jobs command and help(jobs) gives detailed documentation on the interface to the jobs

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

3

object such as the functions to select copy resubmit kill and remove jobs from the persistent

repository

23 The Remote Backend

One recently developed feature of Ganga is the Remote backend shown in Figure 3 This backend allows users to submit jobs from any host even if the client tools for the execution backend are not

installed The Remote backend works as follows first a user runs a local instance of Ganga (eg on

his or her laptop) and configures the application for running on the target resources second Ganga connects via a secure-shell channel to a remote Ganga instance running on a host that has the

execution backend clients installed finally the remote Ganga instance submits and monitors the actual

jobs via the desired backend In this way users can develop their analysis in the familiar environment of their usual workstation and access remote resources that would be otherwise inaccessible locally

24 Development and Testing

One of the main strengths of Ganga is its robustness which is ensured by an organized release

procedure and extensive testing For the creation of stable releases Ganga developers rotate through six-week terms as release manager In general the job of the release manager is quite simple a cut-off

date for a pre-release is announced the release manager collects tags for the many components and

finally the pre-release is built Each pre-release undergoes extensive testing with tests of both core- and extension-specific

functionalities the testing framework includes approximately 500 test cases In general all core bugs

submitted to the Savannah tracking system [8] get a test case these validate the bug fixes and help

prevent regressions After approximately 24 hours of testing the developers inspect the test results if all developers are

satisfied the stable release is made Generally a new stable release is produced bi-weekly

3 The Ganga Community Though Ganga is developed and used primarily by the ATLAS and LHCb experiments it is built as a

generic tool and indeed sees usage from many Virtual Organizations (VOs) of users A plot of the

unique users for the first four months of 2009 is presented in Figure 4 During that period Ganga saw nearly 1000 unique users and for any given week there were usually between 250 and 300 unique

users About 50 of users are from ATLAS 25 are from LHCb and 25 are from other VOs

Other usage statistics not shown indicate that the GUI is not commonly used with instances using

either the CLI or scripting API making up over 99 of the total Finally the tool is used by many institutes during the past six months Ganga has been used at client machines coming from more than

130 top level domains (eg cernch gridkade etchellip)

Figure 3 The Remote backend allows users to submit to otherwise unavailable execution backends via an intermediate Ganga instance

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

4

Figure 4 Weekly unique users from January 1 to April 30 2009 ATLAS users are in blue LHCb in

green and others in brown

31 Ganga and the LHCb Experiment

Ganga is the grid user interface for LHCb As such many experiment-specific plugins have been incorporated to allow LHCb physicists to easily run large-scale analyses In particular the Gaudi [9]

application has been well integrated into Ganga allowing users to work with multiple configurations

of Gaudi jobs After specifying the relevant datasets users can run Gaudi jobs locally on batch systems or on the grid resources managed by Dirac Complete details of the use of Ganga by LHCb

are provided in these proceedings [10]

32 Ganga and the ATLAS Experiment

For the ATLAS experiment Ganga is one of two front-end tools to the ATLAS grids the other tool pathena and Ganga share a common library for the configuration of ATLAS analyses for the grid

[11] As with LHCb experiment-specific plugins have been developed in order easily map the ATLAS

analysis application Athena [12] to the execution backends The application object Athena allows users to process the various data types produced by the ATLAS detector and AthenaMC provides

users with a small scale production system to generate personal Monte Carlo samples Full details of

the ATLAS plugins in Ganga are given in these proceedings [13]

ATLAS has also built distributed analysis testing software using Ganga The GangaRobot and HammerCloud services perform functional and stress testing of the distributed analysis factilities [14]

The GangaRobot test results are used to provide up-to-date site status information to running Ganga

sessions so that malfunctioning sites can be avoided The HammerCloud stress tests are used to identify bottlenecks in the grid architecture for example to discover local storage or network

limitations and to test the remote access of shared database resources

33 Ganga and the Wider Community There are also novel contributions to Ganga that come from outside the core development team For

example developers at Korea Institute of Science and Technology Information (KISTI) while

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

5

working on the WISDOM project [15] have developed a new multiplexing backend For their studies

the KISTI researchers need to make use of both LCGg-Lite accessible sites and also new sites

managed by the GridWay [16] metascheduler Since a GridWay backend did not exist they first

developed it using the examples from the pre-existing backends With all of their execution resources now accessible to Ganga they then developed the InterGrid backend [17] using information about the

current load on various sites InterGrid selects between LCG and GridWay for each submitted job In

this way the backend complexity is hidden and users can focus on getting results with minimal effort

4 Conclusions

This paper has presented Ganga a user-friendly job management tool for grid batch and local

systems The modular architecture of Ganga allows users to easily make use of all resources available

to them while spending little time reconfiguring their analyses for each execution backend With a stable development process and large user community Ganga is a mature project that will continue to

thrive as we move into the data taking era of the LHC

References [1] F Brochu et al Ganga a tool for computational-task management and easy access to Grid

resources Submitted to Computer Physics Communications arXiv09022685v1

[2] T Maeno PanDA distributed production and distributed analysis system for ATLAS 2008 J Phys Conf Ser 119 062036 (4pp) doi 1010881742-65961196062036

[3] E Laure et al Programming the Grid with gLite Computational Methods in Science and

Technology Computational Methods in Science and Technology 12(1)33-45 2006

[4] M Ellert et al The NorduGrid project using Globus toolkit for building Grid infrastructure Nucl Instrum Methods A502 (2003) 407

[5] A Tsaregorodtsev et al DIRAC A Scalable Lightweight Architecture for High Throughput

Computing Proceedings of the 5th IEEEACM International Workshop on Grid Computing pp 19-25 2004

[6] R Jones An overview of the EGEE project pp 1-8 of C Tuumlrker M Agosti and H-J Schek

(Eds) Peer-to-peer Grid and service-orientation in digital library architectures [Lecture

Notes in Computer Science 3664] (Springer Berlin 2005) [7] R Pordes et al The Open Science Grid J Phys Conf Ser 78 (2007) 012057

[8] Savannah Portal for Ganga httpssavannahcernchprojectsganga

[9] G Barrand et al GAUDI A software architecture and framework for building HEP data processing applications Computer Physics Communications 140(2001) 45-55

[10] A Maier et al User analysis of LHCb data with Ganga These proceedings

[11] D C Vanderster et al A PanDA Backend for the Ganga Analysis Interface These proceedings [12] Athena Framework httpstwikicernchtwikibinviewAtlasAthenaFramework

[13] J Elmsheuser et al Distributed Analysis in ATLAS using GANGA These proceedings

[14] D C Vanderster et al Functional and Large-Scale Testing of the ATLAS Distributed Analysis

Facilities with Ganga These proceedings [15] WISDOM project website httpwisdomeu-egeefr

[16] E Huedo et al The GridWay Framework for Adaptive Scheduling and Execution on Grids

Scalable Computing ndash Practice and Experience 6(3) 1-8 2005 [17] S Hwang et al An Approach to Grid Interoperability using Ganga EGEE User Forum Catania

March 2-6 2009

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

6

Page 3: Ganga: User-friendly Grid job submission and management tool

collaborations Ganga 5 is a mature stable and widely-used tool with long-term support from

the HEP community

1 Introduction

Scientific users are driven by a requirement to get computing results quickly and with minimal effort These users have a wide variety of computing resources at their disposal namely workstations batch

systems and computing grids Each of these resources has a unique user interface and users have the

burden of continually reconfiguring their applications to take advantage of the diverse systems With this situation it is apparent that there is a need for an end-user tool that hides these complexities and

enables users to concentrate on their science and not the computing

Ganga is one such end-user tool Built with a modular architecture that maps applications to

arbitrary execution backends Ganga allows users to easily scale up their analyses from development on a local workstation or batch system to full scale analysis on the Grid By providing a familiar and

consistent user interface to all of the resource types Ganga allows users to get their computing results

quickly utilizing all of their available resources Co-developed by the LHCb and ATLAS high energy physics experiments Ganga is a mature and stable tool that is supported by a number of user

communities

This paper presents an overview of Ganga (section 2) highlighting the basic user workflow and the development model and then discusses the Ganga user community (section 3) emphasising Ganga

usage by the LHC experiments

2 Introduction to Ganga

Ganga [1] is a user-friendly job management tool for scientific computing As a modular application written in python it supports an extensible suite of execution backends allowing users to easily run

jobs on their local workstation a number of batch systems and many grids For job management it

allows users to submit and monitor their jobs and also to manipulate previously submitted job objects to perform repeated analyses For example in order to change an analysis from running on their local

workstation to another system a user need only copy the locally executed job object change the

selected backend to a batch or grid system and then resubmit the job

The suite of available execution backends available in Ganga is shown in Figure 1 Ganga is one of the main distributed analysis tools for the LHCb and ATLAS experiments As such it provides

modules that allow participants in these experiments to run their applications on the available

resources including those managed by PanDA [2] g-Lite [3] and ARC [4] for ATLAS and DIRAC-managed resources [5] for LHCb Ganga also provides users from other communities with access to

the distributed computing resources such as the EGEE [6] and the Open Science Grid [7]

Ganga is an open source project driven by the members of its user community Core development is maintained by LHCb and ATLAS though contributions and extensions from outside these

collaborations are common

21 Ganga User Interfaces

Ganga is a flexible tool with three different user interfaces The graphical user interface (GUI) provides a simple point-and-click approach to configuring submitting monitoring and otherwise

managing jobs Depicted in Figure 2 the GUI makes it simple for users to monitor the progress of

many running jobs The command line interface (CLI) of Ganga is generally the most commonly used Providing users

with an interactive python prompt the CLI provides a powerful interface to quickly manipulate job

objects Finally Ganga provides an application programming interface (API) to the data and functions

available in the CLI Using the API users can write python scripts to execute automated workflows

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

2

Figure 1 The suite of execution backends available in Ganga Users can develop and debug on the local workstation then test on a batch system and finally run a full analysis on one of the many grids

Figure 2 The graphical user interface provides a point-and-click approach to job management

22 The Job Object A Ganga Job is a python object whose members constitute the configuration of an application running

on an execution backend In addition to specifying the application and backend the Job object also

includes the datasets used by the application to read inputs and write outputs Finally the Job object allows users to optionally specify a splitter and a merger the former being a rule to divide a large task

into subjobs and the latter being a rule for combining the subjob outputs

Ganga maintains a persistent repository of previously submitted jobs Stored in a local database at a

configurable location (by default in the userrsquos home directory) the repository allows users to submit jobs then exit and later restart to continue monitoring the set of submitted jobs

In the Ganga CLI running the default application is trivially accomplished using Job()submit()

To make the same application run on the EGEE resources the command

Job(backend=LCG())submit() is used Previously submitted jobs can be easily monitored

with the jobs command and help(jobs) gives detailed documentation on the interface to the jobs

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

3

object such as the functions to select copy resubmit kill and remove jobs from the persistent

repository

23 The Remote Backend

One recently developed feature of Ganga is the Remote backend shown in Figure 3 This backend allows users to submit jobs from any host even if the client tools for the execution backend are not

installed The Remote backend works as follows first a user runs a local instance of Ganga (eg on

his or her laptop) and configures the application for running on the target resources second Ganga connects via a secure-shell channel to a remote Ganga instance running on a host that has the

execution backend clients installed finally the remote Ganga instance submits and monitors the actual

jobs via the desired backend In this way users can develop their analysis in the familiar environment of their usual workstation and access remote resources that would be otherwise inaccessible locally

24 Development and Testing

One of the main strengths of Ganga is its robustness which is ensured by an organized release

procedure and extensive testing For the creation of stable releases Ganga developers rotate through six-week terms as release manager In general the job of the release manager is quite simple a cut-off

date for a pre-release is announced the release manager collects tags for the many components and

finally the pre-release is built Each pre-release undergoes extensive testing with tests of both core- and extension-specific

functionalities the testing framework includes approximately 500 test cases In general all core bugs

submitted to the Savannah tracking system [8] get a test case these validate the bug fixes and help

prevent regressions After approximately 24 hours of testing the developers inspect the test results if all developers are

satisfied the stable release is made Generally a new stable release is produced bi-weekly

3 The Ganga Community Though Ganga is developed and used primarily by the ATLAS and LHCb experiments it is built as a

generic tool and indeed sees usage from many Virtual Organizations (VOs) of users A plot of the

unique users for the first four months of 2009 is presented in Figure 4 During that period Ganga saw nearly 1000 unique users and for any given week there were usually between 250 and 300 unique

users About 50 of users are from ATLAS 25 are from LHCb and 25 are from other VOs

Other usage statistics not shown indicate that the GUI is not commonly used with instances using

either the CLI or scripting API making up over 99 of the total Finally the tool is used by many institutes during the past six months Ganga has been used at client machines coming from more than

130 top level domains (eg cernch gridkade etchellip)

Figure 3 The Remote backend allows users to submit to otherwise unavailable execution backends via an intermediate Ganga instance

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

4

Figure 4 Weekly unique users from January 1 to April 30 2009 ATLAS users are in blue LHCb in

green and others in brown

31 Ganga and the LHCb Experiment

Ganga is the grid user interface for LHCb As such many experiment-specific plugins have been incorporated to allow LHCb physicists to easily run large-scale analyses In particular the Gaudi [9]

application has been well integrated into Ganga allowing users to work with multiple configurations

of Gaudi jobs After specifying the relevant datasets users can run Gaudi jobs locally on batch systems or on the grid resources managed by Dirac Complete details of the use of Ganga by LHCb

are provided in these proceedings [10]

32 Ganga and the ATLAS Experiment

For the ATLAS experiment Ganga is one of two front-end tools to the ATLAS grids the other tool pathena and Ganga share a common library for the configuration of ATLAS analyses for the grid

[11] As with LHCb experiment-specific plugins have been developed in order easily map the ATLAS

analysis application Athena [12] to the execution backends The application object Athena allows users to process the various data types produced by the ATLAS detector and AthenaMC provides

users with a small scale production system to generate personal Monte Carlo samples Full details of

the ATLAS plugins in Ganga are given in these proceedings [13]

ATLAS has also built distributed analysis testing software using Ganga The GangaRobot and HammerCloud services perform functional and stress testing of the distributed analysis factilities [14]

The GangaRobot test results are used to provide up-to-date site status information to running Ganga

sessions so that malfunctioning sites can be avoided The HammerCloud stress tests are used to identify bottlenecks in the grid architecture for example to discover local storage or network

limitations and to test the remote access of shared database resources

33 Ganga and the Wider Community There are also novel contributions to Ganga that come from outside the core development team For

example developers at Korea Institute of Science and Technology Information (KISTI) while

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

5

working on the WISDOM project [15] have developed a new multiplexing backend For their studies

the KISTI researchers need to make use of both LCGg-Lite accessible sites and also new sites

managed by the GridWay [16] metascheduler Since a GridWay backend did not exist they first

developed it using the examples from the pre-existing backends With all of their execution resources now accessible to Ganga they then developed the InterGrid backend [17] using information about the

current load on various sites InterGrid selects between LCG and GridWay for each submitted job In

this way the backend complexity is hidden and users can focus on getting results with minimal effort

4 Conclusions

This paper has presented Ganga a user-friendly job management tool for grid batch and local

systems The modular architecture of Ganga allows users to easily make use of all resources available

to them while spending little time reconfiguring their analyses for each execution backend With a stable development process and large user community Ganga is a mature project that will continue to

thrive as we move into the data taking era of the LHC

References [1] F Brochu et al Ganga a tool for computational-task management and easy access to Grid

resources Submitted to Computer Physics Communications arXiv09022685v1

[2] T Maeno PanDA distributed production and distributed analysis system for ATLAS 2008 J Phys Conf Ser 119 062036 (4pp) doi 1010881742-65961196062036

[3] E Laure et al Programming the Grid with gLite Computational Methods in Science and

Technology Computational Methods in Science and Technology 12(1)33-45 2006

[4] M Ellert et al The NorduGrid project using Globus toolkit for building Grid infrastructure Nucl Instrum Methods A502 (2003) 407

[5] A Tsaregorodtsev et al DIRAC A Scalable Lightweight Architecture for High Throughput

Computing Proceedings of the 5th IEEEACM International Workshop on Grid Computing pp 19-25 2004

[6] R Jones An overview of the EGEE project pp 1-8 of C Tuumlrker M Agosti and H-J Schek

(Eds) Peer-to-peer Grid and service-orientation in digital library architectures [Lecture

Notes in Computer Science 3664] (Springer Berlin 2005) [7] R Pordes et al The Open Science Grid J Phys Conf Ser 78 (2007) 012057

[8] Savannah Portal for Ganga httpssavannahcernchprojectsganga

[9] G Barrand et al GAUDI A software architecture and framework for building HEP data processing applications Computer Physics Communications 140(2001) 45-55

[10] A Maier et al User analysis of LHCb data with Ganga These proceedings

[11] D C Vanderster et al A PanDA Backend for the Ganga Analysis Interface These proceedings [12] Athena Framework httpstwikicernchtwikibinviewAtlasAthenaFramework

[13] J Elmsheuser et al Distributed Analysis in ATLAS using GANGA These proceedings

[14] D C Vanderster et al Functional and Large-Scale Testing of the ATLAS Distributed Analysis

Facilities with Ganga These proceedings [15] WISDOM project website httpwisdomeu-egeefr

[16] E Huedo et al The GridWay Framework for Adaptive Scheduling and Execution on Grids

Scalable Computing ndash Practice and Experience 6(3) 1-8 2005 [17] S Hwang et al An Approach to Grid Interoperability using Ganga EGEE User Forum Catania

March 2-6 2009

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

6

Page 4: Ganga: User-friendly Grid job submission and management tool

Figure 1 The suite of execution backends available in Ganga Users can develop and debug on the local workstation then test on a batch system and finally run a full analysis on one of the many grids

Figure 2 The graphical user interface provides a point-and-click approach to job management

22 The Job Object A Ganga Job is a python object whose members constitute the configuration of an application running

on an execution backend In addition to specifying the application and backend the Job object also

includes the datasets used by the application to read inputs and write outputs Finally the Job object allows users to optionally specify a splitter and a merger the former being a rule to divide a large task

into subjobs and the latter being a rule for combining the subjob outputs

Ganga maintains a persistent repository of previously submitted jobs Stored in a local database at a

configurable location (by default in the userrsquos home directory) the repository allows users to submit jobs then exit and later restart to continue monitoring the set of submitted jobs

In the Ganga CLI running the default application is trivially accomplished using Job()submit()

To make the same application run on the EGEE resources the command

Job(backend=LCG())submit() is used Previously submitted jobs can be easily monitored

with the jobs command and help(jobs) gives detailed documentation on the interface to the jobs

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

3

object such as the functions to select copy resubmit kill and remove jobs from the persistent

repository

23 The Remote Backend

One recently developed feature of Ganga is the Remote backend shown in Figure 3 This backend allows users to submit jobs from any host even if the client tools for the execution backend are not

installed The Remote backend works as follows first a user runs a local instance of Ganga (eg on

his or her laptop) and configures the application for running on the target resources second Ganga connects via a secure-shell channel to a remote Ganga instance running on a host that has the

execution backend clients installed finally the remote Ganga instance submits and monitors the actual

jobs via the desired backend In this way users can develop their analysis in the familiar environment of their usual workstation and access remote resources that would be otherwise inaccessible locally

24 Development and Testing

One of the main strengths of Ganga is its robustness which is ensured by an organized release

procedure and extensive testing For the creation of stable releases Ganga developers rotate through six-week terms as release manager In general the job of the release manager is quite simple a cut-off

date for a pre-release is announced the release manager collects tags for the many components and

finally the pre-release is built Each pre-release undergoes extensive testing with tests of both core- and extension-specific

functionalities the testing framework includes approximately 500 test cases In general all core bugs

submitted to the Savannah tracking system [8] get a test case these validate the bug fixes and help

prevent regressions After approximately 24 hours of testing the developers inspect the test results if all developers are

satisfied the stable release is made Generally a new stable release is produced bi-weekly

3 The Ganga Community Though Ganga is developed and used primarily by the ATLAS and LHCb experiments it is built as a

generic tool and indeed sees usage from many Virtual Organizations (VOs) of users A plot of the

unique users for the first four months of 2009 is presented in Figure 4 During that period Ganga saw nearly 1000 unique users and for any given week there were usually between 250 and 300 unique

users About 50 of users are from ATLAS 25 are from LHCb and 25 are from other VOs

Other usage statistics not shown indicate that the GUI is not commonly used with instances using

either the CLI or scripting API making up over 99 of the total Finally the tool is used by many institutes during the past six months Ganga has been used at client machines coming from more than

130 top level domains (eg cernch gridkade etchellip)

Figure 3 The Remote backend allows users to submit to otherwise unavailable execution backends via an intermediate Ganga instance

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

4

Figure 4 Weekly unique users from January 1 to April 30 2009 ATLAS users are in blue LHCb in

green and others in brown

31 Ganga and the LHCb Experiment

Ganga is the grid user interface for LHCb As such many experiment-specific plugins have been incorporated to allow LHCb physicists to easily run large-scale analyses In particular the Gaudi [9]

application has been well integrated into Ganga allowing users to work with multiple configurations

of Gaudi jobs After specifying the relevant datasets users can run Gaudi jobs locally on batch systems or on the grid resources managed by Dirac Complete details of the use of Ganga by LHCb

are provided in these proceedings [10]

32 Ganga and the ATLAS Experiment

For the ATLAS experiment Ganga is one of two front-end tools to the ATLAS grids the other tool pathena and Ganga share a common library for the configuration of ATLAS analyses for the grid

[11] As with LHCb experiment-specific plugins have been developed in order easily map the ATLAS

analysis application Athena [12] to the execution backends The application object Athena allows users to process the various data types produced by the ATLAS detector and AthenaMC provides

users with a small scale production system to generate personal Monte Carlo samples Full details of

the ATLAS plugins in Ganga are given in these proceedings [13]

ATLAS has also built distributed analysis testing software using Ganga The GangaRobot and HammerCloud services perform functional and stress testing of the distributed analysis factilities [14]

The GangaRobot test results are used to provide up-to-date site status information to running Ganga

sessions so that malfunctioning sites can be avoided The HammerCloud stress tests are used to identify bottlenecks in the grid architecture for example to discover local storage or network

limitations and to test the remote access of shared database resources

33 Ganga and the Wider Community There are also novel contributions to Ganga that come from outside the core development team For

example developers at Korea Institute of Science and Technology Information (KISTI) while

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

5

working on the WISDOM project [15] have developed a new multiplexing backend For their studies

the KISTI researchers need to make use of both LCGg-Lite accessible sites and also new sites

managed by the GridWay [16] metascheduler Since a GridWay backend did not exist they first

developed it using the examples from the pre-existing backends With all of their execution resources now accessible to Ganga they then developed the InterGrid backend [17] using information about the

current load on various sites InterGrid selects between LCG and GridWay for each submitted job In

this way the backend complexity is hidden and users can focus on getting results with minimal effort

4 Conclusions

This paper has presented Ganga a user-friendly job management tool for grid batch and local

systems The modular architecture of Ganga allows users to easily make use of all resources available

to them while spending little time reconfiguring their analyses for each execution backend With a stable development process and large user community Ganga is a mature project that will continue to

thrive as we move into the data taking era of the LHC

References [1] F Brochu et al Ganga a tool for computational-task management and easy access to Grid

resources Submitted to Computer Physics Communications arXiv09022685v1

[2] T Maeno PanDA distributed production and distributed analysis system for ATLAS 2008 J Phys Conf Ser 119 062036 (4pp) doi 1010881742-65961196062036

[3] E Laure et al Programming the Grid with gLite Computational Methods in Science and

Technology Computational Methods in Science and Technology 12(1)33-45 2006

[4] M Ellert et al The NorduGrid project using Globus toolkit for building Grid infrastructure Nucl Instrum Methods A502 (2003) 407

[5] A Tsaregorodtsev et al DIRAC A Scalable Lightweight Architecture for High Throughput

Computing Proceedings of the 5th IEEEACM International Workshop on Grid Computing pp 19-25 2004

[6] R Jones An overview of the EGEE project pp 1-8 of C Tuumlrker M Agosti and H-J Schek

(Eds) Peer-to-peer Grid and service-orientation in digital library architectures [Lecture

Notes in Computer Science 3664] (Springer Berlin 2005) [7] R Pordes et al The Open Science Grid J Phys Conf Ser 78 (2007) 012057

[8] Savannah Portal for Ganga httpssavannahcernchprojectsganga

[9] G Barrand et al GAUDI A software architecture and framework for building HEP data processing applications Computer Physics Communications 140(2001) 45-55

[10] A Maier et al User analysis of LHCb data with Ganga These proceedings

[11] D C Vanderster et al A PanDA Backend for the Ganga Analysis Interface These proceedings [12] Athena Framework httpstwikicernchtwikibinviewAtlasAthenaFramework

[13] J Elmsheuser et al Distributed Analysis in ATLAS using GANGA These proceedings

[14] D C Vanderster et al Functional and Large-Scale Testing of the ATLAS Distributed Analysis

Facilities with Ganga These proceedings [15] WISDOM project website httpwisdomeu-egeefr

[16] E Huedo et al The GridWay Framework for Adaptive Scheduling and Execution on Grids

Scalable Computing ndash Practice and Experience 6(3) 1-8 2005 [17] S Hwang et al An Approach to Grid Interoperability using Ganga EGEE User Forum Catania

March 2-6 2009

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

6

Page 5: Ganga: User-friendly Grid job submission and management tool

object such as the functions to select copy resubmit kill and remove jobs from the persistent

repository

23 The Remote Backend

One recently developed feature of Ganga is the Remote backend shown in Figure 3 This backend allows users to submit jobs from any host even if the client tools for the execution backend are not

installed The Remote backend works as follows first a user runs a local instance of Ganga (eg on

his or her laptop) and configures the application for running on the target resources second Ganga connects via a secure-shell channel to a remote Ganga instance running on a host that has the

execution backend clients installed finally the remote Ganga instance submits and monitors the actual

jobs via the desired backend In this way users can develop their analysis in the familiar environment of their usual workstation and access remote resources that would be otherwise inaccessible locally

24 Development and Testing

One of the main strengths of Ganga is its robustness which is ensured by an organized release

procedure and extensive testing For the creation of stable releases Ganga developers rotate through six-week terms as release manager In general the job of the release manager is quite simple a cut-off

date for a pre-release is announced the release manager collects tags for the many components and

finally the pre-release is built Each pre-release undergoes extensive testing with tests of both core- and extension-specific

functionalities the testing framework includes approximately 500 test cases In general all core bugs

submitted to the Savannah tracking system [8] get a test case these validate the bug fixes and help

prevent regressions After approximately 24 hours of testing the developers inspect the test results if all developers are

satisfied the stable release is made Generally a new stable release is produced bi-weekly

3 The Ganga Community Though Ganga is developed and used primarily by the ATLAS and LHCb experiments it is built as a

generic tool and indeed sees usage from many Virtual Organizations (VOs) of users A plot of the

unique users for the first four months of 2009 is presented in Figure 4 During that period Ganga saw nearly 1000 unique users and for any given week there were usually between 250 and 300 unique

users About 50 of users are from ATLAS 25 are from LHCb and 25 are from other VOs

Other usage statistics not shown indicate that the GUI is not commonly used with instances using

either the CLI or scripting API making up over 99 of the total Finally the tool is used by many institutes during the past six months Ganga has been used at client machines coming from more than

130 top level domains (eg cernch gridkade etchellip)

Figure 3 The Remote backend allows users to submit to otherwise unavailable execution backends via an intermediate Ganga instance

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

4

Figure 4 Weekly unique users from January 1 to April 30 2009 ATLAS users are in blue LHCb in

green and others in brown

31 Ganga and the LHCb Experiment

Ganga is the grid user interface for LHCb As such many experiment-specific plugins have been incorporated to allow LHCb physicists to easily run large-scale analyses In particular the Gaudi [9]

application has been well integrated into Ganga allowing users to work with multiple configurations

of Gaudi jobs After specifying the relevant datasets users can run Gaudi jobs locally on batch systems or on the grid resources managed by Dirac Complete details of the use of Ganga by LHCb

are provided in these proceedings [10]

32 Ganga and the ATLAS Experiment

For the ATLAS experiment Ganga is one of two front-end tools to the ATLAS grids the other tool pathena and Ganga share a common library for the configuration of ATLAS analyses for the grid

[11] As with LHCb experiment-specific plugins have been developed in order easily map the ATLAS

analysis application Athena [12] to the execution backends The application object Athena allows users to process the various data types produced by the ATLAS detector and AthenaMC provides

users with a small scale production system to generate personal Monte Carlo samples Full details of

the ATLAS plugins in Ganga are given in these proceedings [13]

ATLAS has also built distributed analysis testing software using Ganga The GangaRobot and HammerCloud services perform functional and stress testing of the distributed analysis factilities [14]

The GangaRobot test results are used to provide up-to-date site status information to running Ganga

sessions so that malfunctioning sites can be avoided The HammerCloud stress tests are used to identify bottlenecks in the grid architecture for example to discover local storage or network

limitations and to test the remote access of shared database resources

33 Ganga and the Wider Community There are also novel contributions to Ganga that come from outside the core development team For

example developers at Korea Institute of Science and Technology Information (KISTI) while

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

5

working on the WISDOM project [15] have developed a new multiplexing backend For their studies

the KISTI researchers need to make use of both LCGg-Lite accessible sites and also new sites

managed by the GridWay [16] metascheduler Since a GridWay backend did not exist they first

developed it using the examples from the pre-existing backends With all of their execution resources now accessible to Ganga they then developed the InterGrid backend [17] using information about the

current load on various sites InterGrid selects between LCG and GridWay for each submitted job In

this way the backend complexity is hidden and users can focus on getting results with minimal effort

4 Conclusions

This paper has presented Ganga a user-friendly job management tool for grid batch and local

systems The modular architecture of Ganga allows users to easily make use of all resources available

to them while spending little time reconfiguring their analyses for each execution backend With a stable development process and large user community Ganga is a mature project that will continue to

thrive as we move into the data taking era of the LHC

References [1] F Brochu et al Ganga a tool for computational-task management and easy access to Grid

resources Submitted to Computer Physics Communications arXiv09022685v1

[2] T Maeno PanDA distributed production and distributed analysis system for ATLAS 2008 J Phys Conf Ser 119 062036 (4pp) doi 1010881742-65961196062036

[3] E Laure et al Programming the Grid with gLite Computational Methods in Science and

Technology Computational Methods in Science and Technology 12(1)33-45 2006

[4] M Ellert et al The NorduGrid project using Globus toolkit for building Grid infrastructure Nucl Instrum Methods A502 (2003) 407

[5] A Tsaregorodtsev et al DIRAC A Scalable Lightweight Architecture for High Throughput

Computing Proceedings of the 5th IEEEACM International Workshop on Grid Computing pp 19-25 2004

[6] R Jones An overview of the EGEE project pp 1-8 of C Tuumlrker M Agosti and H-J Schek

(Eds) Peer-to-peer Grid and service-orientation in digital library architectures [Lecture

Notes in Computer Science 3664] (Springer Berlin 2005) [7] R Pordes et al The Open Science Grid J Phys Conf Ser 78 (2007) 012057

[8] Savannah Portal for Ganga httpssavannahcernchprojectsganga

[9] G Barrand et al GAUDI A software architecture and framework for building HEP data processing applications Computer Physics Communications 140(2001) 45-55

[10] A Maier et al User analysis of LHCb data with Ganga These proceedings

[11] D C Vanderster et al A PanDA Backend for the Ganga Analysis Interface These proceedings [12] Athena Framework httpstwikicernchtwikibinviewAtlasAthenaFramework

[13] J Elmsheuser et al Distributed Analysis in ATLAS using GANGA These proceedings

[14] D C Vanderster et al Functional and Large-Scale Testing of the ATLAS Distributed Analysis

Facilities with Ganga These proceedings [15] WISDOM project website httpwisdomeu-egeefr

[16] E Huedo et al The GridWay Framework for Adaptive Scheduling and Execution on Grids

Scalable Computing ndash Practice and Experience 6(3) 1-8 2005 [17] S Hwang et al An Approach to Grid Interoperability using Ganga EGEE User Forum Catania

March 2-6 2009

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

6

Page 6: Ganga: User-friendly Grid job submission and management tool

Figure 4 Weekly unique users from January 1 to April 30 2009 ATLAS users are in blue LHCb in

green and others in brown

31 Ganga and the LHCb Experiment

Ganga is the grid user interface for LHCb As such many experiment-specific plugins have been incorporated to allow LHCb physicists to easily run large-scale analyses In particular the Gaudi [9]

application has been well integrated into Ganga allowing users to work with multiple configurations

of Gaudi jobs After specifying the relevant datasets users can run Gaudi jobs locally on batch systems or on the grid resources managed by Dirac Complete details of the use of Ganga by LHCb

are provided in these proceedings [10]

32 Ganga and the ATLAS Experiment

For the ATLAS experiment Ganga is one of two front-end tools to the ATLAS grids the other tool pathena and Ganga share a common library for the configuration of ATLAS analyses for the grid

[11] As with LHCb experiment-specific plugins have been developed in order easily map the ATLAS

analysis application Athena [12] to the execution backends The application object Athena allows users to process the various data types produced by the ATLAS detector and AthenaMC provides

users with a small scale production system to generate personal Monte Carlo samples Full details of

the ATLAS plugins in Ganga are given in these proceedings [13]

ATLAS has also built distributed analysis testing software using Ganga The GangaRobot and HammerCloud services perform functional and stress testing of the distributed analysis factilities [14]

The GangaRobot test results are used to provide up-to-date site status information to running Ganga

sessions so that malfunctioning sites can be avoided The HammerCloud stress tests are used to identify bottlenecks in the grid architecture for example to discover local storage or network

limitations and to test the remote access of shared database resources

33 Ganga and the Wider Community There are also novel contributions to Ganga that come from outside the core development team For

example developers at Korea Institute of Science and Technology Information (KISTI) while

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

5

working on the WISDOM project [15] have developed a new multiplexing backend For their studies

the KISTI researchers need to make use of both LCGg-Lite accessible sites and also new sites

managed by the GridWay [16] metascheduler Since a GridWay backend did not exist they first

developed it using the examples from the pre-existing backends With all of their execution resources now accessible to Ganga they then developed the InterGrid backend [17] using information about the

current load on various sites InterGrid selects between LCG and GridWay for each submitted job In

this way the backend complexity is hidden and users can focus on getting results with minimal effort

4 Conclusions

This paper has presented Ganga a user-friendly job management tool for grid batch and local

systems The modular architecture of Ganga allows users to easily make use of all resources available

to them while spending little time reconfiguring their analyses for each execution backend With a stable development process and large user community Ganga is a mature project that will continue to

thrive as we move into the data taking era of the LHC

References [1] F Brochu et al Ganga a tool for computational-task management and easy access to Grid

resources Submitted to Computer Physics Communications arXiv09022685v1

[2] T Maeno PanDA distributed production and distributed analysis system for ATLAS 2008 J Phys Conf Ser 119 062036 (4pp) doi 1010881742-65961196062036

[3] E Laure et al Programming the Grid with gLite Computational Methods in Science and

Technology Computational Methods in Science and Technology 12(1)33-45 2006

[4] M Ellert et al The NorduGrid project using Globus toolkit for building Grid infrastructure Nucl Instrum Methods A502 (2003) 407

[5] A Tsaregorodtsev et al DIRAC A Scalable Lightweight Architecture for High Throughput

Computing Proceedings of the 5th IEEEACM International Workshop on Grid Computing pp 19-25 2004

[6] R Jones An overview of the EGEE project pp 1-8 of C Tuumlrker M Agosti and H-J Schek

(Eds) Peer-to-peer Grid and service-orientation in digital library architectures [Lecture

Notes in Computer Science 3664] (Springer Berlin 2005) [7] R Pordes et al The Open Science Grid J Phys Conf Ser 78 (2007) 012057

[8] Savannah Portal for Ganga httpssavannahcernchprojectsganga

[9] G Barrand et al GAUDI A software architecture and framework for building HEP data processing applications Computer Physics Communications 140(2001) 45-55

[10] A Maier et al User analysis of LHCb data with Ganga These proceedings

[11] D C Vanderster et al A PanDA Backend for the Ganga Analysis Interface These proceedings [12] Athena Framework httpstwikicernchtwikibinviewAtlasAthenaFramework

[13] J Elmsheuser et al Distributed Analysis in ATLAS using GANGA These proceedings

[14] D C Vanderster et al Functional and Large-Scale Testing of the ATLAS Distributed Analysis

Facilities with Ganga These proceedings [15] WISDOM project website httpwisdomeu-egeefr

[16] E Huedo et al The GridWay Framework for Adaptive Scheduling and Execution on Grids

Scalable Computing ndash Practice and Experience 6(3) 1-8 2005 [17] S Hwang et al An Approach to Grid Interoperability using Ganga EGEE User Forum Catania

March 2-6 2009

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

6

Page 7: Ganga: User-friendly Grid job submission and management tool

working on the WISDOM project [15] have developed a new multiplexing backend For their studies

the KISTI researchers need to make use of both LCGg-Lite accessible sites and also new sites

managed by the GridWay [16] metascheduler Since a GridWay backend did not exist they first

developed it using the examples from the pre-existing backends With all of their execution resources now accessible to Ganga they then developed the InterGrid backend [17] using information about the

current load on various sites InterGrid selects between LCG and GridWay for each submitted job In

this way the backend complexity is hidden and users can focus on getting results with minimal effort

4 Conclusions

This paper has presented Ganga a user-friendly job management tool for grid batch and local

systems The modular architecture of Ganga allows users to easily make use of all resources available

to them while spending little time reconfiguring their analyses for each execution backend With a stable development process and large user community Ganga is a mature project that will continue to

thrive as we move into the data taking era of the LHC

References [1] F Brochu et al Ganga a tool for computational-task management and easy access to Grid

resources Submitted to Computer Physics Communications arXiv09022685v1

[2] T Maeno PanDA distributed production and distributed analysis system for ATLAS 2008 J Phys Conf Ser 119 062036 (4pp) doi 1010881742-65961196062036

[3] E Laure et al Programming the Grid with gLite Computational Methods in Science and

Technology Computational Methods in Science and Technology 12(1)33-45 2006

[4] M Ellert et al The NorduGrid project using Globus toolkit for building Grid infrastructure Nucl Instrum Methods A502 (2003) 407

[5] A Tsaregorodtsev et al DIRAC A Scalable Lightweight Architecture for High Throughput

Computing Proceedings of the 5th IEEEACM International Workshop on Grid Computing pp 19-25 2004

[6] R Jones An overview of the EGEE project pp 1-8 of C Tuumlrker M Agosti and H-J Schek

(Eds) Peer-to-peer Grid and service-orientation in digital library architectures [Lecture

Notes in Computer Science 3664] (Springer Berlin 2005) [7] R Pordes et al The Open Science Grid J Phys Conf Ser 78 (2007) 012057

[8] Savannah Portal for Ganga httpssavannahcernchprojectsganga

[9] G Barrand et al GAUDI A software architecture and framework for building HEP data processing applications Computer Physics Communications 140(2001) 45-55

[10] A Maier et al User analysis of LHCb data with Ganga These proceedings

[11] D C Vanderster et al A PanDA Backend for the Ganga Analysis Interface These proceedings [12] Athena Framework httpstwikicernchtwikibinviewAtlasAthenaFramework

[13] J Elmsheuser et al Distributed Analysis in ATLAS using GANGA These proceedings

[14] D C Vanderster et al Functional and Large-Scale Testing of the ATLAS Distributed Analysis

Facilities with Ganga These proceedings [15] WISDOM project website httpwisdomeu-egeefr

[16] E Huedo et al The GridWay Framework for Adaptive Scheduling and Execution on Grids

Scalable Computing ndash Practice and Experience 6(3) 1-8 2005 [17] S Hwang et al An Approach to Grid Interoperability using Ganga EGEE User Forum Catania

March 2-6 2009

17th International Conference on Computing in High Energy and Nuclear Physics (CHEP09) IOP PublishingJournal of Physics Conference Series 219 (2010) 072022 doi1010881742-65962197072022

6