negative improvements, relative validity and …maria grazia pia, infn genova 1 negative...

13
1 Negative improvements, relative validity and elusive goodness M. Batic 1 , G. Hoff 2 , S. H. Kim 3 , M. G. Pia 4 , P. Saracco 4 , G. Weidenspointner 5 IEEE NSS 2013 27 October – 2 November 2013 Seoul, Korea 1 Sinergise, Ljubljana, Slovenia 2 Pontificia Universidade Catolica do Rio Grande do Sul, Porto Alegre, Brazil 3 Hanyang University, Seoul, Korea 4 INFN Genova, Italy 5 Halbleiterlabor, MPI-MPE, München, Germany Food for thought for Monte Carlo simulation users and developers

Upload: others

Post on 12-Mar-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Negative improvements, relative validity and …Maria Grazia Pia, INFN Genova 1 Negative improvements, relative validity and elusive goodness M. Batic1, G. Hoff2, S. H. Kim3, M. G

Maria Grazia Pia, INFN Genova 1

Negative improvements, relative validity and elusive goodness

M. Batic1, G. Hoff2, S. H. Kim3, M. G. Pia4, P. Saracco4, G. Weidenspointner5

IEEE NSS 2013 27 October – 2 November 2013

Seoul, Korea

1Sinergise, Ljubljana, Slovenia 2Pontificia Universidade Catolica do Rio Grande do Sul, Porto Alegre, Brazil

3Hanyang University, Seoul, Korea 4INFN Genova, Italy

5Halbleiterlabor, MPI-MPE, München, Germany

Food for thought for Monte Carlo simulation users and developers

Page 2: Negative improvements, relative validity and …Maria Grazia Pia, INFN Genova 1 Negative improvements, relative validity and elusive goodness M. Batic1, G. Hoff2, S. H. Kim3, M. G

Maria Grazia Pia, INFN Genova 2

Aging ! Our Monte Carlo codes are getting old ! (Monte Carlo) simulation gets more and

more popular in experimental practice

0.0

0.1

0.2

0.3

0.4

0.5

0.6

1960

19

62

1964

19

66

1968

19

70

1972

19

74

1976

19

78

1980

19

82

1984

19

86

1988

19

90

1992

19

94

1996

19

98

2000

20

02

2004

20

06

2008

Frac

tion

of p

ublis

hed

pape

rs

Year

Monte Carlo OR simulation

NIM A+B TNS NIM A NIM B

MCNP EGS GEANT

Penelope Geant4

2009

Page 3: Negative improvements, relative validity and …Maria Grazia Pia, INFN Genova 1 Negative improvements, relative validity and elusive goodness M. Batic1, G. Hoff2, S. H. Kim3, M. G

Maria Grazia Pia, INFN Genova 3

Revisiting old successes

How does it look like a few years later?

Validation of Geant4 low energy physicsmodels against electron energy

deposition and backscattering dataAnton Lechner, Student Member, IEEE, Maria Grazia Pia and Manju Sudhakar

Abstract—A comprehensive and systematic validation ofGeant4 electromagnetic physics models, relevant to the low-energy domain (! 1 MeV), was performed. Considering differentmaterials, the energy deposition pattern and backscattering areinvestigated for electron beams of varying energies and incidentangles. The obtained simulation results are compared againsthigh-precision experimental data. The study provides an usefulguidance for low-energy physics simulation applications based onthe Geant4 toolkit.

Index Terms—Geant4, validation, low-energy electrons.

I. INTRODUCTION

THE electron transport in matter is fundamental to simu-lation applications involving electrons either as primary

particles or secondary products; thus a detailed understandingof the accuracy of relevant models with respect to experimentaldata is of great importance for Monte Carlo developers andusers. Systematic investigations of the simulation accuracyover a variety of conditions, covering different materials andenergies, allow to obtain a global picture of the abilities andshortcomings of physics models.

The study presented here addresses a comprehensive val-idation of Geant4 [1], [2] electron energy deposition andbackscattering simulations for a variety of target materials,incident angles and beam energies in the low-energy range (!1MeV). Simulation results are compared against experimentaldata in [3], [4]. Earlier attempts exist to validate Geant4 againstthese measurements, but they consider only a limited subset ofthe available experimental data (see e.g. [5], [6]). The currentvalidation project has as its primary objective the coverage ofthe complete experimental data set. An initial, still significantcollection of results is presented, focusing on Geant4 library-based interactions models (see section II).

II. GEANT4 ELECTROMAGNETIC PHYSICS

The current validation process examines the accuracyachievable with Geant4 physics models describing the elec-tromagnetic interactions of electrons and photons. The Geant4toolkit offers alternative physics process implementations ap-plicable to simulations in the low-energy domain, which are

Manuscript received November 23, 2007.A. Lechner is with the Atomic Institute of the Austrian Universities, Vienna

University of Technology, Vienna, Austria and CERN, Geneva, Switzerland.M. G. Pia is with INFN Sezione di Genova, Via Dodecaneso 33, I-

16146 Genova, Italy (phone: +39 010 3536420, fax: +39 010 313358, e-mail:[email protected]).

M. Sudhakar is with the ISRO Satellite Centre, Bangalore, India.

contained in the Low-Energy [7] and in the Standard [8]electromagnetic packages. The relevant physics processes forelectrons and photons and the corresponding implementationsare summarized in Table I.

The Low-Energy package contains two alternative ap-proaches for both electrons and photons: one relying on theparametrization of data libraries (EEDL [9] for electrons andEPDL [10] for photons) and the second utilizing analyticaldescriptions originally developed for the Penelope MonteCarlo code [11], [12]. Only results corresponding to the EEDLand EPDL parametrizations are presented in this paper.

A unique implementation of the multiple-scattering processexists in Geant4, which is part of the Standard package; thisprocess is used in all the simulations described here.

III. EXPERIMENTAL DATA

The experimental data considered for the radiation studyderive from [3] and [4]. The first reference includes a com-prehensive collection of energy deposition measurements fromincident electrons as a function of penetration depth in varioustargets, while the latter one provides a variety of experimentalelectron energy and charge albedos. Both data sets coverdifferent target materials, spanning a wide range in atomicnumber (from beryllium to uranium). The measured data setswere originally intented for validating the TIGER code [13].The uncertainty in the dose deposition is <2.2% [3].

For both experimental setups, calorimetric measurementtechniques were applied to determine the physical observablesunder investigation:

• Longitudinal energy delivery pattern. To measure theenergy deposition at a specified distance from the targetsurface, a calorimeter foil was placed between a frontlayer and a “semi-infinite” backward layer, i.e. a slabthicker than the range of most electrons. The calorimeterwas made of the same material as the entire targetconfiguration.For a given front slab, the measurement depth was deter-mined as the slabs’ thickness plus half the calorimeterthickness. Measurements were performed for differentthicknesses of the front layers.

• Backscattering. The fractions of energy and chargebackscattered from the target surface were determinedindirectly by measuring the energy deposition and thecurrent in a calorimeter. The albedos were then obtainedby taking the complement, where for the energy albedos a

2007 IEEE Nuclear Science Symposium Conference Record N36-4

1-4244-0923-3/07/$25.00 ©2007 IEEE. 2001

Best Student paper, IEEE NSS 2007

398 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 56, NO. 2, APRIL 2009

Validation of Geant4 Low Energy ElectromagneticProcesses Against Precision Measurements of

Electron Energy DepositionAnton Lechner, Maria Grazia Pia, and Manju Sudhakar

Abstract—A comparison of energy deposition profiles producedby Geant4-based simulations against calorimetric measurementsis reported, specifically addressing the low energy range. The en-ergy delivered by primary and secondary particles is analyzed asa function of the penetration depth. The experimental data con-cern electron beams of energy between approximately 50 keV and1 MeV and several target materials of atomic number between 4and 92. The simulations involve different sets of physics processmodels and versions of the Geant4 toolkit. The agreement betweensimulations and experimental data is evaluated quantitatively; thedifferences in accuracy observed between Geant4 models and ver-sions are highlighted. The results document the accuracy achiev-able in use cases involving the simulation of low energy electronswith Geant4 and provide guidance to the experimental applicationsconcerned.

Index Terms—Calorimeter, dosimetry, Geant4, Monte Carlo,simulation.

I. INTRODUCTION

T HE transport of electrons in matter is fundamental to thesimulation of a large variety of experimental problems,

where electrons are involved either as primary particles or sec-ondary products of interaction. Systematic investigations of theaccuracy of physics simulation models over a variety of condi-tions, encompassing different interacting materials and electronenergies, allow both Monte Carlo developers and users to obtaina comprehensive appraisal of the abilities and shortcomings ofavailable simulation tools.

The study presented here addresses the validation ofprocesses in the Geant4 [1], [2] Low Energy Electromag-netic package [3], [4]: the energy deposited by electrons inGeant4-based simulations is compared against experimentaldata from [5] for various target materials, incident angles, beamenergies and simulation models.

These high precision experimental measurements have beenconsidered a valuable reference for Monte Carlo systems; theywere originally meant for the validation of the ITS (Integrated

Manuscript received September 10, 2008; revised December 09, 2008. Cur-rent version published April 08, 2009.

A. Lechner is with the Atomic Institute of the Austrian Universities, ViennaUniversity of Technology, Vienna, Austria, and also with CERN, 1211 Geneva23, Switzerland (e-mail: [email protected]).

M. G. Pia is with INFN Sezione di Genova, I-16146 Genova, Italy (e-mail:[email protected]).

M. Sudhakar is with ISRO, Bangalore, India, INFN Sezione di Genova,I-16146 Genova, Italy, and also with the Department of Physics, University ofCalicut, Calicut, India (e-mail: [email protected]).

Digital Object Identifier 10.1109/TNS.2009.2013858

Tiger Series) [6] system, but they have been exploited in variousvalidation studies of Monte Carlo codes, like [7]–[11]. Theystill represent the most comprehensive set of reference data forbenchmarking simulation models of electron energy depositionin the low energy range, including various materials, primaryelectron energies and incident angles.

Previous evaluations of Geant4 physics models [13]–[16]were performed against these experimental measurements:they concerned a subset of the available data, limited to a fewmaterials and beam settings; [14] was specifically focussed onthe optimization of user-defined parameters in the simulationapplication, while [13] investigated the effect of parameter set-tings in an early version of Geant4 multiple scattering model.The Geant4 toolkit has been subject to further evolutions sincetheir publication.

This paper presents a comprehensive set of validation resultsbased on the experimental data concerning single element tar-gets in [5]. The study is characterized by a faithful reproductionof the experimental set-up in the simulation and by a quantita-tive statistical analysis of the results. Two topics were addressedwith particular emphasis in the validation process: the evalua-tion of effects in dosimetric simulations related to the evolutionof Geant4 multiple scattering algorithm, and the comparison ofthe accuracy of two Geant4 sets of physics models specificallydevoted to the low energy domain.

These results complement the comparison [17] of Geant4electromagnetic models against the NIST Physical ReferenceData [18] as a reference for experimental applications concernedwith the reliability of Geant4-based simulations in the low en-ergy range.

II. EXPERIMENTAL DATA

The experimental data exploited in this simulation validationstudy derive from [5]. The measurement technique is describedin detail in [5] and [19].

The experimental set-up consisted of an electron beam im-pinging on a target equipped with a calorimeter. The target wasconfigured with a semi-infinite geometry, in such a way that thelongitudinal profile of the energy deposition could be measured.

The experimental apparatus consisted of a front slab of pas-sive material, a calorimeter and a so-called “infinite” plate. Thecalorimeter was made of a thin foil of the material under in-vestigation. The front slab consisted of either a single materialor a stack of foils of different materials; the present study con-cerned only the data taken with a front slab of the same material

0018-9499/$25.00 © 2009 IEEE

TNS, 2009

Page 4: Negative improvements, relative validity and …Maria Grazia Pia, INFN Genova 1 Negative improvements, relative validity and elusive goodness M. Batic1, G. Hoff2, S. H. Kim3, M. G

Maria Grazia Pia, INFN Genova 4

V. Ivanchenko et al., Recent Progress of Geant4 Electromagnetic Physics and Readiness for the LHC Start, XII Advanced Computing and Analysis Techniques in Physics Research, Erice, Italy, 3-7 November 2008

PoS(ACAT08)108

¤ Copyright owned by the author(s) under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike Licence. http://pos.sissa.it

Recent Progress of Geant4 Electromagnetic Physics and Readiness for the LHC Start

Vladimir Ivanchenko1, John Apostolakis CERN, CH-1211 Geneve 23, Switzerland E-mail: [email protected], [email protected]

Andreas Schaelicke DESY, 15738 Zeuthen, Germany E-mail: [email protected]

Laszlo Urban Geant4 Associates International Ltd, United Kingdom E-mail: [email protected]

Toshiyuki Toshito KEK, Tsukubo, Ibaraki 305-0801 Japan E-mail: [email protected]

Sabine Elles, Jean Jacquemier, Michel Maire LAPP, 74941 Annecy-le-vieux, France E-mail: [email protected], [email protected], [email protected]

Alexander Bagulya, Vladimir Grichine Lebedev Physics Institute, 119991Moscow, Russia E-mail: [email protected], [email protected]

Alexei Bogdanov, Rostislav Kokoulin MEPhI, 115409 Moscow, Russia E-mail: [email protected], [email protected]

Peter Gumplinger TRIUMF, Vancouver, Canada E-mail: [email protected]

The status of Geant4 electromagnetic (EM) physics models is presented, focusing on the models most relevant for collider HEP experiments, at LHC in particular.

XII Advanced Computing and Analysis Techniques in Physics Research Erice, Italy 3-7 November, 2008

1 Speaker

Progress in NUCLEAR SCIENCE and TECHNOLOGY, Vol. 2, pp.898-903 (2011)

c! 2011 Atomic Energy Society of Japan, All Rights Reserved.

898

REVIEW

Recent Improvements in Geant4 Electromagnetic Physics Models and Interfaces

Vladimir IVANCHENKO1,2,3*, John APOSTOLAKIS1, Alexander BAGULYA4, Haifa Ben ABDELOUAHED5, Rachel BLACK6, Alexey BOGDANOV7, Helmut BURKHARD1, Stéphane CHAUVIE8, Pablo CIRRONE9, Giacomo CUTTONE9, Gerardo DEPAOLA10, Francesco Di ROSA9, Sabine ELLES11, Ziad FRANCIS12,

Vladimir GRICHINE4, Peter GUMPLINGER13, Paul GUEYE6, Sebastien INCERTI14, Anton IVANCHENKO14, Jean JACQUEMIER11, Anton LECHNER1,15, Francesco LONGO16, Omrane KADRI5, Nicolas KARAKATSANIS17,

Mathieu KARAMITROS14, Rostislav KOKOULIN7, Hisaya KURASHIGE18, Michel MAIRE11,19, Alfonso MANTERO20, Barbara MASCIALINO21, Jakub MOSCICKI1, Luciano PANDOLA22, Joseph PERL23, Ivan PETROVIC9,

Aleksandra RISTIC-FIRA9, Francesco ROMANO9, Giorgio RUSSO9, Giovanni SANTIN24, Andreas SCHAELICKE25, Toshiyuki TOSHITO26, Hoang TRAN14, Laszlo URBAN19, Tomohiro YAMASHITA27 and Christina ZACHARATOU28

1 CERN, CH1211 Geneve 23, Switzerland

2 Ecoanalytica,119899 Moscow, Russia 3 Metz University, LPMC, 57078 Metz, France

4 Lebedev Physics Institute, 119991 Moscow, Russia 5 CNSTN, 2020 Tunis, Tunisia,

6 Hampton University, Hampton VA 23668, USA 7 MEPhI, 115409 Moscow, Russia

8 INFN/Torino, Italy 9 INFN/LNS, I-95125 Catania, Italy

10 FAMAF, Argentina 11 LAPP, 74941 Anneccy-le-vieux, France

12 IRSN, 92262 Fontenay-aux-Roses, France 13 TRIUMF, Vancouver, Canada

14 Université Bordeaux 1, CNRS/IN2P3, CENBG, Ch. Du Solarium, BP120, 33175 Gradignan, France 15Atomic Institute of the Austrian Universities, Vienna, Austria

16 INFN/Trieste, Italy 17 NTUA, Greece

18 Kobe University, Japan 19 Geant4 Associates International Ltd, United Kingdom

20 INFN/Genoa, Italy 21 Karolinska Institutet, S-171-76 Stockholm, Sweden

22 INFN/LNGS, I-67100 Assergi (AQ), Italy 23 SLAC, Menlo Park, CA, USA

24 ESA/ESTEC, 2200 AG Noordwjik, The Netherlands 25 DESY,15738 Zeuthen, Germany

26 Nagoya City Hall, Nagoya, Japan 27 Hyogo Ion Beam Medical Center, Tatsuno, Japan

28 Niels Bohr Institute, Denmark

An overview of the electromagnetic (EM) physics of the Geant4 toolkit is presented. Two sets of EM models are available: the "Standard" initially focused on high energy physics (HEP) while the "Low-energy" was developed for medical, space and other applications. The "Standard" models provide a faster computation but are less accurate for keV energies, the "Low-energy" models are more CPU time consuming. A common interface to EM physics models has been developed allowing a natural combination of ultra-relativistic, relativistic and low-energy models for the same run providing both precision and CPU performance. Due to this migration additional capabilities become available. The new developments include relativistic models for bremsstrahlung and e+e- pair production, models of multiple and single scattering, hadron/ion ionization, microdosimetry for very low energies and also improvements in existing Geant4 models. In parallel, validation suites and benchmarks have been intensively developed.

KEYWORDS: Geant4, bremsstrahlung, Monte Carlo, multiple scattering, energy loss

I. Introduction1

Geant4 is a general purpose toolkit1,2) for Monte Carlo si-

*Corresponding author, E-mail: [email protected]

mulation of particle transport in matter. The standard EM package of Geant41,3-12) provides simulation of EM interac-tions of particles in the energy interval from 1 keV to 10 PeV. It is used for Monte Carlo production for the Large Hadron

SNA+Monte Carlo 2010

Abstract— A review of Geant4 model and benchmark

developments in the framework of an Electron Shielding project sponsored by ESA is presented. It includes extensions of electron ionisation models to lower energies, development of a new standard model of bremsstrahlung and a new angular generator of bremsstrahlung, and new standard models of Compton and Rayleigh scattering. New validation benchmarks have been developed for electron scattering and bremsstrahlung; also a simple benchmark tool has been developed for the GRAS tool. With the new benchmarks Geant4 simulation results can be compared with EGSnrc and Penelope predictions.

Index Terms— Geant4, gamma rays, physics computing, simulation, software tools

I. INTRODUCTION

EANT4 is a toolkit for Monte Carlo simulation of the transportation and interaction of particles in matter [1],

[2]. It is applicable for a wide variety of applications including high energy physics, space and medical science. The Geant4 electromagnetic standard physics sub-package [3] is an important component of the toolkit. In the current report we summarise results of recent developments for Standard Geant4 models used for electron transport simulations and new results of validations performed in the framework of the ESA funded project. The goals of the project are to improve the accuracy of Geant4 simulations of radiation transport of electrons for space shielding applications. For these validation studies standard, low-energy Livermore and low-energy Penelope

Manuscript received September 5, 2011. This work was supported in part by the ESA Contract 22839/10/NL/AT.

J. Allison, V. Grichine, A. Howard, V. Ivanchenko, M. Maire, L. Urban are with Geant4 Associate International, Hebden Brigge, HX7 7BT, United Kingdom (e-mail: [email protected]). V. Grichine is with Lebedev Physics Institute, Moscow, Russia, A. Howard is with ETHZ, Zurich, Switzerland. V. Ivanchenko is on leave of Ecoanalytica, MSU, Moscow, Russia (corresponding author, phone: +41-22-767-8871; fax: +41-22-767-8830; e-mail: [email protected]).

J. Cueto is with Thales Alenia Space Espania, 28760 Tres Cantos, Madrid, Spain (e-mail: [email protected]).

S. Ibarmia is with National Institute for Aerospace Tecnology, 28850 Torrejón de Ardoz, Madrid, Spain (e-mail: [email protected]).

G. Santin is with ESA-ESTEC, 2200 AG Noordwijk, The Netherlands (e-mail: [email protected]).

models of Geant4 were used. The results were compared with experimental data and published results of EGSnrc and Penelope codes.

II. GEANT4 INTERFACE UPGRADE

Recently Geant4 software interfaces for electromagnetic physics processes were unified between all electromagnetic sub-packages including Standard, Livermore, Penelope and DNA [4]. As a result, combined Physics Lists collecting a desired set of models from different sub-packages can be built per particle type and energy range. The Standard models are used above 1 GeV in all cases; for low-energies a selection should be made to provide the needed accuracy in a concrete Geant4 application efficiently. One of the goals of the current project was to study the accuracy of different models for different energy ranges and to provide recommendations for the Physics List to be used for electron shielding simulations.

The unification of Geant4 interfaces for electromagnetic physics was completed when a common approach for simulation of atomic de-excitation had been developed [5]. This was achieved by the introduction of a new abstract G4VAtomDeexcitation interface and its first concrete implementation, G4UAtomicDeexcitation. With this new interface the atomic de-excitation module can be activated for any electromagnetic or other physics model of Geant4.

III. GEANT4 MODELS UPGRADE

A. Electron multiple scattering The process of multiple scattering (MSC) of charged

particles is an important component of Monte Carlo transport. At high energy it defines the deviation of charged particles from ideal tracks, limiting the spatial resolution of detectors. Scattering of low-energy electrons defines energy flow via volume boundaries, directly affecting all application domains including transmission through shielding, and therefore in-flight predictions of dose inside spacecraft. The Geant4 toolkit offers several models of multiple scattering [6]. The production model is developed by L.Urban [7] and it is used in Geant4 by default.

New Geant4 Model and Interface Developments for Improved Space Electron Transport

Simulations: First results John Allison, Juan Cueto, Vladimir Grichine, Alexander Howard, Sergio Ibarmia, Vladimir

Ivanchenko, Michel Maire, Giovanni Santin and Laszlo Urban

G

19 115 p

Page 5: Negative improvements, relative validity and …Maria Grazia Pia, INFN Genova 1 Negative improvements, relative validity and elusive goodness M. Batic1, G. Hoff2, S. H. Kim3, M. G

Maria Grazia Pia, INFN Genova 5

Negative improvements

●● ●

0.0 0.2 0.4 0.6 0.8

0

1

2

3

4

Fraction of CSDA range

E/Δ

x (M

eV c

m2

g)

C: 1 MeV, 0°

● exp.9.19.29.39.49.59.6

● ●

● ●

0.0

0.2

0.4

0.6

0.8

1.0

Geant4 version

effic

ienc

y

● ●

● ●

9.1 9.2 9.3 9.4 9.5 9.6

● EEDL−EPDLPenelopeStandard

2946 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 60, NO. 4, AUGUST 2013

TABLE VIIIp-VALUES OF THE TESTS FOR LONGITUDINAL ENERGY DEPOSITION: SIMULATIONS WITH ELECTRON-PHOTON MODELS BASED ON EEDL–EPDL

models in these versions, not exceeding 0.3 for any model: this

observation suggests that the analysis of contingency tables for

these Geant4 versions would have scarce discriminating power

to appreciate differences in accuracy associated with any model.

According to Table XIII, the hypothesis of equivalent com-

patibility with experiment for the electron–photonmodels based

on EEDL–EPDL and the models in the “standard” package is

rejected in Geant4 version 9.1, while it is not rejected in the

later versions. The efficiency of both sets of models is lower inGeant4 versions later than 9.1, therefore the result of the statis-

tical tests cannot be explained by an improvement of the accu-

racy of the models in Geant4 standard electromagnetic package.

This result could be due to a worse degradation of the accuracy

of the models based on EEDL–EPDL than of the standard ones

in versions later than 9.1, or could be explained by the loss of

discriminating power of the tests over contingency tables when

only a small number of test cases “pass” the test of compat-

ibility between simulation and experiment.

Based on the results in Tables XII and XIII, one can con-

clude that the three sets of electron–photon models contribute to

significantly different simulation accuracy when using Geant49.1, while, along with a general degradation of simulation accu-

racy in later Geant4 versions, the selection of electron–photon

models is less critical when using other, more recent versions.

The simulations based on Penelope-like and standard

electron–photon models exhibit statistically equivalent com-

patibility with experimental data: the null hypothesis of

equivalence of the two categories is never rejected in any of the

test cases summarized in Table XIV.

C. Evolution of Penelope-Like Electron-Photon Modeling

The general trend of the results reported in Section VI-B

does not suggest any major variations in the compatibility of

the Penelope-like models with experimental data, where differ-

ences could have arisen due to the update from Penelope 2001

to Penelope 2008. Since classes reengineered from both Pene-

lope versions coexist in Geant4 9.5, it is possible to compare

their accuracy directly in that context.

Longitudinal energy deposition profiles were produced withGeant4 9.5 using both implementations of Penelope-like elec-

tron–photon models, keeping all other simulation features un-

changed, and were subject to the same analysis procedure. The

results of the test appear similar for the two implementations:

the number of test cases that “pass” the test is 4 and 5, using

code reengineered from Penelope 2008 and 2001 respectively,

while the hypothesis of compatibility with experimental mea-

surements is rejected in 26 and 25 test cases respectively.

The analysis of the resulting 2 2 table does not reject the

hypothesis of equivalent compatibility with experiment using

Penelope 2008 or 2001: the p-value of McNemar’s test is 0.317,

while it is 1 for McNemar’s exact test. These results suggest

that the reengineered Penelope 2008 models do not represent

a significant improvement with respect to the 2001 ones in theexperimental scenario subject to evaluation.

D. Evolution Over Geant4 Versions

One can observe in Fig. 13 a similar trend associated with

all Geant4 electron–photon models: for each modeling option

2934 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 60, NO. 4, AUGUST 2013

Validation of Geant4 Simulation of ElectronEnergy Deposition

Matej Batiþ, Gabriela Hoff, Maria Grazia Pia, Paolo Saracco, and Georg Weidenspointner

Abstract—Geant4-based simulations of the energy deposited byelectrons in various materials are quantitatively compared to high-precision calorimetric measurements taken at Sandia Laborato-ries. The experimental data concern electron beams of energy be-tween a few tens of kilolectron volt and 1 MeV at various inci-dence angles. Two experimental scenarios are evaluated: the longi-tudinal energy deposition pattern in a finely segmented detector,and the total energy deposited in a larger size calorimeter. Thesimulations are produced with Geant4 versions from 9.1 to 9.6;they involve models of electron–photon interactions in the stan-dard and low energy electromagnetic packages, and various imple-mentations of electron multiple scattering. Significant differencesin compatibility with experimental data are observed in the lon-gitudinal energy deposition patterns produced by the examinedGeant4 versions, while the total deposited energy exhibits smallervariations across the various Geant4 versions, with the exceptionGeant4 9.4. The validation analysis, based on statistical methods,shows that the best compatibility between simulation and experi-mental energy deposition profiles is achieved using electromagneticmodels based on the EEDL and EPDL evaluated data librarieswith Geant4 9.1. The results document the accuracy achievable inthe simulation of the energy deposited by low energy electrons withGeant4; they provide guidance for application in similar experi-mental scenarios and for improving Geant4.

Index Terms—Dosimetry, electrons, Geant4, Monte Carlo, sim-ulation.

I. INTRODUCTION

T HE simulation of the interactions with matter of electronsand of their secondary particles is one of the main tasks

of any Monte Carlo codes for particle transport. The resultingenergy deposition is relevant to a wide variety of experimentalapplications, where electrons contribute to determine experi-mental observables either as primary or secondary particles.High-precision experimental measurements [1]–[5] were

performed at the Sandia National Laboratories, specificallyfor the validation of the ITS (Integrated Tiger Series) [6]

Manuscript received April 11, 2013; revised July 01, 2013; accepted July 02,2013. Date of publication August 06, 2013; date of current version August 14,2013. This work has been partly funded by CNPq Grant BEX6460/10-0, Brazil.M. Batiþ was with INFN Sezione di Genova, I-16146 Genova, Italy. He is

now with Sinergise, 1000 Ljubljana, Slovenia.G. Hoff is with Pontificia Universidade Catolica do Rio Grande do Sul, Porto

Alegre, Brazil.M. G. Pia and P. Saracco are with the INFN Sezione di Genova, I-16146

Genova, Italy (e-mail: [email protected]; [email protected]).G. Weidenspointner is with the Max-Planck-Institut Für Extraterrestrische

Physik, 85740 Garching, Germany, and also with the MPI Halbleiterlabor,81739 München, Germany.Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TNS.2013.2272404

simulation code: they concern electrons with energies rangingfrom a few tens of keV to 1 MeV and involve various targetmaterials and electron incidence angles. These experimentaldata are still regarded as the most comprehensive referencefor benchmarking the simulation of energy deposition by lowenergy electrons: they have been exploited in the validation[7]–[15] of numerous general-purpose Monte Carlo codesother than the ITS system, for which the measurements wereoriginally intended, such as EGS [16], EGSnrc [17], Geant4[18], [19], MCNP [20], MCNPX [21], and Penelope [22].The validation of simulated electron energy deposition in [15]

concerns two versions of Geant4, 8.1p02 and 9.1: the latterwas the latest version available at the time when the articlewas written. Some differences in compatibility with experimentwere observed between the two Geant4 versions, which wereascribed to evolutions in Geant4 multiple scattering implemen-tation.Statements of improvements to Geant4 simulation of elec-

tromagnetic interactions have been reported in the literature[23]–[26] since the publication of [15], and a multiple scatteringmodel specifically addressing the transport of electrons [27] hasbeen introduced in the Geant4 toolkit. This paper documentsquantitatively how these evolutions in Geant4 electromagneticphysics domain affect the accuracy of the simulation of the en-ergy deposited by low energy electrons: it reports comparisonsbetween experimental data in [1]–[4] and simulations based onGeant4 versions from 9.1 to 9.6, which span five years’ Geant4development.In this respect, it is worthwhile to note that several versions

of Geant4 are actively used in the experimental community atany given time, not limited to the latest release: in fact, despitethe fast release rate of Geant4 of one or two new versions peryear, often complemented by correction patches, experimentalprojects usually require a stable simulation production environ-ment for large portions of their life-cycle and retain a Geant4version in their simulation productions for an extended period,even though new versions may become available in the mean-time.Besides the effects due to the evolution of Geant4 electro-

magnetic physics, this paper evaluates quantitatively anotherissue of experimental relevance: the sensitivity to physics mod-eling features in relation to the geometrical granularity of thedetector.The results of this validation analysis provide guidance

to experimental users in optimizing the configuration ofGeant4-based applications in scenarios concerned by the sim-ulation of the energy deposited by low energy electrons. Thisinvestigation may be relevant also to high energy experiments,

0018-9499 © 2013 IEEE

Page 6: Negative improvements, relative validity and …Maria Grazia Pia, INFN Genova 1 Negative improvements, relative validity and elusive goodness M. Batic1, G. Hoff2, S. H. Kim3, M. G

Maria Grazia Pia, INFN Genova 6

How to facilitate negative improvements

c1 = bg2lim*sig0[iZ]*(1.+hecorr[iZ]*(beta2-beta2lim))/bg2; c2 = bg2lim*sig0[iZ+1]*(1.+hecorr[iZ+1]*(beta2-beta2lim))/bg2;

static const double hecorr[15] = {120.70, 117.50, 105.00, 92.92, 79.23, 74.510, 68.29, 57.39, 41.97, 36.14, 24.53, 10.21, -7.855, -16.84, -22.30};

standard::G4ComptonScattering

G4VDiscreteProcessutils::G4VEmProcess

standard::G4KleinNishinaCompton

standard::G4KleinNishinaModel

polarisation::G4PolarizedComptonModel

utils::G4VEmModel

utils::G4VAtomDeexcitation

polarisation::G4PolarizedCompton

polarisation::G4PolarizedComptonCrossSection

polarisation::G4VPolarizedCrossSection

lowenergy::G4LivermoreComptonModel

lowenergy::G4LivermoreComptonModifiedModel

lowenergy::G4LivermorePolarizedComptonModel

lowenergy::G4PenelopeComptonModel

lowenergy::G4LowEPComptonModel

Geant4 9.6-ref-0721 August 2013

Compton scattering

G4VEmAdjointModelinclude::

G4AdjointComptonModel

lowenergy

standard

polarisation

adjoint

Legend

standard::G4HeatedKleinNishinaCompton

-fAtomDeexcitation

-currentModel

-fAtomDeexcitation

-crossSectionCalculator

-selectedModel

-theDirectEMProcess

-fAtomDeexcitation

-fAtomDeexcitation

-fAtomDeexcitation

-emModel

M. Fowler’s stink parade

…and more!

Page 7: Negative improvements, relative validity and …Maria Grazia Pia, INFN Genova 1 Negative improvements, relative validity and elusive goodness M. Batic1, G. Hoff2, S. H. Kim3, M. G

Maria Grazia Pia, INFN Genova 7

What is bad may be good

validation: (A) […] (B) The process of providing evidence t h a t t h e s y s t e m , software, or hardware and its associated p r o d u c t s s a t i s f y r e q u i r e m e n t s allocated to it at the end of each life cycle activity, solve the right p r o b l e m ( e . g . , c o r r e c t l y m o d e l p h y s i c a l l a w s , implement business rules, and use the p r o p e r s y s t e m assumptions), and satisfy intended use and user needs.

●●

0.0

0.2

0.4

0.6

0.8

1.0

Geant4 version

effic

ienc

y●

●●

9.1 9.2 9.3 9.4 9.5 9.6

● EEDL−EPDLPenelopeStandard

calorimeter

beam

beam angle

d

infinite layer calorimeter

front layer

beam

beam angle

● ●

● ●

0.0

0.2

0.4

0.6

0.8

1.0

Geant4 version

effic

ienc

y

● ●

● ●

9.1 9.2 9.3 9.4 9.5 9.6

● EEDL−EPDLPenelopeStandard

IEEE Standard 1012

Page 8: Negative improvements, relative validity and …Maria Grazia Pia, INFN Genova 1 Negative improvements, relative validity and elusive goodness M. Batic1, G. Hoff2, S. H. Kim3, M. G

Maria Grazia Pia, INFN Genova 8

How good?

! Hardly any systematic, comprehensive, quantitative validation of the “ingredients” of Monte Carlo simulation codes is documented in the literature

! Comparison of simulation results and experimental data in use cases mainly rests on visual appraisal of figures or indicators (%) deprived of any statistical relevance

Agreement Good agreement

Excellent agreement Satisfactory agreement

validation

Page 9: Negative improvements, relative validity and …Maria Grazia Pia, INFN Genova 1 Negative improvements, relative validity and elusive goodness M. Batic1, G. Hoff2, S. H. Kim3, M. G

Maria Grazia Pia, INFN Genova 9

Elusive goodness

information is still fairly limited, and insufficient for a quan-titative assessment of the accuracy and limitations of the dif-ferent theoretical approaches. Figure 6 shows cross sectionsfor ionization of the K shell of the elements Al, Ar, Ti, Cr,Ni, and Ge. Calculations are seen to agree reasonably withthe experiments and the agreement is better for the mostrecent measurements, which are also expected to be moreaccurate.

Figures 7 display calculated and measured cross sectionsfor the K shell and L subshells of the elements Cu, Sr, Ag,

Xe, W, Au, Pb, and Bi. For the K shells, when there isenough experimental information available, the degree ofagreement between theory and experiments is seen to besimilar to that of the light elements shown in Fig. 6. In thecase of L subshells, the comparison is not conclusive becauseof the considerable uncertainties of experimental data. Thisis partially due to the fact that ionization cross sections arederived from measured x-ray generation cross sections. Inthis derivation, use is made of fluorescence yields andCoster-Kronig coefficients which, in the case of L subshells,are affected by large uncertainties !of the order of 20% orlarger; see, e.g., Ref. "90#$.

As indicated above, published measured data for ioniza-tion of inner shells by impact of positrons are very rare.Figure 8 shows a comparison of calculated and experimentalcross sections for positron impact ionization of the K shellsof copper and silver atoms. Again, due to the scarcity of data,it is not possible to draw any conclusions from this compari-son. Figure 8 also displays calculated cross sections for ion-

U (Z = 92), 1s1/20

2

4

6

8

10

!!(barn)

0 10 20 30 40 50E/U

CPWBAPWBA

+

U (Z = 92), 3p1/20

400

800

1200

!!(barn)

0 10 20 30 40 50E/U

CPWBAPWBA

+

(a)

(b)

FIG. 5. Total cross sections for ionization of the Kand M2 shellsof uranium atoms by impact of electrons. Circles represent resultsfrom our numerical CPWBA calculation, Eq. !98$. The continuouscurves are the results from Eq. !99$, and the dashed curves repre-sent the PWBA cross sections, obtained by integrating the energy-loss DCS given by Eq. !76$. The vertical lines mark the matchingenergy, E=16U.

E (eV)

!!(barn)

103 104 105 106 107 108 1091

10

102

103

104

105

,> Liu et al. [4],+ Llovet et al. [65],) An et al. [67]

1s1/2

=10

Ar (Z = 18)

Cr (Z = 24)

Al (Z = 13)

E (eV)

!!(barn)

0.1

1

10

102

103

103 104 105 106 107 108 109

=0.1

Ge (Z = 32)

Ti (Z = 22)

Ni (Z = 28)

,= Liu et al. [4],+ Llovet et al. [65,66],) An et al. [67],* Tang et al. [68],% Zhou et al. [69]

1s1/2

(a)

(b)

FIG. 6. Total cross sections for electron-impact ionization of theK shell of the elements Al, Ar, Ti, Cr, Ni, and Ge, as functions ofthe energy E of the projectile. Curves are the results from the CP-WBA Eq. !99$. Symbols represent experimental data; open circles"65,66#, open squares "67#, solid squares "68#, and open diamonds"69# correspond to recent measurements. The solid circles andcrosses are results from various groups !Refs. "70–89#$, which werecompiled by Liu et al. "4#.

DAVID BOTE AND FRANCESC SALVAT PHYSICAL REVIEW A 77, 042701 !2008$

042701-18

Are these cross sections good?

Are they any better than existing calculations?

Fraction of test cases that “pass” the χ2 test

χ2 test Significance: α=0.01 p-value ≥ α èPass p-value < α èFail

0.0

0.2

0.4

0.6

0.8

1.0

Cross section model

Effic

ienc

y

EEDL BEB DM Bote

K shell

K shell ●

0.0

0.2

0.4

0.6

0.8

1.0

Cross section model

Effic

ienc

y

EEDL BEB DM Bote

L shell

L shell

0.0

0.2

0.4

0.6

0.8

1.0

Cross section model

Effic

ienc

y

EEDL BEB DM Bote

M shell

M shell

Test P-value Fisher 0.0098 0.0453 0.4283 Pearson χ2 0.0076 0.0362 Barnard 0.0079 0.0383 0.3519

Categorical analysis EEDL-B/S

D. Bote and F. Salvat, Calculations of inner-shell ionization by electron impact with the distorted-wave and plane wave Born approximation, Phys. Rev. A 77, 042701, 2008

Full paper in preparation

preliminary

Page 10: Negative improvements, relative validity and …Maria Grazia Pia, INFN Genova 1 Negative improvements, relative validity and elusive goodness M. Batic1, G. Hoff2, S. H. Kim3, M. G

Maria Grazia Pia, INFN Genova 10

State-of-the-art, quantified simulation

V&V UQ

Software design

Physics

Positive improvements

��

IEEE Standard for System andSoftware Verification and Validation�

Sponsored by the Software & Systems Engineering Standards Committee (C/S2ESC)�

IEEE 3 Park Avenue New York, NY 10016-5997 USA 25 May 2012

IEEE Computer Society�

IEEE Std 1012™-2012(Revision of

IEEE Std 1012-2004)

Authorized licensed use limited to: CERN. Downloaded on October 20,2013 at 15:17:22 UTC from IEEE Xplore. Restrictions apply.

methods

technology statistics

Physics

Page 11: Negative improvements, relative validity and …Maria Grazia Pia, INFN Genova 1 Negative improvements, relative validity and elusive goodness M. Batic1, G. Hoff2, S. H. Kim3, M. G

Maria Grazia Pia, INFN Genova 11

IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 58, NO. 6, DECEMBER 2011 3269

Validation of Proton Ionization Cross SectionGenerators for Monte Carlo Particle Transport

Matej Batic, Maria Grazia Pia, and Paolo Saracco

Abstract—Three software systems, ERCS08, ISICS 2011 andSmit’s code, that implement theoretical calculations of inner shellionization cross sections by proton impact, are validated with re-spect to experimental data. The accuracy of the cross sections theygenerate is quantitatively estimated and inter-compared throughstatistical methods. Updates and extensions of a cross section datalibrary relevant to PIXE simulation with Geant4 are discussed.

Index Terms—Geant4, ionization, Monte Carlo, Particle In-duced X-ray Emission (PIXE), simulation.

I. INTRODUCTION

T HE calculation of inner shell ionization cross sectionsby proton and ion impact is an important component of

the simulation of Particle Induced X-ray Emission (PIXE) andthe analysis of experimental PIXE spectra. The Energy-lossCoulomb Perturbed Stationary State Relativistic (ECPSSR) [1]theory with its variants is regarded as the standard approach forcross section calculations in the domain of PIXE applications,which typically concern the energy range up to a few tens MeVand the whole range of elements in the periodic system. Itprovides inner shell ionization cross sections for PIXE analysiscodes such as GeoPIXE [2], GUPIX [3]–[5], PIXAN [6],PIXEF [7], PIXYKLM [8], Sapix [9], and TTPIXAN [10], andfor specialized PIXE simulation codes [11]–[13].

Several cross section models for the computation of innershell ionization by proton and particle impact are available in apackage for PIXE simulation [14] released in Geant4 [15], [16]9.4 version; they include models based on the plane wave Bornapproximation (PWBA) [17], the ECPSSR model in a numberof variants and a collection of empirical models, deriving fromfits to experimental data. The PWBA and ECPSSR cross sec-tion models (with variants) exploit tabulations produced by theISICS [18] code for K, L and M shells, which have been assem-bled in a data library [19] publicly distributed by RSICC (Radia-tion Safety Information Computational Center at the Oak RidgeNational Laboratory).

A new version of ISICS and an entirely new code for thecalculation of ECPSSR cross sections (with variants), ERCS08

Manuscript received May 18, 2011; revised August 25, 2011; acceptedSeptember 06, 2011. Date of publication October 13, 2011; date of currentversion December 14, 2011.

M. Batic is with INFN Sezione di Genova, I-16146 Genova, Italy, on leavefrom Jozef Stefan Institute, 1000 Ljubljana, Slovenia (e-mail: [email protected]).

M. G. Pia and P. Saracco are with INFN Sezione di Genova, I-16146 Genova,Italy (e-mail: [email protected]; [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TNS.2011.2168539

[20], have become available since the publication of the previ-ously cited paper on Geant4 PIXE simulation. This paper eval-uates the cross sections deriving from these evolutions, alongwith those calculated by Smit’s code [21] (identified in the fol-lowing as KIO-LIO), which is consistent with pristine ECPSSRformulation.

The K and L shell proton ionization cross sections producedby these three theoretical generators are compared with refer-ence collections of experimental data to assess their validity, incompliance with the IEEE Standard for Software Verificationand Validation [22]. The results of this validation process docu-ment quantitatively the relative merits of the three codes, eval-uate the impact of the newly available calculations on Geant4accuracy and identify the state-of-the-art of theoretical crosssections for PIXE simulation with Geant4.

II. THEORETICAL OVERVIEW

In the PWBA approach [17], the first-order Born approxima-tion is used in scattering theory to describe the interaction be-tween an incident charged particle and an atomic target. Thistreatment is justified when the atomic number of the projectileis much smaller than the atomic number of the target, and thevelocity of the incident particle is much larger than the velocityof the target-atom electron velocities.

The PWBA cross section in the center of mass system for theionization of a given shell is given by

(1)

where:

(2)

is the Bohr radius, is the projectile atomic number, isthe effective atomic number of the target atom, is the reduceduniversal function, and the reduced atomic electron binding en-ergy and reduced projectile energy are given by

(3)

(4)

with E, M and U representing the energy, mass and atomicbinding energy of the projectile and the target, respectively iden-tified by the indices 1 and 2. The analytical formulation of thereduced universal function can be found in [18].

The ECPSSR theory [1] was proposed to address the short-comings of the PWBA approach in the energy range relevant toPIXE experimental practice (approximately up to a few tens ofMeV); it accounts for the energy loss and Coulomb deflection

0018-9499/$26.00 © 2011 IEEE

p 2984 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 60, NO. 4, AUGUST 2013

Validation of Geant4-Based RadioactiveDecay Simulation

Steffen Hauf, Markus Kuster, Matej Batiþ, Zane W. Bell, Dieter H. H. Hoffmann, Philipp M. Lang, Stephan Neff,Maria Grazia Pia, Georg Weidenspointner, and Andreas Zoglauer

Abstract—Radioactive decays are of concern in a wide varietyof applications using Monte-Carlo simulations. In order to prop-erly estimate the quality of such simulations, knowledge of the ac-curacy of the decay simulation is required. We present a valida-tion of the original Geant4 Radioactive Decay Module, which usesa per-decay sampling approach, and of an extended package forGeant4-based simulation of radioactive decays, which, in additionto being able to use a refactored per-decay sampling, is capable ofusing a statistical sampling approach. The validation is based onmeasurements of calibration isotope sources using a high purityGermanium (HPGe) detector; no calibration of the simulation isperformed. For the considered validation experiment equivalentsimulation accuracy can be achieved with per-decay and statisticalsampling.

Index Terms—Geant4, high purity germanium detector, radio-active decay, simulation, validation.

I. INTRODUCTION

R ADIOACTIVE decays and the resulting radiation playan important role for many experiments, either as an ob-

servable, as a background source, or as a radiation hazard. De-tailed knowledge of the radiation in and around an experimentand its detectors is thus required for successful measurementsand the safety of the experimentalist. At the same time the in-creasing complexity of experiments often makes it prohibitivelyexpensive, if not impossible, to completely determine an ex-periment’s radiation characteristics and response frommeasure-ments alone. This is especially true during the design phase ofa new project—much of the hardware may either not be avail-able or the environmental conditions the experiment will be sub-jected to cannot be replicated (e.g. in space-based observato-ries). In order to circumvent these limitations it has become in-creasingly common to estimate an experiment’s radiation and

Manuscript received January 31, 2013; revised June 03, 2013; accepted June13, 2013. Date of publication August 06, 2013; date of current version August14, 2013. This work has been supported by Deutsches Zentrum für Luft- undRaumfahrt e.V. (DLR) under grants 50 QR 0902 and 50 QR 1102.S. Hauf andM. Kuster are with European XFEL GmbH, Hamburg, Germany.

(e-mail: [email protected]).M. Batiþ and M. G. Pia are with the INFN Genova, Genova, Italy.Z. W. Bell is with the Oak Ridge National Laboratory, Oak Ridge, TN, USA.D. H. H. Hoffmann, P. M. Lang, and S. Neff are with Institute for Nuclear

Sciences, TU Darmstadt, Darmstadt, Germany.G. Weidenspointner is with the Max-Planck Halbleiter Labor, Munich, Ger-

many. and the Max-Planck Institut für extraterrestrische Physik, Garching,Germany.A. Zoglauer is with the Space Science Laboratory, University of California,

Berkeley, CA, USA.Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TNS.2013.2271047

response characteristics with the help of computer simulations.General-purpose Monte-Carlo systems such as Geant4 [1], [2]are frequently the tool of choice for such simulations, as theyallow for the modeling of complex geometries and a wide rangeof physical processes. The accuracy of these simulation-derivedestimates is determined by the accuracy of the individual con-tributing processes, which in turn is validated by measurements.The radioactive decay code implemented in the released ver-

sion of Geant4 has been previously compared to experimentalmeasurements in a number of works, e.g. [3]–[6]. In these com-parisons the simulated detector geometry is usually calibratedto the experiment in an iterative process so that the simulationbetter models the measurements. Whereas this method can pro-duce simulation results consistent with experimental data withinreasonable error margins determined by the experimental setup,it obfuscates how well a noncalibrated geometry can be simu-lated by Geant4. The ability to run absolute models is importantif simulations are used to aid in the development of new detec-tors: in that context hardware, and thus measurements to com-pare against, do not exist.A validation addressing this issue, like the one presented in

this work, requires a self-consistent approach, i.e. only knownexperimental properties are used as simulation input. Accord-ingly, we have refrained from iteratively optimizing the detectorgeometry, with the sole exception of the simulated calibrationsource itself—a topic discussed in Section IV-B.The following sections provide an overview of the software

for the simulation of radioactive decays in the Geant4 environ-ment, discuss the strategy adopted for the validation process,and document the experimental measurements and the corre-sponding simulation. The validation results are divided into acomparison of the photo peaks in Section V-A and the completespectra including the continuum in Section V-B.

II. RADIOACTIVE DECAYS IN GEANT4A package for the simulation of radioactive decays [7], [8]

has been available in the Geant4 simulation toolkit since ver-sion 2.0, in which it was named radiative_decay. Since Geant4version 6.0 it has been named radioactive_decay, although itis conventionally known as Geant4 RDM (Radioactive DecayModule), and identified as “original RDM” in the following.Radioactive decays are simulated in this package by a processimplemented in the G4RadioactiveDecay class, which sam-ples secondary particles on a per-decay basis. The softwareimplementation involves the optional use of event biasingtechniques. The simulation of radioactive decays is data driven,using a reprocessed ENSDF library [9]. From this library thealgorithm samples any direct decay emission (i.e. and

0018-9499 © 2013 IEEE

2934 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 60, NO. 4, AUGUST 2013

Validation of Geant4 Simulation of ElectronEnergy Deposition

Matej Batiþ, Gabriela Hoff, Maria Grazia Pia, Paolo Saracco, and Georg Weidenspointner

Abstract—Geant4-based simulations of the energy deposited byelectrons in various materials are quantitatively compared to high-precision calorimetric measurements taken at Sandia Laborato-ries. The experimental data concern electron beams of energy be-tween a few tens of kilolectron volt and 1 MeV at various inci-dence angles. Two experimental scenarios are evaluated: the longi-tudinal energy deposition pattern in a finely segmented detector,and the total energy deposited in a larger size calorimeter. Thesimulations are produced with Geant4 versions from 9.1 to 9.6;they involve models of electron–photon interactions in the stan-dard and low energy electromagnetic packages, and various imple-mentations of electron multiple scattering. Significant differencesin compatibility with experimental data are observed in the lon-gitudinal energy deposition patterns produced by the examinedGeant4 versions, while the total deposited energy exhibits smallervariations across the various Geant4 versions, with the exceptionGeant4 9.4. The validation analysis, based on statistical methods,shows that the best compatibility between simulation and experi-mental energy deposition profiles is achieved using electromagneticmodels based on the EEDL and EPDL evaluated data librarieswith Geant4 9.1. The results document the accuracy achievable inthe simulation of the energy deposited by low energy electrons withGeant4; they provide guidance for application in similar experi-mental scenarios and for improving Geant4.

Index Terms—Dosimetry, electrons, Geant4, Monte Carlo, sim-ulation.

I. INTRODUCTION

T HE simulation of the interactions with matter of electronsand of their secondary particles is one of the main tasks

of any Monte Carlo codes for particle transport. The resultingenergy deposition is relevant to a wide variety of experimentalapplications, where electrons contribute to determine experi-mental observables either as primary or secondary particles.High-precision experimental measurements [1]–[5] were

performed at the Sandia National Laboratories, specificallyfor the validation of the ITS (Integrated Tiger Series) [6]

Manuscript received April 11, 2013; revised July 01, 2013; accepted July 02,2013. Date of publication August 06, 2013; date of current version August 14,2013. This work has been partly funded by CNPq Grant BEX6460/10-0, Brazil.M. Batiþ was with INFN Sezione di Genova, I-16146 Genova, Italy. He is

now with Sinergise, 1000 Ljubljana, Slovenia.G. Hoff is with Pontificia Universidade Catolica do Rio Grande do Sul, Porto

Alegre, Brazil.M. G. Pia and P. Saracco are with the INFN Sezione di Genova, I-16146

Genova, Italy (e-mail: [email protected]; [email protected]).G. Weidenspointner is with the Max-Planck-Institut Für Extraterrestrische

Physik, 85740 Garching, Germany, and also with the MPI Halbleiterlabor,81739 München, Germany.Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TNS.2013.2272404

simulation code: they concern electrons with energies rangingfrom a few tens of keV to 1 MeV and involve various targetmaterials and electron incidence angles. These experimentaldata are still regarded as the most comprehensive referencefor benchmarking the simulation of energy deposition by lowenergy electrons: they have been exploited in the validation[7]–[15] of numerous general-purpose Monte Carlo codesother than the ITS system, for which the measurements wereoriginally intended, such as EGS [16], EGSnrc [17], Geant4[18], [19], MCNP [20], MCNPX [21], and Penelope [22].The validation of simulated electron energy deposition in [15]

concerns two versions of Geant4, 8.1p02 and 9.1: the latterwas the latest version available at the time when the articlewas written. Some differences in compatibility with experimentwere observed between the two Geant4 versions, which wereascribed to evolutions in Geant4 multiple scattering implemen-tation.Statements of improvements to Geant4 simulation of elec-

tromagnetic interactions have been reported in the literature[23]–[26] since the publication of [15], and a multiple scatteringmodel specifically addressing the transport of electrons [27] hasbeen introduced in the Geant4 toolkit. This paper documentsquantitatively how these evolutions in Geant4 electromagneticphysics domain affect the accuracy of the simulation of the en-ergy deposited by low energy electrons: it reports comparisonsbetween experimental data in [1]–[4] and simulations based onGeant4 versions from 9.1 to 9.6, which span five years’ Geant4development.In this respect, it is worthwhile to note that several versions

of Geant4 are actively used in the experimental community atany given time, not limited to the latest release: in fact, despitethe fast release rate of Geant4 of one or two new versions peryear, often complemented by correction patches, experimentalprojects usually require a stable simulation production environ-ment for large portions of their life-cycle and retain a Geant4version in their simulation productions for an extended period,even though new versions may become available in the mean-time.Besides the effects due to the evolution of Geant4 electro-

magnetic physics, this paper evaluates quantitatively anotherissue of experimental relevance: the sensitivity to physics mod-eling features in relation to the geometrical granularity of thedetector.The results of this validation analysis provide guidance

to experimental users in optimizing the configuration ofGeant4-based applications in scenarios concerned by the sim-ulation of the energy deposited by low energy electrons. Thisinvestigation may be relevant also to high energy experiments,

0018-9499 © 2013 IEEE

3246 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 58, NO. 6, DECEMBER 2011

Evaluation of Atomic Electron Binding Energiesfor Monte Carlo Particle Transport

Maria Grazia Pia, Hee Seo, Matej Batic, Marcia Begalli, Chan Hyeong Kim, Lina Quintieri, and Paolo Saracco

Abstract—A survey of atomic binding energies used by generalpurpose Monte Carlo systems is reported. Various compilations ofthese parameters have been evaluated; their accuracy is estimatedwith respect to experimental data. Their effects on physical quan-tities relevant to Monte Carlo particle transport are highlighted:X-ray fluorescence emission, electron and proton ionization crosssections, and Doppler broadening in Compton scattering. The ef-fects due to different binding energies are quantified with respectto experimental data. Among the examined compilations, EADL isfound in general a less suitable option to optimize simulation ac-curacy; other compilations exhibit distinctive capabilities in spe-cific applications, although in general their effects on simulationaccuracy are rather similar. The results of the analysis providequantitative ground for the selection of binding energies to opti-mize the accuracy of Monte Carlo simulation in experimental usecases. Recommendations on software design dealing with these pa-rameters and on the improvement of data libraries for Monte Carlosimulation are discussed.

Index Terms—Geant4, ionization, Monte Carlo, PIXE, simula-tion, X-ray fluorescence.

I. INTRODUCTION

T HE simulation of particle interactions in matter involvesa number of atomic physics parameters, whose values af-

fect physics models applied to particle transport and experi-mental observables calculated by the simulation. Despite thefundamental character of these parameters, a consensus has notalways been achieved about their values, and different MonteCarlo codes use different sets of parameters.

Atomic parameters are especially relevant to simulation sce-narios that are sensitive to detailed modeling of the propertiesof the interacting medium. Examples include the generation ofcharacteristic lines resulting from X-ray fluorescence or Auger

Manuscript received May 18, 2011; revised August 31, 2011; accepted Oc-tober 10, 2011. Date of current version December 14, 2011.

M. G. Pia and P. Saracco are with the INFN Sezione di Genova, I-16146Genova, Italy (e-mail: [email protected]; [email protected]).

H. Seo and C. H. Kim are with the Department of Nuclear Engineering,Hanyang University, Seoul 133-791, Korea (e-mail: [email protected];[email protected]).

M. Batic is with INFN Sezione di Genova, Genova, Italy, on leave from theJozef Stefan Institute, 1000 Ljubljana, Slovenia (e-mail: [email protected]).

M. Begalli is with the State University of Rio de Janeiro, 20551-030 Rio deJaneiro, Brazil (e-mail: [email protected].).

L. Quintieri is with the INFN Laboratori Nazionali di Frascati, I-00044 Fras-cati, Italy (e-mail: [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TNS.2011.2172458

electron emission, and precision simulation studies, such as mi-crodosimetry, that involve the description of particle interac-tions with matter down to energies comparable with the scaleof atomic binding energies.

Simulation in these domains has been for an extended timethe object of specialized Monte Carlo codes; some general pur-pose Monte Carlo systems have devoted attention to these areas,introducing functionality for the simulation of fluorescence,PIXE (Particle Induced X-ray Emission) and microdosimetry.In this context, emphasis has been placed on the developmentand validation of the physics models implemented in thesimulation systems, while relatively limited effort has beeninvested into verifying the adequacy of the atomic parametersused by general purpose Monte Carlo codes with regard to therequirements of new application domains.

This paper surveys atomic binding energies used by wellknown Monte Carlo systems, including EGS [1], EGSnrc[2], Geant4 [3], [4], ITS (Integrated Tiger Series) [5],MCNP/MCNPX [6], [7] and Penelope [8], and by somespecialized physics codes. These software systems use a va-riety of compilations of binding energies, which are derivedfrom experimental data or theoretical calculations; this paperinvestigates their accuracy and their effects on simulations.

II. COMPILATIONS OF ELECTRON BINDING ENERGIES

The binding energies considered in this study concern neu-tral atoms in their ground state; several compilations of theirvalues, of experimental and theoretical origin, are available inthe literature.

Compilations based on experimental data are the result of theapplication of selection, evaluation, manipulations (like inter-polation and extrapolation) and semi-empirical criteria to avail-able experimental measurements to produce a set of referencevalues covering the whole periodic system of the elements andthe complete atomic structure of each element.

Most of the collections of electron binding energies based onexperimental data derive from a review published by Beardenand Burr in 1967 [9]. Later compilations introduced further re-finements in the evaluation of experimental data and the cal-culation of binding energies for which no measurements wereavailable; they also accounted for new data taken after the pub-lication of Bearden and Burr’s review.

Experimental atomic binding energies can be affected by var-ious sources of systematic effects; they originate not only fromthe use of different experimental techniques in the measure-ments, but also from physical effects: for instance, binding en-ergies of elements in the solid state are different from those of

0018-9499/$26.00 © 2011 IEEE

1636 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 59, NO. 4, AUGUST 2012

Photon Elastic Scattering Simulation: Validation andImprovements to Geant4

Matej Batiþ, Gabriela Hoff, Maria Grazia Pia, and Paolo Saracco

Abstract—Several models for the simulation of photon elasticscattering are quantitatively evaluated with respect to a large col-lection of experimental data retrieved from the literature. They in-clude models based on the form factor approximation, on S-matrixcalculations and on analytical parameterizations; they exploit pub-licly available data libraries and tabulations of theoretical calcula-tions. Some of these models are currently implemented in generalpurpose Monte Carlo systems; some have been implemented andevaluated for the first time in this paper for possible use in MonteCarlo particle transport. The analysis mainly concerns the energyrange between 5 keV and a fewMeV. The validation process identi-fies the newly implemented model based on second order S-matrixcalculations as the one best reproducing experimental measure-ments. The validation results show that, along with Rayleigh scat-tering, additional processes, not yet implemented in Geant4 nor inother major Monte Carlo systems, should be taken into account torealistically describe photon elastic scattering with matter above1 MeV. Evaluations of the computational performance of the var-ious simulation algorithms are reported along with the analysis oftheir physics capabilities.

Index Terms—Geant4, Monte Carlo, simulation, X-rays.

I. INTRODUCTION

P HOTON elastic scattering is important in various exper-imental domains, such as material analysis applications,

medical diagnostics and imaging [1]; more in general, elastic in-teractions contribute to the determination of photon mass atten-uation coefficients, which are widely used parameters in med-ical physics and radiation protection [2]. In the energy rangebetween approximately 1 keV and few MeV, which is the ob-ject of this paper, the resolution of modern detectors, high inten-sity synchrotron radiation sources and, in recent years, the avail-ability of resources for large scale numerical computations haveconcurred to build a wide body of knowledge on photon elasticscattering. Extensive reviews of photon elastic scattering, thatcover both its theoretical and experimental aspects, can be foundin the literature (e.g., [3]–[5]).

Manuscript received February 28, 2012; revisedMay 26, 2012; acceptedMay31, 2012. Date of publication July 20, 2012; date of current version August 14,2012. This work was supported in part by CNPq BEX6460/10-0 grant, Brazil.M. Batiþ is with the INFN Sezione di Genova, Genova, Italy; on leave from

Jozef Stefan Institute, 1000 Ljubljana, Slovenia (e-mail: [email protected]).G. Hoff is with the INFN Sezione di Genova, Genova, Italy; on leave from

Pontificia Universidade Catolica do Rio Grande do Sul, Brazil (e-mail: [email protected]).M. G. Pia and P. Saracco are with the INFN Sezione di Genova, I-16146

Genova, Italy (e-mail: [email protected]; [email protected]).Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TNS.2012.2203609

This paper addresses this topic under a pragmatic perspec-tive: its simulation in general purpose Monte Carlo codes forparticle transport. Photon interactions with matter, both elasticand inelastic, play a critical role in these systems; their mod-eling presents some peculiarities, because the software must sat-isfy concurrent requirements of physical accuracy and compu-tational performance.Photon-atom elastic scattering encompasses various inter-

actions, but Rayleigh scattering, i.e., scattering from boundelectrons, is the dominant contribution in the low energy régimeand, as energy increases, remains dominant in a progressivelysmaller angular range of forward scattering. Rayleigh scatteringmodels are implemented in all general-purpose Monte Carlosystems; comparison studies have highlighted discrepanciesamong some of them [6], nevertheless a comprehensive, quan-titative appraisal of their validity is not yet documented inthe literature. It is worthwhile to note that the validation ofsimulation models implies their comparison with experimentalmeasurements [7]; comparisons with tabulations of theoreticalcalculations or analytical parameterizations, such as those thatare reported in [8] as validation of Geant4 [9], [10] photoncross sections, do not constitute a validation of the simulationsoftware.This paper evaluates the models adopted by general-purpose

Monte Carlo systems and other modeling approaches not yetimplemented in these codes, to identify the state-of-the-art inthe simulation of photon elastic scattering. Computational per-formance imposes constraints on the complexity of physics cal-culations to be performed in the course of simulation: hence theanalysis is limited to theoretical models for which tabulationsof pre-calculated values are available, or that are expressed bymeans of simple analytical formulations. To be relevant for gen-eral purpose Monte Carlo systems, tabulated data should coverthe whole periodic table of elements and an extended energyrange. The accuracy of elastic scattering simulation models isquantified with statistical methods comparing them with a widecollection of experimental data retrieved from the literature; theevaluation of physics capabilities is complemented by the es-timate of their computational performance. These results pro-vide guidance for the selection of physics models in simula-tion applications in response to the requirements of physics ac-curacy and computational speed pertinent to different experi-mental scenarios.Special emphasis is devoted to the validation and possible im-

provement of photon elastic scatteringmodeling in Geant4; nev-ertheless, the results documented in this paper can be of moregeneral interest also for other Monte Carlo systems.

0018-9499/$31.00 © 2012 IEEE

…and more!

Page 12: Negative improvements, relative validity and …Maria Grazia Pia, INFN Genova 1 Negative improvements, relative validity and elusive goodness M. Batic1, G. Hoff2, S. H. Kim3, M. G

Maria Grazia Pia, INFN Genova 12

Supporting tools

2056 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 51, NO. 5, OCTOBER 2004

A Goodness-of-Fit Statistical ToolkitG. A. P. Cirrone, S. Donadio, S. Guatelli, A. Mantero, B. Mascialino, S. Parlati, M. G. Pia, A. Pfeiffer, A. Ribon, and

P. Viarengo

Abstract—Statistical methods play a significant role throughoutthe life-cycle of physics experiments, being an essential componentof physics analysis. The present project in progress aims to developan object-oriented software Toolkit for statistical data analysis.The Toolkit contains a variety of Goodness-of-Fit (GoF) tests, fromChi-squared to Kolmogorov–Smirnov, to less known, but generallymuch more powerful tests such as Anderson–Darling, Goodman,Fisz–Cramer–von Mises, Kuiper. Thanks to the component-baseddesign and the usage of the standard abstract interfaces for dataanalysis, this tool can be used by other data analysis systems orintegrated in experimental software frameworks.

In this paper we describe the statistical details of the algorithmsand the computational features of the Toolkit. With the aim ofshowing the consistency between the code and the mathematicalfeatures of the algorithms, we describe the results we obtained re-producing by means of the Toolkit a couple of Goodness-of-Fittesting examples of relevance in statistics literature.

Index Terms—Data analysis, distributions comparison, Good-ness-of-Fit (GoF) tests, software, statistics.

I. INTRODUCTION

STATISTICAL comparison of distributions is an essentialsection of data analysis in any field and in particular in high

energy physics experiments. In spite of this, only a few basictools for statistical analysis were available in the public domainFORTRAN libraries for high energy physics, such as CERNLibraries (CERNLIB) [1] and PAW [2]. Nowadays the situationis unchanged even among the libraries for data analysis of thenew generation, such as for istance Anaphe [3], JAS [4], OpenScientist [5] and Root [6].

For these reasons a new project was launched to build anopen-source and up-to-date object-oriented Toolkit for high en-ergy physics experiments and other applications.

In this paper we will focus our attention on a specific compo-nent of the statistical Toolkit that we developed. The Statistical

Manuscript received November 22, 2003; revised April 15, 2004. Thiswork was supported in part by the European Space Agency under Contract16339/02/NL/FM.

G. A. P. Cirrone was with the Italian Institute for Nuclear Research, Assergi-L’Aquila, Italy, and is currently with the University of Catania, 95100 Catania,Italy (e-mail: [email protected]).

S. Donadio, S. Guatelli, A. Mantero, B. Mascialino and M. G. Pia are with theItalian Institute for Nuclear Research, Assergi-L’Aquila, Italy, and with the Uni-versity of Genova, 16146 Genova, Italy (e-mail: [email protected],[email protected], [email protected], [email protected], [email protected]).

S. Parlati is with the Italian Institute for Nuclear Research, 67010 Assergi–L’Aquila, Italy (e-mail: [email protected]).

A. Pfeiffer and A. Ribon are with the European Organization for Nuclear Re-search (CERN), CH-1211 Geneve 23, Switzerland (e-mail: [email protected], [email protected]).

P. Viarengo is with the Istituto Nazionale per la Ricerca Sul Cancro(IST National Cancer Research Institute), 16132 Genova, Italy (e-mail:[email protected]).

Digital Object Identifier 10.1109/TNS.2004.836124

Comparison component of the Toolkit provides some Good-ness-of-Fit (GoF) algorithms for the comparison of data distri-butions in a variety of use cases typical for physics experiments,as:

— regression testing (in various phases of the softwarelife-cycle),

— validation of simulation through comparison with ex-perimental data,

— comparison of expected versus reconstructed distribu-tions,

— comparison of data from different sources, such as dif-ferent sets of experimental data, or experimental withrespect to theoretical distributions.

The GoF Toolkit gathers together some of the most importantGoF tests available in literature. Its aim is to provide a wide setof algorithms in order to test the compatibility of the distribu-tions of two variables.

II. THE GOODNESS-OF-FIT STATISTICAL TOOLKIT

The GoF tests measure the compatibility of a random samplewith a theoretical probability distribution function or betweenthe empirical distributions of two different populations comingfrom the same theoretical distribution. From a general point ofview, the aim may consist also in testing whether the distribu-tions of two random variables are identical against the alterna-tive that they differ in some way.

Specifically, consider two real-valued random variablesand , and let be a sample of independent andidentically distributed observations with cumulative distribu-tions (cdf) , and . If

is a discrete random variable, then the cdf of will be dis-continuous at the points and constant in between, if is acontinuous random variable, the cdf of will be continuous.If we consider that and may be continuous or discrete, theGoF tests consist in testing the null hypothesis

(1)

against the alternative hypothesis

(2)

Under the assumption , the fraction of wrongly rejected ex-periments is typically fixed to a few percent. In all the tests pre-sented below, is rejected when a pre-defined test statistics islarge with respect to some conventional value.

The tests are classified in distribution-dependent (parametrictests) and distribution-free tests (non parametric tests). In thecase of weak, very general assumptions, the latter are in prin-ciple preferable, because they can be adapted to arbitrary distri-

0018-9499/04$20.00 © 2004 IEEE

N45-8, Data Analysis with R in an Experimental Physics Environment

3834 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 53, NO. 6, DECEMBER 2006

New Developments of the Goodness-of-FitStatistical Toolkit

Barbara Mascialino, Andreas Pfeiffer, Maria Grazia Pia, Alberto Ribon, and Paolo Viarengo

Abstract—The Statistical Toolkit is a project for the developmentof open source software tools for statistical data analysis in ex-perimental particle and nuclear physics. The second developmentcycle encompassed an extension of the software functionality andnew tools to facilitate its usage in experimental environments. Thenew developments include additional goodness-of-fit tests, new im-plementations of existing tests to improve their statistical preci-sion or computational performance, a new component to extendthe usability of the toolkit with other data analysis systems, andnew tools for an easier configuration and build of the system in theuser’s computing environment. The computational performance ofall the algorithms implemented has been studied.

Index Terms—Data analysis, data comparison, goodness-of-fittesting, software, Statistical Toolkit.

I. INTRODUCTION

THE comparison of data distributions with respect to otherreference data or functions is a common problem in

experimental physics: some typical cases are the validationof simulation results against experimental data, the evaluationof physical quantities reconstructed by the experiment’s soft-ware against theoretically expected ones, or monitoring thebehaviour of a particle detector with respect to its nominaloperation reference. Moreover, the regression testing of anexperiment’s software usually involves some comparisons ofdata distributions to monitor the software stability or to verifyits evolution.

A recent project, named the Statistical Toolkit [1], under-took the development of an open source software system forthe comparison of data distributions, especially addressing, butnot limited to, applications in particle and nuclear physics. Thisproject is characterized by an iterative and incremental softwareprocess, according to established best practices in software de-velopment [2]; the first development cycle is documented in[1]. This paper describes the new developments and improve-ments, which have been released [3] for public usage in version2. The new features available respond to experimental user re-quirements. The first two development cycles concern the com-parison of two samples of one-dimensional distributions.

Manuscript received July 12, 2006; revised September 14, 2006.B. Mascialino and M. G. Pia are with INFN Sezione di Genova,

16146 Genova, Italy (e-mail: [email protected]; [email protected]).

A. Pfeiffer and A. Ribon are with CERN, CH 1122 Geneva, Switzerland(e-mail: [email protected]; [email protected]).

P. Viarengo is with IST National Institute for Cancer Research, 16132Genova, Italy (e-mail: [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TNS.2006.885386

Several goodness-of-fit tests have been added to the alreadyextensive collection available in the first released version; someof them introduce new weighted formulations of establishedtests, for the first time available in a software tool for dataanalysis. Other tests have been significantly improved, eitherin their mathematical algorithms or in their computationalperformance. New developments in the architectural user layerhave extended the usability of the Statistical Toolkit, addressingin particular the requirements for data analysis in high energyphysics experiments. A significant redesign of the supportingsoftware tools package facilitates the configuration of thesystem in the user’s own computing environment.

The paper also reports a comparative analysis of the com-puting performance of all the algorithms implemented: these re-sults provide useful guidance to experimental users to select thetest appropriate to their requirements among those available inthe toolkit. A comparative study of the statistical performance ofthe various goodness-of-fit tests is the object of current researchactivity [4], [5]; it will be documented in a dedicated paper.

II. OVERVIEW OF THE GOODNESS-OF-FIT

STATISTICAL TOOLKIT

The Statistical Toolkit is a software system for statistical dataanalysis; it is especially targeted to common applications in ex-perimental nuclear and particle science. It exploits the objectoriented technology and generic programming techniques; it isimplemented in C++. Its life-cycle is based on the iterative-in-cremental model of the Unified Process [2]; the process frame-work adopted emphasizes the role of the software architectureand the relevance of use cases in the software development.

The Statistical Toolkit adopts a component-based architec-ture, which facilitates its usage in association with other dataanalysis software systems widely used in particle physics ex-periments; its sound object oriented design makes it open to ex-tension and evolution.

A. Statistical Background

Goodness-of-fit testing provides the mathematical foundationfor a rigorous, quantitative evaluation of the compatibility oftwo data samples, or of a data sample against a reference func-tion. A detailed overview of goodness-of-fit testing is reportedin [1]; the brief summary below is meant to introduce the basicmathematical concepts involved in the software developmentsdescribed in the following sections.

Let and be two real-valued random variables, and letand be, respectively, two samples of

independent and identically distributed observations with em-pirical distribution functions and . For every the em-pirical distribution function denotes the observed fraction

0018-9499/$20.00 © 2006 IEEE

Enabling quantitative validation of Monte Carlo codes

Monte Carlo simulation as quantitative science

Page 13: Negative improvements, relative validity and …Maria Grazia Pia, INFN Genova 1 Negative improvements, relative validity and elusive goodness M. Batic1, G. Hoff2, S. H. Kim3, M. G

Maria Grazia Pia, INFN Genova 13

Monte Carlo simulation as a science

A collaborative effort across different Monte Carlo codes for the validation of the “bricks” would be beneficial to the experimental community

ends fall far short of what’s needed.The bigger and more complex the code, the harder it

is to verify and validate. A sophisticated climate-modelingcode might include models for 50 or more effects—for ex-ample, ocean evaporation, currents, salinity, seasonalalbedo variation, and so forth. The code would predict ob-servables such as mean surface temperature and precipi-tation levels. The accuracy of the predictions depends onthe validity of each component model, the completeness ofthe set of all the models, the veracity of the solutionmethod, the interaction between component models, thequality of the input data, the grid resolution, and the user’sability to correctly set up the problem, run it, and inter-pret the results.

In practice, one must first verify and validate eachcomponent, and then do the same for progressively largerensembles of interacting components until the entire in-tegrated code has been verified and validated for the prob-lem regimes of interest.9

Here we list five common verification techniques, eachhaving limited effectiveness: ! Comparing code results to a related problem with anexact answer! Establishing that the convergence rate of the trunca-tion error with changing grid spacing is consistent with ex-pectations! Comparing calculated with expected results for a prob-lem specially manufactured to test the code! Monitoring conserved quantities and parameters,preservation of symmetry properties, and other easily pre-dictable outcomes! Benchmarking—that is, comparing results with thosefrom existing codes that can calculate similar problems.

Diligent code developers generally do as much verifi-cation as they think feasible. They are also alert to suspi-cious results. But diligence and alertness are far from aguarantee that the code is free of defects. Better verifica-tion techniques are desperately needed.

Once a code has been verified, the next step is valida-tion. Verification must precede validation. If a code is un-

verified, any agreement be-tween its results and experi-mental data is likely to be for-tuitous. A code cannot bevalidated for all possible ap-plications, only for specificregimes and problems. Thenone must estimate its validityfor adjacent regimes.

One has to validate theentire calculational system—

including user, computer system, problem setup, running,and results analysis. All those elements are important. Aninexperienced user can easily get wrong answers out of agood code in a validated regime.

The need to validate codes presents a number of chal-lenges. Various types of experimental data are used for validation:! Passive observations of physical events—for example,weather or supernovae! Controlled experiments designed to investigate specificphysics or engineering principles—for example, nuclearreactions or spectroscopy! Experiments designed to certify the performance of aphysical component or system—for example, full-scalewind tunnels! Experiments specifically designed to validate code cal-culations—for example, laser-fusion facilities.10

Many codes, such as those that predict weather, haveto rely mainly on passive observations. Others can availthemselves of controlled experiments. The latter make formore credible validation. The most effective validationmethod involves comparing predictions made before a val-idation experiment with the subsequent data from that ex-periment. The experiments can be designed to test specificelements of a code. Such experiments can often be rela-tively simple and inexpensive.

Successful prediction before a validation experimentis a better test than successful reproduction after the fact,because agreement is too often achieved by tuning a codeto reproduce what’s already known. And prediction is gen-erally the code’s raison d’être. The National Ignition Fa-cility at Livermore is an example of a validation facility forthe ASCI codes.

A paradigm shiftThe scientific community needs a paradigm shift with re-gard to validation experiments. Experimenters and fund-ing agencies understand the value of experiments de-signed to explore new scientific phenomena, test theories,or examine the performance of design components. But few

http://www.physicstoday.org January 2005 Physics Today 39

25

25

0

0

–25

–25

0 0–50 –50

1800 UT, 24 March 2000 1900 UT, 24 March 2000

2000 UT, 24 March 2000

2100 UT, 24 March 2000

z(E

arth

rad

ii)z

(Ear

th r

adii)

x (Earth radii) x (Earth radii)

Figure 4. Effect on Earth’s mag-netosphere of the arrival ofplasma and associated mag-netic field from a strong massejection off the Sun’s coronathree days earlier, simulated bya group at the University ofMichigan. Color intensity indi-cates pressure and the whitelines are projections of themagnetic field on the plane thatcuts Earth’s noon and midnightmeridians. The x axis marks thedirection of the Sun, off to theleft. (Adapted from ref. 18.)

appreciate the value of experi-ments explicitly conducted forcode validation. Even when ex-perimenters are interested in val-idating a code, few mechanismsexist for funding such an experi-ment. It’s essential that the scien-tific community provide supportfor code-validation experiments.

Verification and validationestablish the credibility of codepredictions. Therefore, it’s veryimportant to have a writtenrecord of verification and valida-tion results. In fact, a validationactivity should be organized like aproject, with goals and require-ments, a plan, resources, a sched-ule, and a documented record. At present, few computational-science projects practice system-atic verification or validationalong those lines. Almost nonehave dedicated experimental pro-grams. Without such programs,computational science will neverbe credible.

Large-scale computer-simulationprojects face many of the chal-lenges that have historically con-fronted large-scale experimentalundertakings.11 Both require ex-tensive planning, strong leader-ship, effective teams, clear goals,schedules, and adequate re-sources. Both must meet budgetand schedule constraints, andboth need adequate flexibility andcontingency provisions to adjustto changing requirements and theunexpected.

Using a code to address atechnical issue is similar to con-ducting an experiment. Projectleaders require a variety of skills.They need a coherent vision of theproject, a good technical overviewof all its components, and a senseof what’s practical and achievable. They must be able tocommand the respect of the team and the trust of man-agement. Sometimes they must make technical decisionswhose correctness will not be apparent for years. Theyhave to estimate needed resources, make realistic sched-ules, anticipate changing needs, develop and nurture theteam, and shield it from unreasonable demands.

Scientists with such an array of talents are relativelyrare. But without them, few code projects succeed. We havefound that almost every successful project has been car-ried out by a highly competent team led by one or two sci-entists with those qualities. Examples are numerous. TheASCI case study1 demonstrates in detail that good software-project management is essential.

Scientific-software specialists can learn much fromthe broader IT community.12 The IT community has hadto address the problem of planning and coordinating theactivities of large numbers of programmers writing fairlycomplex software. But, as we survey the emerging computational-science community, we find that few ofeven the simplest and best-known proven methods for or-

ganizing and managing code-development teams are beingemployed. The most common ap-proach is the painful rediscoveryof lessons already learned byothers. Unfortunately, that ap-proach leads to wasted effort andsometimes failure.

Software quality is anotherkey issue. Improvements in qual-ity promise easier maintenanceand greater longevity. It’s notenough to test a code at the end ofits development. Quality has to bebuilt in at every stage. Lack of at-tention to quality at every manu-facturing step nearly destroyedthe US automobile industry in the1970s and 1980s. Many tech-niques developed by the IT indus-try for improving software qualityare applicable to scientific com-puting. But each technique mustbe individually evaluated tomatch costs and benefits to projectgoals. It would be counterproduc-

tive, for example, to dogmatically apply the rigorous quality-assurance processes necessary for mission-critical soft-ware—where one bug could crash a plane—to thedevelopment of scientific codes that thrive on risky innova-tion and experimentation.

The way forwardComputational science has the potential to be an impor-tant tool for scientific research. It also promises to provideguidance on key public-policy issues such as global warm-ing. The field is meeting the performance challenge withcomputers of unprecedented power. It is addressing theprogramming challenge. But programming for large, mas-sively parallel computers remains formidable. To meet theprediction challenge, however, the computational-sciencecommunity must achieve the maturity of the older disci-plines of theoretical and experimental science and tradi-tional engineering design.

All the essential tasks for reaching that maturity re-quire attention to the process of code development. Suchprojects must be organized as carefully as experimental

40 January 2005 Physics Today http://www.physicstoday.org

Széchényi BridgeBudapest, 1849

Brooklyn Bridge, 1883

Tacoma Narrows Bridge, 1940

Akashi Kaikyo Bridge, 1998

Figure 5. Four suspensionbridges illustrate typicalstages on the way to matu-rity for an engineering tech-nology. The two 19th-century bridges, spanningthe Danube at Budapestand New York’s East River,are still in service. Lessonslearned from the spectacu-lar 1940 collapse of thenewly built Tacoma Nar-rows Bridge in WashingtonState have contributed tothe confident design of am-bitious modern structureslike the Akashi KaikyoBridge built near Kobe in aregion of Japan prone toearthquakes. Its central spanis 2 km long.

Computers have become indispensable to scientific re-search. They are essential for collecting and analyzing

experimental data, and they have largely replaced penciland paper as the theorist’s main tool. Computers let theo-rists extend their studies of physical, chemical, and bio-logical systems by solving difficult nonlinear problems inmagnetohydrodynamics; atomic, molecular, and nuclearstructure; fluid turbulence; shock hydrodynamics; and cos-mological structure formation.

Beyond such well-established aids to theorists and ex-perimenters, the exponential growth of computer power isnow launching the new field of computational science.Multidisciplinary computational teams are beginning todevelop large-scale predictive simulations of highly com-plex technical problems. Large-scale codes have been cre-ated to simulate, with unprecedented fidelity, phenomenasuch as supernova explosions (see figures 1 and 2), inertial-confinement fusion, nuclear explosions (see the box onpage 38), asteroid impacts (figure 3), and the effect of spaceweather on Earth’s magnetosphere (figure 4).

Computational simulation has the potential to jointheory and experiment as a third powerful researchmethodology. Although, as figures 1–4 show, the new dis-cipline is already yielding important and exciting results,it is also becoming all too clear that much of computationalscience is still troublingly immature. We point out threedistinct challenges that computational science must meetif it is to fulfill its potential and take its place as a fullymature partner of theory and experiment:! the performance challenge—producing high-perform-ance computers,! the programming challenge—programming for complexcomputers, and! the prediction challenge—developing truly predictivecomplex application codes.

The performance challenge requires that the expo-nential growth of computer performance continue, yield-ing ever larger memories and faster processing. The pro-gramming challenge involves the writing of codes that can

efficiently exploit the capacities of theincreasingly complex computers. Theprediction challenge is to use all thatcomputing power to provide answersreliable enough to form the basis forimportant decisions.

The performance challenge isbeing met, at least for the next 10years. Processor speed continues to in-

crease, and massive parallelization is augmenting thatspeed, albeit at the cost of increasingly complex computerarchitectures. Massively parallel computers with thou-sands of processors are becoming widely available at rela-tively low cost, and larger ones are being developed.

Much remains to be done to meet the programmingchallenge. But computer scientists are beginning to de-velop languages and software tools to facilitate program-ming for massively parallel computers.

The most urgent challengeThe prediction challenge is now the most serious limitingfactor for computational science. The field is in transitionfrom modest codes developed by small teams to much morecomplex programs, developed over many years by largeteams, that incorporate many strongly coupled effectsspanning wide ranges of spatial and temporal scales. Theprediction challenge is due to the complexity of the newercodes, and the problem of integrating the efforts of largeteams. This often results in codes that are not sufficientlyreliable and credible to be the basis of important decisionsfacing society. The growth of code size and complexity, andits attendant problems, bears some resemblance to thetransition from small to large scale by experimentalphysics in the decades after World War II.

A comparative case study of six large-scale scientificcode projects, by Richard Kendall and one of us (Post),1 hasyielded three important lessons. Verification, validation,and quality management, we found, are all crucial to thesuccess of a large-scale code-writing project. Althoughsome computational science projects—those illustrated byfigures 1–4, for example—stress all three requirements,many other current and planned projects give them insuf-ficient attention. In the absence of any one of those re-quirements, one doesn’t have the assurance of independ-ent assessment, confirmation, and repeatability of results.Because it’s impossible to judge the validity of such results,they often have little credibility and no impact.

Part of the problem is simply that it’s hard to decidewhether a code result is right or wrong. Our experience asreferees and editors tells us that the peer review process incomputational science generally doesn’t provide as effectivea filter as it does for experiment or theory. Many things thata referee cannot detect could be wrong with a computa-tional-science paper. The code could have hidden defects, itmight be applying algorithms improperly, or its spatial ortemporal resolution might be inappropriately coarse.

© 2005 American Institute of Physics, S-0031-9228-0501-020-3 January 2005 Physics Today 35

Douglass Post is a computational physicist at Los Alamos Na-tional Laboratory and an associate editor-in chief of Computingin Science and Engineering. Lawrence Votta is a DistinguishedEngineer at Sun Microsystems Inc in Menlo Park, California. Hehas been an associate editor of IEEE Transactions on SoftwareEngineering.

The field has reached a threshold at which better organizationbecomes crucial. New methods of verifying and validatingcomplex codes are mandatory if computational science is tofulfill its promise for science and society.

Douglass E. Post and Lawrence G. Votta

Computational Science Demandsa New Paradigm

Computers have become indispensable to scientific re-search. They are essential for collecting and analyzing

experimental data, and they have largely replaced penciland paper as the theorist’s main tool. Computers let theo-rists extend their studies of physical, chemical, and bio-logical systems by solving difficult nonlinear problems inmagnetohydrodynamics; atomic, molecular, and nuclearstructure; fluid turbulence; shock hydrodynamics; and cos-mological structure formation.

Beyond such well-established aids to theorists and ex-perimenters, the exponential growth of computer power isnow launching the new field of computational science.Multidisciplinary computational teams are beginning todevelop large-scale predictive simulations of highly com-plex technical problems. Large-scale codes have been cre-ated to simulate, with unprecedented fidelity, phenomenasuch as supernova explosions (see figures 1 and 2), inertial-confinement fusion, nuclear explosions (see the box onpage 38), asteroid impacts (figure 3), and the effect of spaceweather on Earth’s magnetosphere (figure 4).

Computational simulation has the potential to jointheory and experiment as a third powerful researchmethodology. Although, as figures 1–4 show, the new dis-cipline is already yielding important and exciting results,it is also becoming all too clear that much of computationalscience is still troublingly immature. We point out threedistinct challenges that computational science must meetif it is to fulfill its potential and take its place as a fullymature partner of theory and experiment:! the performance challenge—producing high-perform-ance computers,! the programming challenge—programming for complexcomputers, and! the prediction challenge—developing truly predictivecomplex application codes.

The performance challenge requires that the expo-nential growth of computer performance continue, yield-ing ever larger memories and faster processing. The pro-gramming challenge involves the writing of codes that can

efficiently exploit the capacities of theincreasingly complex computers. Theprediction challenge is to use all thatcomputing power to provide answersreliable enough to form the basis forimportant decisions.

The performance challenge isbeing met, at least for the next 10years. Processor speed continues to in-

crease, and massive parallelization is augmenting thatspeed, albeit at the cost of increasingly complex computerarchitectures. Massively parallel computers with thou-sands of processors are becoming widely available at rela-tively low cost, and larger ones are being developed.

Much remains to be done to meet the programmingchallenge. But computer scientists are beginning to de-velop languages and software tools to facilitate program-ming for massively parallel computers.

The most urgent challengeThe prediction challenge is now the most serious limitingfactor for computational science. The field is in transitionfrom modest codes developed by small teams to much morecomplex programs, developed over many years by largeteams, that incorporate many strongly coupled effectsspanning wide ranges of spatial and temporal scales. Theprediction challenge is due to the complexity of the newercodes, and the problem of integrating the efforts of largeteams. This often results in codes that are not sufficientlyreliable and credible to be the basis of important decisionsfacing society. The growth of code size and complexity, andits attendant problems, bears some resemblance to thetransition from small to large scale by experimentalphysics in the decades after World War II.

A comparative case study of six large-scale scientificcode projects, by Richard Kendall and one of us (Post),1 hasyielded three important lessons. Verification, validation,and quality management, we found, are all crucial to thesuccess of a large-scale code-writing project. Althoughsome computational science projects—those illustrated byfigures 1–4, for example—stress all three requirements,many other current and planned projects give them insuf-ficient attention. In the absence of any one of those re-quirements, one doesn’t have the assurance of independ-ent assessment, confirmation, and repeatability of results.Because it’s impossible to judge the validity of such results,they often have little credibility and no impact.

Part of the problem is simply that it’s hard to decidewhether a code result is right or wrong. Our experience asreferees and editors tells us that the peer review process incomputational science generally doesn’t provide as effectivea filter as it does for experiment or theory. Many things thata referee cannot detect could be wrong with a computa-tional-science paper. The code could have hidden defects, itmight be applying algorithms improperly, or its spatial ortemporal resolution might be inappropriately coarse.

© 2005 American Institute of Physics, S-0031-9228-0501-020-3 January 2005 Physics Today 35

Douglass Post is a computational physicist at Los Alamos Na-tional Laboratory and an associate editor-in chief of Computingin Science and Engineering. Lawrence Votta is a DistinguishedEngineer at Sun Microsystems Inc in Menlo Park, California. Hehas been an associate editor of IEEE Transactions on SoftwareEngineering.

The field has reached a threshold at which better organizationbecomes crucial. New methods of verifying and validatingcomplex codes are mandatory if computational science is tofulfill its promise for science and society.

Douglass E. Post and Lawrence G. Votta

Computational Science Demandsa New Paradigm

Computers have become indispensable to scientific re-search. They are essential for collecting and analyzing

experimental data, and they have largely replaced penciland paper as the theorist’s main tool. Computers let theo-rists extend their studies of physical, chemical, and bio-logical systems by solving difficult nonlinear problems inmagnetohydrodynamics; atomic, molecular, and nuclearstructure; fluid turbulence; shock hydrodynamics; and cos-mological structure formation.

Beyond such well-established aids to theorists and ex-perimenters, the exponential growth of computer power isnow launching the new field of computational science.Multidisciplinary computational teams are beginning todevelop large-scale predictive simulations of highly com-plex technical problems. Large-scale codes have been cre-ated to simulate, with unprecedented fidelity, phenomenasuch as supernova explosions (see figures 1 and 2), inertial-confinement fusion, nuclear explosions (see the box onpage 38), asteroid impacts (figure 3), and the effect of spaceweather on Earth’s magnetosphere (figure 4).

Computational simulation has the potential to jointheory and experiment as a third powerful researchmethodology. Although, as figures 1–4 show, the new dis-cipline is already yielding important and exciting results,it is also becoming all too clear that much of computationalscience is still troublingly immature. We point out threedistinct challenges that computational science must meetif it is to fulfill its potential and take its place as a fullymature partner of theory and experiment:! the performance challenge—producing high-perform-ance computers,! the programming challenge—programming for complexcomputers, and! the prediction challenge—developing truly predictivecomplex application codes.

The performance challenge requires that the expo-nential growth of computer performance continue, yield-ing ever larger memories and faster processing. The pro-gramming challenge involves the writing of codes that can

efficiently exploit the capacities of theincreasingly complex computers. Theprediction challenge is to use all thatcomputing power to provide answersreliable enough to form the basis forimportant decisions.

The performance challenge isbeing met, at least for the next 10years. Processor speed continues to in-

crease, and massive parallelization is augmenting thatspeed, albeit at the cost of increasingly complex computerarchitectures. Massively parallel computers with thou-sands of processors are becoming widely available at rela-tively low cost, and larger ones are being developed.

Much remains to be done to meet the programmingchallenge. But computer scientists are beginning to de-velop languages and software tools to facilitate program-ming for massively parallel computers.

The most urgent challengeThe prediction challenge is now the most serious limitingfactor for computational science. The field is in transitionfrom modest codes developed by small teams to much morecomplex programs, developed over many years by largeteams, that incorporate many strongly coupled effectsspanning wide ranges of spatial and temporal scales. Theprediction challenge is due to the complexity of the newercodes, and the problem of integrating the efforts of largeteams. This often results in codes that are not sufficientlyreliable and credible to be the basis of important decisionsfacing society. The growth of code size and complexity, andits attendant problems, bears some resemblance to thetransition from small to large scale by experimentalphysics in the decades after World War II.

A comparative case study of six large-scale scientificcode projects, by Richard Kendall and one of us (Post),1 hasyielded three important lessons. Verification, validation,and quality management, we found, are all crucial to thesuccess of a large-scale code-writing project. Althoughsome computational science projects—those illustrated byfigures 1–4, for example—stress all three requirements,many other current and planned projects give them insuf-ficient attention. In the absence of any one of those re-quirements, one doesn’t have the assurance of independ-ent assessment, confirmation, and repeatability of results.Because it’s impossible to judge the validity of such results,they often have little credibility and no impact.

Part of the problem is simply that it’s hard to decidewhether a code result is right or wrong. Our experience asreferees and editors tells us that the peer review process incomputational science generally doesn’t provide as effectivea filter as it does for experiment or theory. Many things thata referee cannot detect could be wrong with a computa-tional-science paper. The code could have hidden defects, itmight be applying algorithms improperly, or its spatial ortemporal resolution might be inappropriately coarse.

© 2005 American Institute of Physics, S-0031-9228-0501-020-3 January 2005 Physics Today 35

Douglass Post is a computational physicist at Los Alamos Na-tional Laboratory and an associate editor-in chief of Computingin Science and Engineering. Lawrence Votta is a DistinguishedEngineer at Sun Microsystems Inc in Menlo Park, California. Hehas been an associate editor of IEEE Transactions on SoftwareEngineering.

The field has reached a threshold at which better organizationbecomes crucial. New methods of verifying and validatingcomplex codes are mandatory if computational science is tofulfill its promise for science and society.

Douglass E. Post and Lawrence G. Votta

Computational Science Demandsa New Paradigm

Software for experiments

Experiments for software