using machine learning to extract properties of systems of ... posters/gulian_poster.pdf · using...

Using Machine Learning to Extract Properties of Systems of ParticlesEllen D. Gulian†, ‡, Michael Kordell II‡, Rainer J. Fries‡, §

†Department of Physics, University of Maryland - Baltimore County‡Cyclotron Institute, Texas A&M University

§Department of Physics and Astronomy, Texas A&M University

Using Machine Learning to Extract Properties of Systems of ParticlesEllen D. Gulian†, ‡, Michael Kordell II‡, Rainer J. Fries‡, §

†Department of Physics, University of Maryland - Baltimore County‡Cyclotron Institute, Texas A&M University

§Department of Physics and Astronomy, Texas A&M University

Introduction

In high energy and nuclear physics, one is often presented with complex final-state systems of particles that are products of many dynamical processes. Withtraditional analysis methods, it is hard to determine the specific processes thatcreated these systems. Here, we explore the application of machine learning inanalyzing such systems. We first discuss our Python simulations, which createsystems of particles. Then, we apply various machine learning algorithms to pseudo-data generated by our model in order to predict features of the systems of particles.

Model

Our Python code creates systems of particles (events) that resemble several keyfeatures of real-world experimental data:

• Thermal sampling of particle momentum based on a relativistic Maxwell-Boltzmann distribution given (in natural units) by:

f(p) = e−E/T , where E = −√p2 + M 2 (1)

• Collective flow velocity of the particles in the form of a cylindrical blastwave;transverse velocity of particles is given by

vT = α0/R (2)

where α0 is the boundary velocity and R is the radius of the cylinder.

• Two-body decay of unstable particles (mass M) into daughter particles (massesm1 and m2); momentum of daughters in rest frame of mother is given by

k =√

1/(4M)[(M 2 −m21 −m2

2)2 − 4m2

1m22] (3)

Example: 10,000 events with 10 particles each. Default parameters: T = 0.15 GeV, M = 0.775GeV, m1 = m2 = 0.14 GeV, and α0 = 0.7.

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0Mother Momentum Magnitude (GeV)

0

500

1000

1500

2000

2500

Mother Particle Momentum DistributionWith flowWithout flow

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0Daughter Momentum Magnitude (GeV)

0

500

1000

1500

2000

2500

3000

Daughter Particle Momentum Distribution

0.0 0.5 1.0 1.5 2.0 2.5 3.0Mother mass (GeV)

0

50000

100000

150000

200000

Invariant Mass Spectrum

Note: By analyzing the candidate mass for all pairs of daughter particles in an event, one can reconstruct the mothermass on top of a background, as seen by the peak in the invariant mass spectrum.

Results

We analyze the daughter momenta in an event to extract parameters M, T, and α0. UsingPython’s Scikit-learn library [1], we trained several machine learning algorithms on the datagenerated by our model. Each algorithm was trained with 14,000 events and tested with 6,000events. Each event consisted of 10 particles. We focused on four classifiers to obtain binaryclassifications:

1. random forest (RF): an ensemble classification algorithm with many individual decisiontrees; each tree in the forest yields a class prediction, and the class with the most numberof votes is the winning prediction.

2. adaptive boost (ADA): an ensemble classification algorithm that fits weak learners toweighted data in order to create a strong learner; very sensitive to noise and outliers.

3. gradient boost (GB): an ensemble classification algorithm that builds a prediction modelusing weak estimators (typically decision trees); allows for extensive optimization.

4. multilayer perceptron (MLP): a neural network classifier that uses back-propagation (atool that increases prediction accuracy) to train.

0.1 0.2 0.3 0.4 0.5ΔMΔ(GeV)

50

60

70

80

90

100Ac

curacyΔ(%

)

Predicting Mother Mass (RF, ADA, GD)

RFADAGB

0.05 0.10 0.15 0.20 0.25 0.30ΔTΔ(GeV)

50

60

70

80

90

100

Accu

racyΔ(%

)

Predicting Temperature (RF, ADA, GB)

RFADAGB

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0Δα0

50

60

70

80

90

100

Accu

racyΔ(%

)

Predicting Flow Velocity (RF, ADA, GB)

RFADAGB

Panel 1. Performance of ensemble classifiers (100 estimators each) in relation to closeness in value between two binary target values.

20 40 60 80 100 120 140Number of Estimators

50

60

70

80

90

100

Accu

racy

(%)

Predicting Mother Mass (RF, ADA, GB)

RFADAGB


50

60

70

80

90

100

Accuracy (%)

Predicting Temperature (RF, ADA, GB)

RFADAGB


50

60

70

80

90

100

Accu

racy (%

)

Predicting Flow Velocity (RF, ADA, GB)

RFADAGB

Panel 2. Performance of ensemble classifiers in predicting targets as a function of the number of estimators.

0.1 0.2 0.3 0.4 0.5ΔMΔ(GeV)

50

60

70

80

90

100

AccuracyΔ(%

)

PredictingΔM therΔMassΔ%ithΔMLPΔClassifier

0.05 0.10 0.15 0.20 0.25 0.30ΔTΔ(GeV)

50

60

70

80

90

100

Accu

racy (%

)

Predicting Temperature with MLP Classifier

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0Δα0

50

60

70

80

90

100

AccuracyΔ(%

)

PredictingΔFl %ΔVel cityΔ%ithΔMLPΔClassifier

Panel 3. Performance of MLP classifier in predicting targets in relation to the closeness in value between two binary target values.

Discussion and Analysis

1. As expected, the performance of all classifiers improves with increasing con-trast between parameter choices (Panel 1).

2. The algorithms can easily infer the masses of decayed particles. However,they have slightly more trouble predicting temperature and are least efficientin predicting the collective flow velocities.

3. Increasing the number of estimators, in general, results in higher accuracy.However, the impact of additional estimators appears to decrease at somepoint, and one can imagine that eventually there is a trade-off between ac-curacy and efficiency; adding more estimators to the classifier can increaseruntime significantly. Other options, such as increasing the size of the trainingdata set, can also improve accuracy.

4. The multi-layer perceptron classifier in Panel 3 is shown only for comparison.With default parameters, it appears to perform worse than the ensemble clas-sifiers. While neural networks are very powerful, they are also very complexand require a significant amount of parameter tuning. Given the performanceof the ensemble classifiers, we chose not to study the parameter optimizationof the multi-layer perceptron classifier here.

Remarks

While our work is a first step in training machine learning algorithms to under-stand systems of particles, further work is needed to develop more sophisticatedsimulation code. Ongoing work involves adjusting the model to account for parti-cles that undergo N -body decays (N > 2). In addition, further research is neededinto optimizing the parameters to the classifiers. In the future, after exploring theability of machine learning algorithms in our simplified models, we plan to useestablished simulation codes like JETSCAPE [2] and PYTHIA [3] which createmore complex and realistic systems. If proven feasible, one possible applicationof our work is to increase our understanding of the hadronization process andthe phenomenon of confinement by analyzing experimental data with machinelearning.

Acknowledgements

This research was supported by NSF grants PHY-1659847, PHY-1812431, andPHY-1550221.

References

[1] F. Pedregosa et al. “Scikit-learn: Machine Learning in Python”. In: Journal of MachineLearning Research 12 (2011), pp. 2825–2830.

[2] J. H. Putschke et al. “The JETSCAPE framework”. In: (2019). arXiv: 1903.07706 [nucl-

th].

[3] Torbjorn Sjostrand et al. “An Introduction to PYTHIA 8.2”. In: Comput. Phys. Commun.191 (2015), pp. 159–177. doi: 10.1016/j.cpc.2015.01.024. arXiv: 1410.3012 [hep-ph].

using machine learning to extract properties of systems of ... posters/gulian_poster.pdf · using...

Documents