using machine learning to extract properties of systems of ... posters/gulian_poster.pdf · using...

1
Using Machine Learning to Extract Properties of Systems of Particles Ellen D. Gulian , , Michael Kordell II , Rainer J. Fries , § Department of Physics, University of Maryland - Baltimore County Cyclotron Institute, Texas A&M University § Department of Physics and Astronomy, Texas A&M University Using Machine Learning to Extract Properties of Systems of Particles Ellen D. Gulian , , Michael Kordell II , Rainer J. Fries , § Department of Physics, University of Maryland - Baltimore County Cyclotron Institute, Texas A&M University § Department of Physics and Astronomy, Texas A&M University Introduction In high energy and nuclear physics, one is often presented with complex final- state systems of particles that are products of many dynamical processes. With traditional analysis methods, it is hard to determine the specific processes that created these systems. Here, we explore the application of machine learning in analyzing such systems. We first discuss our Python simulations, which create systems of particles. Then, we apply various machine learning algorithms to pseudo- data generated by our model in order to predict features of the systems of particles. Model Our Python code creates systems of particles (events) that resemble several key features of real-world experimental data: Thermal sampling of particle momentum based on a relativistic Maxwell- Boltzmann distribution given (in natural units) by: f (p)= e -E/T , where E = - p p 2 + M 2 (1) Collective flow velocity of the particles in the form of a cylindrical blastwave; transverse velocity of particles is given by v T = α 0 /R (2) where α 0 is the boundary velocity and R is the radius of the cylinder. Two-body decay of unstable particles (mass M ) into daughter particles (masses m 1 and m 2 ); momentum of daughters in rest frame of mother is given by k = q 1/(4M )[(M 2 - m 2 1 - m 2 2 ) 2 - 4m 2 1 m 2 2 ] (3) Example: 10,000 events with 10 particles each. Default parameters: T = 0.15 GeV, M = 0.775 GeV, m 1 = m 2 = 0.14 GeV, and α 0 = 0.7. Note: By analyzing the candidate mass for all pairs of daughter particles in an event, one can reconstruct the mother mass on top of a background, as seen by the peak in the invariant mass spectrum. Results We analyze the daughter momenta in an event to extract parameters M, T, and α 0 . Using Python’s Scikit-learn library [1], we trained several machine learning algorithms on the data generated by our model. Each algorithm was trained with 14,000 events and tested with 6,000 events. Each event consisted of 10 particles. We focused on four classifiers to obtain binary classifications: 1. random forest (RF) : an ensemble classification algorithm with many individual decision trees; each tree in the forest yields a class prediction, and the class with the most number of votes is the winning prediction. 2. adaptive boost (ADA) : an ensemble classification algorithm that fits weak learners to weighted data in order to create a strong learner; very sensitive to noise and outliers. 3. gradient boost (GB) : an ensemble classification algorithm that builds a prediction model using weak estimators (typically decision trees); allows for extensive optimization. 4. multilayer perceptron (MLP) : a neural network classifier that uses back-propagation (a tool that increases prediction accuracy) to train. Panel 1. Performance of ensemble classifiers (100 estimators each) in relation to closeness in value between two binary target values. Panel 2. Performance of ensemble classifiers in predicting targets as a function of the number of estimators. Panel 3. Performance of MLP classifier in predicting targets in relation to the closeness in value between two binary target values. Discussion and Analysis 1. As expected, the performance of all classifiers improves with increasing con- trast between parameter choices (Panel 1). 2. The algorithms can easily infer the masses of decayed particles. However, they have slightly more trouble predicting temperature and are least efficient in predicting the collective flow velocities. 3. Increasing the number of estimators, in general, results in higher accuracy. However, the impact of additional estimators appears to decrease at some point, and one can imagine that eventually there is a trade-off between ac- curacy and efficiency; adding more estimators to the classifier can increase runtime significantly. Other options, such as increasing the size of the training data set, can also improve accuracy. 4. The multi-layer perceptron classifier in Panel 3 is shown only for comparison. With default parameters, it appears to perform worse than the ensemble clas- sifiers. While neural networks are very powerful, they are also very complex and require a significant amount of parameter tuning. Given the performance of the ensemble classifiers, we chose not to study the parameter optimization of the multi-layer perceptron classifier here. Remarks While our work is a first step in training machine learning algorithms to under- stand systems of particles, further work is needed to develop more sophisticated simulation code. Ongoing work involves adjusting the model to account for parti- cles that undergo N -body decays (N > 2). In addition, further research is needed into optimizing the parameters to the classifiers. In the future, after exploring the ability of machine learning algorithms in our simplified models, we plan to use established simulation codes like JETSCAPE [2] and PYTHIA [3] which create more complex and realistic systems. If proven feasible, one possible application of our work is to increase our understanding of the hadronization process and the phenomenon of confinement by analyzing experimental data with machine learning. Acknowledgements This research was supported by NSF grants PHY-1659847, PHY-1812431, and PHY-1550221. References [1] F. Pedregosa et al. Scikit-learn: Machine Learning in Python. In: Journal of Machine Learning Research 12 (2011), pp. 2825–2830. [2] J. H. Putschke et al. The JETSCAPE framework. In: (2019). arXiv: 1903.07706 [nucl- th]. [3] Torbj¨ orn Sj¨ ostrand et al. An Introduction to PYTHIA 8.2. In: Comput. Phys. Commun. 191 (2015), pp. 159–177. doi: 10.1016/j.cpc.2015.01.024. arXiv: 1410.3012 [hep-ph].

Upload: others

Post on 31-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using Machine Learning to Extract Properties of Systems of ... posters/gulian_poster.pdf · Using Machine Learning to Extract Properties of Systems of Particles Ellen D. Guliany,

Using Machine Learning to Extract Properties of Systems of ParticlesEllen D. Gulian†, ‡, Michael Kordell II‡, Rainer J. Fries‡, §

†Department of Physics, University of Maryland - Baltimore County‡Cyclotron Institute, Texas A&M University

§Department of Physics and Astronomy, Texas A&M University

Using Machine Learning to Extract Properties of Systems of ParticlesEllen D. Gulian†, ‡, Michael Kordell II‡, Rainer J. Fries‡, §

†Department of Physics, University of Maryland - Baltimore County‡Cyclotron Institute, Texas A&M University

§Department of Physics and Astronomy, Texas A&M University

Introduction

In high energy and nuclear physics, one is often presented with complex final-state systems of particles that are products of many dynamical processes. Withtraditional analysis methods, it is hard to determine the specific processes thatcreated these systems. Here, we explore the application of machine learning inanalyzing such systems. We first discuss our Python simulations, which createsystems of particles. Then, we apply various machine learning algorithms to pseudo-data generated by our model in order to predict features of the systems of particles.

Model

Our Python code creates systems of particles (events) that resemble several keyfeatures of real-world experimental data:

• Thermal sampling of particle momentum based on a relativistic Maxwell-Boltzmann distribution given (in natural units) by:

f(p) = e−E/T , where E = −√p2 + M 2 (1)

• Collective flow velocity of the particles in the form of a cylindrical blastwave;transverse velocity of particles is given by

vT = α0/R (2)

where α0 is the boundary velocity and R is the radius of the cylinder.

• Two-body decay of unstable particles (mass M) into daughter particles (massesm1 and m2); momentum of daughters in rest frame of mother is given by

k =√

1/(4M)[(M 2 −m21 −m2

2)2 − 4m2

1m22] (3)

Example: 10,000 events with 10 particles each. Default parameters: T = 0.15 GeV, M = 0.775GeV, m1 = m2 = 0.14 GeV, and α0 = 0.7.

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0Mother Momentum Magnitude (GeV)

0

500

1000

1500

2000

2500

Mother Particle Momentum DistributionWith flowWithout flow

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0Daughter Momentum Magnitude (GeV)

0

500

1000

1500

2000

2500

3000

Daughter Particle Momentum Distribution

0.0 0.5 1.0 1.5 2.0 2.5 3.0Mother mass (GeV)

0

50000

100000

150000

200000

Invariant Mass Spectrum

Note: By analyzing the candidate mass for all pairs of daughter particles in an event, one can reconstruct the mothermass on top of a background, as seen by the peak in the invariant mass spectrum.

Results

We analyze the daughter momenta in an event to extract parameters M, T, and α0. UsingPython’s Scikit-learn library [1], we trained several machine learning algorithms on the datagenerated by our model. Each algorithm was trained with 14,000 events and tested with 6,000events. Each event consisted of 10 particles. We focused on four classifiers to obtain binaryclassifications:

1. random forest (RF): an ensemble classification algorithm with many individual decisiontrees; each tree in the forest yields a class prediction, and the class with the most numberof votes is the winning prediction.

2. adaptive boost (ADA): an ensemble classification algorithm that fits weak learners toweighted data in order to create a strong learner; very sensitive to noise and outliers.

3. gradient boost (GB): an ensemble classification algorithm that builds a prediction modelusing weak estimators (typically decision trees); allows for extensive optimization.

4. multilayer perceptron (MLP): a neural network classifier that uses back-propagation (atool that increases prediction accuracy) to train.

0.1 0.2 0.3 0.4 0.5ΔMΔ(GeV)

50

60

70

80

90

100Ac

curacyΔ(%

)

Predicting Mother Mass (RF, ADA, GD)

RFADAGB

0.05 0.10 0.15 0.20 0.25 0.30ΔTΔ(GeV)

50

60

70

80

90

100

Accu

racyΔ(%

)

Predicting Temperature (RF, ADA, GB)

RFADAGB

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0Δα0

50

60

70

80

90

100

Accu

racyΔ(%

)

Predicting Flow Velocity (RF, ADA, GB)

RFADAGB

Panel 1. Performance of ensemble classifiers (100 estimators each) in relation to closeness in value between two binary target values.

20 40 60 80 100 120 140Number of Estimators

50

60

70

80

90

100

Accu

racy

(%)

Predicting Mother Mass (RF, ADA, GB)

RFADAGB

20 40 60 80 100 120 140Number of Estimators

50

60

70

80

90

100

Accuracy (%)

Predicting Temperature (RF, ADA, GB)

RFADAGB

20 40 60 80 100 120 140Number of Estimators

50

60

70

80

90

100

Accu

racy (%

)

Predicting Flow Velocity (RF, ADA, GB)

RFADAGB

Panel 2. Performance of ensemble classifiers in predicting targets as a function of the number of estimators.

0.1 0.2 0.3 0.4 0.5ΔMΔ(GeV)

50

60

70

80

90

100

AccuracyΔ(%

)

PredictingΔM therΔMassΔ%ithΔMLPΔClassifier

0.05 0.10 0.15 0.20 0.25 0.30ΔTΔ(GeV)

50

60

70

80

90

100

Accu

racy (%

)

Predicting Temperature with MLP Classifier

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0Δα0

50

60

70

80

90

100

AccuracyΔ(%

)

PredictingΔFl %ΔVel cityΔ%ithΔMLPΔClassifier

Panel 3. Performance of MLP classifier in predicting targets in relation to the closeness in value between two binary target values.

Discussion and Analysis

1. As expected, the performance of all classifiers improves with increasing con-trast between parameter choices (Panel 1).

2. The algorithms can easily infer the masses of decayed particles. However,they have slightly more trouble predicting temperature and are least efficientin predicting the collective flow velocities.

3. Increasing the number of estimators, in general, results in higher accuracy.However, the impact of additional estimators appears to decrease at somepoint, and one can imagine that eventually there is a trade-off between ac-curacy and efficiency; adding more estimators to the classifier can increaseruntime significantly. Other options, such as increasing the size of the trainingdata set, can also improve accuracy.

4. The multi-layer perceptron classifier in Panel 3 is shown only for comparison.With default parameters, it appears to perform worse than the ensemble clas-sifiers. While neural networks are very powerful, they are also very complexand require a significant amount of parameter tuning. Given the performanceof the ensemble classifiers, we chose not to study the parameter optimizationof the multi-layer perceptron classifier here.

Remarks

While our work is a first step in training machine learning algorithms to under-stand systems of particles, further work is needed to develop more sophisticatedsimulation code. Ongoing work involves adjusting the model to account for parti-cles that undergo N -body decays (N > 2). In addition, further research is neededinto optimizing the parameters to the classifiers. In the future, after exploring theability of machine learning algorithms in our simplified models, we plan to useestablished simulation codes like JETSCAPE [2] and PYTHIA [3] which createmore complex and realistic systems. If proven feasible, one possible applicationof our work is to increase our understanding of the hadronization process andthe phenomenon of confinement by analyzing experimental data with machinelearning.

Acknowledgements

This research was supported by NSF grants PHY-1659847, PHY-1812431, andPHY-1550221.

References

[1] F. Pedregosa et al. “Scikit-learn: Machine Learning in Python”. In: Journal of MachineLearning Research 12 (2011), pp. 2825–2830.

[2] J. H. Putschke et al. “The JETSCAPE framework”. In: (2019). arXiv: 1903.07706 [nucl-

th].

[3] Torbjorn Sjostrand et al. “An Introduction to PYTHIA 8.2”. In: Comput. Phys. Commun.191 (2015), pp. 159–177. doi: 10.1016/j.cpc.2015.01.024. arXiv: 1410.3012 [hep-ph].