jens zimmermann, mpi für physik münchen, acat 2005 zeuthen1 performance of statistical learning...

18
Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 1 Performance of Statistical Learning Methods Jens Zimmermann [email protected] Max-Planck-Institut für Physik, München Forschungszentrum Jülich GmbH Performance Examples from Astrophysics Performance vs. Control H1 Neural Network Trigger Controlling Statistical Learning Methods Overtraining Efficiencies Uncertainties Comparison of Learning Methods Artificial Intelligence Higgs Parity Measurement at the ILC

Upload: enrique-jaquess

Post on 12-Dec-2015

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 1

Performance of Statistical Learning Methods

Jens [email protected]

Max-Planck-Institut für Physik, München

Forschungszentrum Jülich GmbH

Performance Examples from AstrophysicsPerformance vs. ControlH1 Neural Network TriggerControlling Statistical Learning Methods

OvertrainingEfficienciesUncertainties

Comparison of Learning MethodsArtificial IntelligenceHiggs Parity Measurement at the ILC

Page 2: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 2

Performance of Statistical Learning Methods: MAGIC

Significance and number of excess events scale theuncertainties in the flux calculation.

Page 3: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 3

Performance of Statistical Learning Methods: XEUS

Pileup vs. Single photon

classical algorithm„XMM“

? ?pileups not recognised by XMM but by NN

Page 4: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 4

Control of Statistical Learning Methods

There may be many different successful applicationsof statistical learning methods.

There may be great performance improvementscompared to classical methods.

This does not impress people who fear thatstatistical learning methods are not well under control.

First talk: Understanding and InterpretationNow: Control and correct Evaluation

Page 5: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 5

The Neural Network Trigger in the H1 Experiment

L1 2.3 µs

L2 20 µs

L4 100 ms

10 MHz

500 Hz

50 Hz

10 Hz

Trigger Scheme

H1 at HERA ep Collider, DESY

„L2NN“

Each neural network on L2 verifies a specific L1 sub-trigger.

Page 6: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 6

Triggering Deeply Virtual Compton Scattering

L1 sub-trigger 41 triggers DVCS by requiring• Significant energy deposition in SpaCal• Within Time Window

L2 neural network additional information• Liquid argon energies• SpaCal centre energies• z-vertex information

Triggering with4 Hz

Must be reduced to0.8 Hz

TheorySignal

(DVCS)

Background(upstreambeam-gasinteraction)

Page 7: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 7

Determine the correct efficiency

50% training set 25% test set

signalshouldpeak at 1

backgroundshouldpeak at 0

25% selection set

Tune training parameters to• avoid overtraining• optimise performance

Page 8: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 8

Determine the Correct Efficiency

[%]

[%]

training set

test set

Page 9: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 9

Check Statistical Uncertainties

propagation of uncertaintiesefficiency

statistical uncertainty of the efficiency

e.g. 80% ± 4% for 80 of 100

Page 10: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 10

Check Systematical Uncertainties

There is only a propagation ofsystematical uncertainties of the inputs

Assumingx1 with absolute error 1

x2 with relative error 2= 5%x3 with relative error 3=10%

Page 11: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 11

Check Systematical Uncertainties

example: DVCS dataset

Page 12: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 12

Comparison of Hypotheses

efficiencies for fixed rejection of 80%

NN: 96.5% vs. SVM: 95.7%Statistically significant?

Build 95% confidence interval! is the variation over

different parts of the test set

Page 13: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 13

Comparison of Learning Methods

Cross-Validation:Divide dataset into k parts,

train k classifiers byusing each part once as test set.

is the variationover the different trainings

Compare performancesover different training sets!

efficiencies for fixed rejection of 60%

Page 14: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 14

two events with low NN-output

Artificial Intelligence

overlay cosmic

CC

cosmic

H1-L2NN: TriggeringCharged Current

Page 15: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 15

Artificial Intelligence

background foundin J/ selection

H1-L2NN: Triggering J/

Page 16: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 16

Higgs Parity Measurement at the ILC

Parity induces favourite -configuration:• anti-parallel for H• parallel for A

H/A + -

= 5.09

Significance is amplitudedivided by its uncertainty

Significance measured for500 events and averaged

over 600 pseudo-experiments

Classical approach:fit angular distribution

0 2

A

Page 17: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 17

Higgs Parity Measurement at the ILC

Statistical learning approach: direct discrimination

trained towards 0 trained towards 1

= 6.26Significance is difference

of measured meansdivided by its uncertainty

Significance measured for500 events and averaged

over 600 pseudo-experiments

Page 18: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Performance of Statistical Learning Methods Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 18

Conclusion

Statistical Learning Methods successful in manyapplications in high energy and astrophysics.

Significant performance improvements comparedto classical algorithms.

Statistical learning methods are well under control:- efficiencies can be determined- uncertainties can be calculated.

Comparison of learning methods revealsstatistically significant differences.

Statistical Learning Methods sometimes show moreartificial intelligence than expected.