G. Cowan Computing and Statistical Data Analysis / Stat 5 1
Computing and Statistical Data Analysis Stat 5: Multivariate Methods
London Postgraduate Lectures on Particle Physics;
University of London MSci course PH4515
Glen Cowan Physics Department Royal Holloway, University of London [email protected] www.pp.rhul.ac.uk/~cowan
Course web page: www.pp.rhul.ac.uk/~cowan/stat_course.html
G. Cowan Computing and Statistical Data Analysis / Stat 5 page 2
Finding an optimal decision boundary In particle physics usually start by making simple “cuts”:
xi < ci xj < cj
Maybe later try some other type of decision boundary: H0 H0
H0
H1
H1 H1
G. Cowan Computing and Statistical Data Analysis / Stat 5 3
Multivariate methods Many new (and some old) methods:
Fisher discriminant Neural networks Kernel density methods Support Vector Machines Decision trees Boosting Bagging
New software for HEP, e.g., TMVA , Höcker, Stelzer, Tegenfeldt, Voss, Voss, physics/0703039 StatPatternRecognition, I. Narsky, physics/0507143
G. Cowan Computing and Statistical Data Analysis / Stat 5 4
G. Cowan Computing and Statistical Data Analysis / Stat 5 5
G. Cowan Computing and Statistical Data Analysis / Stat 5 6
G. Cowan Computing and Statistical Data Analysis / Stat 5 7
2
G. Cowan Computing and Statistical Data Analysis / Stat 5 8
G. Cowan Computing and Statistical Data Analysis / Stat 5 9
G. Cowan Computing and Statistical Data Analysis / Stat 5 10
G. Cowan Computing and Statistical Data Analysis / Stat 5 11
G. Cowan Computing and Statistical Data Analysis / Stat 5 12
G. Cowan Computing and Statistical Data Analysis / Stat 5 13
G. Cowan Computing and Statistical Data Analysis / Stat 5 14
G. Cowan Computing and Statistical Data Analysis / Stat 5 15
G. Cowan Computing and Statistical Data Analysis / Stat 5 16
G. Cowan Computing and Statistical Data Analysis / Stat 5 17
G. Cowan Computing and Statistical Data Analysis / Stat 5 18
G. Cowan Computing and Statistical Data Analysis / Stat 5 19
G. Cowan Computing and Statistical Data Analysis / Stat 5 20
G. Cowan Computing and Statistical Data Analysis / Stat 5 21
G. Cowan Computing and Statistical Data Analysis / Stat 5 22
G. Cowan Computing and Statistical Data Analysis / Stat 5 23
G. Cowan Computing and Statistical Data Analysis / Stat 5 page 24
Overtraining
training sample independent validation sample
If decision boundary is too flexible it will conform too closely to the training points → overtraining. Monitor by applying classifier to independent validation sample.
G. Cowan Computing and Statistical Data Analysis / Stat 5 25
Choose classifier that minimizes error function for validation sample.
G. Cowan Computing and Statistical Data Analysis / Stat 5 page 26
Neural network example from LEP II Signal: e+e- → W+W- (often 4 well separated hadron jets) Background: e+e- → qqgg (4 less well separated hadron jets)
← input variables based on jet structure, event shape, ... none by itself gives much separation.
Neural network output:
(Garrido, Juste and Martinez, ALEPH 96-144)
G. Cowan Computing and Statistical Data Analysis / Stat 5 27
G. Cowan Computing and Statistical Data Analysis / Stat 5 28
G. Cowan Computing and Statistical Data Analysis / Stat 5 29
G. Cowan Computing and Statistical Data Analysis / Stat 5 30
G. Cowan Computing and Statistical Data Analysis / Stat 5 31
G. Cowan Computing and Statistical Data Analysis / Stat 5 32
Kernel-based PDE (KDE, Parzen window) Consider d dimensions, N training events, x1, ..., xN, estimate f (x) with
Use e.g. Gaussian kernel:
kernel bandwidth (smoothing parameter)
Need to sum N terms to evaluate function (slow); faster algorithms only count events in vicinity of x (k-nearest neighbor, range search).
G. Cowan Computing and Statistical Data Analysis / Stat 5 33
G. Cowan Computing and Statistical Data Analysis / Stat 5 34
G. Cowan Computing and Statistical Data Analysis / Stat 5 35