the small- n problem in high energy physics

The small-n problem in High Energy PhysicsGlen CowanGlen CowanDepartment of PhysicsRoyal Holloway, University of [email protected]/~cowan

Statistical Challenges in Modern Astronomy IV SCMA4, 12-15 June, 2006June 12 - 15, 2006

Glen Cowan, SCMA4, 12-15 June, 2006

OutlineGlen CowanI.High Energy Physics (HEP) overviewTheoryExperimentsData

II.The small-n problem, etc.Making a discoverySetting limitsSystematic uncertainties

III.ConclusionsSCMA4, 12-15 June, 2006


The current picture in particle physicsMatter...+ force carriers...photon (g)WZgluon (g)+ relativity + quantum mechanics + symmetries...= The Standard Model almost certainly incomplete 25 free parameters (masses, coupling strengths,...) should include Higgs boson (not yet seen) no gravity yet agrees with all experimental observations so farGlen CowanSCMA4, 12-15 June, 2006


Experiments in High Energy PhysicsGlen CowanHEP mainly studies particle collisions in accelerators, e.g.,Large Electron-Positron (LEP) Collider at CERN, 1989-20004 detectors, each collaboration ~400 physicists.SCMA4, 12-15 June, 2006


More HEP experimentsGlen CowanLEP tunnel now used for the Large Hadron Collider (LHC)proton-proton collisions, Ecm=14 TeV, very high luminosityTwo general purpose detectors: ATLAS and CMSEach detector collaboration has ~2000 physicistsData taking to start 2007The ATLAS DetectorSCMA4, 12-15 June, 2006


HEP dataGlen CowanBasic unit of data: an event.Ideally, an event is a list of momentum vectors & particle types.In practice, particles reconstructed as tracks, clusters of energydeposited in calorimeters, etc.Resolution, angular coverage, particle id, etc. imperfect.An event from the ALEPH detector at LEPSCMA4, 12-15 June, 2006


Data samplesGlen CowanAt LEP, event rates typically ~Hz or less~106 Z boson events in 5 years for each of 4 experimentsAt LHC, ~109 events/sec(!!!), mostly uninteresting;do quick sifting, record ~200 events/secsingle event ~ 1 Mbyte1 year 107 s, 1016 pp collisions per year,2 billion / year recorded (~2 Pbyte / year)For new/rare processes, rates at LHC can be vanishingly smallHiggs bosons detectable per year could be e.g. ~103 needle in a haystackSCMA4, 12-15 June, 2006


HEP game planGlen CowanGoals include:Fill in the gaps in the Standard Model (e.g. find the Higgs)Find something beyond the Standard Model (New Physics)Example of an extension to SM: Supersymmetry (SUSY)For every SM particle SUSY partner (none yet seen!)Minimal SUSY has 105 free parameters, constrainedmodels ~5 parameters (plus the 25 from SM)Provides dark matter candidate (neutralino), unificationof gauge couplings, solution to hierarchy problem,...Lightest SUSY particle can be stable (effectively invisible)SCMA4, 12-15 June, 2006


Simulated HEP dataGlen CowanMonte Carlo event generators available for essentially allStandard Model processes, also for many possible extensionsto the SM (supersymmetric models, extra dimensions, etc.)SM predictions rely on a variety of approximations (perturbation theory to limited order, phenomenological modeling of non-perturbative effects, etc.)Monte Carlo programs also used to simulate detector response.Simulated event for ATLASSCMA4, 12-15 June, 2006


A simulated eventGlen CowanPYTHIA Monte Carlopp gluino-gluinoSCMA4, 12-15 June, 2006


Glen CowanThe data streamExperiment records events of different types, with different numbers of particles, kinematic properties, ... SCMA4, 12-15 June, 2006


Glen CowanSelecting eventsTo search for events of a given type (H0: signal), need discriminatingvariable(s) distributed as differentlyas possible relative to unwanted event types (H1: background)Count number of events in acceptance region defined by cutsExpected number of signal events: s = s s LExpected number of background events:b = b b Ls, b = cross section for signal, backgroundEfficiencies: s = P( accept | s ), b = P( accept | b ) L = integrated luminosity (related to beam intensity, data taking time)SCMA4, 12-15 June, 2006


Glen CowanPoisson data with backgroundCount n events, e.g., in fixed time or integrated luminosity.s = expected number of signal eventsb = expected number of background eventsn ~ Poisson(s+b):Sometimes b known, other times it is in some way uncertain.Goals: (i) convince people that s 0 (discovery);(ii) measure or place limits on s, taking into consideration the uncertainty in b.Widely discussed in HEP community, see e.g. proceedings ofPHYSTAT meetings, Durham, Fermilab, CERN workshops...SCMA4, 12-15 June, 2006


Making a discoveryGlen CowanOften compute p-value of the background only hypothesis H0 using test variable related to a characteristic of the signal.p-value = Probability to see data as incompatible with H0, or more so, relative to the data observed.Requires definition of incompatible with H0HEP folklore: claim discovery if p-value equivalent to a 5 fluctuation of Gaussian variable (one-sided) Actual p-value at which discovery becomes believable will depend on signal in question (subjective)Why not do Bayesian analysis?Usually dont know how to assign meaningful priorprobabilitiesSCMA4, 12-15 June, 2006


Computing p-valuesGlen CowanFor n ~ Poisson (s+b) we compute p-value of H0 : s = 0Often we dont simply count events but also measure foreach event one or more quantitiesnumber of events observed n replaced by numbers of events (n1, ..., nN) in a histogram

Goodness-of-fit variable could be e.g. Pearsons 2SCMA4, 12-15 June, 2006


Example: search for the Higgs boson at LEPGlen CowanSeveral usable signal modes:Important background from e+e- ZZMass of jet pair = mass of Higgs boson;b jets contain tracksnot from interaction pointb-jet pair of virtualZ can mimic HiggsSCMA4, 12-15 June, 2006


A candidate Higgs eventGlen Cowan17 Higgs like candidates seen but no claim of discovery -- p-value of s=0 (background only) hypothesis 0.09SCMA4, 12-15 June, 2006


Glen CowanSetting limitsFrequentist intervals (limits) for a parameter s can be found by defining a test of the hypothesized value s (do this for all s): Specify values of the data n that are disfavoured by s (critical region) such that P(n in critical region) g for a prespecified g, e.g., 0.05 or 0.1.(Because of discrete data, need inequality here.)If n is observed in the critical region, reject the value s.Now invert the test to define a confidence interval as:set of s values that would not be rejected in a test ofsize g (confidence level is 1 - g ).The interval will cover the true value of s with probability 1 - g.SCMA4, 12-15 June, 2006


Glen CowanSetting limits: classical methodE.g. for upper limit on s, take critical region to be low values of n, limit sup at confidence level 1 - b thus found fromSimilarly for lower limit at confidence level 1 - a, Sometimes choose a = b = g /2 central confidence interval.SCMA4, 12-15 June, 2006


Glen CowanCalculating classical limitsTo solve for slo, sup, can exploit relation to 2 distribution:SCMA4, 12-15 June, 2006Quantile of 2 distributionFor low fluctuation of n this can give negative result for slo; i.e. confidence interval is empty.b


Glen CowanLikelihood ratio limits (Feldman-Cousins)Define likelihood ratio for hypothesized parameter value s:Here is the ML estimator, note Critical region defined by low values of likelihood ratio.Resulting intervals can be one- or two-sided (depending on n). (Re)discovered for HEP by Feldman and Cousins, Phys. Rev. D 57 (1998) 3873.

SCMA4, 12-15 June, 2006


Glen CowanCoverage probability of confidence intervalsBecause of discreteness of Poisson data, probability for intervalto include true value in general > confidence level (over-coverage)SCMA4, 12-15 June, 2006


Glen CowanMore on intervals from LR test (Feldman-Cousins)Caveat with coverage: suppose we find n >> b.Usually one then quotes a measurement:If, however, n isnt large enough to claim discovery, onesets a limit on s.FC pointed out that if this decision is made based on n, thenthe actual coverage probability of the interval can be less thanthe stated confidence level (flip-flopping).FC intervals remove this, providing a smooth transition from1- to 2-sided intervals, depending on n.But, suppose FC gives e.g. 0.1 < s < 5 at 90% CL, p-value of s=0 still substantial. Part of upper-limit wasted?SCMA4, 12-15 June, 2006


Glen CowanProperties of upper limitsUpper limit sup vs. n Mean upper limit vs. sExample: take b = 5.0, 1 - = 0.95SCMA4, 12-15 June, 2006


Upper limit versus bGlen CowanSCMA4, 12-15 June, 2006bIf n = 0 observed, should upper limit depend on b?Classical: yesBayesian: noFC: yesFeldman & Cousins, PRD 57 (1998) 3873


Glen CowanNuisance parameters and limitsIn general we dont know the background b perfectly.Suppose we have a measurement of b, e.g., bmeas ~ N (b, b)So the data are really: n events and the value bmeas. In principle the confidence interval recipe can be generalized to two measurements and two parameters. Difficult and not usually attempted, but see e.g. talks by K. Cranmer at PHYSTAT03, G. Punzi at PHYSTAT05.G. Punzi, PHYSTAT05SCMA4, 12-15 June, 2006


Glen CowanBayesian limits with uncertainty on bUncertainty on b goes into the prior, e.g.,Put this into Bayes theorem,Marginalize over b, then use p(s|n) to find intervals for swith any desired probability content.For b = 0, b = 0, (s) = const. (s > 0), Bayesian upper limit coincides with Classical one.SCMA4, 12-15 June, 2006


Glen CowanCousins-Highland method Regard b as random, characterized by pdf (b).Makes sense in Bayesian approach, but in frequentist model b is constant (although unknown).A measurement bmeas is random but this is not the meannumber of background events, rather, b is.Compute anywayThis would be the probability for n if Nature were to generatea new value of b upon repetition of the experiment with b(b).Now e.g. use this P(n;s) in the classical recipe for upper limitat CL = 1 - b:Widely used method in HEP.SCMA4, 12-15 June, 2006


Glen CowanIntegrated likelihoods Consider again signal s and background b, suppose we haveuncertainty in b characterized by a prior pdf b(b).

Define integrated likelihood asalso called modified profile likelihood, in any case nota real likelihood.Now use this to construct likelihood-ratio test and invertto obtain confidence intervals.Feldman-Cousins & Cousins-Highland (FHC2), see e.g.J. Conrad et al., Phys. Rev. D67 (2003) 012002 and Conrad/Tegenfeldt PHYSTAT05 talk.Calculators available (Conrad, Tegenfeldt, Barlow).SCMA4, 12-15 June, 2006


Glen CowanCorrelation between causes errorsto increase.Standard deviations fromtangent lines to contour

Digression: tangent plane methodConsider least-squares fit with parameter of interest 0 andnuisance parameter 1, i.e., minimizeSCMA4, 12-15 June, 2006


Glen CowanThe tangent plane method is a special case of using theprofile likelihood: The profile likelihoodis found by maximizing L (q0, q1) for each q0.Equivalently use The interval obtained from is the same as what is obtained from the tangents toWell known in HEP as the MINOS method in MINUIT.See e.g. talks by Reid, Cranmer, Rolke at PHYSTAT05.SCMA4, 12-15 June, 2006


Glen CowanInterval from inverting profile LR test Suppose we have a measurement bmeas of b.Build the likelihood ratio test with profile likelihood:and use this to construct confidence intervals.

Not widely used in HEP but recommended in e.g. Kendall & Stuart; see also PHYSTAT05 talks by Cranmer, Feldman, Cousins, Reid.SCMA4, 12-15 June, 2006


Wrapping up,Glen CowanFrequentist methods have been most widely used but for manyquestions (particularly related to systematics), Bayesian methodsare getting more notice.Frequentist properties such as coverage probability of confidenceintervals seen as very important (overly so?)Bayesian methods remain problematic in cases where it is difficult to enumerate alternative hypotheses and assign meaningful prior probabilities.Tools widely applied at LEP; some work needed to extendthese to LHC analyses (ongoing).SCMA4, 12-15 June, 2006


Finally,Glen CowanThe LEP programme was dominated by limit setting:Standard Model confirmed, No New PhysicsThe Tevatron discovered the top quark and Bs mixing (both partsof the SM) and also set many limits (but NNP)By ~2012 either well have discovered something new and interesting beyond the Standard Model,or,well still be setting limits and HEP should think seriously about a new approach!SCMA4, 12-15 June, 2006


Extra slidesGlen CowanSCMA4, 12-15 June, 2006


A recent discovery: Bs oscillationsGlen CowanRecently the D0 experiment (Fermilab) announced the discovery of Bs mixing: Moriond talk by Brendan Casey, also hep-ex/0603029Produce a Bq meson at time t=0; there is a time dependentprobability for it to decay as an anti-Bq (q = d or s):|Vts| |Vtd| and so Bs oscillates quickly compared to decay rateSought but not seen at LEP; early on predicted to be visible at TevatronDiscovery quickly confirmed by the CDF experimentSCMA4, 12-15 June, 2006


Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester


Glen CowanConfidence interval from likelihood function In the large sample limit it can be shown for ML estimators:defines a hyper-ellipsoidal confidence region,If then(n-dimensional Gaussian, covariance V)SCMA4, 12-15 June, 2006


Glen CowanApproximate confidence regions from L( ) So the recipe to find the confidence region with CL = 1- is:For finite samples, these are approximate confidence regions.Coverage probability not guaranteed to be equal to 1- ;no simple theorem to say by how far off it will be (use MC).SCMA4, 12-15 June, 2006


Glen CowanStatistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester


Glen CowanUpper limit from test of hypothesized ms Base test on likelihood ratio (here = ms):Observed value is lobs , sampling distribution is g(l;) (from MC) is excluded at CL=1- ifD0 shows the distribution of ln l for ms = 25 ps-1equivalent to 2.1 effect95% CL upper limitSCMA4, 12-15 June, 2006


The significance of an observed signalGlen CowanSuppose b = 0.5, and we observe nobs = 5. Often, however, b has some uncertaintythis can have significant impact on p-value,e.g. if b = 0.8, p-value = 1.4 10-3SCMA4, 12-15 June, 2006


The significance of a peakGlen CowanSuppose we measure a value x for each event and find:Each bin (observed) is aPoisson r.v., means aregiven by dashed lines.In the two bins with the peak, 11 entries found with b = 3.2.We are tempted to compute the p-value for the s = 0 hypothesis as:SCMA4, 12-15 June, 2006


The significance of a peak (2)Glen CowanBut... did we know where to look for the peak? give P(n 11) in any 2 adjacent binsIs the observed width consistent with the expected x resolution? take x window several times the expected resolutionHow many bins distributions have we looked at? look at a thousand of them, youll find a 10-3 effectDid we adjust the cuts to enhance the peak? freeze cuts, repeat analysis with new dataHow about the bins to the sides of the peak... (too low!)Should we publish????SCMA4, 12-15 June, 2006


Statistical vs. systematic errors Glen CowanStatistical errors: How much would the result fluctuate upon repetition of the measurement?Implies some set of assumptions to define probability of outcome of the measurement.Systematic errors:What is the uncertainty in my result due to uncertainty in my assumptions, e.g.,model (theoretical) uncertainty;modeling of measurement apparatus.The sources of error do not vary upon repetition of the measurement. Often result from uncertainvalue of, e.g., calibration constants, efficiencies, etc.SCMA4, 12-15 June, 2006


Systematic errors and nuisance parametersGlen CowanResponse of measurement apparatus is never modeled perfectly:x (true value)y (measured value)model: truth:Model can be made to approximate better the truth by includingmore free parameters.systematic uncertainty nuisance parametersSCMA4, 12-15 June, 2006


the small- n problem in high energy physics

Documents

particle collisions

event rates

particle physicsmatter

particle id

eventssecsingle event

lightest susy particle

sm particle susy partner

modern astronomy iv