e. santovetti lesson 4 maximum likelihood interval...

43
1 E. Santovetti E. Santovetti lesson 4 lesson 4 Maximum likelihood Maximum likelihood Interval estimation Interval estimation

Upload: others

Post on 18-Mar-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

1

E. SantovettiE. Santovetti

lesson 4lesson 4

Maximum likelihoodMaximum likelihoodInterval estimationInterval estimation

Page 2: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

2

Sometimes the number of total events measurements of the experiment n is not fixed, but, for example, is a Poisson r.v., mean .

Extended Maximum LikelihoodExtended Maximum Likelihood

The extended likelihood function is then:

If is a function of we have:

Example: the expected number of events of a certain process

● Extended ML uses more informations, so the error on the parameters will be smaller, compared to the case in which n independent

● In case n doesn't depend from we have the usual likelihood

Page 3: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

3

Consider two types of events (e.g., signal and background) each of which predict a given pdf for the variable x: fs(x) and fb(x).

We observe a mixture of the two event types, signal fraction = θ, expected total number = ν, observed total number = nLet s = and b = (1- be the number of signal and background that we want to evaluate

Extended ML exampleExtended ML example

Page 4: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

4

Consider for signal a Gaussian pdf and for background an exponential

Extended ML example (2)Extended ML example (2)

Maximize the log L, to find s and b

Here errors reflect total Poisson fluctuation as well as that in proportion of signal/background

Page 5: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

5

Here the unphysical estimator is unbiased and should nevertheless be reported, since average of a large number of unbiased estimates converges to the true value (cf. PDG).

Repeat entire MC experiment many times, allow unphysical estimates.

Unphysical values for estimatorsUnphysical values for estimators

Page 6: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

6

Extended ML example IIExtended ML example II

LH does not provide any information on the Goodness of the fit.This has to be checked separately.

● Simulate toy MC according to estimated pdf (using fit results from data as “true” parameter values) compare max Likelihood value in toy to the one in data

● Draw data in (binned) histogram, “compare” distribution with result of ML fit.

Page 7: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

7

Again you want to distinguish (and count) signal events respect to background events.

The signal is the process

Extended ML example IIExtended ML example II

The background is combinatorial: vertex (and then particle) reconstructed with the wrong tracks.

To select signal from background we can use two main things:

1) The invariant mass of the two daughter particles has to peak to the B meson mass;

2) The time of flight of the B meson candidate has to be of the order of the B meson life time.

These two variables have to behave in a complete different way for the two event categories.

Let us see the distribution of this variables

Page 8: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

8

A first look at the distribution (mass and time) allow us to state:

pdf for signal mass: double Gaussian

pdf for signal time: exponential (negative)

pdf for background mass: exponential (almost flat)

pdf for background time: exponential + Lorentzian

We build the pdf and make as:

Extended ML example IIExtended ML example II

By maximizing the likelihood, we can estimate the number of signal and background as well as the B meson mass and life time

Page 9: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

9

Extended ML example IIExtended ML example II

all

Life time

mass

signal

background

data

The fit is done with theRooFit package (Root)

Page 10: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

10

Suppose we want to measure the polarization of the J/Ψ meson (1--).

The measurement can be done by looking at the angular distribution of the decay product of the meson itself:

Weighted maximum likelihoodWeighted maximum likelihood

and are respectively the polar and azimuthal angles of the positive muon, in the decay

in the J/Ψ rest frame, measured choosing the J/Ψ direction in the lab frame as polar axis.

θ

Page 11: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

11

We have to measure the angular distribution and fit with the function

Weighted likelihood – polarization measurementWeighted likelihood – polarization measurement

There are two main problem to face:

When we select our signal, there is an unavoidable amount of background events (evident from the mass distribution)

The angular distribution of the background events is unknown and also very difficult to parametrize

The likelihood function is:

Where ε is the total detection efficiency, P is the above angular function and Norm is a normalization function in order to have probability normalized to 1.

Page 12: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

12

The efficiency term at the denominator does not depend on λ parameters and then is a constant term in the maximization procedure.

In order to take into account the background events the likelihood sum is extended at all events but with some proper weights:

Weighted likelihood – polarization measurementWeighted likelihood – polarization measurement

The background events contribution cancels out if:

The combinatorial background angular distributions are the same in the signal and side bands regions (can be demonstrated shifting the three regions 300 MeV up)

The background mass distribution is linear. The hypothesis is well satisfied otherwise we can always take into account this by readjusting the weights in a proper way

left

sid

e ba

nd

right

sid

e ba

nd

Page 13: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

13

How to evaluate the Norm function (depending on detector efficiency) ?

Weighted likelihood – polarization measurementWeighted likelihood – polarization measurement

We can again use the MC simulation, considering an unpolarized sample, P=1

and the sum is over the MC events.

Page 14: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

14

Then from MC events we can compute the function:

Weighted likelihood – polarization measurementWeighted likelihood – polarization measurement

Page 15: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

15

The sPlot techniqueThe sPlot technique

Page 16: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

16

In Bayesian statistics, both θ and x are random variables:.

Relationship between ML and Bayesian estimatorsRelationship between ML and Bayesian estimators

In the Bayes approach, if θ is a certain hypothesis:

prior θ probabilityposterior θ pdf (conditional pdf for θ given x)

Purist Bayesian: p(θ|x) contains all the informations about θ.

Pragmatist Bayesian: p(θ|x) can be a complicated function: summarize by using new estimator

Looking at p(θ|x): what do we use for π(θ)? No golden rule (subjective!), often represent ‘prior ignorance’ by π(θ) = constant, in which case

But... we could have used a different parameter, e.g., λ = 1/θ, and if prior π(θ) is constant, then π(λ) is not! ‘Complete prior ignorance’ is not well defined.

Page 17: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

17

The main concern expressed by frequentist statisticians regarding the use of Bayesian probability is its intrinsic dependence on a prior probability that can be chosen in an arbitrary way. This arbitrariness makes Bayesian probability to some extent subjective.

Adding more measurements increases one’s knowledge of the unknown parameter, hence the posterior probability will depend less, and be less sensitive to the choice of the prior probability. In those cases, where a large number of measurements occurs, in most of the cases the results of Bayesian calculations tend to be identical to those of frequentist calculations.

Many interesting statistical problems arise in the cases of low statistics, i.e. a small number of measurements. In those cases, Bayesian or frequentist methods usually leads to different results. In those cases, using the Bayesian approach, the choice of the prior probabilities plays a crucial role and has great influence in the results.

One main difficulty is how to chose a PDF that models one’s complete ignorance on an unknown parameter. One naively could choose a uniform (“flat”) PDF in the interval of validity of the parameter. But it is clear that if we move the parametrization from x to a function of x (say log x or 1/x), the resulting transformed parameter will no longer have a uniform prior PDF

Relationship between ML and Bayesian estimatorsRelationship between ML and Bayesian estimators

Page 18: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

18

One possible approach has been proposed by Harold Jeffreys, adopting a choice of prior PDF that results invariant under parameter transformation. This choice is:

with:

The Jeffreys priorThe Jeffreys prior

Determinant of the Fischer information matrix

Examples of Jeffreys prior distributions for some important parameters

Page 19: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

19

Interval estimation, Interval estimation, setting limitssetting limits

Page 20: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

20

In addition to a ‘point estimate’ of a parameter we should report an interval reflecting its statistical uncertainty.

Desirable properties of such an interval may include:

● communicate objectively the result of the experiment;

● have a given probability of containing the true parameter;

● provide informations needed to draw conclusions about the parameter possibly incorporating stated prior beliefs.

Often use +/- the estimated standard deviation of the estimator. In some cases, however, this is not adequate:

● estimate near a physical boundary, e.g., an observed event rate consistent with zero.

We will look briefly at Frequentist and Bayesian intervals.

Interval estimation — introductionInterval estimation — introduction

Page 21: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

21

Rigorous procedure to get confidence intervals in frequentist approach

Consider an estimator for a parameter (measurable)

We also need the pdf

Specify upper and lower tail probabilities, e.g., α = 0.05, β = 0.05, then find functions uα(θ) and vβ(θ) such that

Neyman confidence intervalsNeyman confidence intervals

We obtain a confidence interval (CL = 1-α-β) for the estimator

function of the true parameter value θ. This is the interval for the estimator

No unique way to define this interval with the same CL

Integral over all the possible estimator values

Page 22: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

22

Confidence interval from the confidence beltConfidence interval from the confidence belt

Confident belt region

Find points where observedestimate intersects theconfidence belt

This gives the confidence interval forThe true parameter

Confidence level = 1 - α - β = probability for the interval to cover true value of the parameter (holds for any possible true θ) with

is a function of the parameter

Page 23: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

23

Confidence intervals by inverting a testConfidence intervals by inverting a test

Confidence intervals for a parameter θ can be found by defining a test of the hypothesized value θ (do this for all θ):

● Define values of the data that are ‘disfavored’ by θ (critical region) such that P(data in critical region) ≤ γ for a specified γ, e.g., 0.05 or 0.1.

● If data observed in the critical region, reject the value θ .

Now invert the test to define a confidence interval as:

● set of θ values that would not be rejected in a test of size γ (confidence level is 1 - γ ). We have to collect many data...

The interval will cover the true value of θ with probability ≥ 1 - γ.

Equivalent to confidence belt construction; confidence belt is acceptance region of a test.

Page 24: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

24

Relation between confidence interval and p-valueRelation between confidence interval and p-value

Equivalently we can consider a significance test for each hypothesized value of θ, resulting in a p-value, pθ.

Equivalently we can consider a significance test for each hypothesized value of θ, resulting in a p-value, pθ.

The confidence interval at CL = 1–γ consists of those values of θ that are not rejected.

E.g. an upper limit on θ is the greatest value for which pθ ≥ γ.

In practice find by setting pθ = γ and solve for θ

Page 25: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

25

Confidence intervals in practiceConfidence intervals in practice

In practice, in order to find the interval [a, b] we have to solve:

a is hypothetical value of θ such thatb is hypothetical value of θ such that

we replace uα(θ) withand get a

we replace uβ(θ) withand get b

Page 26: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

26

Meaning of a confidence intervalMeaning of a confidence interval

Important to keep in mind:● The interval is random● The true θ is an unknown constant

Often we report this interval as:

This does not mean

but: repeat the measurements many times, build the interval according to the same prescription each time,

in 1 – – experiments the interval will contain θ

Page 27: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

27

Central vs. one-sided confidence intervalsCentral vs. one-sided confidence intervals

Fixed the CL, the choice of and is not unique, in literature this is the so called ordering rule

Sometimes, only specified or : one-side interval (limit)

Often = = / 2: coverage probability 1- : central confidence interval● N.B.: central confidence level does not mean symmetric interval

around θ.

In HEP the convention to quote the error is:

= = / 2 with 1 - = 68.3% = 1σ

Page 28: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

28

Intervals from the likelihood functionIntervals from the likelihood function

In the large sample limit it can be shown for ML estimators:

N-dimensional Gaussian, variance V

defines a hyper ellipsoidal confidence region

If the θ follows a multi-dimentional Gaussian

Page 29: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

29

Approximate confidence regions from L(θ)Approximate confidence regions from L(θ)

So the recipe to find the confidence region with CL = 1 - γ is:

For finite samples, these are approximate confidence regions.● Coverage probability not guaranteed to be exactly equal to 1 - γ ;● no simple theorem to say by how far off it will be (use MC).

Remember here the interval is random, not the parameter.

Page 30: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

30

Example of interval from ln L(θ)Example of interval from ln L(θ)For n=1, CL = 1 - γ = 0.683 Q = 1

Page 31: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

31

Setting limits on Poisson parameterSetting limits on Poisson parameter

Consider again the case in which we have a sample of events that contains signal and background (means s and b), and both of them are Poisson variables.

Suppose that we can say how many background we expect.

Unfortunately we observe:

There is clearly no evidence of signal. This means that we cannot exclude s = 0, we can anyway put un upper limit to the number of signal

Page 32: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

32

Upper limit for Poisson parameterUpper limit for Poisson parameter

We have to find the hypothetical s such that there is a given small probability, say, γ = 0.05, to find as few events as we did or less.

Solving numerically for s, it gives an upper limit at a confidence level of 1-γ (usually 0.95).

Suppose b = 0 and we find n = 0

Page 33: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

33

Calculating Poisson parameter limitsCalculating Poisson parameter limits

To find the lower and upper limits we can use the relation to the 2

distribution with z/2. It can be found:

For low fluctuation of n this can give negative result for sup; i.e. confidence interval is empty.

Page 34: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

34

Limits near a physical boundaryLimits near a physical boundary

Suppose e.g. b = 2.5 and we observe n = 0. If we choose CL = 0.9, we find from the formula for sup

negative!?

Physicist:

We already knew s ≥ 0 before we started; can’t use negative upper limit to report result of expensive experiment!

Statistician:

The interval is designed to cover the true value only 90% of the time - this was clearly not one of those times.

Not uncommon dilemma when limit of parameter is close to a physical boundary

Page 35: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

35

Expected limit for s = 0Expected limit for s = 0Physicist: I should have used CL = 0.95, then sup = 0.496

even better: for CL = 0.917923 we get sup = 10-4 !

We are not taking into account the background fluctuation

Reality check: with b = 2.5, typical Poisson fluctuation in n is at least √2.5 = 1.6. How can the limit be so low?

Look at the mean limit for the no-signal hypothesis (s = 0) (sensitivity).

Distribution of 95% CL limitswith b = 2.5, s = 0.Mean upper limit = 4.44With N MC experiments (poisson) with μ=2.5, I extract n and then evaluate sup with 95% CL

Page 36: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

36

The “flip-flopping” problemThe “flip-flopping” problemIn order to determine confidence intervals, a consistent choice of ordering rule has to be adopted.

Feldman and Cousins demonstrated that the ordering rule choice must not depend on the outcome of the measurements, otherwise the quoted confidence intervals or upper limits could be incorrect.

In some cases, experiment searching for a rare signal make the chose, while quoting their result, to switch from a central interval to an upper limit depending on the outcome of the measurement.

A typical choice is to quote an upper limit if the significance of the observed signal is smaller than 3σ, and a central value otherwise

We have than to quote the error fixing the CL, say 90%.

If x ≥ 3σ we choose a symmetric interval (5% each), while if x < 3σ an upper limit implies a completely asymmetric interval.

Page 37: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

37

The “flip-flopping” problemThe “flip-flopping” problemFrom a single measurement of x we can decide to quote an interval with a e certain CL if x>3σ:

or we can decide to quote only an upper limit if our measurement is x<3σ.

Page 38: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

38

The “flip-flopping” problemThe “flip-flopping” problemThe choice to switch from a central interval to a fully asymmetric interval (upper limit) based on the observation of x clearly spoils the statistical coverage.

Looking at the figure, depending on the value of μ, the interval [x1, x2] obtained by crossing the confidence belt in by an horizontal line, one may have cases where the coverage decreases from 90% to 85%, which is lower than the desired CL.

To avoid flip-flopping, decide before the measurement if you quote limit or 2-sided interval - and stick to it. Or use Feldman-Cousins

Page 39: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

39

The Feldman Cousins methodThe Feldman Cousins methodThe ordering rule proposed by Feldman and Cousins provides a Neyman confidence belt that smoothly changes from a central or quasi-central interval to an upper limit in the case of low observed signal yield.

The ordering rule is based on the likelihood ratio given a value θ0 of the unknown parameter under a Neyman construction, the chosen interval on the variable x is defined from the ratio of two PDFs of x, one under the hypothesis that θ is equal to the considered fixed value θ0, the other under the hypothesis that θ is equal to the maximum-likelihood estimate value θbest(x) corresponding to the given measurement x.

Page 40: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

40

Feldman Cousins: Gaussian caseFeldman Cousins: Gaussian caseLet apply the Feldman-Cousins method to a Gaussian distribution

When we divide by f(x|μbest) we obtain:

That is a an asymmetric function with a longer tail to the negative x values

Using Feldman-Cousin approach, for alrge x we have the usual symmetric confidence interval.

Going at small x (close to the boundary) the interval becomes more and more asymmetric and at certain point it become a completely asymmetric interval (upper limit)

Page 41: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

41

The Bayesian approachThe Bayesian approach

In Bayesian statistics need to start with prior pdf π(θ), this reflects degree of belief about θ before doing the experiment.

Bayes’ theorem tells how our beliefs should be updated in light of the data x:

Then we have to integrate posterior probability to the desired probability confidence level.

For the Poisson case, suppose 95% CL, we have:

Page 42: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

42

Bayesian prior for Poisson parameterBayesian prior for Poisson parameter

Include knowledge that s ≥0 by setting prior π(s) = 0 for s<0. Often try to reflect ‘prior ignorance’ with e.g.

Not normalized but this is OK as long as L(s) dies off for large s.

Not invariant under change of parameter — if we had used instead a flat prior for, say, the mass of the Higgs boson, this would imply a non-flat prior for the expected number of Higgs events.

Does not really reflect a reasonable degree of belief, but often used as a point of reference;

or viewed as a recipe for producing an interval whose frequentist properties can be studied (coverage will depend on true s).

Page 43: E. Santovetti lesson 4 Maximum likelihood Interval estimationstatistics.roma2.infn.it/~santovet/Downloads/Stat4.pdf · lesson 4 Maximum likelihood Interval estimation. 2 Sometimes

43

Bayesian interval with flat prior for sBayesian interval with flat prior for s

Solve numerically to find limit sup.

For special case b = 0, Bayesian upper limit with flat prior numerically same as classical case (‘coincidence’).

Otherwise Bayesian limit is everywhere greater than classical (‘conservative’).

Never goes negative.

Doesn’t depend on b if n = 0