lectures prepared by: elchanan mossel yelena shvets introduction to probability stat 134 fall 2005...
TRANSCRIPT
Lectures prepared by:Elchanan MosselYelena Shvets
Introduction to probability
Stat 134 FAll 2005
Berkeley
Follows Jim Pitman’s book:
ProbabilitySection 2.2
Binomial Distribution:Mean
Recall: the Mean
of a binomial(n,p) distribution
is given by:
= #Trials £ P(success) = n p
• The Mean = Mode (most likely value).
• The Mean = “Center of gravity” of the distribution.
Standard Deviation
The Standard Deviation (SD) of a distribution, denoted measures the spread of the distribution.
Def: The Standard Deviation (SD) of the binomial(n,p) distribution is given by:
Binomial DistributionExample 1:
• Let’s compare Bin(100,1/4) to Bin(100,1/2).
• 1 = 25, 2 = 50
• 1 = 4.33, 2 = 5
• We expect Bin(100,1/4) to be less spread than Bin(100,1/2).
• Indeed, you are more likely to guess the value of Bin(100,1/4) distribution than of Bin(100,1/2) since:For p=1/2 : P(50) ¼ 0.079589237;
For p=1/4 : P(25) ¼ 0.092132;
Histograms for binomial(100, 1/4) & binomial(100, 1/2) ;
.00
.02
.04
.06
.08
.10
.12
0 10 20 30 40 50 60 70 80 90 100
Binomial DistributionExample 2:
• Let’s compare Bin(50,1/2) to Bin(100,1/2).
• 1 = 25, 2 = 50
• 1 = 3.54, 2 = 5
• We expect Bin(50,1/2) to be less spread than Bin(100,1/2).
• Indeed, you are more likely to guess the value of Bin(50,1/2) distribution than of Bin(100,1/2) since:
For n=100 : P(50) ¼ 0.079589237;
For n=50 : P(25) ¼ 0.112556;
Histograms for binomial(50, 1/2) & binomial(100,1/2)
.00
.02
.04
.06
.08
.10
.12
0 10 20 30 40 50 60 70 80 90 100
, and the normal curve
•We will see today that the •Mean () and the •SD () give a very good summary of the binom(n,p) distribution via •The Normal Curve with parameters and .
binomial(n,½); n=50,100,250,500
.00
.02
.04
.06
.08
.10
.12
0 50 100 150 200 250 300
25, 3.54
50, 5
125, 7.91
250, 11.2
.0
.1
.2
.3
.4
.5
Normal Curve
2
2
(x- )-
21y e
2
x
y
.0
.1
.2
.3
.4
.5
The Normal Distribution
a xb
2
2b
2
a
1(a,b)= dx
2
( - )x
P e
2
2
(x- )a2
-
1P(- ,a) e dx
2
.0
.1
.2
.3
.4
.5
.0
.1
.2
.3
.4
.5
.0
.1
.2
.3
.4
.5
The Normal Distribution
2
2
(x- )b -2
a
1P(a,b) = e dx
2
a xb
2
2
(x- )b -21
e dx2
2(x- )a -
221e dx
2
=
Standard Normal
2 x2
1(x)=
2e
Standard Normal Density Function:
Corresponds to = 0 and =1
Standard Normal
For the normal (,) distribution: P(a,b) = (b-) - (a-);
z
-
(z)= (x)dx
Standard Normal Cumulative Distribution Function:
Standard Normal
.0
.1
.2
.3
.4
.5
-5 -4 -3 -2 -1 0 1 2 3 4 5
.0
.1
.2
.3
.4
.5
.6
.7
.8
.9
1.0
-5 -4 -3 -2 -1 0 1 2 3 4 5
2 x2
1(x)=
2e
z
-
(z)= (x)dx
Standard Normal
.0
.1
.2
.3
.4
.5
-5 -4 -3 -2 -1 0 1 2 3 4 5
.0
.1
.2
.3
.4
.5
.6
.7
.8
.9
1.0
-5 -4 -3 -2 -1 0 1 2 3 4 5
2 x2
1(x)=
2e
1(0)=
2
(z)
1 (z)
(-z)
z-z
Standard Normal
For the normal (,) distribution: P(a,b) = (b-) - (a-);
In order to prove this, suffices to show:
P(-1,a) = (a-);
Standard Normal Cumulative Distribution Function:
2
2
(x- )a -2
-
1e dx
2
Change of variables
xs
dxds
ax a s
a
(s)ds
a( )
2a
s21
e ds 2
2
2
( )a -2
-
x-1e
2dx
Properties of :
(0) = 1/2;
(-z) = 1 - (z);
(-1) = 0;
(1)= 1.
does not have a closed form formula!
Normal Approximation of a binomial
For n independent trials with success probability p:
where:
binomial(n,½); n=50,100,250,500
.00
.02
.04
.06
.08
.10
.12
0 50 100 150 200 250 300
25, 3.54
50, 5
125, 7.91
250, 11.2
Normal(,);
25, 3.54
50, 5
125, 7.91
250, 11.2
.00
.02
.04
.06
.08
.10
.12
0 50 100 150 200 250 300
Normal Approximation of a binomial
For n independent trials with success probability p:
•The 0.5 correction is called the “continuity correction”
•It is important when is small or when a and b are close.
Normal Approximation
Question: Find P(H>40) in 100 tosses.
.00
.02
.04
.06
.08
.10
.12
Normal Approximation to bin(100,1/2).
2
2
(x-50)
2*51
2 5e
P(#H 40)
((40-50)/5) =1-(-2)
= (2) ¼ 0.9772
Exact = 0.971556
.00
.02
.04
.06
.08
.10
.12
What do we get With continuity correction?
When does the Normal Approximation fail?
.00
.10
.20
.30
.40
.50
.60
.70
.80
.90
1.00
0 1
bin(1,1/2)
=0.5,=0.5
N(0.5,0.5)
0.
0.2
0.4
0.6
0.8
1.
0 1 2 3 4 5 6 7 8 9 10
When does the NA fail?bin(100,1/100)
=1, ¼ 1
N(1,1)
Rule of Thumb
Normal works better:•The larger is.•The closer p is to ½.
Fluctuation in the number of successes.
From the normal approximation it follows that:
P(- to + successes in n trials) ¼ 68%
P(-2 to +2 successes in n trials) ¼ 95%
P(-3 to +3 successes in n trials) ¼ 99.7%
P(-4 to +4 successes in n trials) ¼ 99.99%
Fluctuation in the proportion of successes.Typical size of fluctuation in the number of successes
is:
np(1 p)
p(1 p)n n
Typical size of fluctuation in the proportion of successes is:
Square Root Law
Let n be a large number of independent trials with probability of success p on each.
The number of successes will, with high probability, lie in an interval, centered on the mean np, with a width a moderate multiple of .
The proportion of successes, will lie in a small interval centered on p, with the width a moderate multiple of 1/ .
n
n
Law of large numbers
Let n be a number of independent trials, with probability p of success on each.
For each > 0;
P(|#successes/n – p|< ) ! 1, as n ! 1
Confidence intervals
Suppose we observe the results of n trials with an probability of success p.
#successesˆThe observed f requency of successes
unk n
p=
now
.n
Conf Intervals
The Normal Curve Approximation says that f or any
fi xed p and n large enough, there is a 99.99% chance
ˆ that the observed f requency p will diff er f rom p by
p(1-p) less than 4 .
n
I t's easy to see that p(p(1-p)1 2
1-p) , so 4 .2 n n
Conf Intervals2 2ˆ ˆ (p ,p ) n n
is called a 99.99% confidence interval.
binomial(n,½); n=50,100,250,500
.00
.02
.04
.06
.08
.10
.12
0 50 100 150 200 250 300
25, 3.54
50, 5
125, 7.91
250, 11.2
Normal(,);
25, 3.54
50, 5
125, 7.91
250, 11.2
.00
.02
.04
.06
.08
.10
.12
0 50 100 150 200 250 300
.00
.02
.04
.06
.08
.10
.12
0 10 20 30 40 50 60 70 80 90 100
.00
.02
.04
.06
.08
.10
.12
0 10 20 30 40 50 60 70 80 90 100
bin(50,1/2)
= 25, = 3.535534
bin(50,1/2) + 25
50, =3.535534
Normal Approximation
2 (x- )221
e2
bin(100,1/2)
50, =5
(bin(50,1/2) + 25)* 5/3.535534
50, = 5
50, =5
Notre Dame de Rheims
Quote
Bientôt je pus montrer quelques esquisses. Personnen'y comprit rien. Même ceux qui furent favorables à ma perception des vérités que je voulais ensuite graverdans le temple, me félicitèrent de les avoirdécouvertes au « microscope », quand je m'étais aucontraire servi d'un télescope pour apercevoir deschoses, très petites en effet, mais parce qu'elles étaient situées à une grande distance […]. Là où jecherchais les grandes lois, on m'appelait fouilleur dedétails. (TR, p.346)