7 sampling distributions and the central limit theorempeople.stat.sc.edu/wang528/stat 512/stat 512...
TRANSCRIPT
7 Sampling Distributions and the Central Limit Theorem
7.1 Introduction
Example 7.1. Suppose that Y1, . . . , Yn is an iid sample from fY (y). For example, each of thefollowing are statistics:
• T (Y1, . . . , Yn) = Y = 1n
Pni=1 Yi
• T (Y1, . . . , Yn) =12 [Y(n/2) + Y(n/2+1)] if n is even.
• T (Y1, . . . , Yn) = Y(1)
• T (Y1, . . . , Yn) = Y(n) � Y(1)
• T (Y1, . . . , Yn) = S2 = 1n�1
Pni=1(Yi � Y )2
30
7.2 Sampling distributions related to the normal distribution
Example 7.2. Suppose that Y1, . . . , Yn is an iid sample from a N (µ,�2). What is the distributionof the sample mean?
Example 7.3. In the interest of pollution control, an experimenter records Y , the amount of bacteriaper unit volume of water (measured in mg/cm3). The population distribution for Y is assumed tobe normal with mean µ = 48 and variance �2 = 100. That is Y ⇠ N (µ,�2).
(a) What is the probability that a single water specimen’s bacteria amount will exceed 50 mg/cm3?
(b) Suppose that the experimenter takes a random sample of n = 100 water specimens, and denotethe observation by Y1, . . . , Y100. What is the probability that the sample mean Y will exceedexceed 50 mg/cm3?
(c) How large should the sample size n be so that P (Y > 50) < 0.01?
31
32
Now we prove that(n� 1)S2
�2⇠ �2(n� 1).
33
Example 7.4. In an ecological study examining the e↵ects of Hurricane Katrina, researchers choosen = 9 plots and, for each plot, record Y , the amount of dead weight material (recorded in grams).Denote the nine dead weights by Y1, . . . , Y9, where Yi represents the dead weight for plot i. Theresearchers model the data Y1, . . . , Y9 as an iid N (100, 32) sample. What is the probability that thesample variance S2 of the nine dead weights is less than 20? That is, what is P (S2 < 20)? Further,how large should the sample size n be so that P (S2 < 20) < 0.01.
34
7.3 The t distribution
Recall that if Y1, . . . , Yn is an iid N (µ,�2) sample, the sample mean Y ⇠ N (µ,�2/n); i.e.,
Z =Y � µ
�/pn
⇠ N (0, 1).
Suppose we replace � by its estimator S, now we want to find the distribution of
t =Y � µ
S/pn.
35
36
37
The t(3) density function (dotted) and the standard normal density (solid)
38
7.4 The F distribution
1. If W ⇠ F (⌫1, ⌫2), then 1/W ⇠ F (⌫2, ⌫1).
2. If T ⇠ t(⌫), then T 2 ⇠ F (1, ⌫).
3. If W ⇠ F (⌫1, ⌫2), then (⌫1/⌫2)W/[1 + (⌫1/⌫2)W ] ⇠ Beta(⌫1/2, ⌫2/2).
39
Example 7.5. Suppose that Y1, . . . , Yn is an iid sample from a N (µ,�2) distribution. Let T =(Y � µ)/(S/
pn). What is the distribution of T 2?
Then we have
F =S21/�
21
S22/�
2⇠ F (n1 � 1, n2 � 1).
40
7.5 The Central Limit Theorem
41
42
Proof of the central limit theorem:
43
Example 7.6.A chemist is studying the degradation behavior of vitamin B6 in a multivitamin.The chemist selects a random sample of n = 36 multivitamin tablets, and for each tablet, countsthe number of days until the B6 content falls below the FDA requirement. Let Y1, . . . , Y36 denotethe measurements for the 36 tablets, and assume that Y1, . . . , Y36 is an iid sample from a Poissondistribution with mean 50. What is the approximate probability that the average number of days Ywill exceed 52? How many tablets does the research need to observe so that P (Y < 49.5) ⇡ 0.01?
44
7.6 The normal approximation to the binomial
Let
X =nX
i=1
Yi,
the number of “successes.” What is the distribution of X?
Define the sample proportion p as
p =X
n=
1
n
nX
i=1
Yi = Y .
45
Example 7.7.Use Monte Carlo simulation to approximate the sample distribution of p for thefollowing cases:
Case 1: n = 10, p = 0.1 Case 2: n = 40, p = 0.1 Case 3: n = 100, p = 0.1Case 4: n = 10, p = 0.5 Case 2: n = 40, p = 0.5 Case 6: n = 100, p = 0.5
One can clearly see that the normal approximation is not good when p = 0.1, except when n is verylarge. On the other hand, when p = 0.5, the normal approximation is already pretty good whenn = 40.
46