7 sampling distributions and the central limit theorempeople.stat.sc.edu/wang528/stat 512/stat 512...

20
7 Sampling Distributions and the Central Limit Theorem 7.1 Introduction Example 7.1. Suppose that Y 1 ,...,Y n is an iid sample from f Y (y). For example, each of the following are statistics: T (Y 1 ,...,Y n )= ¯ Y = 1 n P n i=1 Y i T (Y 1 ,...,Y n )= 1 2 [Y (n/2) + Y (n/2+1) ] if n is even. T (Y 1 ,...,Y n )= Y (1) T (Y 1 ,...,Y n )= Y (n) - Y (1) T (Y 1 ,...,Y n )= S 2 = 1 n-1 P n i=1 (Y i - ¯ Y ) 2 30

Upload: others

Post on 18-Oct-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

7 Sampling Distributions and the Central Limit Theorem

7.1 Introduction

Example 7.1. Suppose that Y1, . . . , Yn is an iid sample from fY (y). For example, each of thefollowing are statistics:

• T (Y1, . . . , Yn) = Y = 1n

Pni=1 Yi

• T (Y1, . . . , Yn) =12 [Y(n/2) + Y(n/2+1)] if n is even.

• T (Y1, . . . , Yn) = Y(1)

• T (Y1, . . . , Yn) = Y(n) � Y(1)

• T (Y1, . . . , Yn) = S2 = 1n�1

Pni=1(Yi � Y )2

30

Page 2: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

7.2 Sampling distributions related to the normal distribution

Example 7.2. Suppose that Y1, . . . , Yn is an iid sample from a N (µ,�2). What is the distributionof the sample mean?

Example 7.3. In the interest of pollution control, an experimenter records Y , the amount of bacteriaper unit volume of water (measured in mg/cm3). The population distribution for Y is assumed tobe normal with mean µ = 48 and variance �2 = 100. That is Y ⇠ N (µ,�2).

(a) What is the probability that a single water specimen’s bacteria amount will exceed 50 mg/cm3?

(b) Suppose that the experimenter takes a random sample of n = 100 water specimens, and denotethe observation by Y1, . . . , Y100. What is the probability that the sample mean Y will exceedexceed 50 mg/cm3?

(c) How large should the sample size n be so that P (Y > 50) < 0.01?

31

Page 3: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem
Page 4: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

32

Page 5: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

Now we prove that(n� 1)S2

�2⇠ �2(n� 1).

33

Page 6: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem
Page 7: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

Example 7.4. In an ecological study examining the e↵ects of Hurricane Katrina, researchers choosen = 9 plots and, for each plot, record Y , the amount of dead weight material (recorded in grams).Denote the nine dead weights by Y1, . . . , Y9, where Yi represents the dead weight for plot i. Theresearchers model the data Y1, . . . , Y9 as an iid N (100, 32) sample. What is the probability that thesample variance S2 of the nine dead weights is less than 20? That is, what is P (S2 < 20)? Further,how large should the sample size n be so that P (S2 < 20) < 0.01.

34

Page 8: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

7.3 The t distribution

Recall that if Y1, . . . , Yn is an iid N (µ,�2) sample, the sample mean Y ⇠ N (µ,�2/n); i.e.,

Z =Y � µ

�/pn

⇠ N (0, 1).

Suppose we replace � by its estimator S, now we want to find the distribution of

t =Y � µ

S/pn.

35

Page 9: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

36

Page 10: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

37

Page 11: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

The t(3) density function (dotted) and the standard normal density (solid)

38

Page 12: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

7.4 The F distribution

1. If W ⇠ F (⌫1, ⌫2), then 1/W ⇠ F (⌫2, ⌫1).

2. If T ⇠ t(⌫), then T 2 ⇠ F (1, ⌫).

3. If W ⇠ F (⌫1, ⌫2), then (⌫1/⌫2)W/[1 + (⌫1/⌫2)W ] ⇠ Beta(⌫1/2, ⌫2/2).

39

Page 13: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

Example 7.5. Suppose that Y1, . . . , Yn is an iid sample from a N (µ,�2) distribution. Let T =(Y � µ)/(S/

pn). What is the distribution of T 2?

Then we have

F =S21/�

21

S22/�

2⇠ F (n1 � 1, n2 � 1).

40

Page 14: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

7.5 The Central Limit Theorem

41

Page 15: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

42

Page 16: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

Proof of the central limit theorem:

43

Page 17: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem
Page 18: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

Example 7.6.A chemist is studying the degradation behavior of vitamin B6 in a multivitamin.The chemist selects a random sample of n = 36 multivitamin tablets, and for each tablet, countsthe number of days until the B6 content falls below the FDA requirement. Let Y1, . . . , Y36 denotethe measurements for the 36 tablets, and assume that Y1, . . . , Y36 is an iid sample from a Poissondistribution with mean 50. What is the approximate probability that the average number of days Ywill exceed 52? How many tablets does the research need to observe so that P (Y < 49.5) ⇡ 0.01?

44

Page 19: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

7.6 The normal approximation to the binomial

Let

X =nX

i=1

Yi,

the number of “successes.” What is the distribution of X?

Define the sample proportion p as

p =X

n=

1

n

nX

i=1

Yi = Y .

45

Page 20: 7 Sampling Distributions and the Central Limit Theorempeople.stat.sc.edu/wang528/Stat 512/STAT 512 Lecture Notes - S17... · 7 Sampling Distributions and the Central Limit Theorem

Example 7.7.Use Monte Carlo simulation to approximate the sample distribution of p for thefollowing cases:

Case 1: n = 10, p = 0.1 Case 2: n = 40, p = 0.1 Case 3: n = 100, p = 0.1Case 4: n = 10, p = 0.5 Case 2: n = 40, p = 0.5 Case 6: n = 100, p = 0.5

One can clearly see that the normal approximation is not good when p = 0.1, except when n is verylarge. On the other hand, when p = 0.5, the normal approximation is already pretty good whenn = 40.

46