chapter 22, part 2: computing p-values for …azimmer/lect23_ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0...

29
Reminders Last HW and Last quiz on Thursday My office hours will be Today from 11-1 If you won’t be around during the final week to take the Final Project, please email me ASAP to arrange for a time for you to take it. 1

Upload: others

Post on 28-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Reminders

• Last HW and Last quiz on Thursday

• My office hours will be Today from 11-1

• If you won’t be around during the final week to take theFinal Project, please email me ASAP to arrange for atime for you to take it.

1

Page 2: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Warmup

• A drug company develops an AIDS treatment that theyhope will reduce the proportion of AIDS patients who diewithin 50 years. In a randomized control trial, 35% ofpatients in the control group died within 5 years. Thedrug company would like to show that the proportion ofpatients who die within 5 years in the treatment group isless than this.

• What is the null hypothesis for this experiment?

• What is the alternative hypothesis for thisexperiment?

2

Page 3: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Warmup

• A drug company develops an AIDS treatment that theyhope will reduce the proportion of AIDS patients who diewithin 50 years. In a randomized control trial, 35% ofpatients in the control group died within 5 years. Thedrug company would like to show that the proportion ofpatients who die within 5 years in the treatment group isless than this. What is the null hypothesis for thisexperiment? What is the alternative hypothesis forthis experiment?

• H0 : p = 0.35

• Ha : p < 0.35

3

Page 4: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Warmup

• It turns out that 28% of the patients in the treatmentgroup died within 5 years. The drug company calculatesthat the p-value for the experiment is .014. What doesthis p-value mean?

• Before the trial, the drug company set the significancelevel of the test at α = 1% = .01. What is theconclusion of this experiment?

4

Page 5: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Warmup• It turns out that 28% of the patients in the treatment

group died within 5 years. The drug company calculatesthat the p-value for the experiment is .014. What doesthis p-value mean?

• There is a .014 chance (14 in 1000 chance) that wewould observe results as extreme (as small) as we did ifthe null hypothesis was true.

• Before the trial, the drug company set thesignificance level of the test at α = 1% = .01. Whatis the conclusion of this experiment?

• Since the p-value is larger than the significance level, wefail to reject the null hypothesis and conclude thatthe differences we observe could be due to randomchance alone. So, we don’t have enough evidence tosuggest that the treatment group has a statistically lowerpercent of people dying within 5 years. 5

Page 6: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Chapter 22, Part 2: Computing p-values

for significance tests

Aaron ZimmermanSTAT 220 - Summer 2014

Department of StatisticsUniversity of Washington - Seattle

6

Page 7: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Practice

The U.S. military would like to know whether the proportionof women in the military has changed in the last 20 years. In1992, they know that 4.6% of active-duty soldiers werewomen. They would like to know if the current proportion isdifferent that this value.What is the null hypothesis for this experiment?What is the alternative hypothesis?

7

Page 8: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Practice

The U.S. military would like to know whether the proportionof women in the military has changed in the last 20 years. In1992, they know that 4.6% of active-duty soldiers werewomen. They would like to know if the current proportion isdifferent that this value.What is the null hypothesis for this experiment?What is the alternative hypothesis?H0 : p = 0.046Ha : p 6= 0.046

8

Page 9: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Practice

It turns out that 16% of the active duty soldiers they surveyedare women. The military calculates that the p-value for theexperiment is .003.What does this p-value mean?Before the trial, the military set the significance level of thetest at α = 5% = .05. Remember, the p-value of the test is.003.What is the conclusion of this experiment?

9

Page 10: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Practice

It turns out that 16% of the active duty soldiers they surveyedare women. The military calculates that the p-value for theexperiment is .003.What does this p-value mean?It means that there’s a 3 in 1000 chance (.003) that we wouldobserve a result this extreme (16% or more of active dutysoldiers are women) if the null hypothesis was true.Before the trial, the military set the significance level of thetest at α = 5% = .05. Remember, the p-value of the test is.003.What is the conclusion of this experiment?Since the p-value is less than α, we reject the null hypothesisand conclude that our data gives us evidence suggesting thatthe percent of active duty soldiers has changed since 1992.

10

Page 11: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Steps of a Test of Significance

• Returning to our motivating example from Monday,remember that the prosecution in the Kristin Gilbert casefound that there were 34/1384 = .025 deaths per shiftwhen Nurse Gilbert wasn’t working, and that there were40/257=.156 deaths per shift when she was working.

• We’d like to know if the high rate of deaths during hershift can be explained by random variation. That is, we’dlike to know if the rate of deaths during her shift is trulydifferent than .025.

• Question: Is there sufficient evidence against the nullhypothesis that the rate of deaths on Nurse Gilbert’sshifts are different than the baseline .025 rate if thesignificance level is α = .05?

11

Page 12: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Step 0 and 1: Significance Level & The Hypotheses• Before we even start, we set the significance level

(α = .05)

• Remember, the claim being tested in a statistical test iscalled the null hypothesis (H0).

• Nurse Gilbert’s defense claims that she’s unlucky and thatthe rate of deaths during her shift is the same as everyoneelse (.025).

? So, H0: p = 0.025

• The statement we hope or suspect is true instead of H0 iscalled the alternative hypothesis (Ha or H1).

• The prosecution wants to show that the percent of deathsunder Nurse Gilbert is larger than .025.

? So, Ha : p > 0.025

12

Page 13: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Step 2: The Sampling Distribution (if H0 is true)

• Remember, in a test ofsignificance, we start byassuming that H0 is true

• If H0 (p = 0.025) is true,what is the samplingdistribution of p̂?

? The samplingdistribution is Normal

? The mean is p = 0.025

? The standard deviation

is:√

p(1−p)n

=√.025(1−.025)

257= .00974

−0.05 0.00 0.05 0.10 0.15 0.20

010

2030

40

Sampling Distribution

x

y

13

Page 14: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Step 3: The Data

• There were 40 deaths outof 257 shifts under NurseGilbert

• So, p̂ = 40257

= .156

−0.05 0.00 0.05 0.10 0.15 0.20

010

2030

40

Sampling Distribution

x

y

14

Page 15: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Step 4: The p-value (NEW)

• Remember: a p-value is theprobability of observing anoutcome as extreme or moreextreme than what we actuallyobserved if the null hypothesiswere true

• In this problem, the alternativehypothesis is one-sided(p > 0.025)

• So, the p-value is the area underthe normal curve that is as far orfurther away from the mean ofthe distribution.

−0.05 0.00 0.05 0.10 0.15 0.20

010

2030

40

Sampling Distribution

p

p−value is the ‘more extreme' area under the Normal curve

NOTE: We’d look at the areaunder the curve to the left ofthe observation if thealternative was Ha : p < .025. 15

Page 16: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Step 4: The p-value (NEW)

• What percent of the samplingdistribution is greater than theobservation of 40/257=0.156?

? Mean = 0.025

? SD = 0.0097

? Standard score:.156−.025

.0097= 13.5!

• Look up the standard score inTable B. Not so helpful - it justtells us that it must be less than1-.9997 = .0003

• My computer says the p-value isless than 1/100,000,000

−0.05 0.00 0.05 0.10 0.15 0.200

1020

3040

Sampling Distribution

p

p−value is the ‘more extreme' area under the Normal curve

16

Page 17: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Step 4: The p-value (NEW)

17

Page 18: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Step 5: Conclusion• The p-value of .00000001 means that there is a 1 in

100,000,000 chance that Kristin Gilbert would randomly(and unluckily) have that extreme percent of deathsduring her shifts if the proportion of deaths during shiftswas actually .025 (H0).

• Since my significance level is α = .05 andalpha > p − value, this test IS statistically significant.

• Conclusion: We have enough evidence to reject thenull hypothesis and conclude that the percent ofdeaths during Nurse Gilbert’s shifts is larger thanthe baseline rate of .025.

• REMEMBER: this doesn’t mean she was killing people,but it does imply that something different was happeningunder her watch.

18

Page 19: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Note #1: p-values in 2-sided tests

• In practice, when Ha is two-sided(Ha : p 6= .025), we calculate thearea that’s more extreme thanthe observation in one directionand then multiply by two

• We do this because in the2-sided setting, “more extreme”could be extreme and large orextreme and small. Either waygives us evidence against thenull hypothesis H0

• We’re not doing it for thisproblem, but you should beaware of it!

−0.05 0.00 0.05 0.10 0.15 0.200

1020

3040

Sampling Distribution

p

2−sided p−value is found by looking at

‘more extreme' in both directions!

19

Page 20: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Note #2: Different Sample Sizes

• What if we only saw 7 of Nurse Gilbert’s shifts?

• 7× 40/257 ≈ 1. So using 1 death in 7 shifts is about thesame ratio.

• Then the sampling distribution would have mean .025,

but SD =√

.025(1−.025)7

= .059

• And the standard score would be .156−.025.059

= 2.22

• So the p-value would be 1-.9861 = .0139

• While we still would reject the null at α = .05, theevidence isn’t as strong, and we wouldn’t reject atα = .01.

20

Page 21: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Significance Tests for Means

• There’s no reason that we can’t apply the proportionssignificance testing framework directly towardssignificance tests for means.

• We’ll still use the same steps

• Chat with your neighbor about the strategy we’regoing to take to perform significance tests about amean

21

Page 22: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Significance Tests for Means

• There’s no reason that we can’t apply the proportionssignificance testing framework directly towardssignificance tests for means.

• We’ll still use the same steps

• Chat with your neighbor about the strategy we’regoing to take to perform significance tests about amean

Very generally, we find the sampling distribution if the nullhypothesis was true, and then we see how unlikely it was torecord data as extreme as what we’ve seen (still assuming H0

true).

22

Page 23: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Steps 0-5 for Significance Tests on Means• Step 0: pick a significance level (usually α = .05 unless

you have a reason to use a different level)• Step 1: Write down the hypotheses (both H0 & Ha)• Step 2: Determine the sampling distribution if H0 is true.

It will be Normal with mean equal to the claim in H0 andeither standard error like the standard errors used inconfidence intervals (from the CLT)

• Step 3: The data. Figure out what the sample mean isfrom your data

• Step 4: Find the p-value. It’s the area under samplingdistribution more extreme than the sample meanobservation. Multiply this p-value by 2 if you have a2-sided alternative.

• Step 5: Make a conclusion. If the p-value is smaller thanα, reject H0. If the p-value is larger than α, fail to rejectH0

23

Page 24: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Your turnA doctor claims that 17 year olds have an average bodytemperature that is higher than the commonly acceptedaverage human temperature of 98.6 degrees Fahrenheit. Asimple random statistical sample of 25 people, each of age 17,is selected. The average temperature of the 17 year olds isfound to be 98.83 degrees, with standard deviation of 0.6degrees.

The doctor hires you to perform a statisticalsignificance test to check the validity of his claim.

Perform the significance test.

How would your work change if he instead suspectedthat 17 year olds have a temperature different than98.6 but wasn’t sure if they were hotter or colder?

24

Page 25: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Your turn• Step 0: Significance level: α = 0.05

• Step 1: Hypotheses: H0 : µ = 98.6 VS Ha : µ > 98.6

• Step 2: Sampling distribution: Normal with mean = 98.6 andSD = 0.6/

√25 = .12

• Step 3: Data: Observation is 98.83, and we standardize it to98.83−98.6

.12 = 1.9

• Step 4: P-value: From Table B, 1-.9713 = .0287

• Step 5: Since .0287 < .05 (p − value < α), we reject the nullhypothesis and claim that we have a significant test result atthe .05 level. So, we conclude that we have enough evidenceto reject the null hypothesis that 17-year-olds have a 98.6degree average temperature in favor for the claim that thatthey have a higher average body temperature.

25

Page 26: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Your turn

97 98 99 100

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Sampling Distribution of Sample Mean

mean body temp

Sampling Dist. AssumingNull Hyp. True: µ= 98.6

Sample Mean from 17 year olds (98.83)

P−value (.0287)

26

Page 27: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Your turnAnd if we instead did the two-sided test,

• Step 0: Significance level: α = 0.05

• Step 1: Hypotheses: H0 : µ = 98.6, Ha : µ 6= 98.6

• Step 2: Sampling distribution: Normal with mean = 98.6 andSD = 0.6/

√25 = .12

• Step 3: Data: Observation is 98.9, and we standardize it to98.83−98.6

.12 = 1.9

• Step 4: P-value: From Table B,2× (1− .9713) = 2× .0287 = .0574

• Step 5: Since .0574 > .05 (p − value > α), we now fail toreject the null hypothesis! So we don’t have enough evidenceat the .05 significance level to suggest that 17-year-olds havea different average body temperature than the averagehuman. 27

Page 28: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Your turn

97 98 99 100

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Sampling Distribution of Sample Mean

mean body temp

Sampling Dist. Assuming Null Hyp. True

Sample Mean from 17 year olds

P−value

28

Page 29: Chapter 22, Part 2: Computing p-values for …azimmer/Lect23_Ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 Sampling Distribution p p-value is the `more extreme' area under the

Homework• The final HW is up on the website• After today you can finish reading Ch. 22• Do problems:

22.24 (use significance level α = 5% = .05)22.27 (use significance level α = 1% = .01)22.2822.3222.34) A professor once claimed to Aaron that in a smalldiscussion course, about 10% of the students fall asleepat some point during class. During Monday’s lecture,Aaron counted that 1 out of the 20 students in class wereasleep at some point. Is this evidence that the trueproportion is different that 10%? (use significance level α= 5% = .05, and note that you will need a two-sidedalternative hypothesis)

29