chapter 22, part 2: computing p-values for …azimmer/lect23_ch21...-0.05 0.00 0.05 0.10 0.15 0.20 0...
TRANSCRIPT
Reminders
• Last HW and Last quiz on Thursday
• My office hours will be Today from 11-1
• If you won’t be around during the final week to take theFinal Project, please email me ASAP to arrange for atime for you to take it.
1
Warmup
• A drug company develops an AIDS treatment that theyhope will reduce the proportion of AIDS patients who diewithin 50 years. In a randomized control trial, 35% ofpatients in the control group died within 5 years. Thedrug company would like to show that the proportion ofpatients who die within 5 years in the treatment group isless than this.
• What is the null hypothesis for this experiment?
• What is the alternative hypothesis for thisexperiment?
2
Warmup
• A drug company develops an AIDS treatment that theyhope will reduce the proportion of AIDS patients who diewithin 50 years. In a randomized control trial, 35% ofpatients in the control group died within 5 years. Thedrug company would like to show that the proportion ofpatients who die within 5 years in the treatment group isless than this. What is the null hypothesis for thisexperiment? What is the alternative hypothesis forthis experiment?
• H0 : p = 0.35
• Ha : p < 0.35
3
Warmup
• It turns out that 28% of the patients in the treatmentgroup died within 5 years. The drug company calculatesthat the p-value for the experiment is .014. What doesthis p-value mean?
• Before the trial, the drug company set the significancelevel of the test at α = 1% = .01. What is theconclusion of this experiment?
4
Warmup• It turns out that 28% of the patients in the treatment
group died within 5 years. The drug company calculatesthat the p-value for the experiment is .014. What doesthis p-value mean?
• There is a .014 chance (14 in 1000 chance) that wewould observe results as extreme (as small) as we did ifthe null hypothesis was true.
• Before the trial, the drug company set thesignificance level of the test at α = 1% = .01. Whatis the conclusion of this experiment?
• Since the p-value is larger than the significance level, wefail to reject the null hypothesis and conclude thatthe differences we observe could be due to randomchance alone. So, we don’t have enough evidence tosuggest that the treatment group has a statistically lowerpercent of people dying within 5 years. 5
Chapter 22, Part 2: Computing p-values
for significance tests
Aaron ZimmermanSTAT 220 - Summer 2014
Department of StatisticsUniversity of Washington - Seattle
6
Practice
The U.S. military would like to know whether the proportionof women in the military has changed in the last 20 years. In1992, they know that 4.6% of active-duty soldiers werewomen. They would like to know if the current proportion isdifferent that this value.What is the null hypothesis for this experiment?What is the alternative hypothesis?
7
Practice
The U.S. military would like to know whether the proportionof women in the military has changed in the last 20 years. In1992, they know that 4.6% of active-duty soldiers werewomen. They would like to know if the current proportion isdifferent that this value.What is the null hypothesis for this experiment?What is the alternative hypothesis?H0 : p = 0.046Ha : p 6= 0.046
8
Practice
It turns out that 16% of the active duty soldiers they surveyedare women. The military calculates that the p-value for theexperiment is .003.What does this p-value mean?Before the trial, the military set the significance level of thetest at α = 5% = .05. Remember, the p-value of the test is.003.What is the conclusion of this experiment?
9
Practice
It turns out that 16% of the active duty soldiers they surveyedare women. The military calculates that the p-value for theexperiment is .003.What does this p-value mean?It means that there’s a 3 in 1000 chance (.003) that we wouldobserve a result this extreme (16% or more of active dutysoldiers are women) if the null hypothesis was true.Before the trial, the military set the significance level of thetest at α = 5% = .05. Remember, the p-value of the test is.003.What is the conclusion of this experiment?Since the p-value is less than α, we reject the null hypothesisand conclude that our data gives us evidence suggesting thatthe percent of active duty soldiers has changed since 1992.
10
Steps of a Test of Significance
• Returning to our motivating example from Monday,remember that the prosecution in the Kristin Gilbert casefound that there were 34/1384 = .025 deaths per shiftwhen Nurse Gilbert wasn’t working, and that there were40/257=.156 deaths per shift when she was working.
• We’d like to know if the high rate of deaths during hershift can be explained by random variation. That is, we’dlike to know if the rate of deaths during her shift is trulydifferent than .025.
• Question: Is there sufficient evidence against the nullhypothesis that the rate of deaths on Nurse Gilbert’sshifts are different than the baseline .025 rate if thesignificance level is α = .05?
11
Step 0 and 1: Significance Level & The Hypotheses• Before we even start, we set the significance level
(α = .05)
• Remember, the claim being tested in a statistical test iscalled the null hypothesis (H0).
• Nurse Gilbert’s defense claims that she’s unlucky and thatthe rate of deaths during her shift is the same as everyoneelse (.025).
? So, H0: p = 0.025
• The statement we hope or suspect is true instead of H0 iscalled the alternative hypothesis (Ha or H1).
• The prosecution wants to show that the percent of deathsunder Nurse Gilbert is larger than .025.
? So, Ha : p > 0.025
12
Step 2: The Sampling Distribution (if H0 is true)
• Remember, in a test ofsignificance, we start byassuming that H0 is true
• If H0 (p = 0.025) is true,what is the samplingdistribution of p̂?
? The samplingdistribution is Normal
? The mean is p = 0.025
? The standard deviation
is:√
p(1−p)n
=√.025(1−.025)
257= .00974
−0.05 0.00 0.05 0.10 0.15 0.20
010
2030
40
Sampling Distribution
x
y
13
Step 3: The Data
• There were 40 deaths outof 257 shifts under NurseGilbert
• So, p̂ = 40257
= .156
−0.05 0.00 0.05 0.10 0.15 0.20
010
2030
40
Sampling Distribution
x
y
14
Step 4: The p-value (NEW)
• Remember: a p-value is theprobability of observing anoutcome as extreme or moreextreme than what we actuallyobserved if the null hypothesiswere true
• In this problem, the alternativehypothesis is one-sided(p > 0.025)
• So, the p-value is the area underthe normal curve that is as far orfurther away from the mean ofthe distribution.
−0.05 0.00 0.05 0.10 0.15 0.20
010
2030
40
Sampling Distribution
p
p−value is the ‘more extreme' area under the Normal curve
NOTE: We’d look at the areaunder the curve to the left ofthe observation if thealternative was Ha : p < .025. 15
Step 4: The p-value (NEW)
• What percent of the samplingdistribution is greater than theobservation of 40/257=0.156?
? Mean = 0.025
? SD = 0.0097
? Standard score:.156−.025
.0097= 13.5!
• Look up the standard score inTable B. Not so helpful - it justtells us that it must be less than1-.9997 = .0003
• My computer says the p-value isless than 1/100,000,000
−0.05 0.00 0.05 0.10 0.15 0.200
1020
3040
Sampling Distribution
p
p−value is the ‘more extreme' area under the Normal curve
16
Step 4: The p-value (NEW)
17
Step 5: Conclusion• The p-value of .00000001 means that there is a 1 in
100,000,000 chance that Kristin Gilbert would randomly(and unluckily) have that extreme percent of deathsduring her shifts if the proportion of deaths during shiftswas actually .025 (H0).
• Since my significance level is α = .05 andalpha > p − value, this test IS statistically significant.
• Conclusion: We have enough evidence to reject thenull hypothesis and conclude that the percent ofdeaths during Nurse Gilbert’s shifts is larger thanthe baseline rate of .025.
• REMEMBER: this doesn’t mean she was killing people,but it does imply that something different was happeningunder her watch.
18
Note #1: p-values in 2-sided tests
• In practice, when Ha is two-sided(Ha : p 6= .025), we calculate thearea that’s more extreme thanthe observation in one directionand then multiply by two
• We do this because in the2-sided setting, “more extreme”could be extreme and large orextreme and small. Either waygives us evidence against thenull hypothesis H0
• We’re not doing it for thisproblem, but you should beaware of it!
−0.05 0.00 0.05 0.10 0.15 0.200
1020
3040
Sampling Distribution
p
2−sided p−value is found by looking at
‘more extreme' in both directions!
19
Note #2: Different Sample Sizes
• What if we only saw 7 of Nurse Gilbert’s shifts?
• 7× 40/257 ≈ 1. So using 1 death in 7 shifts is about thesame ratio.
• Then the sampling distribution would have mean .025,
but SD =√
.025(1−.025)7
= .059
• And the standard score would be .156−.025.059
= 2.22
• So the p-value would be 1-.9861 = .0139
• While we still would reject the null at α = .05, theevidence isn’t as strong, and we wouldn’t reject atα = .01.
20
Significance Tests for Means
• There’s no reason that we can’t apply the proportionssignificance testing framework directly towardssignificance tests for means.
• We’ll still use the same steps
• Chat with your neighbor about the strategy we’regoing to take to perform significance tests about amean
21
Significance Tests for Means
• There’s no reason that we can’t apply the proportionssignificance testing framework directly towardssignificance tests for means.
• We’ll still use the same steps
• Chat with your neighbor about the strategy we’regoing to take to perform significance tests about amean
Very generally, we find the sampling distribution if the nullhypothesis was true, and then we see how unlikely it was torecord data as extreme as what we’ve seen (still assuming H0
true).
22
Steps 0-5 for Significance Tests on Means• Step 0: pick a significance level (usually α = .05 unless
you have a reason to use a different level)• Step 1: Write down the hypotheses (both H0 & Ha)• Step 2: Determine the sampling distribution if H0 is true.
It will be Normal with mean equal to the claim in H0 andeither standard error like the standard errors used inconfidence intervals (from the CLT)
• Step 3: The data. Figure out what the sample mean isfrom your data
• Step 4: Find the p-value. It’s the area under samplingdistribution more extreme than the sample meanobservation. Multiply this p-value by 2 if you have a2-sided alternative.
• Step 5: Make a conclusion. If the p-value is smaller thanα, reject H0. If the p-value is larger than α, fail to rejectH0
23
Your turnA doctor claims that 17 year olds have an average bodytemperature that is higher than the commonly acceptedaverage human temperature of 98.6 degrees Fahrenheit. Asimple random statistical sample of 25 people, each of age 17,is selected. The average temperature of the 17 year olds isfound to be 98.83 degrees, with standard deviation of 0.6degrees.
The doctor hires you to perform a statisticalsignificance test to check the validity of his claim.
Perform the significance test.
How would your work change if he instead suspectedthat 17 year olds have a temperature different than98.6 but wasn’t sure if they were hotter or colder?
24
Your turn• Step 0: Significance level: α = 0.05
• Step 1: Hypotheses: H0 : µ = 98.6 VS Ha : µ > 98.6
• Step 2: Sampling distribution: Normal with mean = 98.6 andSD = 0.6/
√25 = .12
• Step 3: Data: Observation is 98.83, and we standardize it to98.83−98.6
.12 = 1.9
• Step 4: P-value: From Table B, 1-.9713 = .0287
• Step 5: Since .0287 < .05 (p − value < α), we reject the nullhypothesis and claim that we have a significant test result atthe .05 level. So, we conclude that we have enough evidenceto reject the null hypothesis that 17-year-olds have a 98.6degree average temperature in favor for the claim that thatthey have a higher average body temperature.
25
Your turn
97 98 99 100
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Sampling Distribution of Sample Mean
mean body temp
Sampling Dist. AssumingNull Hyp. True: µ= 98.6
Sample Mean from 17 year olds (98.83)
P−value (.0287)
26
Your turnAnd if we instead did the two-sided test,
• Step 0: Significance level: α = 0.05
• Step 1: Hypotheses: H0 : µ = 98.6, Ha : µ 6= 98.6
• Step 2: Sampling distribution: Normal with mean = 98.6 andSD = 0.6/
√25 = .12
• Step 3: Data: Observation is 98.9, and we standardize it to98.83−98.6
.12 = 1.9
• Step 4: P-value: From Table B,2× (1− .9713) = 2× .0287 = .0574
• Step 5: Since .0574 > .05 (p − value > α), we now fail toreject the null hypothesis! So we don’t have enough evidenceat the .05 significance level to suggest that 17-year-olds havea different average body temperature than the averagehuman. 27
Your turn
97 98 99 100
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Sampling Distribution of Sample Mean
mean body temp
Sampling Dist. Assuming Null Hyp. True
Sample Mean from 17 year olds
P−value
28
Homework• The final HW is up on the website• After today you can finish reading Ch. 22• Do problems:
22.24 (use significance level α = 5% = .05)22.27 (use significance level α = 1% = .01)22.2822.3222.34) A professor once claimed to Aaron that in a smalldiscussion course, about 10% of the students fall asleepat some point during class. During Monday’s lecture,Aaron counted that 1 out of the 20 students in class wereasleep at some point. Is this evidence that the trueproportion is different that 10%? (use significance level α= 5% = .05, and note that you will need a two-sidedalternative hypothesis)
29