chapter 22: comparing two proportions
DESCRIPTION
Chapter 22: Comparing Two Proportions. Unit 5. AP Statistics. Comparing Two Proportions. Comparisons between two percentages are much more common than questions about isolated percentages. And they are more interesting. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/1.jpg)
Chapter 22: Comparing Two ProportionsAP Statistics
Unit 5
![Page 2: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/2.jpg)
Comparing Two Proportions
0 Comparisons between two percentages are much more common than questions about isolated percentages. And they are more interesting.
0 We often want to know how two groups differ, whether a treatment is better than a placebo control, or whether this year’s results are better than last year’s.
![Page 3: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/3.jpg)
Another Ruler
0 In order to examine the difference between two proportions, we need another ruler—the standard deviation of the sampling distribution model for the difference between two proportions.
0 Recall that standard deviations don’t add, but variances do. In fact, the variance of the sum or difference of two independent random quantities is the sum of their individual variances.
![Page 4: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/4.jpg)
The Standard Deviation of the Difference Between Two Proportions
0 Proportions observed in independent random samples are independent. Thus, we can add their variances. So…
0 The standard deviation of the difference between two sample proportions is
0 Thus, the standard error is
1 1 2 21 2
1 2
ˆ ˆ p q p qSD p pn n
1 1 2 21 2
1 2
ˆ ˆ ˆ ˆˆ ˆ p q p qSE p pn n
Remember it’s always a +
![Page 5: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/5.jpg)
Assumptions and Conditions
0 Independence Assumptions:0 Randomization Condition: The data in each group
should be drawn independently and at random from a homogeneous population or generated by a randomized comparative experiment.
0 The 10% Condition: If the data are sampled without replacement, the sample should not exceed 10% of the population.
0 Independent Groups Assumption: The two groups we’re comparing must be independent of each other.
![Page 6: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/6.jpg)
Assumptions and Conditions (cont.)
0 Sample Size Assumption:
0 Each of the groups must be big enough…
0 Success/Failure Condition: Both groups are big enough that at least 10 successes and at least 10 failures have been observed in each.
![Page 7: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/7.jpg)
The Sampling Distribution
0 We already know that for large enough samples, each of our proportions has an approximately Normal sampling distribution.
0 The same is true of their difference.
![Page 8: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/8.jpg)
The Sampling Distribution (cont.)
0 Provided that the sampled values are independent, the samples are independent, and the samples sizes are large enough, the sampling distribution of is modeled by a Normal model with0 Mean:
0 Standard deviation:
1 2p p
1 1 2 21 2
1 2
ˆ ˆ p q p qSD p pn n
![Page 9: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/9.jpg)
Two-Proportion z-Interval
0 When the conditions are met, we are ready to find the confidence interval for the difference of two proportions:
0 The confidence interval is
where
0 The critical value z* depends on the particular confidence level, C, that you specify.
1 2 1 2ˆ ˆ ˆ ˆp p z SE p p
1 1 2 21 2
1 2
ˆ ˆ ˆ ˆˆ ˆ p q p qSE p pn n
![Page 10: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/10.jpg)
1. Check Conditions and show that you have checked these!• Randomization Condition: The data in each group should be
drawn independently and at random from a homogeneous population or generated by a randomized comparative experiment.
• The 10% Condition: If the data are sampled without replacement, the sample should not exceed 10% of the population.
• Independent Groups Assumption: The two groups we’re comparing must be independent of each other.
• Success/Failure Condition: Both groups are big enough that at least 10 successes and at least 10 failures have been observed in each.
0
Steps for two-proportion z-interval
![Page 11: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/11.jpg)
2. State the test you are about to conduct Ex) Two-Proportion z-Interval
4. Calculate your z-interval
5. State your conclusion IN CONTEXT.We are 95% confident that the support group program could raise the proportion of smokers who manage
to quit using the parch by between 2 and 22 percentage points.
Steps for TWO PROPORTION Z-interval (cont.)
![Page 12: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/12.jpg)
2-Proportion Confidence Interval Example
The table below describes the effect of preschool on later use of social services:
Set up a 95% confidences interval. Interpret your results.
Population Population description
Sample size Sample proportion
1 Control
2 Preschool
![Page 13: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/13.jpg)
Everyone into the Pool
0 The typical hypothesis test for the difference in two proportions is the one of no difference (when they are equal). In symbols, H0: p1 – p2 = 0.
0 Since we are hypothesizing that there is no difference between the two proportions, that means that the standard deviations for each proportion are the same.
0 Since this is the case, we combine (pool) the counts to get one overall proportion.
![Page 14: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/14.jpg)
Everyone into the Pool (cont.)
0 The pooled proportion is
where and
0 If the numbers of successes are not whole numbers, round them first. (This is the only time you should round values in the middle of a calculation.)
1 2
1 2
ˆ pooledSuccess Successp
n n
1 1 1ˆSuccess n p2 2 2ˆSuccess n p
![Page 15: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/15.jpg)
Everyone into the Pool (cont.)
0 We then put this pooled value into the formula, substituting it for both sample proportions in the standard error formula:
1 21 2
ˆ ˆ ˆ ˆˆ ˆ pooled pooled pooled pooled
pooled
p q p qSE p p
n n
![Page 16: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/16.jpg)
Compared to What?
0 We’ll reject our null hypothesis if we see a large enough difference in the two proportions.
0 How can we decide whether the difference we see is large?0 Just compare it with its standard deviation.
0 Unlike previous hypothesis testing situations, the null hypothesis doesn’t provide a standard deviation, so we’ll use a standard error (here, pooled).
![Page 17: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/17.jpg)
Two-Proportion z-Test
0 The conditions for the two-proportion z-test are the same as for the two-proportion z-interval.
0 We are testing the hypothesis H0: p1 – p2 = 0, or, equivalently, H0: p1 = p2.
0 Because we hypothesize that the proportions are equal, we pool them to find
1 2
1 2
ˆ pooledSuccess Successp
n n
![Page 18: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/18.jpg)
1. Check Conditions and show that you have checked these!• Randomization Condition: The data in each group should be
drawn independently and at random from a homogeneous population or generated by a randomized comparative experiment.
• The 10% Condition: If the data are sampled without replacement, the sample should not exceed 10% of the population.
• Independent Groups Assumption: The two groups we’re comparing must be independent of each other.
• Success/Failure Condition: Both groups are big enough that at least 10 successes and at least 10 failures have been observed in each.
Steps for two-proportion z-tests
![Page 19: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/19.jpg)
2. State the test you are about to conduct Ex) Two-proportion z-test
3. Set up your hypothesesH0:
HA:
4. Calculate your test statistic
5. Draw a picture of your desired area under the Normal model, and calculate your P-value.
Steps for two-proportion z-tests (cont.)
![Page 20: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/20.jpg)
6. Make your conclusion.
Steps for two-proportion z-tests (cont.)
P-Value Action Conclusion
Low Reject H0 The sample mean is sufficient evidence to conclude HA in context.
High Fail to reject H0 The sample mean does not provide us with sufficient evidence to conclude HA in context.
![Page 21: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/21.jpg)
2-Proportion z-Test Example0 High levels of cholesterol in the blood are associated with higher risk of heart
attacks. Will using a drug to lower blood cholesterol reduce heart attacks? The Helsinki Heart Study looked at this question. Middle-aged men were assigned at random to one of two treatments: 2051 men took the drug gemfibrozil to reduce their cholesterol levels, and a control group of 2030 men took a placebo. During the next five years, 56 men in the gemfibrozil group and 84 men in the placebo group had heart attacks. What are the proportions and is the benefit of the drug statistically significant?
![Page 22: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/22.jpg)
Calculator Tips
0Stat TESTS
06: 2-PropZTest
0Enter values
0Calculate
![Page 23: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/23.jpg)
What Can Go Wrong?0 Don’t use two-sample proportion methods when the
samples aren’t independent.0 These methods give wrong answers when the independence
assumption is violated.0 Don’t apply inference methods when there was no
randomization.0 Our data must come from representative random samples or
from a properly randomized experiment.0 Don’t interpret a significant difference in proportions
causally.0 Be careful not to jump to conclusions about causality.
![Page 24: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/24.jpg)
Recap
0We’ve now looked at inference for the difference in two proportions.
0Perhaps the most important thing to remember is that the concepts and interpretations are essentially the same—only the mechanics have changed slightly.
![Page 25: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/25.jpg)
Recap (cont.)
0 Hypothesis tests and confidence intervals for the difference in two proportions are based on Normal models.0 Both require us to find the standard error of the difference
in two proportions.
0We do that by adding the variances of the two sample proportions, assuming our two groups are independent.
0When we test a hypothesis that the two proportions are equal, we pool the sample data; for confidence intervals we don’t pool.
![Page 26: Chapter 22: Comparing Two Proportions](https://reader035.vdocuments.net/reader035/viewer/2022070421/5681619d550346895dd1527a/html5/thumbnails/26.jpg)
Assignments: pp. 519 – 522
0Day 1: # 1, 7, 9, 18
0Day 2: # 3, 10, 20, 22
0Day 3: # 6, 14, 16, 30