chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · chapter 10:...

27
Chapter 10: Comparing Two Proportions 221 Chapter 10 Section 10.1 Check Your Understanding, page 608: 1. Normal. For the crackers from Bag 1 ( ) 50 0.25 12.5 np = = and ( ) ( ) 1 50 0.75 37.5. n p = = For the crackers from Bag 2 ( ) 40 0.35 14 np = = and ( ) ( ) 1 40 0.65 26. n p = = All of those numbers are at least 10. 2. 1 2 ˆ ˆ 1 2 0.25 0.35 0.10. p p p p µ = = =− 1 2 1 1 2 2 ˆ ˆ 1 2 (1 ) (1 ) 0.25(0.75) 0.35(0.65) 0.0971 50 40 p p p p p p n n σ = + = + = 3. ( ) ( ) ( ) 1 2 0.02 0.10 ˆ ˆ 0.02 0.82 0.7939. 0.0971 Pp p P z Pz −− ≤− = = = 4. We would not be surprised. In about 79% of samples the value of 1 2 ˆ ˆ p p will be less than or equal to 0.02. Check Your Understanding, page 611: 1. State: Our parameters of interest are 1 p = proportion of teens who go online every day and 2 p = proportion of adults who go online every day. We want to estimate the difference 1 2 p p at a 90% confidence level. Plan: We should use a two-sample z interval for 1 2 p p if the conditions are satisfied. Random: Both samples were selected randomly. Normal: The number of successes and failures in both groups are at least 10 (Teens: 504 successes, 296 failures. Adults: 1,532 successes, 721 failures). Independent: Both samples are less than 10% of their respective populations (there are more than 8,000 teens and 22,530 adults in the U.S.). The conditions are met. Do: From the data we find that 1 800, n = 1 ˆ 0.63, p = 2 2253, n = and 2 ˆ 0.68. p = So our 90% confidence interval is ( ) ( ) ( ) ( ) 0.63 0.37 0.68 0.32 0.63 0.68 1.645 0.05 0.0324 0.0824, 0.0176 . 800 2253 ± + =− ± =− Conclude: We are 90% confident that between the interval from –0.0824 to –0.0176 captures the true difference in proportion of U.S. adults and teens who go online every day. This interval suggests that between 1.76% and 8.24% more adults are online everyday than teens. Check Your Understanding, page 619: 1. State: We want to perform a test at the 0.05 α = significance level of 0 1 2 : 0 H p p = versus 1 2 : 0 a H p p > where 1 p is the actual proportion of children who did not attend preschool that used social services later and 2 p is the actual proportion of children who did attend preschool that used social services later. Plan: We should use a two-sample z test for 1 2 p p if the conditions are satisfied. Random: This was a randomized experiment. Normal: The number of successes and failures in both groups are at least 10 (No preschool: 49 successes, 12 failures. Preschool: 38 successes, 24 failures). Independent: Due to the random assignment, these two groups of children can be viewed as independent. Individual observations in each group should also be independent: knowing one child’s need of social services gives no information about another child’s need of social services. The conditions are met. Do:

Upload: truongngoc

Post on 08-Feb-2018

230 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

Chapter 10: Comparing Two Proportions 221

Chapter 10 Section 10.1 Check Your Understanding, page 608: 1. Normal. For the crackers from Bag 1 ( )50 0.25 12.5np = = and ( ) ( )1 50 0.75 37.5.n p− = = For the crackers from Bag 2 ( )40 0.35 14np = = and ( ) ( )1 40 0.65 26.n p− = = All of those numbers are at least 10. 2.

1 2ˆ ˆ 1 2 0.25 0.35 0.10.p p p pµ − = − = − = −

1 2

1 1 2 2ˆ ˆ

1 2

(1 ) (1 ) 0.25(0.75) 0.35(0.65) 0.097150 40p p

p p p pn n

σ −

− −= + = + =

3. ( ) ( ) ( )1 2

0.02 0.10ˆ ˆ 0.02 0.82 0.7939.

0.0971P p p P z P z

− − −− ≤ − = ≤ = ≤ =

4. We would not be surprised. In about 79% of samples the value of 1 2ˆ ˆp p− will be less than or equal to 0.02.−

Check Your Understanding, page 611: 1. State: Our parameters of interest are 1p =proportion of teens who go online every day and

2p = proportion of adults who go online every day. We want to estimate the difference 1 2p p− at a 90% confidence level. Plan: We should use a two-sample z interval for 1 2p p− if the conditions are satisfied. Random: Both samples were selected randomly. Normal: The number of successes and failures in both groups are at least 10 (Teens: 504 successes, 296 failures. Adults: 1,532 successes, 721 failures). Independent: Both samples are less than 10% of their respective populations (there are more than 8,000 teens and 22,530 adults in the U.S.). The conditions are met. Do: From the data we find that

1 800,n = 1ˆ 0.63,p = 2 2253,n = and 2ˆ 0.68.p = So our 90% confidence interval is

( ) ( ) ( ) ( )0.63 0.37 0.68 0.320.63 0.68 1.645 0.05 0.0324 0.0824, 0.0176 .

800 2253− ± + = − ± = − − Conclude: We

are 90% confident that between the interval from –0.0824 to –0.0176 captures the true difference in proportion of U.S. adults and teens who go online every day. This interval suggests that between 1.76% and 8.24% more adults are online everyday than teens. Check Your Understanding, page 619: 1. State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H p p− = versus

1 2: 0aH p p− > where 1p is the actual proportion of children who did not attend preschool that used social services later and 2p is the actual proportion of children who did attend preschool that used social services later. Plan: We should use a two-sample z test for 1 2p p− if the conditions are satisfied. Random: This was a randomized experiment. Normal: The number of successes and failures in both groups are at least 10 (No preschool: 49 successes, 12 failures. Preschool: 38 successes, 24 failures). Independent: Due to the random assignment, these two groups of children can be viewed as independent. Individual observations in each group should also be independent: knowing one child’s need of social services gives no information about another child’s need of social services. The conditions are met. Do:

Page 2: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

222 The Practice of Statistics for AP*, 4/e

The proportions of those using social services in each group are 149ˆ 0.803361

p = = and 238ˆ 0.6129.62

p = =

The pooled proportion is 49 38 87ˆ 0.7073.61 62 123Cp +

= = =+

The test statistic is

(0.8033 0.6129) 0 2.32.(0.7073)(1 0.7073) (0.7073)(1 0.7073)

61 62

z − −= =

− −+

Since this is a one-sided test the P-value is

( )2.32 0.0102.P z > = Conclude: Since the P-value is less than 0.05, we reject the null hypothesis and conclude that children like those in this study who participate in preschool are less likely to use social services later in life. 2. We are 95% confident that the interval from 0.033 to 0.347 captures the true difference (no preschool – preschool) in proportion of children who would need social services later in life. This suggests that between 3.3% and 34.7% more children who did not participate in preschool will need social services later in life than those who did participate in preschool. The interval gives us a range of plausible values rather than just making a decision about one specific value. Exercises, page 621: 10.1 (a) Counts will be obtained from the samples so this is a problem about comparing proportions. (b) This is an observational study comparing random samples selected from two independent populations. 10.2 (a) Counts will be obtained from the samples so this is a problem about comparing proportions. (b) This is an observational study comparing random samples selected from two independent populations. 10.3 (a) Scores (numerical values) will be obtained from the samples so this is a problem about comparing means. (b) This is an example of a randomized experiment. 10.4 (a) Amount charged (numerical values) will be obtained from the samples so this is a problem about comparing means. (b) This is an example of a randomized experiment. 10.5 (a) The sampling distribution of 1 2p p− where 1p is the actual proportion of red jelly beans in bags for children and 2p is the actual proportion of red jelly beans in bags for adults, is Normal because

( )1 1 50 0.30 15,n p = = ( ) ( )1 11 50 0.7 35,n p− = = ( )2 2 100 0.15 15,n p = = and

( ) ( )2 21 100 0.85 85n p− = = are all greater than10. The mean is 1 2ˆ ˆ 1 2 0.30 0.15 0.15p p p pµ − = − = − = and

the standard deviation is ( ) ( ) ( ) ( )1 2

1 1 2 2ˆ ˆ

1 2

1 1 0.3 0.7 0.15 0.850.0740.

50 100p p

p p p pn n

σ −

− −= + = + = So

( ) ( )1 20 0.15ˆ ˆ 0 2.03 0.0212.0.0740

P p p P z P z− − ≤ = ≤ = ≤ − =

(b) Yes, we might doubt the company’s

claim. There is only a 2% chance of getting as few or fewer red jelly beans in the child sample than the adult sample if the company’s claim is true. This is not very likely. 10.6 (a) The sampling distribution of 1 2p p− where 1p is the actual proportion of high school graduates who pass a basic literacy test and 2p is the actual proportion of high school dropouts who pass a basic literacy test, is Normal because ( )1 1 60 0.8 48,n p = = ( ) ( )1 11 60 0.2 12,n p− = = ( )2 2 75 0.4 30,n p = = and

Page 3: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

Chapter 10: Comparing Two Proportions 223

( ) ( )2 21 75 0.6 45n p− = = are all greater than10. The mean is 1 2ˆ ˆ 1 2 0.80 0.40 0.40p p p pµ − = − = − = and

the standard deviation is 1 2

1 1 2 2ˆ ˆ

1 2

(1 ) (1 ) 0.8(0.2) 0.4(0.6) 0.0766.60 75p p

p p p pn n

σ −

− −= + = + = So

( ) ( )1 20.2 0.4ˆ ˆ 0.2 2.61 0.9955.0.0766

P p p P z P z− − ≥ = ≥ = ≥ − =

(b) Yes, we might doubt the researcher’s

claim. While there is a 99.2% chance of getting samples where at least 20% more of the high school graduates pass, there is only a 0.8% chance of getting a sample where no more than 20% more of the high school graduates pass. 10.7 The Normal condition is not met because there were only 3 successes in the group from the west side of Woburn. Also, the Random condition is not met. This was not a random sample. 10.8 The Normal condition is not met because there were only 6 successes in the group wearing wrist guards. Also, the Random condition is not met. This was not a random sample. 10.9 The Normal condition is not met. Three of the four counts of successes and failures are less than 10 (indeed, one is 0). 10.10 The Normal condition is not met because there were no successes in the microwave group. 10.11 State: Our parameters of interest are 1p = true proportion of young people who use online instant messaging more than email and 2p = true proportion of older people who use online instant messaging more than email. We want to estimate the difference 1 2p p− at a 90% confidence level. Plan: We should use a two-sample z interval for 1 2p p− if the conditions are satisfied. Random: Both samples were selected randomly. Normal: The number of successes and failures in both groups are at least 10 (Young people: 73 successes, 85 failures. Adults: 26 successes, 117 failures). Independent: Both samples are less than 10% of their respective populations (there are more than 1,580 young people and 1,430 older people in the U.S.). The conditions are met. Do: From the data we find that

1 158,n = 173ˆ 0.462,

158p = = 2 143,n = and 2

26ˆ 0.182.143

p = = So our 90% confidence interval is

( ) ( ) ( ) ( )0.462 0.538 0.182 0.8180.462 0.182 1.645 0.28 0.0841 0.1959,0.3641 .

158 143− ± + = ± = Conclude:

We are 90% confident that the interval from 0.1959 to 0.3641 captures the true difference in propotions of young people and older people who use online instant messaging more than email. This suggests that between 19.6% and 36.4% more young people than older people use online instant messaging more often than email. 10.12 State: Our parameters of interest are 1p = true proportion of young blacks who listen to rap music every day and 2p = true proportion of young whites who listen to rap music every day. We want to estimate the difference 1 2p p− at a 95% confidence level. Plan: We should use a two-sample z interval for 1 2p p− if the conditions are satisfied. Random: Both samples were selected randomly. Normal: The number of successes and failures in both groups are at least 10 (Young blacks: 368 successes, 266 failures. Young whites: 130 successes, 437 failures). Independent: Both samples are less than 10% of their respective populations (there are more than 6,340 young blacks and 5,670 young whites in the U.S.).

Page 4: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

224 The Practice of Statistics for AP*, 4/e

The conditions are met. Do: From the data we find that 1 634,n = 1368ˆ 0.580,634

p = = 2 567,n = and

2130ˆ 0.229.567

p = = So our 95% confidence interval is

0.580(0.420) 0.229(0.771)(0.580 0.229) 1.96 0.351 0.052 (0.299,0.403).

634 567− ± + = ± = Conclude: We are

95% confident that the interval from 0.299 to 0.403 captures the true difference in propotions of young blacks and young whites who listen to rap music every day. This suggests that between 29.9% and 40.3% more young blacks than young whites listen to rap music every day. 10.13 (a) State: Our parameters of interest are 1p = actual proportion of young men who live at home and 2p = actual proportion of young women who live at home. We want to estimate the difference

1 2p p− at a 99% confidence level. Plan: We should use a two-sample z interval for 1 2p p− if the conditions are satisfied. Random: Both samples were selected randomly. Normal: The number of successes and failures in both groups are at least 10 (Young men: 986 successes, 1267 failures. Young women: 923 successes, 1706 failures). Independent: Both samples are less than 10% of their respective populations (there are more than 22,530 young men and 26,290 young women in the U.S.). The

conditions are met. Do: From the data we find that 1 2253,n = 1986ˆ 0.438,2253

p = = 2 2629,n = and

2923ˆ 0.351.

2629p = = So our 99% confidence interval is

( ) ( ) ( ) ( )0.438 0.562 0.351 0.6490.438 0.351 2.576 0.087 0.036 0.051,0.123 .

2253 2629− ± + = ± = Conclude: We

are 99% confident that the interval from 0.051 to 0.123 captures the true difference in the proportions of young men and young women who live at home. This suggests that between 5.1% and 12.3% more young men than young women live at home. (b) Since the interval does not contain 0, there is convincing evidence that the two proportions are not the same. 10.14 (a) State: Our parameters of interest are 1p = actual proportion of older black men who fear crime and 2p = actual proportion of older black women who fear crime. We want to estimate the difference

1 2p p− at a 90% confidence level. Plan: We should use a two-sample z interval for 1 2p p− if the conditions are satisfied. Random: Both samples were selected randomly. Normal: The number of successes and failures in both groups are at least 10 (Older black men: 46 successes, 17 failures. Older black women: 27 successes, 29 failures). Independent: Both samples are less than 10% of their respective populations (there are more than 630 older black men and 560 older black women in Atlantic

City, NJ). The conditions are met. Do: From the data we find that 1 63,n = 146ˆ 0.730,63

p = = 2 56,n =

and 227ˆ 0.482.56

p = = So our 90% confidence interval is

( ) ( ) ( ) ( )0.730 0.270 0.482 0.5190.730 0.482 1.645 0.248 0.143 0.105,0.391 .

63 56− ± + = ± = Conclude: We

are 90% confident that the interval between 0.105 and 0.391 captures the true difference in the proportions of older black men and older black women who fear crime. This interval suggests that between 10.5% and 39.1% more older black men than older black women fear crime. (b) Since the interval does not contain 0, there is convincing evidence that the two proportions are not the same.

Page 5: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

Chapter 10: Comparing Two Proportions 225

10.15 0 1 2: 0H p p− = versus 1 2: 0aH p p− ≠ where 1p is the actual proportion of teens who own an iPod or MP3 player and 2p is the actual proportion of young adults who own an iPod or MP3 player. 10.16 0 1 2: 0H p p− = versus 1 2: 0aH p p− ≠ where 1p is the actual proportion of high school freshman in Illinois who use anabolic steroids and 2p is the actual proportion of high school seniors in Illinois who use anabolic steroids. 10.17 (a) State: We want to perform a test at the 0.05α = significance level of the hypotheses stated in Exercise 15. Plan: We should use a two-sample z test for 1 2p p− if the conditions are satisfied. Random: Both samples were randomly selected. Normal: The number of successes and failures in both groups are at least 10 (Teens: 632 successes, 168 failures. Young adults: 268 successes, 132 failures). Independent: Both samples are less than 10% of their respective populations (there are more than 8,000 teens and 4,000 young adults who live in the U.S.). The conditions are met. Do: The proportions of

those owning iPods or MP3 players in each group are 1632ˆ 0.79800

p = = and 2268ˆ 0.67.400

p = = The pooled

proportion is 632 268 900ˆ 0.75.800 400 1200Cp +

= = =+

The test statistic is

( )( )( ) ( )( )

0.79 0.67 04.53.

0.75 0.25 0.75 0.25800 400

z− −

= =

+

Since this is a two-sided test the P-value is ( )2 4.53 0.P z > ≈

Conclude: Since the P-value is less than 0.05, we reject the null hypothesis and conclude that the actual proportions of teens and young adults who own iPods and MP3 players are different. (b) State: We want to estimate the difference 1 2p p− at a 95% confidence level. Plan: We should use a two-sample z interval for 1 2p p− if the conditions are satisfied. We checked the conditions in (a) and they have been met. Do: From the data we find that 1 800,n = 1ˆ 0.79,p = 2 400,n = and 2ˆ 0.67.p = So our 95%

confidence interval is ( ) ( ) ( ) ( )0.79 0.21 0.67 0.330.79 0.67 1.96 0.12 0.054 0.066,0.174 .

800 400− ± + = ± =

Conclude: We are 95% confident that the interval from 0.066 to 0.174 captures the true difference in proportions of teens and young adults who own iPods or MP3 players. This suggests that between 6.6% and 17.4% more teens than young adults own iPods or MP3 players. This is consistent with our answer to part (a). In both cases we ruled out the difference of proportions being 0 as a plausible value. 10.18 (a) State: We want to perform a test at the 0.05α = significance level of the hypotheses stated in Exercise 16. Plan: We should use a two-sample z test for 1 2p p− if the conditions are satisfied. Random: Both samples were randomly selected. Normal: The number of successes and failures in both groups are at least 10 (Freshmen: 34 successes, 1645 failures. Seniors: 24 successes, 1342 failures). Independent: Both samples are less than 10% of their respective populations (there are more than 16,790 freshmen and 13,660 seniors in Illinois). The conditions are met. Do: The proportions of those using

anabolic steroids in each group are 134ˆ 0.0203

1679p = = and 2

24ˆ 0.0176.1366

p = = The pooled proportion

is 34 24 58ˆ 0.0190.1679 1366 3045Cp +

= = =+

The test statistic is ( )

( )( ) ( )( )0.0203 0.0176 0

0.54.0.019 0.981 0.019 0.981

1679 1366

z− −

= =

+

Page 6: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

226 The Practice of Statistics for AP*, 4/e

Since this is a two-sided test the P-value is ( ) ( )2 0.54 2 0.2946 0.5892.P z > = = Conclude: Since the P-value is greater than 0.05, we fail to reject the null hypothesis. We do not have enough evidence to conclude that there is a difference in the actual proportions of freshmen and seniors in Illinois who use anabolic steroids. (b) State: We want to estimate the difference 1 2p p− at a 95% confidence level. Plan: We should use a two-sample z interval for 1 2p p− if the conditions are satisfied. We checked the conditions in (a) and they have been met. Do: From the data we find that

1 1679,n = 1ˆ 0.0203,p = 2 1366,n = and 2ˆ 0.0176.p = So our 95% confidence interval is

0.0203(0.9797) 0.0176(0.9824)(0.203 0.0176) 1.96 0.0027 0.0097 ( 0.007,0.0124).

1679 1366− ± + = ± = −

Conclude: We are 95% confident that the interval from –0.007 to 0.0124 captures the true difference in the proportions of freshmen and seniors who use anabolic steroids. This is consistent with our answer to part (a). In both cases we decided that 0 was a plausible value for the difference in the proportions. 10.19 Here are the issues that are wrong: (1) Holly fails to identify that 1p is the actual proportion of black men who would marry a person from a lower social class than their own and 2p is the actual proportion of black women who would marry a person from a lower social class than their own. (2) Holly says that she will do a one-sample z test when it should be a two-sample z test. (3) Holly checks all the correct conditions, but what she calls her check of the Normal condition is really her check of the Independence condition and vice versa. She also fails to check the 10% condition as part of the Independence condition check. (4) In the Do stage, Holly miscalculates the value of 2ˆ .p It should

be 2117ˆ 0.50.236

p = = (5) Holly forgets to compute the pooled proportion as 91 117 208ˆ 0.54.149 236 385Cp +

= = =+

(6) Holly uses the wrong formula for the test statistic because she didn’t compute the pooled proportion.

The correct statistic is: ( )

( )( ) ( )( )0.61 0.50 0

2.11.0.54 0.46 0.54 0.46

149 236

z− −

= =

+

(7) Holly computes the P-value

based on a one-sided test instead of a two-sided test. It should be ( ) ( )2 2.11 2 0.0174 0.0348.P z > = = (8) Finally she should have a two-sided conclusion rather than a one-sided conclusion. And Holly shouldn’t have used the word “prove.” 10.20 Here are the issues that are wrong: (1) Min Jae stated in the conditions check that the data came from two random samples. In fact, it came from a randomized experiment. Also, the Independent condition is checked incorrectly. There was no sampling done here. (2) He computed his sample

proportions incorrectly. They should be: 130ˆ 0.6050

p = = and 222ˆ 0.44.50

p = = (3) He also incorrectly

calculated the test statistic. It should be ( )

( )( ) ( )( )0.60 0.44 0

1.60.0.52 0.48 0.52 0.48

50 50

z− −

= =

+

(4) The new P-value

is ( )1.60 0.0548.P z > = Despite the new P-value, his conclusion is still good. 10.21 (a) State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H p p− = versus

1 2: 0aH p p− ≠ where 1p is the actual proportion of women in the low-fat group who had a family history of breast cancer and 2p is the actual proportion of women in the control group who had a family

Page 7: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

Chapter 10: Comparing Two Proportions 227

history of breast cancer. Plan: We should use a two-sample z test for 1 2p p− if the conditions are satisfied. Random: This was a randomized experiment. Normal: The number of successes and failures in both groups are at least 10 (Low-fat: 3396 successes, 16145 failures. Control group: 4929 successes, 24365 failures). Independent: Due to the random assignment, these two groups of women can be viewed as independent. Individual observations in each group should also be independent: knowing one woman’s family history doesn’t tell us anything about any other woman’s history. The conditions are met. Do: The proportions of women with a family history of breast cancer in each group are

13396ˆ 0.174

19541p = = and 2

4929ˆ 0.168.29294

p = = The pooled proportion is

3396 4929 8325ˆ 0.170.19541 29294 48835Cp +

= = =+

The test statistic is (0.174 0.168) 0

1.73.(0.17)(0.83) (0.17)(0.83)

19541 29294

z− −

= =

+

Since this is a two-sided test the P-value is ( ) ( )2 1.73 2 0.0418 0.0836.P z > = = Conclude: Since the P-value is greater than 0.05, we fail to reject the null hypothesis. We do not have enough evidence to conclude that there is a difference in the actual proportions of women in the two groups who have a family history of breast cancer. We note, however, that the null hypothesis would have been rejected at the 10% significance level. (b) A Type I error would be to say that the groups are significantly different when they are not; a Type II error would be to say that the groups are not significantly different when they are. A Type II error would be more serious because the experiment would proceed assuming that the two groups were similar to begin with. Any conclusions made about the difference between the two groups at the end of the study would then be suspect. 10.22 (a) State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H p p− = versus

1 2: 0aH p p− ≠ where 1p is the actual proportion of patients like the ones in the study who would have a stroke when taking aspirin alone and 2p is the actual proportion of patients like the ones in the study who would have had a stroke when taking both drugs. Plan: We should use a two-sample z test for 1 2p p− if the conditions are satisfied. Random: This was a randomized comparative experiment. Normal: The number of successes and failures in both groups are at least 10 (Aspirin alone: 206 successes, 1443 failures. Additional medicine group: 157 successes, 1493 failures). Independent: Due to the random assignment, these two groups of patients can be viewed as independent. Individual observations in each group should also be independent: knowing whether one patient had a stroke or not gives no information about another patient. The conditions are met. Do: The proportions of stroke victims in each group are

1206ˆ 0.125

1649p = = and 2

157ˆ 0.095.1650

p = = The pooled proportion is 206 157 363ˆ 0.11.1649 1650 3299Cp +

= = =+

The test statistic is ( )

( )( ) ( )( )0.125 0.095 0

2.75.0.11 0.89 0.11 0.89

1649 1650

z− −

= =

+

Since this is a two-sided test the P-value is

( ) ( )2 2.75 2 0.0030 0.006.P z > = = Conclude: Since the P-value is less than 0.05, we reject the null hypothesis. We have enough evidence to conclude that there is a difference in the proportions of patients like the ones in this study who suffer strokes depending on whether they take aspirin alone or take the additional medication as well. (b) A Type I error would be to conclude that there is a difference between the stroke rates for the two treatments when there is no difference and a Type II error would be to conclude that there is no difference between the stroke rates of people on the two different treatments when there actually is. A Type II error would be more serious in this case because we would not market a drug that would reduce the number of strokes that people suffer.

Page 8: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

228 The Practice of Statistics for AP*, 4/e

10.23 (a) We should use a two-sample z test for 1 2p p− if the conditions are satisfied. Random: This was a randomized comparative experiment. Normal: The number of successes and failures in both groups are at least 10 (Prayer group: 44 successes, 44 failures. Control group: 21 successes, 60 failures). Independent: Due to the random assignment, these two groups of women can be viewed as independent. Individual observations in each group should also be independent: knowing whether one woman became pregnant gives no information about another woman. The conditions are met. (b) If there is no difference in pregnancy rates of women who are being prayed for and those who are not, there is a 0.07% chance of seeing as many more pregnancies while being prayed for as we did. (c) Since the P-value was less than 0.05, we reject the null hypothesis. We have enough evidence to conclude that the proportion of pregnancies among women like these who are prayed for is higher than for those who are not prayed for. (d) If the women had known whether they were being prayed for, this might have affected their behavior in some way (even unconsciously) that would have affected whether they became pregnant or not. 10.24 (a) We should use a two-sample z test for 1 2p p− if the conditions are satisfied. Random: This was a randomized comparative experiment. Normal: The number of successes and failures in both groups are at least 10 (Acupuncture group: 34 successes, 46 failures. Control group: 21 successes, 59 failures). Independent: Due to the random assignment, these two groups of women can be viewed as independent. Individual observations in each group should also be independent: knowing whether one woman became pregnant gives no information about another woman. The conditions are met. (b) If there is no difference in pregnancy rates of women who receive acupuncture and those who don’t, there is a 1.52% chance of seeing as many more pregnancies while receiving acupuncture as we did. (c) Since the P-value was less than 0.05, we reject the null hypothesis. We have enough evidence to conclude that the proportion of pregnancies among women like these who receive acupuncture is higher than for those who do not. (d) This study was not blind. The women who received acupuncture knew that they had received the treatment and those in the control group knew that they had not received the treatment. This may affect their behavior (even unconsciously) in such a way as to affect whether they became pregnant or not. 10.25 State: Our parameters of interest are 1p =proportion of IVF patients who are prayed for and become pregnant and 2p = proportion of IVF patients who are not prayed for and become pregnant. We want to estimate the difference 1 2p p− at a 95% confidence level. Plan: We should use a two-sample z interval for 1 2p p− if the conditions are satisfied. Conditions were checked in Exercise 10.23 and were

met. Do: From the data we find that 1 88,n = 144ˆ 0.5,88

p = = 2 81,n = and 221ˆ 0.259.81

p = = So our 95%

confidence interval is ( ) ( ) ( ) ( )0.5 0.5 0.259 0.7410.5 0.259 1.96 0.241 0.141 0.100,0.382 .

88 81− ± + = ± =

Conclude: We are 95% confident that the interval from 0.100 to 0.382 captures the true difference in proportions of IVF patients who are prayed for and those who are not who get pregnant. This suggests that between 10% and 38.2% more women who are prayed for than those who are not prayed for get pregnant. This interval not only excludes 0 as being a plausible difference (which is what the test concluded), it also gives a range of other plausible values. 10.26 State: Our parameters of interest are 1p =proportion of IVF patients who receive acupuncture and become pregnant and 2p = proportion of IVF patients who do not receive acupuncture and become pregnant. We want to estimate the difference 1 2p p− at a 95% confidence level. Plan: We should use a two-sample z interval for 1 2p p− if the conditions are satisfied. Conditions were checked in Exercise

Page 9: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

Chapter 10: Comparing Two Proportions 229

10.24 and were met. Do: From the data we find that 1 80,n = 134ˆ 0.425,80

p = = 2 80,n = and

221ˆ 0.263.80

p = = So our 95% confidence interval is

0.425(0.575) 0.263(0.737)(0.425 0.263) 1.96 0.162 0.145 (0.017,0.307).

80 80− ± + = ± = Conclude: We

are 95% confident that the interval from 0.017 to 0.307 captures the true difference in the proportions of pregnant women who receive acupuncture and those who don’t who get pregnant. This suggests that between 1.7% and 30.7% more women who receive acupuncture than those who don’t get pregnant This interval not only excludes 0 as being a plausible difference (which is what the test concluded), it also gives a range of other plausible values. 10.27 State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H p p− = versus

1 2: 0aH p p− ≠ where 1p is the actual proportion of 4- to 5-year-olds who would sort correctly and 2p is the actual proportion of 6- to 7-year-olds who would sort correctly. Plan: We should use a two-sample z test for 1 2p p− if the conditions are satisfied. We check the conditions. Random: Both samples were selected randomly. Normal: The number of successes and failures in both groups are at least 10 (4- to 5-year-olds: 10 successes, 40 failures. 6- to 7-year-olds : 28 successes, 25 failures). Independent: Both samples are less than 10% of their respective populations (there are more than 500 4- to 5-year-olds and 530 6- to 7-year-olds). The conditions are met. Do: From the computer printout, the test statistic is

3.45z = − and the P-value is 0.001. Conclude: Since the P-value is less than 0.05, we reject the null hypothesis. We have enough evidence to conclude that there is a difference in the sorting abilities between 4- to 5-year-olds and 6- to 7-year-olds. 10.28 (a) If a driver changed speeds because of another car’s speed, then the observations would not be independent. (b) State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H p p− = versus 1 2: 0aH p p− ≠ where 1p is the actual proportion of people who speed in the presence of radar and

2p is the actual proportion of people who speed when there is no radar obvious. Plan: We should use a two-sample z test for 1 2p p− if the conditions are satisfied. We check the conditions. Random: Both samples were selected randomly. Normal: The number of successes and failures in both groups are at least 10 (With radar: 1,051 successes, 2,234 failures. Without radar: 5,690 successes, 7241 failures). Independent: Both samples are less than 10% of their respective populations (there are more than 32,850 cars when there is radar and 129,310 cars when there is no radar). The conditions are met. Do: From the computer printout, the test statistic is 12.47z = and the P-value is approximately 0. Conclude: Since the P-value is less than 0.05, we reject the null hypothesis. We have enough evidence to conclude that there is a difference in proportion of speeders between when there is obvious radar present and when it isn’t present. 10.29 b. 10.30 d. 10.31 b. 10.32 e.

Page 10: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

230 The Practice of Statistics for AP*, 4/e

10.33 (a) ˆ 13,832 14,954y x= − + where y = the predicted mileage and x = the age in years of the cars. (b) For each year older the car is, we predict that the mileage will increase by 14,954 miles. (c) For a 10-year-old car we predict a mileage of ˆ 13,832 14,954(10) 135,708y = − + = so the residual is 110,000 135,708 25,708− = − miles. 10.34 (a) 77% of the variation in mileage is explained by the linear relationship with age of the vehicle. (b) The regression equation goes through the point ( ),x y so

( )13,832 14,954 13,832 14,954 8 105,800y x= − + = − + = miles. (c) The typical residual is 22,723 miles. (d) No, it would not be reasonable to use the least-squares line to predict a car’s mileage from its age for a teacher. The least-squares line is based on a sample of cars owned and driven by students, not teachers. Section 10.2 Check Your Understanding, page 632: 1. The shape of the distribution is Normal because both population distributions are Normal.

2. 1 2 1 2 27 17 10x xµ µ µ− = − = − = ounces and ( ) ( )

1 2

2 22 21 2

1 2

0.8 0.50.205

20 25x x n nσ σσ − = + = + = ounces.

3. ( ) ( )1 212 1012 9.76 0.0.205

P x x P z P z− − > = > = > ≈

4. Since the probability in question 3 is so small, we would be very surprised to find a difference of 12 ounces. Page 638: 1. State: Our parameters of interest are 1µ = the true mean price of wheat in July and 2µ = the true mean price of wheat in September. We want to estimate the difference 1 2µ µ− at a 99% confidence level. Plan: We should use a two-sample t interval for 1 2µ µ− if the conditions are satisfied. Random: Both samples were selected randomly. Normal: Both sample sizes were larger than 30. Independent: Both samples are less than 10% of their respective populations (there are more than 900 wheat producers in July and 450 wheat producers in September). The conditions are met. Do: From the data we find that

1 90,n = 1 2.95,x = 1 0.22,s =

2 45,n = 2 3.61,x = and 2 0.19.s = We will use the conservative degrees of freedom which is 44 in this case.

So our 99% confidence interval is

( ) ( ) ( ) ( )2 20.22 0.19

2.95 3.61 2.692 0.66 0.099 0.759, 0.561 .90 45

− ± + = − ± = − − Conclude: We are 99%

confident that the interval from –0.759 to –0.561 captures the true difference in July and September This suggests that the mean price of wheat in July is between $0.561 and $0.759 per bushel less than it is in September. Check Your Understanding, page 644: 1. State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H µ µ− = versus

1 2: 0aH µ µ− > where 1µ is the actual mean breaking strength at 2 weeks and 2µ is the actual mean breaking strength at 16 weeks. Plan: We should use a two-sample t test if the conditions are satisfied. Random: This was a randomized comparative experiment. Normal: Since the number of observations in both groups is less than 30 we examine the data. The dotplot below gives the data for both groups. Neither group displays strong skewness or large outliers. Independent: Due to the random assignment,

Page 11: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

Chapter 10: Comparing Two Proportions 231

these two groups of pieces of cloth can be viewed as independent. Individual observations in each group should also be independent: knowing one piece of cloth’s breaking strength gives no information about the breaking strength of another piece of cloth. The conditions are met.

Do: From the data we find that 1 5,n = 1 123.8,x = 1 4.60,s =

2 5,n = 2 116.4,x = and 2 16.09.s = We will

use the conservative degrees of freedom which is 4 in this case. The test statistic is

(123.8 116.4) 00.989.

2 2(4.60) (16.09)5 5

t− −

= =

+

Since this is a one-sided test the P-value is ( )0.989 0.1893.P t > =

Conclude: Since the P-value is greater than 0.05, we fail to reject the null hypothesis. We do not have enough evidence to conclude that there is a difference in the actual mean breaking strength of polyester fabric that is buried for 2 weeks and fabric that is buried for 16 weeks. Exercises, page 652: 10.35 (a) The distribution of M B− is Normal with mean 188 170 18M B M Bµ µ µ− = − = − = mg/dl and

standard deviation ( ) ( )2 22 2 41 30 50.80M B M Bσ σ σ− = + = + = mg/dl. (b)

( ) ( )0 180 0.35 0.3632.50.80

P M B P z P z− − < = < = < − =

10.36 (a) The distribution of M W− is Normal with mean 69.3 64.5 4.8M W M Wµ µ µ− = − = − = inches

and standard deviation ( ) ( )2 22 2 2.8 2.5 3.75M W M Wσ σ σ− = + = + = inches. (b)

( ) ( )2 4.82 0.75 0.7734.3.75

P M W P z P z− − > = > = > − =

10.37 (a) The distribution of M Bx x− is Normal with mean 188 170 18

M Bx x M Bµ µ µ− = − = − = mg/dl and

standard deviation ( ) ( )2 22 2 41 309.60

25 36M B

M Bx x

M Bn nσ σσ − = + = + = mg/dl. (b)

( ) ( )0 180 1.88 0.0301.9.60M BP x x P z P z− − < = < = < − =

(c) Yes. The likelihood that the sample mean

of the boys is greater than the sample mean of the men is only 3%. 10.38 (a) The distribution of M Wx x− is Normal with mean

69.3 64.5 4.8M Wx x M Wµ µ µ− = − = − = inches and standard deviation

( ) ( )2 222 2.8 2.51.09

16 9M W

WMx x

M Wn nσσσ − = + = + = inches. (b)

Page 12: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

232 The Practice of Statistics for AP*, 4/e

( ) ( )2 4.82 2.57 0.9949.1.09M WP x x P z P z− − ≥ = ≥ = ≥ − =

(c) No. It is almost certain (a 99.49%

chance) that the sample mean height for young women is more than 2 inches less than the sample mean height for the young men. 10.39 No. The Normal condition is not met with this data set. There are fewer than 30 observations in each group and the stemplot for Males shows skewness. 10.40 Yes. The conditions are met. Even though there is an outlier in the South African distribution, the two sample sizes are large enough to make the two-sample t procedures fairly accurate. 10.41 No. The Independent condition is not met in this data set. We have data from more than 10% of Islamic nations. 10.42 No. The Random condition was not met in this study. The words chosen from each article were the first words (either 100 or 200) in the article. It may be that the word length differs in different locations in the articles. 10.43 (a) The centers of the two groups seem to be quite different with people drinking red wine generally having more polyphenol in their blood . The spreads, however, are approximately the same. (b) State: Our parameters of interest are 1µ = the actual mean polyphenol level in the blood of people like those in the study after drinking red wine and 2µ = the actual mean polyphenol level in the blood people like those in the study after drinking. We want to estimate the difference 1 2µ µ− at a 90% confidence level. Plan: We should use a two-sample t interval for 1 2µ µ− if the conditions are satisfied. Random: This was a randomized experiment. Normal: Both sample sizes were less than 30. The dotplots given in the problem do not indicate serious skewness or outliers. Independent: Due to the random assignment, these two groups of men can be viewed as independent. Individual observations in each group should also be independent: knowing one man’s polyphenol level gives no information about another man’s polyphenol level. The conditions are met. Do: From the data we find that

1 9,n = 1 5.5,x = 1 2.517,s =

2 9,n = 2 0.23,x = and 2 3.292.s = We will use the conservative degrees of freedom which is 8 in this case.

So our 90% confidence interval is

( ) ( ) ( ) ( )2 22.517 3.292

5.5 0.23 1.860 5.27 2.569 2.701,7.839 .9 9

− ± + = ± = Conclude: We are 90%

confident that the interval from 2.701 to 7.839 captures the difference in actual mean polyphenol level in men who drink red wine and men who drink white wine. This interval suggests that men who drink red white have mean change in polyphenol level between a 2.l701 and 7.839 higher than those who drink white wine. (c) Since this interval does not contain 0, ikt does support the researcher’s belief that the polyphenol level is different for men who drink red wine than for those who drink white wine. 10.44 (a) The centers of the two groups seem to be quite different, with red flowers being longer. The red flowers also seem to have more variability to their lengths. (b) State: Our parameters of interest are

1µ = the actual mean length of red flowers and 2µ = the actual mean length of yellow flowers. We want to estimate the difference 1 2µ µ− at a 95% confidence level. Plan: We should use a two-sample t interval for 1 2µ µ− if the conditions are satisfied. Random: Both samples were randomly selected. Normal: Both sample sizes were less than 30. However, the dotplots given in the problem do not indicate serious skewness or large outliers. Independent: Both samples are less than 10% of their

Page 13: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

Chapter 10: Comparing Two Proportions 233

respective populations (there are more than 230 red flowers and 150 yellow flowers). The conditions are met. Do: From the data we find that 1 23,n = 1 39.698,x = 1 1.786,s =

2 15,n = 2 36.18,x = and

2 0.975.s = We will use the conservative degrees of freedom which is 14 in this case. So our 95%

confidence interval is 2 2(1.786) (0.975)

(39.698 36.18) 2.145 3.518 0.964 (2.554, 4.482).23 15

− ± + = ± =

Conclude: We are 95% confident that the interval from 2.554 to 4.482 captures the difference in actual mean length of red flowers and yellow flowers. This suggests that the mean length of red flowers is between 2.554 mm and 4.482 mm larger than the mean length of the yellow flowers. (c) Since 0 is not in this interval, it does support the researchers’ belief that the two varieties have different lengths. 10.45 (a) The distributions are skewed to the right because the earnings amounts cannot be negative, yet the standard deviation is almost as large as the distance between the mean and 0. The use of the two-sample t procedures is still justified because the t procedures are robust against non-Normality in the populations with such large sample sizes. (b) State: Our parameters of interest are 1µ = the actual mean summer earnings of male students and 2µ = the actual mean summer earnings of female students. We want to estimate the difference 1 2µ µ− at a 90% confidence level. Plan: We should use a two-sample t interval for 1 2µ µ− if the conditions are satisfied. Random: Both samples were randomly selected. Normal: Both sample sizes were at least 30. Independent: Both samples are less than 10% of their respective populations (since this was a large university, there are likely more than 6750 males with summer jobs and 6210 females with summer jobs). The conditions are met. Do: From the data we find that 1 675,n = 1 1884.52,x = 1 1368.37,s =

2 621,n = 2 1360.39,x = and 2 1037.46.s = We will use the

conservative degrees of freedom which is 620 in this case. So our 90% confidence interval is

( ) ( ) ( ) ( )2 21368.37 1037.46

1884.52 1360.39 1.647 524.13 110.572 413.558,634.702 .675 621

− ± + = ± =

Conclude: We are 90% confident that the interval from 413.558 to 634.702 captures the actual difference in mean summer earnings of male students and female students. This suggests that male students have mean summer earnings between $413.56 and $634.70 higher than female students. (c) It we repeatedly took random samples of 675 males and 621 females from this university and each time constructed a 90% confidence interval in this same way, about 90% of the resulting intervals would capture the actual difference in mean earnings. 10.46 (a) The use of the two-sample t procedure is still justified because the t procedures are robust against non-Normality in the populations with such large samples. (b) State: Our parameters of interest are 1µ = the actual mean reliability rating of Anglo customers and 2µ = the actual mean reliability rating of Hispanic customers. We want to estimate the difference 1 2µ µ− at a 95% confidence level. Plan: We should use a two-sample t interval for 1 2µ µ− if the conditions are satisfied. Random: Both samples were randomly selected. Normal: Both sample sizes were at least 30. Independent: Both samples are less than 10% of their respective populations (there are likely more than 920 Anglo customers of the bank and 860 Hispanic customers of the bank). The conditions are met. Do: From the data we find that

1 92,n = 1 6.37,x = 1 0.60,s =

2 86,n = 2 5.91,x = and 2 0.93.s = We will use the conservative degrees of freedom which is 85 in this case.

So our 95% confidence interval is

( ) ( ) ( ) ( )2 20.60 0.93

6.37 5.91 1.988 0.46 0.235 0.225,0.695 .92 86

− ± + = ± = Conclude: We are 95%

confident that the interval from 0.225 to 0.695 captures the actual difference in mean reliability rating for Anglos and Hispanics. This suggests that the mean reliability rating for Anglos is between 0.225 and

Page 14: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

234 The Practice of Statistics for AP*, 4/e

0.695 higher than the mean reliability rating for Hispanics. (c) If we repeatedly took random samples of 92 Anglos and 86 Hispanics and each time constructed a 95% confidence interval in this same way, about 95% of the resulting intervals would capture the actual difference in mean reliability rating. 10.47 (a) Answers will vary. Assign each day a number form 01 to 60. Move from left to right, looking at pairs of digits. The first 30 distinct pairs between 01 and 60 tell which day Design A will be used. Use Design B on the remaining days. The first three days are 24, 55, and 21. (b) Use a two-sided alternative ( 0 : A BH µ µ= versus :a A BH µ µ≠ ), because we (presumably) have no prior suspicion that one design will be better than the other. (c) Both sample sizes are the same ( 1 2 30n n= = ), so the appropriate degrees of freedom would be df = 30 − 1 = 29. (d) Because 2.045 < t < 2.150, and the alternative is two-sided, Table B tells us that 0.04 < P-value < 0.05. (Software gives P = 0.0485.) We would reject 0H and conclude that there is a difference in the actual mean daily sales for the two designs. 10.48 (a) This is a two-sample test because we are comparing two groups of female birds to each other (the food-supplemented group and the control group). (b) The null hypothesis here is that the mean number of days until breeding is the same for both treatments. The smaller of the two groups had only 6 birds in it, so the number of degrees of freedom for the test would be 5. This would be a two-sided test (we don’t have a suspected difference that we are looking for) so the P-value is 0.342. This is not enough evidence to reject the null hypothesis, so the idea that the mean time to breeding is the same for both treatments is plausible. 10.49 State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H µ µ− = versus

1 2: 0aH µ µ− ≠ where 1µ is the actual mean time to breeding for the birds relying on natural food supply and 2µ is the actual mean time to breeding for birds with food-supplementation. Plan: We should use a two-sample t test if the conditions are satisfied. Random: This was a randomized comparative experiment. Normal: Since the number of observations in both groups is less than 30 we examine the data. The comparative dotplot below displays the data for both groups. Neither distribution displays strong skewness or outliers. Independent: Due to the random assignment, these two groups of birds can be viewed as independent. Individual observations in each group should also be independent: knowing one bird’s time to breeding gives no information about another bird’s time to breeding. The conditions are met.

Do: From the data we find that 1 6,n = 1 4.0,x = 1 3.11,s =

2 7,n = 2 11.3,x = and 2 3.93.s = We will use

the conservative degrees of freedom which is 5 in this case. The test statistic is

( )( ) ( )2 2

4.0 11.3 03.736.

3.11 3.936 7

t− −

= = −

+

Since this is a two-sided test the P-value is

( ) ( )2 3.736 2 0.0067 0.0134.P t < − = = Conclude: Since the P-value is less than 0.05, we reject the null hypothesis. We have enough evidence to conclude that there is a difference in the mean time to breeding for birds relying on natural food supply and birds with food supplements.

Page 15: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

Chapter 10: Comparing Two Proportions 235

10.50 (a) These are paired t statistics: For each bird, the number of days behind the caterpillar peak was observed, and the t values were computed based on the pairwise differences between the first and second years. (b) For the control group, df = 5, and for the supplemented group, df = 6. (c) The control t is not significant (so the birds in that group did not “advance their laying date in the second year”), while the supplemented group t is significant with a one-sided P-value = 0.0195 (so those birds appear to change their laying date). 10.51 (a) The score distribution for the activities group is slightly left-skewed with a greater mean and smaller standard deviation ( 1x =51.48, 1 11.01s = ) than the control group ( 2x =41.52, 2 17.15s = ) which has a more symmetrical DRP score distribution. (b) State: We want to perform a test at the

0.05α = significance level of 0 1 2: 0H µ µ− = versus 1 2: 0aH µ µ− > where 1µ is the actual mean DRP score for third grade students like those who do the activities and 2µ is the actual mean DRP score for those don’t do the activities. Plan: We should use a two-sample t test if the conditions are satisfied. Random: This was a randomized comparative experiment. Normal: Since the number of observations in both groups is less than 30 we examine the data. From the boxplots we see that neither distribution displays strong skewness or outliers. Independent: Due to the random assignment, these two groups of students can be viewed as independent. Individual observations in each group should also be independent: knowing one student’s DRP score gives no information about another student’s DRP score. The conditions are met. Do: From the data we find that 1 21,n = 1 51.48,x = 1 11.01,s =

2 23,n =

2 41.52,x = and 2 17.15.s = We will use the conservative degrees of freedom which is 20 in this case. The

test statistic is ( )( ) ( )2 2

51.48 41.52 02.312.

11.01 17.1521 23

t− −

= =

+

Since this is a one-sided test the P-value is

( )2.312 0.0158.P t > = Conclude: Since the P-value is less than 0.05, we reject the null hypothesis. We have enough evidence to conclude that there is a difference in the actual mean DRP scores of third graders like those who do the activities and those who don’t do the activities. (c) Since this was a randomized controlled experiment we can conclude that the activities caused an increase in the mean DRP score. (d) State: We want to estimate the difference 1 2µ µ− at a 95% confidence level. Plan: We should use a two-sample t interval for 1 2µ µ− if the conditions are satisfied. We checked the conditions in part (b) and they were met. Do:

Using 20 df, our 95% confidence interval is

( ) ( ) ( ) ( )2 211.01 17.15

51.48 41.52 2.086 9.96 8.99 0.97,18.95 .21 23

− ± + = ± = Conclude: We are 95%

confident that the interval from 0.97 to 18.95 captures the difference in actual mean DRP scores for those students learning with the activities and those learning without the activities. This interval not only addresses the plausibility of the two means being the same, but also give a range of plausible values for the difference in the two means. 10.52 (a) Breast-feeding mothers have a lower mean mineral content ( 1x = −3.587, 1 2.506s = ) with more variability than other mothers ( 2x = 0.309, 2 1.298s = ). Both distributions appear slightly right-skewed. (b) State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H µ µ− = versus 1 2: 0aH µ µ− < where 1µ is the actual mean percent change in mineral content for breastfeeding women and 2µ is the actual mean percent change in mineral content for women who were neither pregnant nor lactating. Plan: We should use a two-sample t test if the conditions are satisfied. Random: Both samples were selected randomly. Normal: Since the number of observations in the control group is less than 30 we check the boxplots. Neither group displays strong skewness or any outliers. Independent:

Page 16: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

236 The Practice of Statistics for AP*, 4/e

Both samples are less than 10% of their respective populations (there are more than 470 breastfeeding women and 220 non-pregnant and non-lactating women). The conditions are met. Do: From the data we find that 1 47,n = 1 3.587,x = − 1 2.506,s =

2 22,n = 2 0.309,x = and 2 1.298.s = We will

use the conservative degrees of freedom which is 21 in this case. The test statistic is

( 3.587 0.309) 08.498.

2 2(2.506) (1.298)47 22

t− − −

= = −

+

Since this is a one-sided test the P-value is ( )8.498 0.P t < − ≈

Conclude: Since the P-value is less than 0.05, we reject the null hypothesis. We have enough evidence to conclude that breastfeeding women have a larger mean percent bone mineral loss than women who are neither pregnant nor lactating. (c) Since this was not a randomized controlled experiment we cannot conclude that breastfeeding causes bone mineral loss. (d) State: We want to estimate the difference

1 2µ µ− at a 95% confidence level. Plan: We should use a two-sample t interval for 1 2µ µ− if the conditions are satisfied. We checked the conditions in part (b) and they were met. Do:

Using 21 df, our

95% confidence interval is

( ) ( ) ( ) ( )2 22.506 1.298

3.587 0.309 2.080 3.896 0.954 4.85, 2.942 .47 22

− − ± + = − ± = − − Conclude: We are

95% confident that the interval from -4.85 to -2.942 captures the difference in actual mean percent of bone mineral loss in breastfeeding women and non-pregnant and non-lactating women. This interval suggests that the mean percent of bone3 mineral loss in breastfeeding women is between 4.85% and 2.942% more than for women who are neither pregnant nor lactating. This interval not only addresses the plausibility of the two means being the same, but also gives a range of plausible values for the difference in the two means. 10.53 State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H µ µ− = versus

1 2: 0aH µ µ− ≠ where 1µ is the actual mean number of words spoken per day by female students and 2µ is the actual mean number of words spoken per day by male students. Plan: We should use a two-sample t test if the conditions are satisfied. Random: Both samples were selected randomly. Normal: Both samples had more than 30 observations. Independent: Both samples are less than 10% of their respective populations (there are more than 560 female students at a large university and 560 male students at a large university). The conditions are met. Do: From the data we find that

1 56,n = 1 16,177,x = 1 7520,s =

2 56,n = 2 16,569,x = and 2 9108.s = We will use the conservative

degrees of freedom which is 55 in this case. The test statistic is

( )( ) ( )2 2

16177 16569 00.248.

7520 910856 56

t− −

= = −

+

Since

this is a two-sided test the P-value is ( ) ( )2 0.248 2 0.4025 0.8050.P t < − = = Conclude: Since the P-value is greater than 0.05, we fail to reject the null hypothesis. We do not have enough evidence to conclude that male students and female students speak a different number of words per day on average. (b) If males and females speak the same number of words per day on average, then we have about an 80% chance of selecting a sample where the difference between the average number of words spoken per day by males and females is as large as or larger than the difference we actually saw. 10.54 State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H µ µ− = versus

1 2: 0aH µ µ− ≠ where 1µ is the actual mean height of the second spike in rats given DDT and 2µ is the actual mean height of the second spike in control rats. Plan: We should use a two-sample t test if the conditions are satisfied. Random: This was a randomized controlled experiment. Normal: Both

Page 17: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

Chapter 10: Comparing Two Proportions 237

samples had less than 30 observations. The dotplot below shows that neither group has strong skewness or outliers. Independent: Due to the random assignment, these two groups of rats can be viewed as independent. Individual observations in each group should also be independent: knowing the height of one rat’s second spike gives no information about the height of another rat’s second spike. The conditions are met.

Do: From the computer output, 2.9912.t = The computer is using 5.9 degrees of freedom and reports the P-value to be 0.0247. This is the P-value for the two-sided test. Conclude: Since the P-value is less than 0.05, we reject the null hypothesis. We have enough evidence to conclude that the actual mean height of the second spike is different for rats poisoned with DDT and control rats. (b) If the actual mean height of the second spike were the same for these two populations of rats, there would be about a 2.47% chance of observing a difference between the means as large as or larger than the one in this study. 10.55 State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H µ µ− = versus

1 2: 0aH µ µ− > where 1µ is the actual mean knee velocity for skilled rowers and 2µ is the actual mean knee velocity for novice rowers. Plan: We should use a two-sample t test if the conditions are satisfied. Random: Both samples were randomly selected. Normal: Both samples had less than 30 observations. But we are told that there were no outliers or strong skewness. Independent: Both samples are less than 10% of their respective populations (there are more than 100 skilled rowers and 80 novice rowers). The conditions are met. Do: From the computer output, 3.1583.t = The computer is using 9.8 degrees of freedom and reports the P-value to be 0.0104. This is the P-value for the two-sided test. We are doing a one-sided test, so the P-value is half of that, or 0.0052. Conclude: Since the P-value is less than 0.05, we reject the null hypothesis. We have enough evidence to conclude that the mean knee velocity is greater for skilled rowers than for novice rowers. (b) State: We want to estimate the difference 1 2µ µ− at a 90% confidence level. Plan: We should use a two-sample t interval for 1 2µ µ− if the conditions are satisfied. We checked the conditions in part (a) and they were met. Do:

Using a TI-84, the interval is

computed to be ( )0.497,1.847 . This computation uses 9.77 degrees of freedom. Conclude: We are 90% confident that the interval from 0.497 to 1.847 captures the difference in actual mean knee velocity for skilled rowers and novice rowers. This interval suggests that skilled rowers have a mean knee velocity that is between 0.497 and 1.847 higher than for novice rowers. (c) If we had used Table B, we would have used the conservative degrees of freedom which, in this case, would have been 7. This would have led to a larger t* and therefore a slightly wider interval.

10.56 (a) The missing t statistic is ( ) ( )2 2

70.37 68.45 0.5143.6.10 9.04

10 8

t −= =

+

(b) State: We want to perform a

test at the 0.05α = significance level of 0 1 2: 0H µ µ− = versus 1 2: 0aH µ µ− ≠ where 1µ is the actual mean weight of skilled rowers and 2µ is the actual mean weight of novice rowers. Plan: We should use a two-sample t test if the conditions are satisfied. Random: Both samples were randomly selected. Normal: Both samples had less than 30 observations. But we are told that there were no outliers or strong skewness. Independent: Both samples are less than 10% of their respective populations (there are more than 100 skilled rowers and 80 novice rowers). The conditions are met. Do: From part (a), 0.5143.t = The computer is using 11.8 degrees of freedom and reports the P-value to be 0.6165. This is the P-value

Page 18: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

238 The Practice of Statistics for AP*, 4/e

for the two-sided test. Conclude: Since the P-value is greater than 0.05, we fail to reject the null hypothesis. We do not have enough evidence to conclude that the mean weight of skilled rowers is different from the mean weight of novice rowers. (c) If we had used the conservative number of degrees of freedom, the degrees of freedom would have been less. This means that the P-value would have been greater than the P-value computed by technology. 10.57 (a) The researchers randomly assigned the subjects to the two groups to help balance out the effects of lurking variables. (b) Based on the dotplot from Fathom, a difference of 4.15 between the means is quite rare. Only about 5 out of the 1000 differences were that big. We would conclude that the mean rating for those with internal reasons is significantly higher than for those with external reasons. (c) Since we rejected the null hypothesis (of no difference), this could have been a Type I error – rejecting the null hypothesis when it is really true. 10.58 (a) If people were allowed to choose which group they wanted to be in, it is likely that all those who choose one particular treatment (sleep deprivation, for example) might be systematically different from those who choose to be in the other treatment group. (b) Based on the dotplot from Fathom, a difference of 15.92 between the means is quite rare. Only about 5 out of the 1000 differences were that big. We would conclude that the mean increase in score is significantly higher for those who were allowed to sleep than for those who were sleep deprived. (c) Since we rejected the null hypothesis (of no difference), this could have been a Type I error – rejecting the null hypothesis when it is really true. 10.59 (a) Two-sample t test. Each car has a different brand of tire on it. There is no obvious way to pair one observation of a Brand A tire with one observation of a Brand B tire. (b) Paired t test. The subjects are each subjected to both treatments. So we would take the differences in productivity for each subject from when they listened to music and when they didn’t. (c) Two-sample t test. Each person was only given one treatment. Then the two groups were compared. 10.60 (a) Paired t test. Pairs of pigs who were littermates were used, with one in each pair getting one treatment and the other pig in the pair getting the other treatment. (b) Two-sample t test. There is no connection between the male and female professors. They are not paired in any way. (c) Two-sample t test. The treatments were randomly assigned to the plots. There is no way to pair a plot with one treatment with a plot with the other treatment. 10.61 One-sample z interval for a proportion. 10.62 Paired t test for the mean difference. 10.63 Paired t interval for the mean difference. 10.64 Two-sample z interval for the difference in proportions. 10.65 (a) State: We want to perform a test at the 0.05α = significance level of 0 1 2: 10H µ µ− = versus

1 2: 10aH µ µ− > where 1µ is the actual mean cholesterol reduction for people like the ones in the study when using the new drug and 2µ is the actual mean cholesterol reduction for people like the ones in the study when using the current drug. Plan: We should use a two-sample t test if the conditions are satisfied. Random: This was a randomized controlled experiment. Normal: Both samples had less than 30 observations. But we are told that no strong skewness or outliers were detected. Independent: Due to the random assignment, these two groups of patients can be viewed as independent. Individual observations in each group should also be independent: knowing one patient’s reduction in cholesterol gives no information about another patient’s reduction in cholesterol. The conditions are met. Do: From

Page 19: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

Chapter 10: Comparing Two Proportions 239

the data we find that 1 15,n = 1 68.7,x = 1 13.3,s =

2 14,n = 2 54.1,x = and 2 11.93.s = We will use the conservative degrees of freedom which is 13 in this case.

The test statistic is

(68.7 54.1) 100.982.

2 2(13.3) (11.93)15 14

t− −

= =

+

Since this is a one-sided test the P-value is ( )0.982 0.1720.P t > =

Conclude: Since the P-value is greater than 0.05, we fail to reject the null hypothesis. We do not have enough evidence to conclude that mean cholesterol reduction is more than 10 mg/dl more for the new drug than for the current drug. (b) Since we failed to reject the null hypothesis, we could have committed a Type II error (fail to reject the null hypothesis when it is actually false). 10.66 (a) State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0.5H µ µ− = versus

1 2: 0.5aH µ µ− > where 1µ is the actual mean amount of water used in the current-model toilets and 2µ is the actual mean amount of water used in the new model of toilets. Plan: We should use a two-sample t test if the conditions are satisfied. Random: Both samples were randomly selected. Normal: Both samples had 30 observations. Independent: Both samples are less than 10% of their respective populations (there are more than 300 current-model toilets and 300 new-model toilets). The conditions are met. Do: From the data we find that 1 30,n = 1 1.64,x = 1 0.29,s =

2 30,n = 2 1.09,x = and 2 0.18.s =

We will use the conservative degrees of freedom which is 29 in this case. The test statistic is

( )( ) ( )2 2

1.64 1.09 0.50.802.

0.29 0.1830 30

t− −

= =

+

Since this is a one-sided test the P-value is ( )0.802 0.2145.P t > =

Conclude: Since the P-value is greater than 0.05, we fail to reject the null hypothesis. We do not have enough evidence to conclude that the current-model toilet uses an average of 0.50 gallons of water more per flush than the new-model toilet. (b) Since we failed to reject the null hypothesis, we could have committed a Type II error (fail to reject the null hypothesis when it is actually false). 10.67 d. 10.68 a. 10.69 a. 10.70 b. 10.71 Jannie treated the data as if it she wanted to create a confidence interval for the difference between two proportions. In fact, she wants to compute a confidence interval for just one proportion: The proportion of students taking the SAT twice who were coached. This confidence interval is

( ) ( )0.135 0.8650.135 2.576 0.135 0.016 0.119,0.151 .

3160± = ± =

10.72 (a) The appropriate test is the paired t test because we have paired data (two scores for each student). (b) State: We want to perform a test of 0 : 0dH µ = versus : 0a dH µ > where dµ is the actual mean increase in SAT verbal scores of students who were coached. We will perform that test at the

0.05α = significance level. Plan: If conditions are met, we should do a one-sample t test for the population mean dµ . Random: The sample was randomly selected. Normal: The sample size was at least 30. Independent: There are more than 4270 students taking the SAT for a second time who are

Page 20: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

240 The Practice of Statistics for AP*, 4/e

coached so the sample is less than 10% of the population. All conditions have been met. Do: The sample mean and standard deviation of the differences are: 29dx = and 59.ds = The corresponding test

statistic is 29 0 10.16.59427

t −= =

Since this is a one-sided test, the P-value is ( )10.16 0.P t > ≈ Conclude:

Since our P-value is less than 0.05, we reject the null hypothesis. It appears that students who have been coached increase their scores on the SAT verbal test significantly. (c) State: We want to estimate the true mean increase in SAT verbal score dµ for students who were coached at a 99% confidence level. Plan: We should construct a one-sample t-interval if the conditions are met. We checked the conditions in part (a) and they were met. Do: Using 426 df, the confidence interval is

( )5929 2.587 29 7.386 21.614,36.386 .427

± = ± =

Conclude: We are 99% confident that the interval

from 21.614 to 36.386 captures the true mean increase in SAT verbal scores for students who are coached. 10.73 (a) State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H µ µ− = versus

1 2: 0aH µ µ− > where 1µ is the actual mean increase in SAT verbal scores for coached students and 2µ is the actual mean increase in SAT verbal scores for students who were not coached. Plan: We should use a two-sample t test if the conditions are satisfied. Random: Both samples were randomly selected. Normal: Both samples had at least 30 observations. Independent: Both samples are less than 10% of their respective populations (there are more than 4270 students taking the SAT twice who were coached and 27,330 students taking the SAT twice who were not coached). The conditions are met. Do: From the data we find that 1 427,n = 1 29,x = 1 59,s =

2 2733,n = 2 21,x = and 2 52.s = We will use the

conservative degrees of freedom which is 426 in this case. The test statistic is

( )( ) ( )2 2

29 21 02.646.

59 52427 2733

t− −

= =

+

Since this is a one-sided test the P-value is ( )2.646 0.0042.P t > =

Conclude: Since the P-value is less than 0.05, we reject the null hypothesis. We have enough evidence to conclude that students who receive coaching had a higher mean increase in scores on the SAT verbal exam than those who did not receive coaching. (b) State: We want to estimate the difference 1 2µ µ− at a 99% confidence level. Plan: We should use a two-sample t interval for 1 2µ µ− if the conditions are satisfied. We checked the conditions in part (a) and they were met. Do:

Using 426 df, our 99%

confidence interval is ( ) ( ) ( ) ( )2 259 52

29 21 2.587 8 7.822 0.178,15.822 .427 2733

− ± + = ± = Conclude: We are

99% confident that the interval from 0.178 to 15.822 captures the difference in true mean increase in scores on the SAT verbal exam for students who were coached and students who were not coached. This interval suggests that the mean increase is between 0.178 and 15.822 points higher for those students who were coached than for those who were not coached. (c) The amount of points gained may not be very large. It does not seem like the money spent on coaching is worth it. 10.74 (a) If those people who are not responding are fundamentally different from those who do respond, we may be missing important information. (b) This was an observational study, not an experiment. The students (or their parents) chose whether or not to be coached; students who choose coaching might have other motivating factors that help them do better the second time. For example, perhaps students who choose coaching have some personality trait that also compels them to try harder the second time.

Page 21: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

Chapter 10: Comparing Two Proportions 241

10.75 (a) By the 68–95–99.7 rule, 95% of all observations fall within the interval 2µ σ− to 2µ σ+ . Thus, 5% of all observations will fall outside of this interval. Let X= the number of samples means that fall outside of two standard deviations from the mean. X is B(2, 0.05) so ( ) ( )1 1 0 1 0.9025 0.0975.P X P X≥ = − = = − = (b) By the 68–95–99.7 rule, 95% of all observations fall

within the interval 2µ σ− to 2µ σ+ . Thus, 2.5% (half of 5%) of all observations will fall above 2 .µ σ+ Let X = the number of samples that must be taken before we observe one falling above 2 .µ σ+ Then X is

geometric with p = 0.025. Thus, ( ) ( ) ( )34 1 0.025 0.025 0.0232.P X = = − = (c) By the 68–95–99.7 rule, the probability of any one observation falling within the interval µ σ− to µ σ+ is about 0.68. Let X = the number of sample means out of 5 that fall within this interval. Assuming that the samples are independent, X is B(5, 0.68). We want the probability that at least 4 of the 5 sample means fall outside of this interval. This means we are looking for ( ) ( )1 binomcdf 5,0.68,1 0.039.P X ≤ = = This is a good criteria. If the process is under control, only 4% of the time would we conclude that it wasn’t. 10.76 (a) State: We want to estimate the actual proportion of all adults who use the internet at a 95% confidence level. Plan: We should use a one-sample z-interval if the conditions are satisfied. Random: the adults were selected randomly. Normal: there were 1318 successes (use the internet) and 774 failures (do not use internet). Both are at least 10. Independent: the sample is less than 10% of the population of all adults. The conditions are met. Do: A 95% confidence interval is given by

( ) ( )0.63 0.370.63 1.96 0.63 0.02 0.61,0.65 .

2092± = ± = Conclude: We are 95% confident that between 61%

and 65% of U.S. adults use the Internet. (b) State: Our parameters of interest are 1p =proportion of adult Internet users who expect businesses to have Web sites and 2p = proportion of adult non-Internet users who expect businesses to have Web sites. We want to estimate the difference 1 2p p− at a 95% confidence level. Plan: We should use a two-sample z interval if the conditions are satisfied. Random: The people were selected randomly. Normal: Both samples have at least 10 successes and failures (Internet users: 1041 successes and 277 failures. Non-Internet users: 294 successes and 480 failures). Independent: There are more than 13,180 adults who use the Internet and 7740 adults who do not use the

Internet. The conditions are met. Do: From the data we find that 1 1318,n = 11041ˆ 0.79,1318

p = = 2 774,n =

and 2294ˆ 0.38.774

p = = So our 95% confidence interval is

( ) ( ) ( ) ( )0.79 0.21 0.38 0.620.79 0.38 1.96 0.41 0.04 0.37,0.45 .

1318 774− ± + = ± = Conclude: We are 95%

confident that the interval from 0.37 to 0.45 captures the difference in actual proportion of Internet users and non-Internet users who expect businesses to have a web site. This interval suggests that between 37% and 45% more internet users than non-internet users expect businesses to have a Web site.

Page 22: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

242 The Practice of Statistics for AP*, 4/e

Chapter Review Exercises (page 661) R10.1 (a) Paired t test for the mean difference. (b) Two-sample z interval for the difference in proportions. (c) One-sample t interval for the mean. (d) Two-sample t interval for the difference between two means. R10.2 (a) This is an observational study because the researchers simply observed the random samples of women; drivers were not assigned to the cities. (b) State: Our parameters of interest are 1p =proportion of Hispanic female drivers in New York who wear seat belts and 2p = proportion of Hispanic female drivers in Boston who wear seat belts. We want to estimate the difference 1 2p p− at a 95% confidence level. Plan: We should use a two-sample z interval for 1 2p p− if the conditions are satisfied. Random: The women were selected randomly. Normal: Both samples have at least 10 successes and failures (New York: 183 successes and 37 failures. Boston: 68 successes and 49 failures). Independent: There are more than 2200 Hispanic women drivers in New York and 1170 Hispanic women drivers in Boston. The

conditions are met. Do: From the data we find that 1 220,n = 1183ˆ 0.832,220

p = = 2 117,n = and

268ˆ 0.581.

117p = = So our 95% confidence interval is

( ) ( ) ( ) ( )0.832 0.168 0.581 0.4190.832 0.581 1.96 0.251 0.102 0.149,0.353 .

220 117− ± + = ± = Conclude: We

are 95% confident that the interval from 0.149 to 0.353 captures the difference in actual proportions of Hispanic women drivers in New York and Boston who wear their seat belts. This interval suggests that between 14.9% and 35.3% more Hispanic women in New York than Hispanic women in Boston wear their seat belts. (c) Since 0 is not in the interval, we have good evidence that a smaller proportion of Hispanic women in Boston wear their seat belts. R10.3 (a) The Random and Independent conditions are met because this is a randomized comparative experiment. (b) The sample sizes are large enough, 1 2 45n n= = , that the averages will be approximately Normal, so the fact that the individual responses do not follow a Normal distribution has little effect on the reliability of the t procedure. (c) If there were no difference in the ratings of the product under the two treatments, we would have less than a 1% chance of observing a difference as large as or larger than the one in this experiment. R10.4 State: We want to perform a test at the 0.01α = significance level of 0 1 2: 0H µ µ− = versus

1 2: 0aH µ µ− ≠ where 1µ is the actual mean NAEP quantitative skills test score for men and 2µ is the actual mean NAEP quantitative skills test score for women. Plan: We should use a two-sample t test if the conditions are satisfied. Random: Both samples were randomly selected. Normal: Both samples had at least 30 observations. Independent: Both samples are less than 10% of their respective populations (there are more than 8400 males who took the NAEP and 10,770 women who took the NAEP). The conditions are met. Do: From the data we find that 1 840,n = 1 272.40,x = 1 59.2,s =

2 1077,n = 2 274.73,x = and 2 57.5.s = We will use the conservative degrees of freedom which is 839 in

this case. The test statistic is

( )( ) ( )2 2

272.4 274.73 00.865.

59.2 57.5840 1077

t− −

= = −

+

Since this is a two-sided test the P-value

is ( ) ( )2 0.865 2 0.1936 0.3872.P t < − = = Conclude: Since the P-value is greater than 0.01, we fail to

Page 23: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

Chapter 10: Comparing Two Proportions 243

reject the null hypothesis. We do not have enough evidence to conclude that the males and females have different mean scores on the NAEP quantitative skills test. R10.5 (a) State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H p p− = versus

1 2: 0aH p p− < where 1p is the actual proportion of patients taking AZT who develop AIDS and 2p is the actual proportion of patients taking placebo who develop AIDS. Plan: We should use a two-sample z test for 1 2p p− if the conditions are satisfied. Random: This was a randomized comparative experiment. Normal: The number of successes and failures in both groups are at least 10 (AZT: 17 successes, 418 failures. Placebo: 38 successes, 397 failures). Independent: Due to the random assignment, these two groups of patients can be viewed as independent. Individual observations in each group should also be independent: knowing whether one patient developed AIDS gives no information about whether another patient developed AIDS. The conditions are met. Do: The proportions of AIDS

cases in each group are 117ˆ 0.039435

p = = and 238ˆ 0.087.435

p = = The pooled proportion is

17 38 55ˆ 0.063.435 435 870Cp +

= = =+

The test statistic is (0.039 0.087) 0

2.91.(0.063)(0.937) (0.063)(0.937)

435 435

z− −

= = −

+

Since this is a one-sided test the P-value is ( )2.91 0.0018.P z < − = Conclude: Since the P-value is less than 0.05, we reject the null hypothesis. We have enough evidence to conclude that taking AZT lowers the proportion of patients like these who develop AIDS. (b) Neither the subjects nor the researchers who had contact with them (including those determining whether subjects had AIDS) knew which subjects were getting which drug. This is important because if it was not double blind, the results could not be attributed to the treatments. (c) A Type I error is rejecting the null hypothesis when it is true. In this case that means concluding that AZT lowers the risk of developing AIDS when in fact it does not. A Type II error is failing to reject the null hypothesis when it is false. In this case that means concluding that we do not have enough evidence that AZT lowers the risk of developing AIDS when in fact it does. Answers may vary as to which one is more serious. R10.6 (a) The Normal condition is not met because there are only 7 failures in the control area. (b) The Normal condition is not met because there are two outliers in the male data. R10.7 (a) The students in the treatment group had generally higher differences in scores and the middle 50% of their differences were much more compact than the control group. The treatment group differences were also reasonably symmetric whereas the control group differences were more right skewed. (b) State: Our parameters of interest are 1µ = the mean difference in test scores for students like these who get the treatment message and 2µ = the mean difference in test scores for students like these who get the neutral message. We want to estimate the difference 1 2µ µ− at a 90% confidence level. Plan: We should use a two-sample t interval for 1 2µ µ− if the conditions are satisfied. Random: This was a randomized controlled experiment. Normal: Both samples had less than 30 observations, but neither boxplot showed strong skewness or any outliers. Independent: Due to the random assignment, these two groups of students can be viewed as independent. Individual observations in each group should also be independent: knowing one student’s difference in scores gives no information about another student’s difference in scores. The conditions are met. Do:

Using (post – pre) as our difference, we

know that 1 10,n = 1 11.4,x = 1 3.169,s =

2 8,n = 2 8.25,x = and 2 3.69.s = We will use the conservative degrees of freedom which is 7 in this case.

So our 90% confidence interval is

Page 24: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

244 The Practice of Statistics for AP*, 4/e

2 2(3.169) (3.69)(11.4 8.25) 1.895 3.15 3.12 (0.03,6.27).

10 8− ± + = ± = Conclude: We are 90%

confident that the interval from 0.03 to 6.27 captures the true difference in mean scores for those who received the treatment message and those who received the neutral message. This interval suggests that the mean difference in scores is between 0.03 and 6.27 points higher for those students who were received the subliminal message in comparison to the control group. (c) We cannot generalize to all students who failed the test because our sample was not a random sample of all students who failed the test. It was a group of students who agreed to participate in the experiment. R10.8 (a) Since both intervals are 90% intervals, that means that the *z used in creating the intervals is

1.645. The margin of error is half of the length of the interval, so it is ( )1 0.942 0.858 0.042.2

− = This

leads to (1 ) 0.9(0.1)

1.645 0.042.p p

zn n−∗ = = Solving for n we find 138.06n = which suggests that

they actually used 138.n = Using the same procedure on the other confidence interval gives 139.83.n = The differences we found between the two n values are due to rounding and we will use 139. (b) Properties of the sampling distribution of the difference 1 2ˆ ˆp p− can be obtained from properties of the individual sampling distributions used in the individual confidence intervals, but the upper and lower limits of the intervals are not directly related. (c) State: Our parameters of interest are 1p =proportion of golf clubs returned in 5 days or fewer before the changes and 2p = proportion of golf clubs returned in 5 days or fewer after the changes. We want to estimate the difference 1 2p p− at a 95% confidence level. Plan: We should use a two-sample z interval for 1 2p p− if the conditions are satisfied. Random: The clubs both before and after changes were selected randomly. Normal: Both samples have at least 10 successes and failures (Before: 22 successes and 117 failures. After: 125 successes and 14 failures). Independent: There are more than 1390 golf clubs repaired before the changes and 1390 golf clubs repaired after the changes. The conditions are met. Do: From the data we find that

1 139,n = 1ˆ 0.16,p = 2 139,n = and 2ˆ 0.90.p = So our 95% confidence interval is

( ) ( ) ( ) ( )0.90 0.10 0.16 0.840.90 0.16 1.96 0.74 0.079 0.661,0.819 .

139 139− ± + = ± = Conclude: We are 95%

confident that the interval from 0.661 to 0.819 captures the difference in actual proportions of orders sent back to customers in 5 days or fewer before the changes and after the changes. This interval suggests that between 66.1% and 81.9% more orders are being sent back to the customers in 5 days or less after the changes.

R10.9 (a) This was an experiment because a treatment was applied (leaving a message or not). (b) State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H p p− = versus 1 2: 0aH p p− > where 1p is the actual proportion of people who would be contacted when a message is left and 2p is the actual proportion of people who would be contacted when a message is not left. Plan: We should use a two-sample z test for 1 2p p− if the conditions are satisfied. Random: This was a randomized comparative experiment. Normal: The number of successes and failures in both groups are at least 10 (Message: 200 successes, 91 failures. No message: 58 successes, 42 failures).

Page 25: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

Chapter 10: Comparing Two Proportions 245

Independent: Due to the random assignment, these two groups of people contacted can be viewed as independent. Individual observations in each group should also be independent: knowing whether one person responded gives no information about whether another person responded. The conditions are met.

Do: The proportions of contacts in each group are 1200ˆ 0.687291

p = = and 258ˆ 0.58.

100p = = The pooled

proportion is 200 58 258ˆ 0.66.291 100 391Cp +

= = =+

The test statistic is (0.687 0.58) 0

1.95.(0.66)(0.34) (0.66)(0.34)

291 100

z− −

= =

+

Since this is a one-sided test the P-value is ( )1.95 0.0256.P z > = Conclude: Since the P-value is less than 0.05, we reject the null hypothesis. We have enough evidence to conclude that the proportion of people like the ones in the study who would respond when messages are left is higher than the proportion of people who would respond when no messages are left. R10.10 Answers will vary, but here is an example. The difference between average female (55.5) and male (57.9) self-concept scores was so small that it can be attributed to chance variation in the samples (t = −0.83, df = 62.8, P-value = 0.4110). In other words, based on this sample, we have insufficient evidence to suggest that mean self-concept scores differ by gender.

AP Statistics Practice Test (page 664) T10.1 e. Because the values are on a scale of 1 to 20, there can be no large outliers and the t test is robust to skewness in such large sample sizes. T10.2 b. Since it is a 95% confidence interval, the z* is 1.96 and we do not use a pooled proportion in confidence intervals. T10.3 a. The variable being measured is a yes/no variable so the population distribution cannot be Normal. T10.4 a. With smaller degrees of freedom, the t* will be larger and so the confidence interval will be wider. T10.5 b. A test for the difference between two means with population standard deviations unknown is a t test. T10.6 e. With the P-value less than 0.05 we reject the null hypothesis which allows us to conclude that we have evidence for the alternative hypothesis. T10.7 c. The confidence interval gives a list of plausible values. Since 0 is given as a plausible value we cannot rule it out, but we do not know that it is the truth.

T10.8 c. The standard deviation is ( ) ( )0.8 0.2 0.5 0.50.047.

100 400+ =

Page 26: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

246 The Practice of Statistics for AP*, 4/e

T10.9 b. We are measuring rates in the two populations, so the method should be to compare two population proportions. Since we are asking how big the difference is, the method used should be a confidence interval. T10.10 a. If we are trying to detect a smaller difference, the power will be smaller. T10.11 (a) State: Our parameters of interest are 1µ = the mean hospital stay for patients like those who get heating blankets during surgery and 2µ = the mean hospital stay for patientes like those who have core temperatures reduced during surgery. We want to estimate the difference 1 2µ µ− at a 95% confidence level. Plan: We should use a two-sample t interval for 1 2µ µ− if the conditions are satisfied. Random: This was a randomized controlled experiment. Normal: Both samples had more than 30 observations. Independent: Due to the random assignment, these two groups of patients can be viewed as independent. Individual observations in each group should also be independent: knowing one patient’s length of hospital stay gives no information about another patient’s length of hospital stay. The conditions are met. Do:

We know that 1 104,n = 1 12.1,x = 1 4.4,s =

2 96,n = 2 14.7,x = and 2 6.5.s = We

will use the conservative degrees of freedom which is 95 in this case.

So our 95% confidence interval is

( ) ( ) ( ) ( )2 24.4 6.5

12.1 14.7 1.985 2.6 1.57 4.17, 1.03 .104 96

− ± + = − ± = − − Conclude: We are 95% confident

that the the interval from –4.17 to –1.03 captures the different in actual mean hospital stay for patients like those who get heating blankets during surgery and those who have their core temperatures reduced during surgery. This interval suggests that the mean hospital stay is between 4.17 and 1.03 days shorter for those receiving the heating blankets in comparison to those who have their core temperatures reduced. (b) Yes. Since 0 is not in the interval, the entire interval is negative and we subtracted normothermic – hypothermic. It appears that warming patients during surgery decreases the mean hospital stay. T10.12 (a) State: We want to perform a test at the 0.05α = significance level of 0 1 2: 0H p p− = versus 1 2: 0aH p p− > where 1p is the actual proportion of cars that had the brake defect in last year’s model and 2p is the actual proportion of cars that have the brake defect in this year’s model. Plan: We should use a two-sample z test for 1 2p p− if the conditions are satisfied. Random: The samples were randomly selected. Normal: The number of successes and failures in both groups are at least 10 (Last year: 20 successes, 80 failures. This year: 50 successes, 300 failures). Independent: Both samples are less than 10% of their respective populations (there are more than 1000 cars of last year’s model and 3500 cars of this year’s model). The conditions are met. Do: The proportions of defects in each group are

120ˆ 0.2

100p = = and 2

50ˆ 0.143.350

p = = The pooled proportion is 20 50 70ˆ 0.156.100 350 450Cp +

= = =+

The test

statistic is ( )

( )( ) ( )( )0.2 0.143 0

1.39.0.156 0.844 0.156 0.844

100 350

z− −

= =

+

Since this is a one-sided test the P-value is

( )1.39 0.0823.P z > = Conclude: Since the P-value is greater than 0.05, we fail to reject the null hypothesis. We do not have enough evidence to conclude that the proportion of brake defects is less on cars of this year’s model in comparison to cars of last year’s model. (b) A Type I error occurs if we reject the null hypothesis and it is really true. In this case that would mean concluding that there are fewer brake defects on this year’s car model when in fact there are not fewer. This might result in more accidents because people think that their brakes are safe. A Type II error occurs if we fail to reject the null hypothesis when in fact it is false. In this case that would mean concluding that there are no fewer

Page 27: Chapter 10apstatsgrabowski.weebly.com/.../3/0/3/4/30343621/ch10_solutions.pdf · Chapter 10: Comparing Two Proportions 221 Chapter 10 . Section 10.1 . Check Your Understanding, page

Chapter 10: Comparing Two Proportions 247

brake defects this year than last year when the number of brake defects has actually been reduced. This might result in fewer people buying this particular model of car. T10.13 (a) This is a two-sample t test for comparing the difference between two means. Random: The samples were selected randomly. Normal: Since there are fewer than 30 observations in each sample we need to examine the data. The dotplot below shows no strong skewness or outliers in either distribution. Independent: There are very likely more than 100 one-bedroom and 100 two-bedroom apartments in the area of a college campus.

(b) If the mean rent of the two types of apartments is really the same, we have a 5.8% chance of finding a sample where the observed difference in mean rents is a large as or larger than the one in this study. (c) Since the P-value is greater than 0.05, Pat should fail to reject the null hypothesis. She does not have enough evidence to conclude that the mean rent of two-bedroom apartments is higher than the mean rent of one-bedroom apartments in the area of her college campus.