ap statistics 2020 practice free-response questions › prod-hmhco-vmg-craftcms-public ›...

AP® Statistics 2020 Practice Free-Response Questions

From Introduction to Statistical Investigations, AP Edition, by Nathan Tintle, Ruth Carver, Beth Chance, George Cobb, Allan Rossman, Soma Roy, Todd Swanson, and Jill VanderStoep. ©2019 by John Wiley & Sons Inc. HMH is Wiley's exclusive partner for college-level content used in high schools.

DIRECTIONS: Show all your work. Indicate clearly the methods you use, because you will be scored on the correctness of your method as well as on the accuracy and completeness of your results and explanations.

Question 1Students in an AP Statistics class participated in an online memory game. All the students first played the game at Level 1 (the lowest difficulty level), and then played the game again at Level 4 (a higher level of difficulty). The graphs below display the distribution of student scores for the two difficulty levels—Level 1 and Level 4.

a. Use the graphical display above to compare the distribution of student scores for the two difficultylevels (Level 1 and Level 4) of the memory game.

The difference in scores (Level 4–Level 1) on the memory game was calculated for each student. The graph below displays the distribution of the differences.

b. What added information does the graph above of the difference in scores (Level 4-Level 1) giveyou about students’ scores on the two different levels of the game that was not apparent in thefirst graphical display?

Question 2

The National Park Service is interested in determining whether placing predator cages over the loggerhead turtle nests on Cape Lookout National Seashore will keep raccoons from stealing eggs from the nests. Due to budget constraints, funding for the predator cages will only be approved if the Park Service can provide convincing evidence that the predator cages increase the number of turtles that successfully hatch.

The Park Service plans to collect data from the loggerhead turtle nests at Cape Lookout one season, 35 of which will have a predator cage placed over them and 35 of which will not. A test of significance will be conducted at a significance level of α = 0.05 for the following hypotheses:

H0: μC = μNC Ha: μC > μNC,

where μC is the mean number of eggs that would successfully hatch per nest for all loggerhead turtle nests on Cape Lookout with a cage and μNC is the mean number of eggs that would successfully hatch per nest for all nests on Cape Lookout with no cage.

a. Describe what a Type I error would be in the context of the study, and also describe a consequence of making this type of error.

b. Each season, the Park Service moves approximately half the turtle nests at Cape Lookout very soon after they are laid because they are placed in locations that are vulnerable to extreme high tides. The Park Service decides to collect data for their study by randomly selecting 35 of the nests that were moved and placing a cage over them and comparing the hatching rate to the rate for 35 randomly selected nests that were neither moved nor caged. This resulted in a p-value of 0.0003 for the hypotheses stated above. If it was reasonable to conduct a test of significance for the hypotheses stated above using the data collected, what would the p-value of 0.0003 lead you to conclude?

c. Describe the primary flaw in the study described in part (b), and explain why it is a concern.

Question 3

A smartphone manufacturer is concerned about the proportion of defectives produced at a certain plant that produces many smartphones every day. Historically, approximately 15% of phones produced at this plant have been defective. As part of their quality assurance testing, four smartphones are selected at random from a day’s production.

a. Let X represent the number of defective phones in the sample of four phones. Complete the table below for the probability distribution of X, assuming the historic defective rate holds.

b. What is the expected number of defective phones in this sample?

c. What is the conditional probability that all four phones were defective, given that at least two defectives were found in the sample?

d. Suppose that each day the phones are produced independently of all other days’ phones. If a sample of size 4 is taken every weekday (five total samples), what is the probability that there are no defective phones found in any of the five samples taken?

Question 4

The National Sleep Foundation conducts an annual survey to track sleep-related behavior of U.S. adults. In their most recent survey, a random sample of 1,018 adults answered the question “About how much actual sleep would you estimate you typically get on work nights or weeknights?” The two-way table below summarizes the responses by whether they were less than, equal to, or more than the recommended 7 to 9 hours of sleep and by the age group of the respondent.

At the α = 0.05 significance level, do the data provide convincing statistical evidence that there is an association between age group and typical weeknight sleep time for adults in the United States? Explain your answer.

Question 5In 2006, professional tennis introduced a challenge system in which a player can challenge decisions on whether a tennis ball was correctly called "in" or "out" by an official. A video replay then determines whether or not the call was correct (“upheld”) or incorrect (“reversed”). A player wants to determine whether “in” calls or “out” calls are more likely to be reversed. The table below shows the 2015 Women’s Singles challenges based on whether the tennis ball was called “out” or “in” and whether the official’s call was reversed or upheld.

a. Calculate the proportion of all challenges in which the official’s call was reversed.

b. Use these data as a representative sample of all player challenges in professional tennis tofind a 95% confidence interval for the difference in the proportion of reversed calls when theofficial calls the ball “out” and when the official calls the ball “in” (out–in). Be sure to alsoprovide an interpretation of this confidence interval. You may assume that all conditions forinference were met.

c. Does the confidence interval from (b) provide convincing statistical evidence that there is adifference in the proportion of reversed challenges between when the official calls the ball“out” and when the official calls the ball “in”? Justify your answer.

d. Would your answer to (c) change if the confidence interval in (b) were a 90% confidenceinterval rather than a 95% confidence interval? Explain how you are arriving at your answer.

Question 6

In a large manufacturing company every item produced is inspected for defects and will go through a repair process if the defects are serious. Management wanted to investigate whether items produced on Mondays are more likely to require repair than items produced on the midweek day Wednesday. A random sample of 9 weeks from the past 5 years was selected, and the number of items that required repair for the 9 weeks are shown in the table below.

A boxplot of the differences in number of items that required repairing on Monday and Wednesday for the 9 sampled weeks is shown below.

a. Management wanted to test these hypotheses:

H0: μdifference = 0

Ha: μdifference > 0

where μ difference is the mean of the differences in the number of produced items that required repair on Monday and on Wednesday for all weeks in the past 5 years.

Explain why performing a matched-pairs t-test may not be appropriate for this sample of differences.

A different possible set of hypotheses for this investigation could be:

H0: p = 0.5Ha: p > 0.5

where p is the proportion of all weeks in which Monday produces more items that requirerepair than Wednesday. Suppose we decide to test these hypotheses using the statistic X, the number of weeks of the 9 sampled weeks in which more items required repair on Monday than Wednesday.

b. Assuming the null hypothesis (that Mondays and Wednesdays are equally likely to have themost items which require repair) is true, what is the probability distribution for the statistic X?

c. Calculate the p-value and use this p-value to provide the conclusion of the test for asignificance level of α = 0.05.

A different possible set of hypotheses for this investigation is:

H0: The distributions of the numbers of items that require repair for Mondays and Wednesday are the same.Ha: The distribution of the numbers of items that require repair for Mondays is shifted

to the right of the distribution of the number of items that require repair on Wednesdays.

To test these hypotheses using the signed rank test, we first order the absolute values ofthe differences and then determine the ranks of the 9 values (1 = smallest, 9 = largest). Then we return the original positive or negative sign. The statistic is the sum of the positiveranks.

d. Calculate the statistic for the signed rank test by completing the signed rank of thedifference column in the table below and then adding up the positive ranks.

Sum of Positive Ranks = _______________

Under the assumption of the null hypothesis, that the distributions of the numbers of items requiring repair for Mondays and Wednesday are the same, a simulation with 10,000 repetitions was performed and the signed rank statistic was calculated for each repetition. The histogram below shows the distribution of the simulated signed rank statistics.

e. Based on the value of the signed rank test statistic calculated in part (e) and thedistribution of the 10,000 simulated signed rank test statistics above, what should be theconclusion for the manufacturing company in comparing the number of items requiringrepair on Mondays and Wednesdays?

ap statistics 2020 practice free-response questions › prod-hmhco-vmg-craftcms-public ›...

Documents