categorizing inference questions

Categorizing Inference Questions

• 1. Infants who cry easily may be more easily stimulated than others. This may be a sign of higher IQ. Child development researchers explored the relationship between the crying of infants 4 to 10 days old and their later IQ test scores. A snap of a rubber band on the sole of the foot caused the infants to cry. The researchers recorded the crying and measured its intensity by the number of peaks in the most active 20 seconds. They later measured the children’s IQ at age 3 years. The table below contains data from a random sample of 38 infants. Do these data provide convincing evidence that there is a positive linear relationship between the cry counts and IQ in the population of infants?

The researchers recorded the crying intensity by the number of peaks in the most active 20 seconds. They later measured the

children’s IQ at age 3 years. The table below contains data from a random sample of 38 infants. Do these data provide convincing evidence that there is a positive linear relationship between the

cry counts and IQ in the population of infants?

• A) bivariate – positive linear relationship• B) the true slope of the relationship

between cry counts and IQ in the population of infants

• C) 1 sample of 38 infants• D) T-test for slope of a regression line

t = • E) Assumptions:

o SRS, relationship is linear, responses vary normally about the LSRL, standard deviation of the residuals is constant

• 2. Trace metals found in wells affect the taste of drinking water, and high concentrations can pose a health risk. Researchers measured the concentration of zinc (in milligrams per liter) near the top and the bottom of 10 randomly selected wells in a large region. The data are provided in the table below. Construct and interpret a 95% confidence interval for the mean difference in the zinc concentrations from these two locations in the wells.

• A) means• B) the mean difference in the zinc

concentrations from these two locations in the wells

• C) one sample of 10 wells - two measurements at each well

Researchers measured the concentration of zinc (in milligrams per liter) near the top and the bottom of 10 randomly selected wells in a large region. The data are provided in the table below. Construct and interpret a 95% confidence interval for the mean difference in

the zinc concentrations from these two locations in the wells.

• D) This will be a one-sample t-interval for matched pairs. There are two measurements for each well. Subtract the matching measurements and use the difference data to do the t-interval

• E) Assumptionso σ unknown, use t o random sample of wells - giveno population of wells > 100o will have to check graph of difference data to see if a normal

approximation is valid since sample size is too small.

• 3. Researchers designed a survey to compare the proportions of children who come to school without eating breakfast in two low-income elementary schools. An SRS of 80 students from School 1 found 19 had not eaten breakfast. At School 2, an SRS of 150 students included 26 who had not had breakfast. More than 1500 students attend each school. Do these data give convincing evidence of a difference in the population proportions?

• A) proportions• B) the true difference in the population

proportions of children who come to school without eating breakfast in two low income elementary schools

• C) two samples

Researchers designed a survey to compare the proportions of children who come to school without eating breakfast in two low-income elementary schools. An SRS of 80 students from School 1 found 19 had not eaten

breakfast. At School 2, an SRS of 150 students included 26 who had not had breakfast. More than 1500 students attend each school. Do these data give

convincing evidence of a difference in the population proportions?

• D) this will be a two proportion z test for difference in proportions o Remember the plain “p” is the pooled p, putting the two

samples together• E) Assumptions

o SRS of 80 students from one school and independent SRS of 150 students in second school- given

o n1p1=19 n1(1-p1) = 61; n2p2 = 26 n2(1-p2) = 124 since all are greater than 10 we can use a normal approximation

o population of students in school 1 > 800 and school 2 >1500 – question states both populations >1500

• 4. Bottles of a popular cola are supposed to contain 300 milliliters of cola. There is some variation from bottle to bottle because the filling machinery is not perfectly precise. From experience, the distribution of the contents of the bottles is approximately normal. An inspector measures the contents of six randomly selected bottles from a single day’s production. Do these data provide convincing evidence that the mean amount of cola in all the bottles filled that day differs from the target value of 300 ml?

• A) means• B) the mean amount of cola in all the bottles filled

that day• C) one sample

From experience, the distribution of the contents of the bottles is approximately normal. An inspector measures the contents of six randomly selected bottles from a single day’s production. Do

these data provide convincing evidence that the mean amount of cola in all the bottles filled that day differs from the target value of

300 ml?

• D) This will be a one-sample t-test for means t =

• E) Assumptions:o σ unknown, use t o Random sample of bottles from day’s

production - giveno Contents of bottles given to be approximately

normally distributed - giveno Population of bottles in one day’s production >

60

• 5. Some doctors have begun to use medical magnets to treat patients with chronic pain. Scientists wondered whether this type of therapy really worked, so they designed an experiment to find out. Fifty patients with chronic pain were recruited for the study. A doctor identified a painful site on each patient and asked him or her to rate the pain on a scale from 0 to 10. Then, the doctor selected a sealed envelope containing a magnet from a box that contained both active and inactive magnets. The chosen magnet was applied to the site of the pain for 45 minutes. After treatment, each patient was again asked to rate the level of pain from 0 to 10. In all, 29 patients were given active magnets and 21 patients received inactive magnets. All but one of the patients rated their initial pain as 8, 9, or 10, so scientists decided to focus on the patients’ final pain ratings. Do these data show statistical evidence to suggest that the active magnets help reduce pain?

In all, 29 patients were given active magnets and 21 patients received inactive magnets. All but one of the patients rated their initial pain as 8, 9, or 10, so scientists

decided to focus on the patients’ final pain ratings. Do these data show statistical evidence to suggest that the active

magnets help reduce pain?• A) This question doesn’t explicitly say whether it’s

proportions or means, but there are no percentages, and we could find the mean pain rating, so it is means.

• B) The true reduction(difference) in mean pain rating for patients using magnets as opposed to a placebo

• C) Two samples – even though there were 50 patients originally, they get split into two separate treatments. Since whether they get the magnet or not is random, we can say the two samples are independent

In all, 29 patients were given active magnets and 21 patients received inactive magnets. All but one of the patients rated their initial pain as 8, 9, or 10, so scientists

decided to focus on the patients’ final pain ratings. Do these data show statistical evidence to suggest that the active

magnets help reduce pain?• D) Two-sample t-test for difference of means

t = • E) Assumptions:

o σ unknown, use to Random sample of 29 patients given magnets and

independent sample of 21 patients with inactive magnets – given

o Reasonable to assume true population of patients with chronic pain is greater than 210 and 290

o Since each sample size is less than 30, we will need to look at graphs of the pain ratings for each group to determine if we can use a normal approximation

• 6. Tonya wants to estimate what proportion of the seniors in her school plan to attend the prom. She interviews an SRS of 50 of the 750 seniors in her school and finds that 36 plan to go to the prom.

• A) proportions• B) The true proportion of seniors at Tonya’s

school who plan to go to the prom• C) One sample of students• D) One-proportion z-interval since we are

estimating the parameter • E) Assumptions:

o SRS of seniors - giveno np = 36, n(1-p) = 14. Since both are >10 we can use normal approx.o Population of seniors = 750 which is greater than 500 (10n)

• 7. A local high school makes a change that should improve student satisfaction with the parking situation. Before the change, 37% of the school’s students approved of the parking that was provided. After the change, the principal surveys an SRS of 200 of the over 2500 students at the school. In all, 83 students say that they approve of the new parking arrangement. Is this evidence that the change was effective?

• A) Proportions – you have a % and fraction 83/200• B) The true difference in proportion of students

satisfied with the parking situation before and after the change.

• C) Two samples of students

Before the change, 37% of the school’s students approved of the parking that was provided. After the change, the

principal surveys an SRS of 200 of the over 2500 students at the school. In all, 83 students say that they approve of the new parking arrangement. Is this evidence that the change

was effective?

• D) two-proportion z-test for difference of proportions z =

• E) Assumptionso Random sample of students before change (not given)

and SRS of students after change - giveno n1p1 = (2500)(.37) = 925 n1(1-p1) = 1575o n2p2 = 83 n2(1-p2) = 117

• Since all these values > 10 we can use a normal approximation

Population of students is 2500 which is greater than 2000 (10n)

• 8. Here are data on the time (in minutes) Professor Moore takes to swim 2000 yards and his pulse rate (beats per minute) after swimming on a random sample of 23 days. Is there statistically significant evidence of a linear relationship between Professor Moore’s swim time and his pulse rate in the population of days on which he swims 2000 yards?

• A) bivariate data• B) the true slope of the relationship between Mr.

Moore’s swim time and his pulse rate• C) one sample of 23 days• D) t-test for slope of LSRL t = • E) Same assumptions as question #1

• 9. Biologists studying the healing of skin wounds measured the rate at which new cells closed a cut made in the skin of an anesthetized newt. Here are data from a random sample of 18 newts, measured in micrometers per hour. We want to estimate the mean healing rate with 95% confidence.

• A) means• B) the true mean healing rate of skin wounds• C) one sample• D) one-sample t-interval for means • E) Assumptions:

o σ unknown, use to Random sample of newts – giveno Population of newts > 180o Since the sample size is <30 we would need to look at a graph of the data to

determine if we can use a normal approximation

• 10. Breast-feeding mothers secrete calcium into their milk. Some of the calcium may come from their bones, so mothers may lose bone mineral. Researchers compared a random sample of 47 breast-feeding women with a random sample of 32 women of similar age who were neither pregnant nor lactating. They measured the percent change in the bone mineral content of the women’s spines over three months. Comparative data is given below. Is the mean change in bone mineral content significantly lower for the mothers who are breast-feeding?

• A) means• B) the true difference in the mean change of bone

mineral content between breast feeding women and those not breastfeeding

Researchers compared a random sample of 47 breast-feeding women with a random sample of 32 women of similar age who were

neither pregnant nor lactating. They measured the percent change in the bone mineral content of the women’s spines over three months.

Comparative data is given below. Is the mean change in bone mineral content significantly lower for the mothers who are breast-feeding?

• C) Two samples• D) Two-sample t-test for difference of means

t = • E) Assumptions

o Neither σ is known, so use to Random sample of 47 breast-feeding women and random

sample of 32 women not breast-feeding – giveno Both sample sizes are greater than 30, so we can use a normal

approximationo Population of breast-feeding women > 470 and population of

women not breast-feeding > 320

• 11. Some doctors argue that “normal” human body temperature is not really 98.6oF. One researcher took the oral temperature reading for each of 130 randomly chosen, healthy 18- to 40-year olds. The mean temperature was 98.25oF, with a standard deviation of 0.73oF. Do these data provide convincing evidence that normal body temperature is not 98.6oF?

• A) means• B) the true mean temperature of healthy 18- to

40-year-olds• C) one sample • D) one-sample t-test for means t =

One researcher took the oral temperature reading for each of 130 randomly chosen, healthy 18- to 40-year olds. The mean temperature was 98.25oF, with a standard deviation of 0.73oF. Do these data provide convincing evidence that

normal body temperature is not 98.6oF?

• E) Assumptionso σ unknown, use toRandom sample of healthy 18- to 40-

year-olds – giveno Since the sample size is greater than 30,

we can use a normal approximationo Population of 18- to 40-year-olds > 1300

• 12. A study followed a random sample of 8474 people with normal blood pressure for about four years. All the individuals were free of heart disease at the beginning of the study. Each person took a test which measures how prone a person is to sudden anger. Researchers also recorded whether each individual developed coronary heart disease. Do the data provide convincing evidence of an association between anger level and heart disease in the population of interest?

• A) categorical data• B) whether there is an association between anger

level and heart disease in the population of adults• C) one sample – two variables

A study followed a random sample of 8474 people with normal blood pressure for about four years. All the individuals were free of heart disease at the beginning of the study. Each person took a test which measures how prone a person is to sudden anger. Researchers also recorded whether each individual developed coronary heart disease. Do the data provide convincing evidence of an association between anger level and heart

disease in the population of interest?

• D) Chi-square test of independence

• E) Assumptionso random sample of people – giveno Data are countso All expected counts are greater than 5 – we

would need to actually check the individual counts, but considering the sample is so large, this condition will probably be met

• 13. A drug manufacturer claims that less than 10% of patients who take its new drug for treating Alzheimer’s disease will experience nausea. To test this claim, researchers conduct an experiment. They give the new drug to a random sample of 300 out of 5000 Alzheimer’s patients whose families have given informed consent for the patients to participate in the study. In all, 25 of the subjects experience nausea.

• A) proportions• B) the true proportion of Alzheimer’s patients

experiencing nausea when taking a drug• C) one sample of patients• D) one sample z-test of proportions z =

A drug manufacturer claims that less than 10% of patients who take its new drug for treating Alzheimer’s disease will experience nausea. To

test this claim, researchers conduct an experiment. They give the new drug to a random sample of 300 out of 5000 Alzheimer’s patients

whose families have given informed consent for the patients to participate in the study. In all, 25 of the subjects experience nausea.

• E) AssumptionsoRandom sample of 300 Alzheimer’s

patients - giveno n1p1 = 25 n1(1-p1) - 275 Since both

of these values are >10 we can use a normal approximation

o Population of Alzheimer’s patients > 3000

• 14, Glenn wonders what proportion of students at his school think that tuition is too high. He interviews an SRS of 50 of the 240 students at his college. Thirty-eight of those interviewed think tuition is too high.

• A) proportions• B) the true proportion of students at Glenn’s school

who think that tuition is too high• C) one sample of students• D) Since Glenn is trying to determine the proportion, or

estimate the proportion, this will be a confidence interval – z-interval for one sample proportion

• E) Assumptionso SRS of students - giveno np = 38 n(1-p) = 12 Since both values > 10 we can use a normal approx.o Population of students is 240, which is not 10n!!! We can’t assume

independence in our sample. We can proceed, but our results may not be reliable.

• 15. Market researchers suspect that background music may affect the mood and buying behavior of customers. One study in a supermarket compared three randomly assigned treatments: no music, French accordion music, and Italian string music. Under each condition, the researchers recorded the numbers of bottles of French, Italian, and other wine purchased. Are the distributions of wine purchases under the three music treatments similar or different?

• A) categorical data• B) if the population of customers has the same

buying habits with different music treatments• C) 3 samples (treatment groups)

One study in a supermarket compared three randomly assigned treatments: no music, French accordion music, and Italian

string music. Under each condition, the researchers recorded the numbers of bottles of French, Italian, and other wine purchased. Are

the distributions of wine purchases under the three music treatments similar or different?

• D) Chi-square test of homogeneity

• E) AssumptionsoRandom sample of supermarket

customers - givenoData are countsoAll expected counts > 5 – would need

the data to check this

• 16. A surprising number of young adults (ages 19 to 25) still live in their parents’ homes. The National Institutes of Health planned to estimate the difference in proportions of women and men in this age group who live at home. The random sample included 2253 men and 2629 women in this age group. The survey found that 986 of the men and 923 of the women lived with their parents.

• A) proportions• B) the true difference in proportions of men and

women ages 19 – 25 who still live with their parents

• C) two samples – men and women

The National Institutes of Health planned to estimate the difference in proportions of women and men in this age group who live at home. The random sample included

2253 men and 2629 women in this age group. The survey found that 986 of the men and 923 of the women lived with

their parents.• D) two-sample z-interval for difference of

proportions • E) Assumptions

o Random sample of 19-25 year old men and random sample of 19-25 year old women - given

o n1p1 = 986 n1(1-p1) = 1267o n2p2 = 923 n2(1 – p2) = 1706

• Since all these values are >10, I can use a normal approximation

o Population of 19-25 year old men > 22530 and population of 19-25 year old women > 2629

• 17. The Wade Tract Preserve in Georgia is an old-growth forest of long-leaf pines that has survived in a relatively undisturbed state for hundreds of years. One question of interest to foresters who study the area is “How do the sizes of long-leaf pine trees in the northern and southern halves of the forest compare?” To find out, researchers took random samples of 30 trees from each half of the forest and measured the trees’ diameter in centimeters. What is the difference in mean diameters of long-leaf pines in the northern and southern halves?

• A) mean – it’s not proportions, and we can find the mean diameter

• B) the true difference of the mean diameters of trees in the northern and southern halves of the Wade Tract Preserve

To find out, researchers took random samples of 30 trees from each half of the forest and measured the

trees’ diameter in centimeters. What is the difference in mean diameters of long-leaf pines in the

northern and southern halves?• C) two samples of 30 trees• D) We want to find/estimate the difference, so we

use a two – sample t-interval for the difference of means

• E) Assumptions:o Random sample of trees from northern part and independent

random sample of trees from southern part of the forest - giveno Since both sample sizes are 30, we can use a normal

approximationo Population of the trees in each the northern and southern parts

of the forest are > 300o σ unknown, use t

• 18. Environmentalists, government officials, and vehicle manufacturers are all interested in studying the auto exhaust emissions produced by motor vehicles. The major pollutants in auto exhaust from gasoline engines are hydrocarbons, carbon monoxide, and nitrogen oxides (NOX). Researchers collected data on the NOX levels (in grams per mile) for a random sample of 40 light-duty engines of the same type. The mean NOX reading was 1.2675 and the standard deviation was 0.3332.

• A) means• B) The true mean NOX level of the population of

light-duty engines• C) one sample

Researchers collected data on the NOX levels (in grams per mile) for a random sample of 40

light-duty engines of the same type. The mean NOX reading was 1.2675 and the standard

deviation was 0.3332.• D) We’re not testing a claim, so we are going to

estimate the true mean using a one-sample t-interval for means

• E) Assumptionso Random sample of light-duty engines – giveno Sample size 40 > 30 so we can use a normal

approximationo Population of light-duty engines > 400o σ unknown, use t

• 19. Do experienced computer game players earn higher scores when they play with someone present to cheer them on or when they play alone? Fifty teenagers who are experienced at playing a particular computer game have volunteered for a study. We randomly assign 25 of them to play the game alone and the other 25 to play the game with a supporter present. Each player’s score is recorded.

• A) means (mean score)• B) the true difference in the mean scores of

players with and without a supporter present• C) two samples (or randomly assigned groups)

Fifty teenagers who are experienced at playing a particular computer game have volunteered for a study. We randomly assign 25 of them to play the game alone and the other

25 to play the game with a supporter present. Each player’s score is recorded.

• D) two – sample t-test of difference of means t = • E) Assumptions:

o Randomly assigned players in each group – with and without a supporter present - given

o Both sample sizes, 25, are less than 30 so we will need to look at the data to determine if we can use a normal approximation

o Population of computer game players > 250 + 250o σ unknown, use t

• 20. As part of the Pew Internet and American Life Project, researchers conducted two surveys in late 2009. The first survey asked a random sample of 800 U.S. teens about their use of social media and the Internet. A second survey posed the same questions to a random sample of 2253 U.S. adults. In these two studies, 73% of teens and 47% of adults said that they use social-networking sites. Construct and interpret a 95% confidence interval for the difference in the proportion of all US teens and adults who use social-networking sites.

• A) proportions• B) the true difference in the proportion of US

teens and adults who use social-networking sites

The first survey asked a random sample of 800 U.S. teens about their use of social media and the Internet. A second survey posed the same

questions to a random sample of 2253 U.S. adults. In these two studies, 73% of teens and 47% of adults said that they use social-

networking sites. Construct and interpret a 95% confidence interval for the difference in the proportion of all US teens and adults who

use social-networking sites.

• C) two samples• D) two-sample z-interval for difference of

proportions • E) Assumptions

o Random sample of 800 US teens and independent random sample of 2253 US adults – given

o n1p1 = (800)(.73)=584 n1(1-p1)=216o n2p2 = (2253)(.47)=1059 n2(1-p2)=1194o Since all these values > 10, we can use a normal approximationo Population of US teens > 8000 and population of US adults > 22530

categorizing inference questions

Documents