13.2 chi-square test for homogeneity & independence ap statistics

15
13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

Upload: john-dale-green

Post on 17-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

13.2 Chi-Square Test for

Homogeneity & Independence

AP Statistics

Page 2: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

HomogeneityThe two-sample procedures in Chapter 12 allow

us to compare the proportion of successes in two groups. What if we want to compare more than two proportions? We’ll need a new test for that. If data is presented in a two-way table, we can look at categorical variables. The same test that compares multiple proportions also tests if those variables are related. This test is the X2 test for homogeneity/independence.

Page 3: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

Example: Does Background Music Influence Wine Purchases?

A study in a supermarket in Northern Ireland was conducted to determine whether or not the sales of wine changed relative to the type of background music that was played. Researchers recorded the amount and type of wine that was sold while Italian, French, and no music was played.

Music

Wine None French Italian Total

French 30 39 30 99

Italian 11 1 19 31

Other 43 35 35 113

Total 84 75 84 243

Page 4: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

If music had no effect on the type of wine sold, we would expect

to see similar distributions for each type of wine.

Sketch the three wine distributions and compare:

0

10

20

30

40

50

French Italian Other

None

French

Italian

Wine

Mus

ic

Page 5: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

To compare the three population distributions, we must determine what counts we would expect to see if the three distributions were the same. To calculate the expected cell counts, we use the following formula…try to determine why? or….see page 747

expected count =

Calculate the expected counts for each cell and enter them in parentheses next to the observed counts.

total

totalcolumntotalrow

Page 6: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

Music

Wine None French Italian Total

French 30 (34.22)

39 (30.56)

30 (34.22)

99

Italian 11 (10.72)

1 (9.57)

19 (10.72)

31

Other 43 (39.06)

35 (34.88)

35 (39.06)

113

Total 84 75 84 243

Page 7: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

To test the significance of the difference between the observed and expected counts, we must calculate a X2 value. If this value is close to zero, then there is not much of a difference between the distributions. However, if this value is large, then we may have evidence that the distributions differ.

H0 : p1 = p2 = p3 . The proportion of wine sold with each type of music is the same vs. Ha : Not all are equal.

over all cells in the table.

Calculate this value. X2 = 18.2688

ected

ectedobservedX

exp

exp 22

Page 8: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

How likely was this observed difference? To calculate the p-value, we must look up our information on the table.

The degrees of freedom in a test for homogeneity is (row – 1)(column – 1).

(3 – 1)(3 – 1) = 4P-value = 0.001093 X2cdf(X2,1E99, df)

Conclusion?There is significant evidence at α = 0.05 to reject

the null hypothesis. It appears the distributions of wine sales may be different for each type of background music.

Page 9: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

IndependenceIn a sense, the Test for Homogeneity can be

used to determine whether or not one categorical variable has an effect on another. If the goal of our analysis is to determine an association between two categorical variables, we call the test a Test for Independence. If one variable is affecting the other, then we would expect to see differences between the distributions of counts.

Page 10: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

The null hypothesis vs. the alternative in a test

for independence is

Ho : There is no association between the

two categorical variablesHa : There is an association between the

two categorical variables

Page 11: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

Chi-Square procedures can be used for a test of

homogeneity or a test of independence if all expected counts are at least 1 and if 80% of the expected counts are greater than 5.

If these conditions are met, the distribution of X2 will be Chi-Square with df = (r – 1)(c – 1).

Page 12: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

Example: Smoking Habits—Students & ParentsHow are the smoking habits of students and parents related? Does a parent’s habits affect their child’s smoking habits? Consider the following data from eight high schools in Arizona and perform a test for independence:

Student Smokes Student Does NOT Smoke TotalBoth Parents Smoke 400 332.49 1380 1447.51 1780One Parent Smokes 416 418.22 1823 1820.78 2239

Neither Parent Smokes

188 253.29 1168 1102.71 1356

Total 1004 4371 5375

Page 13: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

Hypotheses:Ho : There is no association between parent and child smoking behavior

Ha : There is an association between parent and child smoking behavior

Conditions: Since we do not know if we have an SRS, we must proceed with caution. All expected counts are greater than 5. We will proceed with a X2 test of independence.

Sampling Distribution of X2:df = (3 – 1)(2 – 1) = 2 x 1 = 2X2 = 37.5663

p < α Reject Ho

Conclusion:There is significant evidence to conclude there may be an association between parent and child smoking behavior (α = 0.05).

0109594.65663.37 922 XPp

Page 14: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

Example:Because of the stressful working environment, employees at Company X are

prone to criminal activities. The following data represent the number of various types of crimes by gender in a random sample of 750 wayward employees at Company X.

Does the evidence suggest that gender is independent of type of crime at a 0.05 significance level?

Gender Personal Assault

Property Damage

Drug Abuse

Public Disorder

Female 24 85 7 28Male 97 367 39 103

Gender Personal Assault

Property Damage

Drug Abuse Public Disorder

Total

Female 24 23.232 85 86.784 7 8.832 28 25.152 144Male 97 97.768 367 365.22 39 37.168 103 105.85 606

Total 121 452 46 131 750

Page 15: 13.2 Chi-Square Test for Homogeneity & Independence AP Statistics

Ho: There is NO association between gender and type of crime.

Ha: There is an association between gender and type of crime.

We will assume we have an SRS.Since all expected counts are greater than 5, we will proceed

with a X2 test for Independence.

df = (4 – 1)(2 – 1) = 3X2 = 0.9462

p > α do NOT reject Ho

There is NOT significant evidence to suggest an association between gender and type of crime (α = 0.05).

8143.9462.023 XPp