test of homogeneity lecture 45 section 14.4 wed, apr 19, 2006

32
Test of Test of Homogeneity Homogeneity Lecture 45 Lecture 45 Section 14.4 Section 14.4 Wed, Apr 19, 2006 Wed, Apr 19, 2006

Upload: oscar-simmons

Post on 05-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Test of Test of HomogeneityHomogeneity

Lecture 45Lecture 45

Section 14.4Section 14.4

Wed, Apr 19, 2006Wed, Apr 19, 2006

Page 2: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Homogeneous Homogeneous PopulationsPopulations

Two distributions are called Two distributions are called homogeneoushomogeneous if they exhibit the if they exhibit the same proportions within categories.same proportions within categories.

For example, if two colleges’ student For example, if two colleges’ student bodies are each 55% female and bodies are each 55% female and 45% male, then the distributions are 45% male, then the distributions are homogeneous.homogeneous.

Page 3: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

ExampleExample

Suppose a teacher teaches two Suppose a teacher teaches two sections of Statistics and uses two sections of Statistics and uses two different teaching methods.different teaching methods.

At the end of the semester, he gives At the end of the semester, he gives both sections the same final exam both sections the same final exam and he compares the grade and he compares the grade distributions.distributions.

He wants to know if the differences He wants to know if the differences that he observes are significant.that he observes are significant.

Page 4: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

ExampleExample

Does there appear to be a Does there appear to be a difference?difference?

Or are the two sets (plausibly) Or are the two sets (plausibly) homogeneous?homogeneous?A B C D F

Method I

5 7 36 17 7

Method II

7 11 18 7 5

Page 5: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

The Test of HomogeneityThe Test of Homogeneity The null hypothesis is that the populations The null hypothesis is that the populations

are homogeneous.are homogeneous. The alternative hypothesis is that the The alternative hypothesis is that the

populations are not homogeneous.populations are not homogeneous.

HH00: The populations are homogeneous.: The populations are homogeneous.

HH11: The populations are not homogeneous.: The populations are not homogeneous. Notice that Notice that HH00 does not specify a does not specify a

distribution; it just says that whatever the distribution; it just says that whatever the distribution is, it is the same in all rows.distribution is, it is the same in all rows.

Page 6: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

The Test StatisticThe Test Statistic

The test statistic is the chi-square The test statistic is the chi-square statistic, computed asstatistic, computed as

The question now is, how do we The question now is, how do we compute the expected counts?compute the expected counts?

E

EO 22 )(

Page 7: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Expected CountsExpected Counts

Under the assumption of homogeneity Under the assumption of homogeneity ((HH00), the rows should exhibit the same ), the rows should exhibit the same proportionsproportions..

We can get the best estimate of those We can get the best estimate of those proportions by proportions by poolingpooling the rows. the rows.

That is, add the rows (i.e., find the That is, add the rows (i.e., find the column totals), and then compute the column totals), and then compute the column proportions from them.column proportions from them.

Page 8: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Row and Column Row and Column ProportionsProportions

A B C D F

Method I

5 7 36 17 7

Method II

7 11 18 7 5

Page 9: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Row and Column Row and Column ProportionsProportions

A B C D F

Method I

5 7 36 17 7

Method II

7 11 18 7 5

Col Total

12 18 54 24 12

Page 10: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Row and Column Row and Column ProportionsProportions

A B C D F

Method I

5 7 36 17 7

Method II

7 11 18 7 5

Col Total

12 18 54 24 12

10% 15% 45% 20% 10%

Page 11: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Expected CountsExpected Counts

Similarly, the columns should exhibit Similarly, the columns should exhibit the same proportions, so we can get the same proportions, so we can get the best estimate by pooling the the best estimate by pooling the columns.columns.

That is, add the columns (i.e., find That is, add the columns (i.e., find the row totals), and then compute the row totals), and then compute the row proportions from them.the row proportions from them.

Page 12: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Row and Column Row and Column ProportionsProportions

A B C D F

Method I

5 7 36 17 7

Method II

7 11 18 7 5

Col Total

12 18 54 24 12

10% 15% 45% 20% 10%

Page 13: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Row and Column Row and Column ProportionsProportions

A B C D F Row Total

Method I

5 7 36 17 7 72

Method II

7 11 18 7 5 48

Col Total

12 18 54 24 12

10% 15% 45% 20% 10%

Page 14: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Row and Column Row and Column ProportionsProportions

A B C D F Row Total

Method I

5 7 36 17 7 72 60%

Method II

7 11 18 7 5 48 40%

Col Total

12 18 54 24 12

10% 15% 45% 20% 10%

Page 15: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Row and Column Row and Column ProportionsProportions

A B C D F Row Total

Method I

5 7 36 17 7 72 60%

Method II

7 11 18 7 5 48 40%

Col Total

12 18 54 24 12 120

10% 15% 45% 20% 10%

GrandTotal

Page 16: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Expected CountsExpected Counts

Now apply the appropriate row and Now apply the appropriate row and column proportions to each cell to get column proportions to each cell to get the expected count.the expected count.

Let’s use the upper-left cell as an Let’s use the upper-left cell as an example. example.

According to the row and column According to the row and column proportions, it should contain 60% of proportions, it should contain 60% of 10% of the grand total of 120.10% of the grand total of 120.

That is, the expected count isThat is, the expected count is0.600.60 0.10 0.10 120 = 7.2 120 = 7.2

Page 17: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Expected CountsExpected Counts

Notice that this can be obtained more Notice that this can be obtained more quickly by the following formula.quickly by the following formula.

In the upper-left cell, this formula In the upper-left cell, this formula producesproduces

(72 (72 12)/120 = 7.2 12)/120 = 7.2

totalgrand

tal)(column to total)(rowcount Expected

Page 18: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Expected CountsExpected Counts

Apply that formula to each cell to Apply that formula to each cell to find the expected counts and add find the expected counts and add them to the table.them to the table.

A B C D F

Method I5

(7.2)7

(10.8)

36(32.4

)

17(14.4)

7(7.2)

Method II7

(4.8)11

(7.2)18

(21.6)

7(9.6)

5(4.8)

Page 19: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

The Test StatisticThe Test Statistic

Now compute Now compute 22 in the usual in the usual way.way.

2106.78.4

)8.45(

6.9

)6.97(

6.21

)6.2118(

2.7

)2.711(

8.4

)8.47(

2.7

)2.77(

4.14

)4.1417(

4.32

)4.3236(

8.10

)8.107(

2.7

)2.75(

22222

222222

Page 20: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

Degrees of FreedomDegrees of Freedom

The number of degrees of freedom isThe number of degrees of freedom is

dfdf = (no. of rows – 1) = (no. of rows – 1) (no. of cols – 1). (no. of cols – 1). In our example, In our example, dfdf = (2 – 1) = (2 – 1) (5 – 1) = (5 – 1) =

4.4. To find the To find the pp-value, calculate-value, calculate

22cdf(7.2106, E99, 4) = 0.1252.cdf(7.2106, E99, 4) = 0.1252. At the 5% level of significance, the At the 5% level of significance, the

differences are not statistically differences are not statistically significant.significant.

Page 21: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

TI-83 – Test of TI-83 – Test of HomogeneityHomogeneity

The tables in these examples are not The tables in these examples are not lists, so we can’t use the lists in the lists, so we can’t use the lists in the TI-83.TI-83.

Instead, the tables are Instead, the tables are matricesmatrices.. The TI-83 can handle matrices.The TI-83 can handle matrices.

Page 22: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

TI-83 – Test of TI-83 – Test of HomogeneityHomogeneity

Enter the observed counts into a matrix.Enter the observed counts into a matrix. Press MATRIX.Press MATRIX. Select EDIT.Select EDIT. Use the arrow keys to select the matrix to Use the arrow keys to select the matrix to

edit, say [A].edit, say [A]. Press ENTER to edit that matrix.Press ENTER to edit that matrix. Enter the number of rows and columns. Enter the number of rows and columns.

(Press ENTER to advance.)(Press ENTER to advance.) Enter the observed counts in the cells.Enter the observed counts in the cells. Press 2Press 2ndnd Quit to exit the matrix editor. Quit to exit the matrix editor.

Page 23: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

TI-83 – Test of TI-83 – Test of HomogeneityHomogeneity

Perform the test of homogeneity.Perform the test of homogeneity. Select STATS > TESTS > Select STATS > TESTS > 22-Test…-Test… Press ENTER.Press ENTER. Enter the name of the matrix of observed Enter the name of the matrix of observed

counts.counts. Enter the name (e.g., [E]) of a matrix for Enter the name (e.g., [E]) of a matrix for

the expected counts. These will be the expected counts. These will be computed for you by the TI-83.computed for you by the TI-83.

Select Calculate.Select Calculate. Press ENTER.Press ENTER.

Page 24: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

TI-83 – Test of TI-83 – Test of HomogeneityHomogeneity

The window displaysThe window displays The title “The title “22-Test”.-Test”. The value of The value of 22.. The The pp-value.-value. The number of degrees of freedom.The number of degrees of freedom.

See the matrix of expected counts.See the matrix of expected counts. Press MATRIX.Press MATRIX. Select matrix [E].Select matrix [E]. Press ENTER.Press ENTER.

Page 25: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

ExampleExample

Is the color distribution in Skittles Is the color distribution in Skittles candy the same as in M & M candy?candy the same as in M & M candy?

One package of Skittles:One package of Skittles: Red: 12Red: 12 Orange: 14Orange: 14 Yellow: 10Yellow: 10 Green: 10Green: 10 Purple: 12Purple: 12

Page 26: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

ExampleExample

One package of M & Ms:One package of M & Ms: Red: 8Red: 8 Orange: 19Orange: 19 Yellow: 4Yellow: 4 Green: 8Green: 8 Blue: 10Blue: 10 Brown: 6Brown: 6

Page 27: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

The TableThe Table

Red Orange

Yellow

Green

Brown

Skittles

12 14 10 10 12

M & Ms

8 19 4 8 6

Page 28: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

The TableThe Table

Red Orange

Yellow

Green Brown

Skittles 12(11.3)

14(18.6)

10(7.9)

10(10.1)

12(10.1)

M & Ms

8(8.7)

19(14.4)

4(6.1)

8(7.9)

6(7.9)

dfdf = 4 = 4 22 = 4.787 = 4.787 pp-value = 0.3099-value = 0.3099

Page 29: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

ExampleExample

Let’s gather more evidence. Buy a Let’s gather more evidence. Buy a second package of Skittles and add second package of Skittles and add it to the first package.it to the first package.

Second package of Skittles:Second package of Skittles: Red: 10Red: 10 Orange: 13Orange: 13 Yellow: 15Yellow: 15 Green: 13Green: 13 Purple: 7Purple: 7

Page 30: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

ExampleExample

Buy a second package of M & Ms Buy a second package of M & Ms and add it to the first package.and add it to the first package.

Second package of M & Ms:Second package of M & Ms: Red: 5Red: 5 Orange: 12Orange: 12 Yellow: 16Yellow: 16 Green: 9Green: 9 Blue: 8Blue: 8 Brown: 8Brown: 8

Page 31: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

The TableThe Table

Red Orange

Yellow

Green

Brown

Skittles

22 27 25 23 19

M & Ms

13 31 20 17 14

Page 32: Test of Homogeneity Lecture 45 Section 14.4 Wed, Apr 19, 2006

The TableThe Table

Red Orange

Yellow

Green Brown

Skittles 22(19.2)

27(31.9)

25(24.7)

23(22.0)

19(18.1)

M & Ms

13(15.8)

31(26.1)

20(20.3)

17(18.0)

14(14.9)

dfdf = 4 = 4 22 = 2.740 = 2.740 pp-value = 0.6022-value = 0.6022