ap stats section 13.1 test for goodness of fit (aka χ 2 )

AP STATS SECTION 13.1

Test for Goodness of Fit(aka χ2)

What is Chi- Squared?

Definition: 1) A single test that can be applied to

see if the observed sample distribution is different from the hypothesized population distribution.

2) A test to compare several proportions at the same time.

What is Chi – Squared?

Shape: Degree of freedom = number of

categories – 1 Only positive values (no such thing as a

negative proportion). Skewed to the right. As the number of categories increases, the

shape becomes less skewed. (see pg. 708) Area under the curve is still 1

Goodness of Fit Test

Let’s look at how things are different with this test…

Hypothesis: Ho: observed percents = expected percents Ha: observed percents ≠ expected percents **These must be written in context!

Test Stat:

Where O = observed counts and E = expected counts

E

EO 22 )(


Assumptions: All expected counts are at least 5. Independent (if needed) SRS

P – Value: Option #1: Table E (pg. 842) look for χ2 in the row Option #2: Calculator Value (sort of)

Conclusions: Stay the same.


Let’s go through an example (pg. 703) with PHANTOMS.

P = Parameter of interest is χ2.H =

Ho: the 1996 age group distribution = the 1980 age group distribution

Ha: the 1996 age group distribution ≠ the 1980 age group distribution

A =We will come back to this in a minute…


We need to organize our data to continue:

Now we need to figure out what the expected values are.

Look at 0 – 24 years first. If the years are equal: 500(41.39) = 206.95

Category Observed Expected (O-E)2/E

0 -24 years 177

25 – 44 yrs 158

45 – 64 yrs 101

65+ 64


Here are all of the expected counts and calculations:

So now we can go back to the assumptions: SRS – says so in the problem on pg. 703 Independence not needed here. All expected counts are at least 5.

Category Observed Expected (O-E)2/E

0 -24 years 177 206.95 4.33

25-44 yrs 158 138.4 2.78

45-64 yrs 101 98.2 .08

65+ 64 56.4 1.02


N = χ2 goodness of fit test

T = now we calculate the value of χ2 from the last column of the table.

So our χ2 value is: 8.21

E

EO 22 )(

Categories

Observed Expected (O-E)2/E

0-24 years 177 206.95 4.33

25-44 yrs 158 138.4 2.78

45-64 yrs 101 98.2 .08

65+ 64 56.4 1.02


O = to find the p-value, go to Table E df = 3 and χ2 = 8.21 From Table E, or p-value is between .025 and .05

M = if we still assume α=.10, we would reject the Ho.

S= There is evidence that the age distribution for 1996 is not the same as the age distribution for 1980.

Calculator Issues

Not really a function for goodness of fit test.Here is what you can do…

First, plug in the observed count to list 1, and expected to list 2.

In list 3, define it as (L1-L2)2/L2 Now go to List- Math - #5 (sum) and sum up list 3 to get χ2. Now go to Distr - #7 (χ2cdf) and hit enter. In the home screen, you have to plug in the following three

items: (χ2,big number, df) , for the big number 100000 is good.

The number you get is the more accurate p-value.

So now you see why I think that Table E is faster…

ap stats section 13.1 test for goodness of fit (aka χ 2 )

Documents

goodness of fit test

goodness of fit test

goodness of fit test

expected counts slide

test hypothesis

test stat

single test

goodness of fit aka