ap stats section 13.1 test for goodness of fit (aka χ 2 )
TRANSCRIPT
AP STATS SECTION 13.1
Test for Goodness of Fit(aka χ2)
What is Chi- Squared?
Definition: 1) A single test that can be applied to
see if the observed sample distribution is different from the hypothesized population distribution.
2) A test to compare several proportions at the same time.
What is Chi – Squared?
Shape: Degree of freedom = number of
categories – 1 Only positive values (no such thing as a
negative proportion). Skewed to the right. As the number of categories increases, the
shape becomes less skewed. (see pg. 708) Area under the curve is still 1
Goodness of Fit Test
Let’s look at how things are different with this test…
Hypothesis: Ho: observed percents = expected percents Ha: observed percents ≠ expected percents **These must be written in context!
Test Stat:
Where O = observed counts and E = expected counts
E
EO 22 )(
Goodness of Fit Test
Assumptions: All expected counts are at least 5. Independent (if needed) SRS
P – Value: Option #1: Table E (pg. 842) look for χ2 in the row Option #2: Calculator Value (sort of)
Conclusions: Stay the same.
Goodness of Fit Test
Let’s go through an example (pg. 703) with PHANTOMS.
P = Parameter of interest is χ2.H =
Ho: the 1996 age group distribution = the 1980 age group distribution
Ha: the 1996 age group distribution ≠ the 1980 age group distribution
A =We will come back to this in a minute…
Goodness of Fit Test
We need to organize our data to continue:
Now we need to figure out what the expected values are.
Look at 0 – 24 years first. If the years are equal: 500(41.39) = 206.95
Category Observed Expected (O-E)2/E
0 -24 years 177
25 – 44 yrs 158
45 – 64 yrs 101
65+ 64
Goodness of Fit Test
Here are all of the expected counts and calculations:
So now we can go back to the assumptions: SRS – says so in the problem on pg. 703 Independence not needed here. All expected counts are at least 5.
Category Observed Expected (O-E)2/E
0 -24 years 177 206.95 4.33
25-44 yrs 158 138.4 2.78
45-64 yrs 101 98.2 .08
65+ 64 56.4 1.02
Goodness of Fit Test
N = χ2 goodness of fit test
T = now we calculate the value of χ2 from the last column of the table.
So our χ2 value is: 8.21
E
EO 22 )(
Categories
Observed Expected (O-E)2/E
0-24 years 177 206.95 4.33
25-44 yrs 158 138.4 2.78
45-64 yrs 101 98.2 .08
65+ 64 56.4 1.02
Goodness of Fit Test
O = to find the p-value, go to Table E df = 3 and χ2 = 8.21 From Table E, or p-value is between .025 and .05
M = if we still assume α=.10, we would reject the Ho.
S= There is evidence that the age distribution for 1996 is not the same as the age distribution for 1980.
Calculator Issues
Not really a function for goodness of fit test.Here is what you can do…
First, plug in the observed count to list 1, and expected to list 2.
In list 3, define it as (L1-L2)2/L2 Now go to List- Math - #5 (sum) and sum up list 3 to get χ2. Now go to Distr - #7 (χ2cdf) and hit enter. In the home screen, you have to plug in the following three
items: (χ2,big number, df) , for the big number 100000 is good.
The number you get is the more accurate p-value.
So now you see why I think that Table E is faster…