ap statistics topic 7 chi squared tests hypothesis tests for linear regression

43
AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Upload: juliette-wiswall

Post on 14-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

AP StatisticsTopic 7

Chi Squared TestsHypothesis Tests for Linear Regression

Page 2: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

These are the last 2 things we’ll study

• Chi-squared tests– Goodness of Fit– Independence and Homogeneity

• Hypothesis tests for linear regression– The significance of the linear relationship

• We’ll spend one week on each

Page 3: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Chi-Squared Tests

• Analysis of categorical data• The tests we’ll study are

– Goodness of Fit test– tests for homogeneity and independence

• These tests are performed exactly the same way• For homogeneity, we look at two samples and one

characteristic• For independence, we look at one sample and two

characteristics

22

Page 4: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Goodness of Fit Test

• Measures the extent to which some empirical distribution “fits” the distribution expected under the null hypothesis

20 30 40 50 60Fork length

0

10

20

30

Fre

que

ncy

Page 5: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

For example

• A GEICO Direct magazine had an interesting article concerning the percentage of teenage motor vehicle deaths and the time of day. The following percentages were given from a sample.

Time %12-3AM 173-6AM 86-9AM 89AM-noon 6Noon-3PM 103-6 PM 166-9 PM 159PM-12AM 19

Page 6: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

The Distribution and Hypothesis Statements

• Is the percentage of teenage motor vehicle deaths the same for each time period? Conduct a hypothesis test at the 1% level.

• Ho: The percent of teenage motor vehicle deaths is the same for each time period.

• Ha: The percent of teenage motor vehicle deaths is not the same for each time period.

Page 7: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Let’s look at this more closely

• In this problem, what type of data are we considering?

• Categorical data – that is, the time of day• How many classes is our data divided into?• 8 different classes

Page 8: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Put another way ….

• We want to see if the distribution of our data is consistent with the hypothesized distribution

• In this case, we want to see if the distribution of accidents is uniform – about 12.5% per period

Page 9: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

So how can we do this?

• We have our observed occurrences

• Are these consistent with our hypothesis?• What should we compare with these?• Expected values --

Time 12-3 3-6 6-9 9-12 12-3 3-6 6-9 9-12Count 17 8 8 6 10 16 15 19

in Time 12-3 3-6 6-9 9-12 12-3 3-6 6-9 9-12Observed 17 8 8 6 10 16 15 19Expected 12.38 12.38 12.38 12.38 12.38 12.38 12.38 12.38

Page 10: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Test Statistic

• Our test statistic is

2

k k

kk

E

EOX

22 )(

dfkwithX 1~ 22

Page 11: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Chi-squared Distribution

Family of curves identified by deg of freedom (k-1)Mean = degrees of freedomVariance = 2(degrees of freedom)As deg of freedom increases, curves approach normal

Page 12: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

How we’ll use the chi-squared distribution

• We’ll use the chi-squared distribution to determine our p-value

• If our test statistic is large, then we’ll reject the null hypothesis

P-value

Page 13: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

How can we find p-values?

• Calculate the chi-squared statistic – Use the chi-squared table– Use the chi-squared cdf function on your TI-83

Page 14: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

The Chi-square table

Page 15: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Graphing Calculator

• 2nd DIST

CDF2

),lim,lim(2 dfitupperitlower

valuepaXP )( 2

Page 16: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Graphing Calculator

• STAT

• Inputs– Observed data list– Expected data list– Degrees of freedom

TestGOF2

Page 17: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Assumptions

• We have 2 assumptions for this test– First, the observed cell counts are based on a

random sample (our sample is random)– Our sample is large.

• How do we determine large?• Expected cell counts must all be greater than or equal to 5

Page 18: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Our conclusions?

• The same as we’ve always done– We reject or fail to reject the null based on a

comparison of the p-value and our significance level

– We interpret our conclusion in the context of our alternative hypothesis

Page 19: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Let’s summarize

• Use the same 9 steps for hypothesis testing– Identify the parameter– Null– Alternative– Choose significance level– Test Statistic– Assumptions– Calculate Test Statistic– Determine P-value– Make your conclusion

Page 20: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Let’s finish the Geico Problem

• Let’s identify the parameter– Proportion of teenage accidents

• Null Hypothesis– Ho: The percent of teenage motor vehicle deaths is the

same for each time period. • Alternative Hypothesis

– Ha: The percent of teenage motor vehicle deaths is not the same for each time period.

• Significance level–

01.

01.

Page 21: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Continuing …

• Test Statistic

• Assumptions:– The sample is random– The sample is large

k k

kk

E

EOX

22 )(

Page 22: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Continuing …

• Calculate the Test Statistic

Time 12-3 3-6 6-9 9-12 12-3 3-6 6-9 9-12Observed 17 8 8 6 10 16 15 19Expected 12.38 12.38 12.38 12.38 12.38 12.38 12.38 12.38

72.132 X

72.132 X056. valueP

Page 23: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Conclusion

• We fail to reject the null hypothesis because the p-value (.056) is greater than the significance level (.01).

• The data does not suggest that the distribution of accidental deaths is not distributed differently among the time periods.

Page 24: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Homework 7-1

• Read section 12.1 in the textbook• 12.10• 12.12• 12.14

Page 25: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Let’s try this for some practice

• Using a test, investigate whether it’s reasonable to assume the random number table is random. Use a significance level of .05.

GOF2

Page 26: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Tests for Homogeneity and Independence

• In these tests we’ll be taking n samples and looking at one characteristic.– Take samples of 1000 people from 4 different countries and ask

how they feel about whether the use of torture against suspected terrorist is justified.

– In this case we’d like to see if the responses are distributed equally (homogenous) among the countries.

• Or, we’ll take one sample and look at two characteristics.– Take a sample of 300 adults and determine each person’s

political philosophy and what television news station they watch– In this case we’d like to see if political philosophy and news

station are independent.

Page 27: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Let’s do an example of each

• First, let’s do a test for independenceBig Office is a chain of large office supply stores that sell an extensive line of desktop and laptop computers. Company executives want to know whether the demands for these types of computers are related in any way. They might act as complementary products or sales may not be related. Big Office randomly selected 250 business days categorized demand for each type of computer as Low, MedLow, MedHi and Hi.

Low MedLow MedHi Hi

Low 4 17 17 5 43

MedLow 8 23 22 27 80

MedHi 16 20 14 20 70

Hi 10 17 19 11 57

38 77 72 63 250

Desktops

Laptops

Page 28: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

So how many samples do we have?

Is the data we are collecting categorical or numerical?

How many characteristics are we investigating?

How many classes within those characteristics?

Page 29: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Hypotheses Statements

• Ho: The two variables are independent.• Ha: The two variables are not independent.

Page 30: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

How do we test for independence?

Low MedLow MedHi Hi

Low 4 17 17 5 43

MedLow 8 23 22 27 80

MedHi 16 20 14 20 70

Hi 10 17 19 11 57

38 77 72 63 250

Recall that if events A and B are independentSo, we’ll assume the two variables are independent.Then we’ll determine expected cell counts for each cell.We’ll look at the differences between the expected and observed counts for out test.

)()()( BPAPBAP

Page 31: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Test Statistic

ij ij

ijij

E

EOX

22 )(

)1)(1( crdf

Page 32: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Assumptions

• The sample is random.• The sample is large

– Each expected cell count is at least 5

Page 33: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

At this point …

• We can calculate the p-value using the chi-squared table

• The chi-squared CDF function on our calculator

• Or, use the chi-squared test

Page 34: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Let’s look at the Chi-Squared Test

• This test is a piece of cake ….• First, put your observed matrix into the

calculator• STAT – TEST – • Now just select Calculate

– The calculator creates the Expected matrix– Output : value of your test statistic and p-value

Test2

Page 35: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

So let’s do this problem using our 9 steps of hypothesis testing

Page 36: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Chi-Squared Test for HomogenietyThe paper “No Evidence of Impaired Neurocognitive Performance in Collegiate Soccer

Players’ compared collegiate soccer players, athletes in sports other than soccer, and a group of students who were not involved in collegiate sports with respect to head injuries.

Three independent random samples were chosen and each person in the sample was asked to complete a medical history survey. The following 2-way contingency table was created based on reported concussions.

0 1 2 3+

Soccer 45 25 11 10 91

Other 68 15 8 5 96

Non 45 5 3 0 53

158 45 22 15 240

Page 37: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

So how many samples do we have?

Is the data we are collecting categorical or numerical?

How many characteristics are we investigating?

How many classes within those characteristics?

Page 38: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Hypotheses Statements

• Ho: The populations are homogenous– or, The category proportions are the same for all

populations.• Ha: The populations are not homogenous.

– or, the category proportions are not the same for all populations.

Page 39: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Test Statistic

ij ij

ijij

E

EOX

22 )(

)1)(1( crdf

Page 40: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Assumptions

• The samples are random and independent.• The sample is large

– The expected cell counts are at least 5

Page 41: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Everything else is the same

• Let’s finish this test using our 9 steps.

0 1 2 3+

Soccer 45 25 11 10 91

Other 68 15 8 5 96

Non 45 5 3 0 53

158 45 22 15 240

Page 42: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

To summarize …

• Test for Independence– One sample– Two characteristics– Assumptions:

• Sample is random• Sample is large

• Test for Homogeneity– Multiple samples– One characteristic– Assumptions:

• Samples are independent and random• Samples are large

Page 43: AP Statistics Topic 7 Chi Squared Tests Hypothesis Tests for Linear Regression

Homework 7-2

• Read Section 12.2 • 12.18• 12.22