chapter 11 chi-square distribution

38
Chapter 11 Chi-Square Distribution

Upload: herrod-warren

Post on 02-Jan-2016

58 views

Category:

Documents


1 download

DESCRIPTION

Chapter 11 Chi-Square Distribution. Review. So far, we have used several probability distributions for hypothesis testing and confidence intervals with normal distribution and Student’s t distribution. In this section, we will be using chi- squre. What is Chi-Square?. = Chi-Square - PowerPoint PPT Presentation

TRANSCRIPT

Chapter 11 Chi-Square Distribution

Review

• So far, we have used several probability distributions for hypothesis testing and confidence intervals with normal distribution and Student’s t distribution.

• In this section, we will be using chi-squre.

What is Chi-Square?

• = Chi-Square

• The values begin at 0 and then all are positive. The graph of is not symmetrical, and like student’s t distribution, it depends on the number of degrees of freedom.

• It can determine if random variables are dependent or independent.

• It can determine if different populations share the same proportions of specified characteristics.

Example:

Mode (high point)

• The mode (high point) of a chi-square distribution with n degrees of freedom occurs over n-2 (for

Formula for

• O= observed• E= expected

Degrees of Freedom

• Degrees of freedom = (number of rows – 1)(Number of columns – 1)

• R= number of cell rows• C=number of cell columns

Example: (The situation)

• Innovative Machines Incorporated has developed two new letter arrangements for computer keyboards. The company wishes to see if there is any relationship between the arrangement of letters on the keyboard and the number of hours it takes a new typing student to learn to type at 20 words per minute. Or, from another point of view, is the time it takes a student to learn to type independent of the arrangement of the letters on a keyboard? Use 5% level of significance

Example: (step 1)

• Keyboard arrangement and learning times are independent

• Keyboard arrangement and learning times are not independent

Example: (chart)Step 2: Determine E

Answer for E (will show in class)Keyboard 21-40 h 41-60 h 61-80 h Row Total

A O:25E:24

O:30E:40

O:25E:16

80

B O:30E:36

O:71E:60

O:19E:24

120

Standard O:35E:30

O:49E:50

O:16E:20

100

Column Total 90 150 60 300 (sample size)

Remember

Chart to find

Cell

1 25 24 1 1 0.04

2 30 40 -10 100 2.50

3 25 16 9 81 5.06

4 30 36 -6 36 1.00

5 71 60 11 121 2.02

6 19 24 -5 25 1.04

7 35 30 5 25 0.83

8 49 50 -1 1 0.02

9 16 20 -4 16 0.80

What is then?

• Add up all the numbers0.04

2.50

5.06

1.00

2.02

1.04

0.83

0.02

0.80

Example: (Degrees of freedom for test of independence)

• d.f.=4

Conclusion

• Look in the book with chi-square table.

• Since we have Chi-square as 13.31 with d.f. 4

• The corresponding P-value falls between 0.005 and 0.010.

• Since (.005< P-Value < 0.010) < .05, we reject null and accept alternate. Based on 5% level of significance, we are taking a chance to conclude that keyboard arrangement and learning time are not independent.

Group Work (the situation)

• Vending Machine is to install soda machines in elementary school and high school. The market analyst wish to know if flavor preference and school level are independent. A random sample of 200 students was taken. Their school level and soda preferences are given. Is independence indicated at the 1% level of significance?

Group Work (table)Soda High School Elementary Row Total

Coke O:33E:

O:57E:

90

Pepsi O:30E:

O:20E:

50

Mountain Dew O:5E:

O:35E:

40

Fanta O:12E:

O:8E:

20

Column Total 80 120 200 (sample size)

How to Test for independence of two statistical variables

• Look at Pg 582. Copy it and follow it!

Test of homogeneity

• The test claim that different populations share the sample proportions of specified characteristics.

Test of Homogeneity

• The procedure is very much the same as test for independence, except the hypothesis is different.

• For test of independence:

• For test of homogeneity:

Example:

• If you could own one pet, what kind would you choose? The possible responses were of the following. Does the same proportion of males same as females prefer each type of pet? Use 1 % level of significance

Gender Dog Cat Other pet No Pet

Female 120 132 18 30

Male 135 70 20 25

Fill this outGender Dog Cat Other pet No Pet Row Total

Female O:120E:

O:132E:

O:18E:

O:30E:

Male O:135E:

O:70E:

O:20E:

O:25E:

Column Total

AnswerGender Dog Cat Other pet No Pet Row Total

Female O:120E:139.09

O:132E:110.18

O:18E:20.73

O:30E:30

300

Male O:135E:115.91

O:70E:91.82

O:20E:17.27

O:25E:25

250

Column Total

255 202 38 55 550 (sample size)

Fill this outCell

12345678

AnswerCell

1 120 139.09 2.62

2 132 110.18 4.320

3 18 20.73 0.359

4 30 30 0

5 135 115.91 3.144

6 70 91.82 5.185

7 20 17.27 0.431

8 25 25 0

Final Answer

• Chi-square= 16.059• d.f.=3• P-value=.001

• Based on 1% level of significance, we are taking a chance to say that males and female students have different preferences when it comes to selecting a pet because we rejected the null saying preference is the same and accept the alternate saying the preference is different.

Homework Practice

• Pg 588 #1-15 even

CHI-SQUARE: GOODNESS OF FIT

Reason Behind Goodness of Fit

• Set up a test to investigate how well a sample distribution fits a given distribution

• Use observed and expected frequencies to compute the sample chi-square statistics

• Find or estimate the P-value and complete the test

Hypothesis Testing

Sample statistic

• With degrees of freedom= k-1• E=Expected frequency• O=Observed frequency• k=number of categories in the distribution

Question

• Does present distribution of favorable responses the same or different than last year? To test this hypothesis, a random sample of 500 employees was taken. The chart is on the next slide. Use 1% level of significance

ExampleCategory Percentage of Favorable Responses

Vacation time 4%

Salary 65%

Safety regulations 13%

Health and retirement benefits 12%

Overtime policy and pay 6%

Category Observed

Vacation time 30

Salary 290

Safety regulations 70

Health and retirement benefits

70

Overtime 40

Answer

Category O E

Vacation time 30 20 100 5.00

Salary 290 325 1225 3.77

Safety regulations

70 65 25 0.38

Health and retirement benefits

70 60 100 1.67

Overtime 40 30 100 3.33

Total 500 500 14.15

Answer

• K-1 = 5-1=4

• (.005<P-value<.010) < .01• Reject null, accept alternate

• At the 1% level of significance, we can say that the evidence supports the conclusion that this year’s responses to the issues are different from last years because we reject the null saying they are the same and accept the alternate, saying they are different.

Group Work• The age distribution of the Canadian population and the age

distribution of a random sample of 455 residents in the Indian community (Red Lake village)

• Use 5% level of significance to test the claim that the age distribution fits the age distribution of red lake village

Age % population Observed in Red Lake Village

Under 5 7.2% 47

5-14 13.6% 75

15-64 67.1% 288

65 + 12.1% 45

Answer

• .005<P-value<.01• Reject null; accept alternate• ***insert conclusion***

Homework Practice

• Pg 597 #1-18 even