the chi-square distribution. preliminary idea sum of n values of a random variable

30
The Chi-Square Distribution

Upload: daniela-french

Post on 19-Jan-2016

259 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

The Chi-Square Distribution

Page 2: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Preliminary IdeaSum of n values of a random variable

1 2 3

1 2 3

We know that if a random variable X is normal,

...then has a Student's t-distribution

with ( -1) degrees of freedom.

Since ... = , it follows that the sum of values of a

n

n

x x x xx

nn

x x x x nx n

normal

random variable X also follows a Student's t-distribution with ( -1) degrees

of freedom.

n

Page 3: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Sum of Squares of random numbers

1 2 3

21

Question :

If ... follows a Student's t - distribution with ( -1)

degrees of freedom, what can we say about the sum of the squares

of these values?

In other words, what is the distribution of

nx x x x n

x

2 2 22 3

2 2 2 21 2 3

2

... ?

Answer :

Statisticans have shown that if is normal, then the sum of squares

of values of , namely,

...

has a χ distribution with ( -1) degrees of freedom.

n

n

x x x

X

n X

x x x x

n

Page 4: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

The distribution2

1. It is called the chi-square distribution.

2. “Chi” rhymes with “High” – and the “ch” is pronounced like “k”.

3. It is a continuous random variable.

4. It has n – 1 degrees of freedom

5. It’s values are non-negative (i.e. ≥ 0)

6. It is always skewed to the right.

7. It becomes more symmetrical as n increases

8. It approximates a normal distribution for large values of n

Page 5: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Two Chi-square distributions

Page 6: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

The sample variance s2 follows a chi-square distribution

2

2

22

2 2 2 2

1 2 3

The sample variance is defined by

.1

It follows that ( -1)

... .

Since the right-hand-side of this expression is a sum of sq

i

i

n

x xs

n

n s x x

x x x x x x x x

2 2

uares it follows that,

when X is normal, ( -1) has a distribution with ( -1) degrees of freedom.n s n

Page 7: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Standardizing the Test Statistic

In a test of hypothesis for a population variance σ2, the test statistic is the sample variance s2. The standardized test statistic is denoted by and is defined by:

2*

22*

20

1n s

Note: The standardized values are found in the standard chi-square tables on page 7 in the Formulas and Tables handout.

Page 8: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Chi-square table characteristics

The chi-square tables are not symmetrical.

Therefore lower-tail values and upper-tail values must be listed separately.

In the extract of the chi-square tables shown in the next slide, lower-tail areas are shaded in yellow, upper tail areas are shaded in blue.

Page 9: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Chi-square table (Page 7 in Formulas & Tables)

df 0.005 0.01 0.025 0.05 0.1 0.9 0.95 0.975 0.99 0.995

1 0.0000393 0.000157 0.000982 0.00393 0.0158 2.71 3.84 5.02 6.63 7.88

2 0.0001 0.020 0.0506 0.103 0.211 4.61 5.99 7.38 9.21 10.60

3 0.003 0.115 0.216 0.352 0.584 6.25 7.81 9.35 11.34 12.84

4 0.018 0.297 0.484 0.711 1.064 7.78 9.49 11.14 13.28 14.86

5 0.056 0.554 0.831 1.145 1.61 9.24 11.07 12.83 15.09 16.75

6 0.126 0.872 1.24 1.64 2.20 10.64 12.59 14.45 16.81 18.55

7 0.228 1.24 1.69 2.17 2.83 12.02 14.07 16.01 18.48 20.28

8 0.36 1.65 2.18 2.73 3.49 13.36 15.51 17.53 20.09 21.95

9 0.53 2.09 2.70 3.33 4.17 14.68 16.92 19.02 21.67 23.59

10 0.73 2.56 3.25 3.94 4.87 15.99 18.31 20.48 23.21 25.19

11 0.95 3.05 3.82 4.57 5.58 17.28 19.68 21.92 24.73 26.76

12 1.20 3.57 4.40 5.23 6.30 18.55 21.03 23.34 26.22 28.30

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

25 6.08 11.52 13.12 14.61 16.47 34.38 37.65 40.65 44.31 46.93

26 6.55 12.20 13.84 15.38 17.29 35.56 38.89 41.92 45.64 48.29

27 7.03 12.88 14.57 16.15 18.11 36.74 40.11 43.19 46.96 49.65

28 7.50 13.56 15.31 16.93 18.94 37.92 41.34 44.46 48.28 50.99

29 8.00 14.26 16.05 17.71 19.77 39.09 42.56 45.72 49.59 52.34

30 8.50 14.95 16.79 18.49 20.60 40.26 43.77 46.98 50.89 53.67

Page 10: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Chi-square table

Examples

2

2

Lower Tail

(.05;10) 3.94

Upper Tail

(.95;10) 18.31

Page 11: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Two-Tail Test of Hypothesis

2 20 0

2 21 0

22 2*

20

21

22

2*0 1 2

2* 2*0 1 2

H :

H :

1TS:

AL: ( / 2, 1)

(1 / 2, 1)

DR: Do not reject H if

Reject H if or

n ss

A n

A n

A A

A A

Page 12: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Lower Tail Test of Hypothesis

2 20 0

2 21 0

22 2*

20

2

2*0

2*0

H :

H :

1TS:

AL: ( , 1)

DR: Do not reject H if

Reject H if

n ss

A n

A

A

Page 13: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Upper Tail Test of Hypothesis

2 20 0

2 21 0

22 2*

20

2

2*0

2*0

H :

H :

1TS:

AL: (1 , 1)

DR: Do not reject H if

Reject H if

n ss

A n

A

A

Page 14: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Example

A random sample of 20 students' grades had a standard deviation of 14.2%. Test the professor's claim with =

A professor claims that the standard deviation of grades in an exam is 10% i.e. =.10.

2

0.05.

Note: We cannot test for a population standard deviation directly, we must first convert it to the equivalent test for a population variance.

Thus, a test for =.10 is changed to a test for (. 2

2 2

10) .01. In this example the test statistic is a standard deviation of 14.2%

or .142 =(.142) = .020164.s s

Page 15: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

The test of hypothesis

20

21

22 2*

20

21

22

2*0

2* 2*0

H : .01

H : .01

1 (20 1)(.020164)TS: .020164 38.3116

.01

AL: A (.025;19) 8.91

A (.975;19) 32.85

DR: Do not reject H if 8.91 32.85

Reject H if 8.91 or 32

n ss

0

.85

Conclusion: Reject H The professor is wrong, the standard deviation

is not equal to .10. Since the test statistic is greater

than the upper action limit, we can conclude that the

standard deviation of grades is greater than 10%.

Page 16: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

The F distribution

Page 17: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Comparison of Two Population variances

2 20 1 2

2 21 1 2

H :

H :

We want to test the hypothesis that two population variances are equal, i.e.

We need to rewrite the null and alternative hypotheses so that we can use a single value to represent the test statistic.

Page 18: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Ratio of Variances

The null and alternative hypotheses are converted to the following form.

21

0 22

21

1 22

H : 1

H : 1

Page 19: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

The Test StatisticA natural candidate to be the test statistic for the ratio of two population variances is the ratio of the corresponding sample variances

2122

s

s

Page 20: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

The F-distribution

Statisticians have shown that the ratio of two chi-square variables follows a new distribution known as the F-distribution.

2

1

2 1

2

If we have one variable with -1 degrees of freedom, and another with

1 degrees of freedom then the ratio has an -distribution with 1 degrees

of freedom for the numerator and 1 degrees of

n

n F n

n

2

11 2 2

2

freedom for the denominator.

Therefore, for a specified cumulative area

( ; 1) ( ; 1; 1)

; 1)

a

a nF a n n

a n

Page 21: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Extract of F-tables (1-α=.95)

  The F-distribution with 1 - α = .95

Denominator numerator df

df 1 2 3 4 5 6 7 8 9 10

1 161.4 199.5 215.7 224.6 230.2 234.0 236.8 238.9 240.5 241.9

2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40

3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79

4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96

5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74

6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06

7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64

8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35

9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14

10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98

11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85

12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75

13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67

14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60

15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54

Page 22: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

F-distribution examples

F(.95;4,9) = 3.63

F(.95;8,3) = 8.85

F(.99;15,20) = 3.09

F(.99;40,30) = 2.30

Page 23: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Ratio of Variances

We have already seen that for a sample of size n the sample variance has a χ2 distribution with n - 1 degrees of freedom.

It follows that the ratio of two variances

2122

s

s

numerator 1 denominator 2has an F-distribution with df 1 and df 1.n n

Page 24: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Test of Hypothesis for two variances2 2

0 1 2

2 21 1 2

21

0 22

21

1 22

2122

1 1 2

2 1 2

0 1 2 0 1 2

H :

H :

Rewrite the hypotheses as:

H : 1

H : 1

TS: *

AL: ( / 2; 1, 1)

(1 / 2; 1, 1)

DR: Do not reject H if * , Reject H if F* <A or F > A

sF

s

A F n n

A F n n

A F A

Page 25: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

One-Tail Tests

For Lower Tail Tests: A = F(; n1 - 1; n2 - 1)

For Upper Tail Tests:A = F(1 - ; n1 - 1; n2 - 1).

Page 26: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Formula for Lower Tail F-values

Since the lower tail F-values are not given in the table we must use the formulas:

1 22 1

1 22 1

For one-tail tests:

1( ; -1, -1)

(1- ; 1, 1)

For two-tail tests:

1( / 2; -1, -1)

(1- / 2; 1, 1)

F n nF n n

F n nF n n

Page 27: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Examples of Lower tail F-values

F(.05;5,9) = 1/F(.95;9,5)

= 1/4.77

= 0.2096

F(.05;7,4) = 1/F(.95;4,7)

= 1/4.12

= 0.2427

Page 28: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

EXAMPLE

The production manager of a textile company wants to test the hypothesis that the mean cost of producing a polyester fabric is the same for two different production processes. Assume that production costs are normally distributed for both processes.

Random samples of production costs for several production runs using the two different production processes are as follows:

Test the hypothesis that the two population variances are equal with a 2% level of significance.

Process I$20 $15 $20 $23 $24 $21

Process II$27 $19 $41 $30 $16

Page 29: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Sample Data

Pop 1 Pop 2

Sample size

n1 = 6 n2 = 5

Mean 20.5 26.6

Variance 9.9 97.3

Page 30: The Chi-Square Distribution. Preliminary Idea Sum of n values of a random variable

Testing the Hypothesis21

0 22

21

1 22

2122

1 1 2

2 1 2

0

H : 1

H : 1

9.9TS: * .1017

97.3

AL: ( / 2; 1, 1) (.01;5,4) 1/ (.99;4,5) 1/11.39 .088

(1 / 2; 1, 1) (.99;5,4) 15.52

DR: Do not reject H if .088 * 15.52 , Reject

sF

s

A F n n F F

A F n n F

F

0

0

H if F* <.088 or F > 15.52

Conclusion: Do not reject H .