chapter 14: nonparametric statistics statistics. mcclave, statistics, 11th ed. chapter 14:...

Chapter 14: Nonparametric Statistics

Statistics

McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics

2

Where We’ve Been

Presented methods for making inferences about means and correlation

Methods required the data or the sampling distributions to be normally distributed


3

Where We’re Going

Inferential techniques requiring fewer or less stringent assumptions

Nonparametric tests based on ranks

4McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics

14.1: Distribution-Free Tests

Testing non-normal data with test based on normality may lead to P(Type I error) > less than maximum power of the test (1 - ).



Parametric tests(z, t, F)

Data or sampling distribution are normal

Non-parametric tests(Rank-ordered, no

assumed distribution)

Data or sampling distribution are skewed,

or data is ordinal



Parametric tests(z, t, F)

Data or sampling distribution are normal

Non-parametric tests(Rank-ordered, no

assumed distribution)

Data or sampling distribution are skewed,

or data is ordinal

Nonparametric statistics (or tests) based on the ranks of measurements are called rank statistics (or rank tests.)

14.2: Single-Population Inferences

The sign test provides inferences about population medians, or central tendencies, when skewed data or an outlier would invalidate tests based on normal distributions.


7


One-tailed test for a Population Median,

Test Statistic:

S = number of sample measurements greater than (less than) 0


8

0

00

)(:

)(:

orH

orH

a

Two-tailed test for a Population Median,

Test Statistic:

S = larger of S1 and S2 where S1 is the number of measurements less than 0 and S2 the number greater than 0

0

00

:

:

aH

H


One-tailed test for a Population Median, Observed significance level:

p-value = P(xS)


9

Two-tailed test for a Population Median, Observed significance level:

p-value = 2P(xS)

where x has a binomial distribution with parameters

n and p = .5.

Reject H0 if p-value .



10

Median time to failure for a band of compact disc players is 5,250 hours. Twenty players from a competitor are tested, with failure times from 5 hours to 6,575 hours. Fourteen of the players exceed 5,250 hours.

Do the competitor’s machines perform differently?



11



645.1565.1205.

105.13

5.

5.)5.(

645.1

10.

250,5:

250,5:

*05.

*2/

0

n

nsz

zz

H

H

a



12



645.1565.1205.

105.13

5.

5.)5.(

645.1

10.

250,5:

250,5:

*05.

*2/

0

n

nsz

zz

H

H

a

Do not reject H0

14.3: Comparing Two Populations: Independent Samples

Wilcoxon Rank Sum Test Used to test whether two independent

samples have the same probability distribution Samples must be random and independent. Probability distributions must be continuous.


13

One-tailed test

H0: D1 and D2 are identical

Ha: D1 is shifted right of D2 or Ha: D1 is shifted left of D2

Test statistic:

T1, if n1 < n2

T2, if n1 > n2

Either if n1 = n2

Rejection region:

T1: T1 TU or T1 TL

T2: T2 TL or T2 TU


14


Wilcoxon Rank Sum Test Two-tailed test


Ha: D1 is shifted either right or left of D2

Test statistic:

T1, if n1 < n2

T2, if n1 > n2

Either if n1 = n2

Rejection region:

T TL or T TU



15

Reaction Times of Subjects Under the Influence of Drug A or B

1 2 3 4 5 6 7

1.62 1.71 1.93 1.96 2.07 2.11 2.24

8 9 10 11 12 13

2.41 2.43 2.50 2.71 2.84 2.88

Rank:Value:Rank:Value:

H0 : DA and DB are identicalHa: DA is shifted right of DB or Ha: DA is shifted left of DB

=.05TA = 1 + 2 + 3 + 4 + 7 + 8 = 25TB = 5 + 6 + 9 + 10 + 11 + 12 + 13 = 66Test Statistic is TA, since nA < nB

TL (=.05, nA= 6, nB= 7) = 28 > TA = 25

Reject H0

One-tailed test


Ha: D1 is shifted right of D2 or Ha: D1 is shifted left of D2

Rejection region:

| z | > za


16

14.3: Comparing Two Populations: Independent SamplesWilcoxon Rank Sum Test for Large Samples

Two-tailed test


Ha: D1 is shifted either right or left of D2

Rejection region:

| z | > za/2

Test Statistic:

12)1(

2)1(

2121

2111

nnnn

nnnT

z

14.4: Comparing Two Populations: Paired Difference Experiment


17

Judge Product A Product B A - B |A – B| Rank of |A – B|

1 6 4 2 2 5

2 8 5 3 3 7.5

3 4 5 -1 1 2

4 9 8 1 1 2

5 4 1 3 3 7.5

6 7 9 -2 2 5

7 6 2 4 4 9

8 5 3 2 2 5

9 6 7 -1 1 2

10 8 2 6 6 10



18


1 6 4 2 2 5

2 8 5 3 3 7.5

3 4 5 -1 1 2

4 9 8 1 1 2

5 4 1 3 3 7.5

6 7 9 -2 2 5

7 6 2 4 4 9

8 5 3 2 2 5

9 6 7 -1 1 2

10 8 2 6 6 10

T- = Sum of negative ranks = 9

T+ = Sum of positive ranks = 46



19


1 6 4 2 2 5

2 8 5 3 3 7.5

3 4 5 -1 1 2

4 9 8 1 1 2

5 4 1 3 3 7.5

6 7 9 -2 2 5

7 6 2 4 4 9

8 5 3 2 2 5

9 6 7 -1 1 2

10 8 2 6 6 10

H0: The probability distributions of the ratings for products A and B are identicalHa: The probability distributions of the ratings differ= .05, two-tailed testTest statistic: T = Smaller of T+ and T-

Rejection region: T 8 (see Table XIII in Appendix A)

T- = 9 > 8Do not reject H0

14.5: Comparing Three or More Populations: Completely Randomized Design


20

Kruskal-Wallis H – test Compares probability distributions for k

populations or treatments No assumption about the distributions

H0 : The k probability distributions are identical

Ha: At least two of the k probability distributions differ



21

Kruskal-Wallis H – test k samples are random and independent. For each sample nj 5. The k probability distributions are

continuous.



22

Kruskal-Wallis H – test

Test statistic:

n = total sample size nj = measurements in sample j

Rj = rank sum of sample j

)1(3)1(

122

nn

R

nnH

j

j



23

Kruskal-Wallis H – test H = 0 All samples have the same

mean rank Large H Larger differences

between sample mean ranks If H0 is true, H ~ 2, with df = (k-1)

Reject H0 if H > 2



24

Population A B C

nj 15 15 15

Rj 235 439 361

Rj2 55,225 192,721 130,321

A study of three populations yielded the following:

H0 : The k probability distributions are identicalHa: At least two of the k probability distributions differ= .05df = 3-1= 22 .05= 5.99147



25

Population A B C

nj 15 15 15

Rj 235 439 361

Rj2 55,225 192,721 130,321


19.8

)145(315

321,130

15

721,192

15

225,55

)145(45

12

)1(3)1(

122

H

H

nn

R

nnH

j

j



26

Population A B C

nj 15 15 15

Rj 235 439 361

Rj2 55,225 192,721 130,321


H0 : The k probability distributions are identicalHa: At least two of the k probability distributions differ= .05df = 3-1= 22.05 = 5.99147

Since H = 8.19 > 2 .05= 5.99147,reject H0

14.6: Comparing Three or More Populations: Randomized Block Design


27

Friedman Fr-statistic H0 : The p probability distributions are identical

Ha: At least two of the p probability distributions differ in location

Test statistic:

b = number of blocks (>5) k = number of treatments (>5) Rj = rank sum of treatment j

)1(3)1(

12 2 kbRkbk

F jr



28

Friedman Fr-statistic Treatments are randomly assigned to experimental

units within the blocks. Measurements can be ranked within blocks. The p probability distributions from which the samples

within each block are drawn are continuous.

Fr ~ 2 with k – 1 degrees of freedom.



29

Population A B C D

Rj 11 21 21 7

Rj2 121 441 441 49

A study of four treatments and six blocks yielded the following:

H0 : The probability distributions for the p treatments are identicalHa: At least two of the p probability distributions differ in location= .05df = 4-1= 32 .05 = 7.81473



30

Population A B C D

Rj 11 21 21 7

Rj2 121 441 441 49


2.15

)14)(6(3)49441121()14)(4)(6(

12

)1(3)1(

12 2

r

r

jr

F

F

kbRkbk

F



31

Population A B C D

Rj 11 21 21 7

Rj2 121 441 441 49


H0 : The probability distributions for the p treatments are identicalHa: At least two of the p probability distributions differ in location= .05df = 4-1= 32 .05 = 7.81473

Since H = 15.2 > 2 .05= 7.81473,reject H0

14.7: Rank Correlation

Spearman’s Rank Correlation Coefficient


32

n

vvvvSS

n

uuuuSS

n

vuvuvvuuSS

SSSS

SSr

iiivv

iiiuu

iiiiiiuv

vvuu

uvs

2

22

2

22

)(

)(

))((where




33

where (cont.) ui = Rank of the ith observation in sample 1vi = Rank of the ith observation in sample 2n = Number of pairs of observations

Shortcut Formula for rs*

where

iii

is

vud

nn

dr

)1(

61

2

2

* A good approximation when there are few ties relative to n




34

-1 +1 0

Perfect negative correlation

Perfect positive correlation

No correlation

Spearman’s Nonparametric Test for Rank Correlation

One-Tailed Test

Rejection region: |rs | > rs,

Two-Tailed test

Rejection region: |rs | > rs,/2


35

00:

0:0

orH

H

a 0:

0:0

aH

H

Test Statistics: rs

Conditions1. The sample of experimental units on which the two variables are measured must be randomly selected, and2. The probability distributions of the two variables must be continuous.



Team Predictor A Predictor B

Boston College 1 5Florida State 2 1Wake Forest 3 2Clemson 4 3Maryland 5 6N.C. State 6 4


36

Preseason Predictions for 2007 ACC Atlantic Division Football





37


314.

)5.17)(5.17(

5.5

7.17

5.17

5.5

s

s

vvuu

uvs

vv

uu

uv

r

r

SSSS

SSr

SS

SS

SS





38


0:

0:

314.

0

a

s

H

H

r

From Table XIV,with n = 6, rs =.05 =.829, soH0 is not rejected.

chapter 14: nonparametric statistics statistics. mcclave, statistics, 11th ed. chapter 14:...

Documents