chapter 14: nonparametric statistics statistics. mcclave, statistics, 11th ed. chapter 14:...
TRANSCRIPT
Chapter 14: Nonparametric Statistics
Statistics
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
2
Where We’ve Been
Presented methods for making inferences about means and correlation
Methods required the data or the sampling distributions to be normally distributed
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
3
Where We’re Going
Inferential techniques requiring fewer or less stringent assumptions
Nonparametric tests based on ranks
4McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
14.1: Distribution-Free Tests
Testing non-normal data with test based on normality may lead to P(Type I error) > less than maximum power of the test (1 - ).
5McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
14.1: Distribution-Free Tests
Parametric tests(z, t, F)
Data or sampling distribution are normal
Non-parametric tests(Rank-ordered, no
assumed distribution)
Data or sampling distribution are skewed,
or data is ordinal
6McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
14.1: Distribution-Free Tests
Parametric tests(z, t, F)
Data or sampling distribution are normal
Non-parametric tests(Rank-ordered, no
assumed distribution)
Data or sampling distribution are skewed,
or data is ordinal
Nonparametric statistics (or tests) based on the ranks of measurements are called rank statistics (or rank tests.)
14.2: Single-Population Inferences
The sign test provides inferences about population medians, or central tendencies, when skewed data or an outlier would invalidate tests based on normal distributions.
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
7
14.2: Single-Population Inferences
One-tailed test for a Population Median,
Test Statistic:
S = number of sample measurements greater than (less than) 0
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
8
0
00
)(:
)(:
orH
orH
a
Two-tailed test for a Population Median,
Test Statistic:
S = larger of S1 and S2 where S1 is the number of measurements less than 0 and S2 the number greater than 0
0
00
:
:
aH
H
14.2: Single-Population Inferences
One-tailed test for a Population Median, Observed significance level:
p-value = P(xS)
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
9
Two-tailed test for a Population Median, Observed significance level:
p-value = 2P(xS)
where x has a binomial distribution with parameters
n and p = .5.
Reject H0 if p-value .
14.2: Single-Population Inferences
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
10
Median time to failure for a band of compact disc players is 5,250 hours. Twenty players from a competitor are tested, with failure times from 5 hours to 6,575 hours. Fourteen of the players exceed 5,250 hours.
Do the competitor’s machines perform differently?
14.2: Single-Population Inferences
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
11
Median time to failure for a band of compact disc players is 5,250 hours. Twenty players from a competitor are tested, with failure times from 5 hours to 6,575 hours. Fourteen of the players exceed 5,250 hours.
Do the competitor’s machines perform differently?
645.1565.1205.
105.13
5.
5.)5.(
645.1
10.
250,5:
250,5:
*05.
*2/
0
n
nsz
zz
H
H
a
14.2: Single-Population Inferences
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
12
Median time to failure for a band of compact disc players is 5,250 hours. Twenty players from a competitor are tested, with failure times from 5 hours to 6,575 hours. Fourteen of the players exceed 5,250 hours.
Do the competitor’s machines perform differently?
645.1565.1205.
105.13
5.
5.)5.(
645.1
10.
250,5:
250,5:
*05.
*2/
0
n
nsz
zz
H
H
a
Do not reject H0
14.3: Comparing Two Populations: Independent Samples
Wilcoxon Rank Sum Test Used to test whether two independent
samples have the same probability distribution Samples must be random and independent. Probability distributions must be continuous.
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
13
One-tailed test
H0: D1 and D2 are identical
Ha: D1 is shifted right of D2 or Ha: D1 is shifted left of D2
Test statistic:
T1, if n1 < n2
T2, if n1 > n2
Either if n1 = n2
Rejection region:
T1: T1 TU or T1 TL
T2: T2 TL or T2 TU
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
14
14.3: Comparing Two Populations: Independent Samples
Wilcoxon Rank Sum Test Two-tailed test
H0: D1 and D2 are identical
Ha: D1 is shifted either right or left of D2
Test statistic:
T1, if n1 < n2
T2, if n1 > n2
Either if n1 = n2
Rejection region:
T TL or T TU
14.3: Comparing Two Populations: Independent Samples
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
15
Reaction Times of Subjects Under the Influence of Drug A or B
1 2 3 4 5 6 7
1.62 1.71 1.93 1.96 2.07 2.11 2.24
8 9 10 11 12 13
2.41 2.43 2.50 2.71 2.84 2.88
Rank:Value:Rank:Value:
H0 : DA and DB are identicalHa: DA is shifted right of DB or Ha: DA is shifted left of DB
=.05TA = 1 + 2 + 3 + 4 + 7 + 8 = 25TB = 5 + 6 + 9 + 10 + 11 + 12 + 13 = 66Test Statistic is TA, since nA < nB
TL (=.05, nA= 6, nB= 7) = 28 > TA = 25
Reject H0
One-tailed test
H0: D1 and D2 are identical
Ha: D1 is shifted right of D2 or Ha: D1 is shifted left of D2
Rejection region:
| z | > za
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
16
14.3: Comparing Two Populations: Independent SamplesWilcoxon Rank Sum Test for Large Samples
Two-tailed test
H0: D1 and D2 are identical
Ha: D1 is shifted either right or left of D2
Rejection region:
| z | > za/2
Test Statistic:
12)1(
2)1(
2121
2111
nnnn
nnnT
z
14.4: Comparing Two Populations: Paired Difference Experiment
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
17
Judge Product A Product B A - B |A – B| Rank of |A – B|
1 6 4 2 2 5
2 8 5 3 3 7.5
3 4 5 -1 1 2
4 9 8 1 1 2
5 4 1 3 3 7.5
6 7 9 -2 2 5
7 6 2 4 4 9
8 5 3 2 2 5
9 6 7 -1 1 2
10 8 2 6 6 10
14.4: Comparing Two Populations: Paired Difference Experiment
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
18
Judge Product A Product B A - B |A – B| Rank of |A – B|
1 6 4 2 2 5
2 8 5 3 3 7.5
3 4 5 -1 1 2
4 9 8 1 1 2
5 4 1 3 3 7.5
6 7 9 -2 2 5
7 6 2 4 4 9
8 5 3 2 2 5
9 6 7 -1 1 2
10 8 2 6 6 10
T- = Sum of negative ranks = 9
T+ = Sum of positive ranks = 46
14.4: Comparing Two Populations: Paired Difference Experiment
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
19
Judge Product A Product B A - B |A – B| Rank of |A – B|
1 6 4 2 2 5
2 8 5 3 3 7.5
3 4 5 -1 1 2
4 9 8 1 1 2
5 4 1 3 3 7.5
6 7 9 -2 2 5
7 6 2 4 4 9
8 5 3 2 2 5
9 6 7 -1 1 2
10 8 2 6 6 10
H0: The probability distributions of the ratings for products A and B are identicalHa: The probability distributions of the ratings differ= .05, two-tailed testTest statistic: T = Smaller of T+ and T-
Rejection region: T 8 (see Table XIII in Appendix A)
T- = 9 > 8Do not reject H0
14.5: Comparing Three or More Populations: Completely Randomized Design
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
20
Kruskal-Wallis H – test Compares probability distributions for k
populations or treatments No assumption about the distributions
H0 : The k probability distributions are identical
Ha: At least two of the k probability distributions differ
14.5: Comparing Three or More Populations: Completely Randomized Design
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
21
Kruskal-Wallis H – test k samples are random and independent. For each sample nj 5. The k probability distributions are
continuous.
14.5: Comparing Three or More Populations: Completely Randomized Design
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
22
Kruskal-Wallis H – test
Test statistic:
n = total sample size nj = measurements in sample j
Rj = rank sum of sample j
)1(3)1(
122
nn
R
nnH
j
j
14.5: Comparing Three or More Populations: Completely Randomized Design
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
23
Kruskal-Wallis H – test H = 0 All samples have the same
mean rank Large H Larger differences
between sample mean ranks If H0 is true, H ~ 2, with df = (k-1)
Reject H0 if H > 2
14.5: Comparing Three or More Populations: Completely Randomized Design
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
24
Population A B C
nj 15 15 15
Rj 235 439 361
Rj2 55,225 192,721 130,321
A study of three populations yielded the following:
H0 : The k probability distributions are identicalHa: At least two of the k probability distributions differ= .05df = 3-1= 22 .05= 5.99147
14.5: Comparing Three or More Populations: Completely Randomized Design
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
25
Population A B C
nj 15 15 15
Rj 235 439 361
Rj2 55,225 192,721 130,321
A study of three populations yielded the following:
19.8
)145(315
321,130
15
721,192
15
225,55
)145(45
12
)1(3)1(
122
H
H
nn
R
nnH
j
j
14.5: Comparing Three or More Populations: Completely Randomized Design
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
26
Population A B C
nj 15 15 15
Rj 235 439 361
Rj2 55,225 192,721 130,321
A study of three populations yielded the following:
H0 : The k probability distributions are identicalHa: At least two of the k probability distributions differ= .05df = 3-1= 22.05 = 5.99147
Since H = 8.19 > 2 .05= 5.99147,reject H0
14.6: Comparing Three or More Populations: Randomized Block Design
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
27
Friedman Fr-statistic H0 : The p probability distributions are identical
Ha: At least two of the p probability distributions differ in location
Test statistic:
b = number of blocks (>5) k = number of treatments (>5) Rj = rank sum of treatment j
)1(3)1(
12 2 kbRkbk
F jr
14.6: Comparing Three or More Populations: Randomized Block Design
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
28
Friedman Fr-statistic Treatments are randomly assigned to experimental
units within the blocks. Measurements can be ranked within blocks. The p probability distributions from which the samples
within each block are drawn are continuous.
Fr ~ 2 with k – 1 degrees of freedom.
14.6: Comparing Three or More Populations: Randomized Block Design
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
29
Population A B C D
Rj 11 21 21 7
Rj2 121 441 441 49
A study of four treatments and six blocks yielded the following:
H0 : The probability distributions for the p treatments are identicalHa: At least two of the p probability distributions differ in location= .05df = 4-1= 32 .05 = 7.81473
14.6: Comparing Three or More Populations: Randomized Block Design
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
30
Population A B C D
Rj 11 21 21 7
Rj2 121 441 441 49
A study of four treatments and six blocks yielded the following:
2.15
)14)(6(3)49441121()14)(4)(6(
12
)1(3)1(
12 2
r
r
jr
F
F
kbRkbk
F
14.6: Comparing Three or More Populations: Randomized Block Design
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
31
Population A B C D
Rj 11 21 21 7
Rj2 121 441 441 49
A study of four treatments and six blocks yielded the following:
H0 : The probability distributions for the p treatments are identicalHa: At least two of the p probability distributions differ in location= .05df = 4-1= 32 .05 = 7.81473
Since H = 15.2 > 2 .05= 7.81473,reject H0
14.7: Rank Correlation
Spearman’s Rank Correlation Coefficient
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
32
n
vvvvSS
n
uuuuSS
n
vuvuvvuuSS
SSSS
SSr
iiivv
iiiuu
iiiiiiuv
vvuu
uvs
2
22
2
22
)(
)(
))((where
14.7: Rank Correlation
Spearman’s Rank Correlation Coefficient
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
33
where (cont.) ui = Rank of the ith observation in sample 1vi = Rank of the ith observation in sample 2n = Number of pairs of observations
Shortcut Formula for rs*
where
iii
is
vud
nn
dr
)1(
61
2
2
* A good approximation when there are few ties relative to n
14.7: Rank Correlation
Spearman’s Rank Correlation Coefficient
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
34
-1 +1 0
Perfect negative correlation
Perfect positive correlation
No correlation
Spearman’s Nonparametric Test for Rank Correlation
One-Tailed Test
Rejection region: |rs | > rs,
Two-Tailed test
Rejection region: |rs | > rs,/2
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
35
00:
0:0
orH
H
a 0:
0:0
aH
H
Test Statistics: rs
Conditions1. The sample of experimental units on which the two variables are measured must be randomly selected, and2. The probability distributions of the two variables must be continuous.
14.7: Rank Correlation
14.7: Rank Correlation
Team Predictor A Predictor B
Boston College 1 5Florida State 2 1Wake Forest 3 2Clemson 4 3Maryland 5 6N.C. State 6 4
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
36
Preseason Predictions for 2007 ACC Atlantic Division Football
14.7: Rank Correlation
Team Predictor A Predictor B
Boston College 1 5Florida State 2 1Wake Forest 3 2Clemson 4 3Maryland 5 6N.C. State 6 4
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
37
Preseason Predictions for 2007 ACC Atlantic Division Football
314.
)5.17)(5.17(
5.5
7.17
5.17
5.5
s
s
vvuu
uvs
vv
uu
uv
r
r
SSSS
SSr
SS
SS
SS
14.7: Rank Correlation
Team Predictor A Predictor B
Boston College 1 5Florida State 2 1Wake Forest 3 2Clemson 4 3Maryland 5 6N.C. State 6 4
McClave, Statistics, 11th ed. Chapter 14: Nonparametric Statistics
38
Preseason Predictions for 2007 ACC Atlantic Division Football
0:
0:
314.
0
a
s
H
H
r
From Table XIV,with n = 6, rs =.05 =.829, soH0 is not rejected.