nonparametric statistical techniques
DESCRIPTION
Nonparametric Statistical Techniques. Chapter 17. 17.1 Introduction. The statistical techniques introduced in this chapter deal with ordinal data. We test to determine whether the population locations differ. - PowerPoint PPT PresentationTRANSCRIPT
1
Nonparametric StatisticalTechniquesNonparametric StatisticalTechniques
Chapter 17
2
The statistical techniques introduced in this chapter deal with ordinal data.
We test to determine whether the population locations differ.
In testing the locations we will not refer to any parameter, thus the procedure’s name.
17.1 Introduction
3
When comparing two populations the hypotheses generally are:
17.1 Introduction
H0: The population locations are the same
H1: (i) The locations differ, or(ii) Population 1 is located to the right (left) of
population 2
The random variable X1 is generally larger (smaller) than X2.
4
17.2 Wilcoxon Rank Sum Test
The problem characteristics of this test are: The problem objective is to compare two populations. The data are either ordinal or interval (but not normal). The samples are independent.
5
Wilcoxon Rank Sum Test – Example
Example 17.1 Based on the two samples shown below, can we infer
at 5% significance level that the location of population 1 is to the left of the location of population 2?
Sample 1: 22, 23, 20; Sample 2: 18, 27, 26;The hypotheses are:
H0: The two population locations are the same.H1: The location of population 1 is to the left of the
location of population 2.
6
Graphical DemonstrationWhy use the sum of ranks to test locations?
Sum of ranks = 37Sum of ranks = 41
76 921 3 4 5 8 10 11 12
If the locations of the two populations are about the same, (the null hypothesis is true)we would expect the ranks to be evenly spread between the samples.
In this case the sum of ranks for the two samples will be close to one another.
Two hypothetical populations and their corresponding samples are presented, the GREEN population and the PURPLE population.
Populations
Let us rank the observations of the two samples together
7
Allow the GREEN population to shift to the left of the PURPLE population.
Graphical DemonstrationWhy use the sum of ranks to test locations?
8
766 7 92
Sum of ranks = 38Sum of ranks = 40
1 3 4 5 8 10 11 1292
The green sample is expected to shift to the left too.As a result, several observations exchange location.
What happens to the sum of ranks? Click.
Attentio
nAttention
Attentio
n
Sum of ranks = 37Sum of ranks = 41
Sum of ranks = 45Sum of ranks = 33
Graphical DemonstrationWhy use the sum of ranks to test locations?
9
6 7
Sum of ranks = 38Sum of ranks = 40
1 3 4 5 8 10 11 1292
Sum of ranks = 37Sum of ranks = 41
Sum of ranks = 45Sum of ranks = 33
The “green” sum decreases , and the “purple” sum increases.Changing the relative location of two populations affect the sum of ranks of the two samples combined.
Graphical DemonstrationWhy use the sum of ranks to test locations?
10
Example 17.1 – continued Test statistic
1. Rank all the six observations (1 for the smallest).
Sample 1 22 23 20
Sample 2 18 27 26
RankRank1
6
5
3
4
2
2. Calculate the sum of ranks: 9
2. Calculate the sum of ranks:12
3. Let T = 9 be the test statistic (We arbitrarily define the test statistic as the rank sum of sample 1.
Wilcoxon Rank Sum Test – Example
11
Example 17.1 - continued If T is sufficiently small then most of the smaller
observations are located in population 1. Reject the null hypothesis.
Question: How small is sufficiently small? We need to look at the distribution of T.
Wilcoxon Rank Sum Test – Rationale
12
1,2,3
6 7 8 9 10 11 12 13 14 15
1,2,4 1,2,5 1,2,6
1,3,4
1,3,6
1,3,5 1,4,5
1,4,6 1,5,6
2,3,4 2,3,5
2,3,6
2,4,5
2,4,6
2,5,6
3,4,5
3,4,6
3,5,6 4,5,6
T
.05
.10
.15
T is the rank sum of a sample of size 3.
This sample received the ranks 3, 4, 5
If H0 is true (the two populations have the same location), each ranking is equally likely, and each possible value of T has the same probability = 1/20
This sample received the ranks 1, 2, 3
The distribution of T under H0 for two samples of size 3
13
The distribution of T under H0 for two samples of size 3
1,2,3
6 7 8 9 10 11 12 13 14 15
1,2,4 1,2,5 1,2,6
1,3,4
1,3,6
1,3,5 1,4,5
1,4,6 1,5,6
2,3,4 2,3,5
2,3,6
2,4,5
2,4,6
2,5,6
3,4,5
3,4,6
3,5,6 4,5,6
T
.05
.10
.15
The significance level is 5%,and under H0 P(T 6) = .05.Thus, the critical value of T is 6.
14
• Example 17.1 - continued
Conclusion
H0 is rejected if TSince T = 9, there is insufficient evidence to conclude that population 1 is located to the left of population 2, at the 5% significance level.
Wilcoxon Rank Sum Test – Example
15
Critical values of the Wilcoxon Rank Sum Test
n2 n13 4 5 . . . 10
4 6 18 11 25 17 33 . . . 61 895 6 21 12 28 18 37 . . . 64 96...
10 9 33 16 44 24 56 . . . 79 131
= .025 for two tail test, or = .05 for one tail test
Using the table: For given two samples of sizes n1 and n2, P(T<TL)=P(T>TU)=
For a two tail test: P(T<11) = P(T>25) = .025 if n1=4 and n2=4.For a one tail test: P(T<11) = P(T>25) = .05 if n1=4 and n2=4.
11 25
A similar table exists for = .05 (one tail test) and = .10 (two tail test)
TL TU TL TU TL TU TL TU
16
Wilcoxon rank sum test for samples where n > 10
• The test statistic is approximately normally distributed with the following parameters:
n1(n1 + n2 + 1)2
E(T) =
12)1nn(nn 2121
T
Therefore,
Z =T - E(T)
T
17
• Example 17.2 (using Wilcoxon rank sum test with ordinal data)
A pharmaceutical company is planning to introduce a new painkiller.
To determine the effectiveness of the drug, 30 people were randomly selected.
15 were given the tested drug (Sample 1). 15 were given aspirin (Sample 2).
Each participant was asked to indicate which one of five statements best represented the effectiveness of the drug they took.
Wilcoxon rank sum test for samples where n > 10, Example
18
Example 17.2 – continued Summary of the experiment results.
The drug taken was… Painkiller Aspirinextremely effective (5) 6 1quite effective (4) 3 5somewhat effective (3) 4 3slightly effective (2) 1 4not at all effective (1) 1 2
SolutionThe objective is to compare two populations of ordinal data.The two samples are independent.Wilcoxon rank test is the appropriate technique to apply.
Wilcoxon test for samples where n > 10, Example
19
The hypothesesH0: The locations of population 1 and 2 are the same
H1: The location of population 1 is to the right of the location
of population 2.Note: A high score selected from among the five possible scores 1, 2, 3, 4, 5, indicates high effectiveness.
Wilcoxon rank sum test for samples where n > 10, Example
Received the new painkiller Received Aspirin
Solving by hand To reject the null hypothesis, we need to show that z is
“large enough”. First we rank the observations, Secondly, we run a z-test, with rejection region of Z > Z.
20
Ranking the raw data
Painkiller Rank Aspirin Rank1 2 1 22 6 1 23 12 2 63 12 2 63 12 2 63 12 2 64 19.5 3 124 19.5 3 124 19.5 3 125 27 4 19.55 27 4 19.55 27 4 19.55 27 4 19.55 27 4 19.5
There are three observationswith an effectiveness score of 1.
The original ranks for these observations are 1, 2 , and 3.This tie is broken by giving eachobservation the average rank of 2.
Sum of ranks: T1=276.5 T2=188.5
These are the effectiveness scores provided by the experiment participants for each drug.
Wilcoxon rank sum test for samples where n > 10, Example
21
To standardize the test statistic we need:
E(T) = n1(n1+n2+1)/2= (15)(31)/2=232.5
1.2412
)1nn(nn 2121T
83.1)T(ET
zT
83.1)T(ET
zT
Wilcoxon rank sum test for samples where n > 10, Example
22
For 5% significance level z=1.645.Since z = 1.83 > 1.645, there is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis. At 5% significance level,
the new drugs is perceived as more effective than Aspirin.
Wilcoxon rank sum test for samples where n > 10, Example
23
Excel solution (Xm17-02)
Wilcoxon rank sum test for samples where n > 10, Example
Wilcoxon Rank Sum Test
Rank sum ObservationsNew 276.5 15Aspirin 188.5 15z Stat 1.83P(Z<=z) one-tail 0.034z Critical one-Tail 1.6449P(Z<=z) two-tail 0.068z Critical two-Tail 1.96
24
Wilcoxon rank sum test for non-normal interval data, Example
The human resource manager of a large company wanted to compare how long business and non-business graduates worked for the company before quitting.
Two samples of 25 business graduates and 20 non-business graduates were randomly selected.
The data representing their time with the company were recorded.
• Retaining Workers
25
Business Non-Bus60 2511 6018 2219 245 2325 36. .. .. .
Can the personnel manager conclude at 5% significance level that a difference in duration of employment exists between business and non-business graduates?
• Retaining workers - continued
Wilcoxon rank sum test for non-normal interval data, Example
26
Solution The problem objective is to compare two populations of
interval data. The samples are independent. The non-normality of the two populations is apparent from the
sample histograms:
02468
101214
5 20 35 50 65 More
0
2
4
6
8
10
15 25 35 45 55 65 More
Non Business graduates Business graduates
Wilcoxon rank sum test for non-normal interval data, Example
27
Solution – continued The Wilcoxon rank test is the correct procedure to run.
H0: The two population locations are the same
H1: The location of population 1(business graduates) is different from the location of population 2
(non- business graduates).
Wilcoxon rank sum test for non-normal interval data, Example
28
Solution – continued Solving by hand
The rejection region is After the ranking process is completed, we have:
T = Tbusiness graduates = 463.
E(T) = n1(n1+n2+1)/2=575;
T=[n1n2(n1+n2+1)/12]1/2=43.8
96.1zzz 025.2/
56.28.43575463)T(ET
zT
Reject the null hypothesis
Wilcoxon rank sum test for non-normal interval data, Example
29
Excel solution (Workers.xls)
There is a strong evidence to infer that the duration of employment is different for business and non-business graduates
Wilcoxon rank sum test for non-normal interval data, Example
Wilcoxon Rank Sum Test
Rank Sum ObservationsBusiness 463 25Non-Bus 572 20z Stat -2.56P(Z<=z) one-tail 0.0053z Critical one-tail 1.6449P(Z<=z) two-tail 0.0106z Critical two-tail 1.96
30
Required conditions for nonparametric tests
A rejection of the null hypothesis when performing a nonparametric test can occur due to: different location different spread (variance) different shape (distribution).
Since we are interested in the location, we require that the two distributions are identical, except for location.
31
17.3 Sign Test and Wilcoxon Signed Rank Sum Test
Two techniques for matched pairs experiment are introduced.
the objective is to compare two populations. the data are either ordinal or interval (but not
normal). The samples are matched by pairs.
32
The Sign Test
This test is employed when: The problem objective is to compare two populations, and The data are ordinal, and The experimental design is matched pairs.
The hypothesesH0: The two population locations are the sameH1: The two population locations differ or population 1
is right (left) of population 2
33
The Sign Test –Statistic and Sampling Distribution
A matched pair experiment calls for a test of matched pair differences.
The test statistic and sampling distribution Record the sign of all the matched-pair-differences. The number of positive (or negative) differences is the
test statistic.
34
The number of positive or negative differences is binomial, with: n = the number of non-zero differences p = the probability that a difference is positive (negative)
If the two populations have the same locations (H0 is true), it is expected that
Thus, under H0: p = 0.5
Number of positive differences = Number of negative differences
The Sign Test - Rationale
35
The test statistic and sampling distribution The hypotheses:
H0: The two population locations are the same
H1: The two population locations are different
The Sign Test - Rationale
H0: p .5H1: p .5
36
The Test – continued The hypotheses tested
H0: p .5H1: p .5
The binomial variable can be approximated by a normal variable if np and n(1-p) > 5.The Z- statistic becomes
.10nwhere
n5.
n5.x
)5)(.5(.n
n5.x
)p1(np
npxz
.10nwhere
n5.
n5.x
)5)(.5(.n
n5.x
)p1(np
npxz
The Sign Test –Statistic and Sampling Distribution
37
Example 17.3 (Xm17-03) In an experiment to determine which car is perceived
to have the more comfortable ride, 25 people took two rides:
One ride in a European model. One ride in a North American car.
Each person ranked the cars on a scale of 1 (ride is very uncomfortable) to 5 (ride is very comfortable).
The Sign Test – Example
38
Respondent European American1 4 52 2 13 5 44 3 25 2 16 5 37 1 38 4 29 4 2. . .
Do these data allow us to conclude at 5% significance level that the European car is perceived to be more comfortable?
The Sign Test – Example
39
• SolutionRespondent European American
1 4 52 2 13 5 44 3 25 2 16 5 37 1 38 4 29 4 2. . .
Difference-111112
-222.
• We compare two populations• The data are ordinal• A matched pair experiment
The Sign Test – Example
40
• Solution– The hypotheses are:
H0: The two population location are the same. H1: The European car population is located to the
right of the American car population.– The test.
• There were 18 positive, 5 negatives, and 2 zero differences. Thus, X = 18, n = 23(!).
• Z = [x-np]/[np(1-p)].5 = [18-.5(23)]/[.5{23}.5] = 2.71• The rejection region is z > z. For = .05 we have z > 1.645.
The p-value = P(Z > 2.71) = .0034
The Sign Test – Example
41
Using the computer: Tools > Data Analysis Plus > Sign Test
• Excel – Solution (Xm17-03)
The Sign Test – Example
123456789
101112
A B C D E FSign Test
Difference European - American
Positive Differences 18Negative Differences 5Zero Differences 2z Stat 2.71P(Z<=z) one-tail 0.0034z Critical one-tail 1.6449P(Z<=z) two-tail 0.0068z Critical two-tail 1.96
42
Conclusion: Since the p-value < we reject the null hypothesis.At 5% significance level there is sufficient evidence to infer that the European car is perceived as more comfortable than the American car.
The Sign Test – Example
43
Checking the required conditions Observe the sample histograms (Xm17-03)
The populations are similar in shape and spread
European cars
0
2
4
6
8
1 2 3 4 5 More
American cars
0
2
4
6
8
1 2 3 4 5 More
The Sign Test – Example
44
This test is used when the problem objective is to compare two populations, the data are interval but not normal, the samples are matched pairs.
The test statistic and sampling distribution T is based on rank sum of the absolute values of the
positive and negative differences When n <=30, reject H0 if T>TU or T<TL(TL and TU tabulated
values related to n). When n > 30, T is approximately normally distributed.
Use a Z-test.
Wilcoxon Signed Rank Sum Test
45
Example 17.4 Does “flextime” work-schedule help reduce the travel
time of workers to work? A random sample of 32 workers was selected, and
workers recorded their travel time before and after the program was implemented.
The hypotheses test are The two population locations are the same. The two population locations are different.
Wilcoxon Signed Rank Sum Test,Example
46
Example 17.4 Does “flextime” work-schedule help reduce the travel time of
workers to work? A random sample of 32 workers was selected, and workers
recorded their travel time before and after the program was implemented.
The hypotheses are H0: The two population locations are the same. H1: The two population locations are different.
The rejection region:|z| > z
The rejection region:|z| > z
Wilcoxon Signed Rank Sum Test, Example
47
Worker 8:00-Arr Flextime Difference ABS(Diff.) Ranks3 43 44 -1 1 4.55 16 15 1 1 4.58 38 39 -1 1 4.5
12 13 12 1 1 4.516 18 19 -1 1 4.523 19 18 1 1 4.527 51 50 1 1 4.530 20 19 1 1 4.5
4 46 44 2 2 136 26 28 -2 2 139 61 63 -2 2 13
10 52 54 -2 2 1313 69 71 -2 2 1315 53 55 -2 2 1318 25 23 2 2 1328 40 38 2 2 1331 19 21 -2 2 13
1 34 31 3 3 2111 68 65 3 3 2117 41 38 3 3 21
This data were sorted by the absolute valueof the differences.
12345678
Ties were broken by assigning the average rank to the tied observations
Average rank =(1 + 8)/2 = 4.5
48
Worker 8:00-Arr Flextime Differtence ABS(Diff.) Ranks3 43 44 -1 1 4.55 16 15 1 1 4.58 38 39 -1 1 4.5
12 13 12 1 1 4.516 18 19 -1 1 4.523 19 18 1 1 4.527 51 50 1 1 4.530 20 19 1 1 4.5
4 46 44 2 2 136 26 28 -2 2 139 61 63 -2 2 13
10 52 54 -2 2 1313 69 71 -2 2 1315 53 55 -2 2 1318 25 23 2 2 1328 40 38 2 2 1331 19 21 -2 2 13
1 34 31 3 3 2111 68 65 3 3 2117 41 38 3 3 21
T is the rank sum of the positive differences. T = T+ = 367.5
E(T) = n(n+1)/4 = 32(33)/4 = 264T = [n(n+1)(2n+1)/24].5 = 53.48
The test statistic is:
Z = T
E(T)TT
E(T)T
53.48264= 1.94367.5 -
=
49
Excel – solution (Xm17-04)
Wilcoxon Signed Rank Sum Test,Example
Wilcoxon Signed Rank Sum Test
Difference 8:00-Arr - Flextime
T+ 367.5T- 160.5Observations (for test) 32z Stat 1.94P(Z<=z) one-tail 0.0265z Critical 1.6449P(Z<=z) two-tail 0.053z Critical two-tail 1.96
50
The rejection region for = .05 is |z| > z.025 = 1.96
Conclusion: Since |1.94| < 1.96, There is insufficient evidence to infer that the flextime program was effective at 5% significance level.
Solution – continued
Wilcoxon Signed Rank Sum Test,Example
51
17.4 Kruskal-Wallis Test
The problem characteristics for this test are: The problem objective is to compare two or more populations. The data are either ordinal or interval but not normal. The samples are independent.
The hypotheses areH0: The location of all the k populations are the same.
H1: At least two population locations differ.
52
Rank the data from 1(smallest) to n (largest). Calculate the rank sums T1, T2,…Tk for all the k
samples. Calculate the statistic H as follows:
)1(3)1(
12
1
2
nn
T
nnH
k
j j
j)1(3
)1(
12
1
2
nn
T
nnH
k
j j
j
Kruskal-Wallis Test Statistic
53
Test Rationale and Rejection region
If all the populations have the same location (H0
is true)… The ranks should be evenly distributed among the k
samples. The statistic H will be small.
Uneven distribution of ranks1 4 72 5 83 6 9
T1=6 T2=15 T3=24H = 7.2
Even distribution of ranks1 2 34 5 69 8 7
T1=14 T2=15 T3=16H = .0888
54
Sampling distribution When the sample sizes 5, H is approximately
chi-squared distributed with k-1 degrees of freedom. The rejection region:
Since a large value of H justifies the rejection of H0,
we have:2
1k,H 21k,H
Test Rationale and Rejection Region
55
Example 17.5 How do customers rate three shifts with respect to
speed of service in a certain restaurant? Three samples of 10 customer response-cards were
randomly selected, one sample from each shift. Customer ratings were recorded.
The Kruskal-Wallis Test Example
56
4:00-mid Mid-8:00 8:00-4:004 3 34 4 13 2 34 2 23 3 13 4 33 3 43 3 22 2 43 3 1
Can we conclude that customers perceive the speed of service to be different among the three shifts at 5% significance level?
The Kruskal-Wallis Test Example
57
Solution The problem objective is to compare three
populations. The data are ordinal. The hypotheses:
H0: The locations of all three populations are the same.
H1: At least two population locations differ.
The Kruskal-Wallis Test Example
58
Solution - continued Test statistic:
4:00-mid Mid-8:00 8:00-4:004 3 34 4 13 2 34 2 23 3 13 4 33 3 43 3 22 2 43 3 1
27
27
16.5
16.5
16.5
16.5
16.5
16.5
16.5
16.5
16.5
16.5
16.5
16.5
16.5
16.5
27
27
27
27
27
2
2
2
6.5
6.5
6.5
6.56.5
6.5
1
3
2
3
4
7
5
6
8
T1 = 186.5 T2 = 156.0 T3 = 122.5
n = n1 + n2 + n3 = 10+10+10 = 30
Ranking
64.2
)130(3
10
5.122
10
0.156
10
5.186
)130(30
12
)1(3)1(
12
222
2
n
n
T
nnH
j
j
The Kruskal-Wallis Test Example
59
For = .05, 2,k-1 = 2
.05,2 = 5.99147
Solution - continued The critical value
The Kruskal-Wallis Test Example
60
The Kruskal-Wallis Test Example
Solution – Excel (Xm17-05)
123456789
1011
A B C DKruskal-Wallis Test
Group Rank Sum Observations4:00-mid 186.5 10Mid-8:00 156 108:00-4:00 122.5 10
H Stat 2.64df 2p-value 0.2665chi-squared Critical 5.9915
61
Conclusion: Since H=2.64 < 5.99147, do not reject the null hypothesis. There is insufficient evidence to conclude at 5% significance level, that there is a difference in customers’ perception regarding service speed among the three shifts.
The Kruskal-Wallis Test Example
62
17.5 Friedman Test The problem characteristics of this test are:
The problem objective is to compare two or more populations. The data are either ordinal or interval but not normal. (For
normal populations we use ANOVA). The data are generated from a blocked experiment (samples
are not independent). The hypotheses are
The location of all the k populations are the same. At least two population locations differ.
63
Test Statistic and Rejection Region
The test statistic is
The rejection region is
)1k(b3T)1k(bk
12F
k
1j
2jr
)1k(b3T)1k(bk
12F
k
1j
2jr
21k,rF 21k,rF
b = the number of blocksK = the number of treatments
64
The Friedman Test Example
Example 17.6 Four managers evaluate applicants for a job in an
accounting firm on several dimensions. Eight applicants were randomly selected, and their
evaluations by the four managers recorded. Manager
Applicant 1 2 3 41 2 1 2 22 4 2 3 23 2 2 2 34 3 1 3 25 3 2 3 56 2 2 3 47 4 1 5 58 3 2 5 3
Can we conclude at 5% significance level thatthere are differences inthe way managersevaluate candidates?
Can we conclude at 5% significance level thatthere are differences inthe way managersevaluate candidates?
65
Solution The problem objective is to compare four populations Data are ordinal. This is a randomized block design experiment
because each applicant (block) was ranked four times. The appropriate procedure is the Friedman test
The Friedman Test Example
66
Solution The hypotheses areH0: The locations of all four populations are the same.
H1: At least two population locations differ. The data
ManagerApplicant 1 2 3 4
1 2 1 2 22 4 2 3 23 2 2 2 34 3 1 3 25 3 2 3 56 2 2 3 47 4 1 5 58 3 2 5 3
The Friedman Test Example
67
ManagerApplicant 1 2 3 4
1 2 1 2 22 4 2 3 23 2 2 2 34 3 1 3 25 3 2 3 56 2 2 3 47 4 1 5 58 3 2 5 3
T1 = 21 T2 = 10 T3 = 24.5 T4 = 24.5
34
23.52.51.522.5
11.5
211.5111
33
23.52.533.54
31.5
42443.52.5
The Friedman Test Example
How to rank, block by block.
Applicant 1:
Scores: 2 1 2 2
Actual ranks: 2 1 3 4
Averaged ranks: 3 1 3 3
68
SolutionIn our problem:b = 8 (number of blocks) k = 4 (number of treatments, populations)
The Friedman Test Example
61.10)14(35.244.241021)14)(4(8
12
)1(3)1(
12
2222
1
2
kbTkbk
Fk
jjr
69
Solution
We have : Fr = 10.61; Let = .05, then 2.05, 4-1 = 7.8147
The Friedman Test Example
70
The Friedman Test Example
Solution – Excel (Xm17-06)
Friedman Test
Group Rank SumManager1 21Manager2 10Manager3 24.5Manager4 24.5
Fr Stat 10.61df 3p-value 0.0140chi-squared Critical 7.8147
71
Conclusion: Since Fr =10.61> 7.8147, reject the null hypothesis. There is sufficient evidence to conclude at 5% significance level, that the managers’ evaluations differ.
The Friedman Test Example