chapter 7 nonparametric statistics. 7.1 - introduction parametric tests – have a requirement about...
DESCRIPTION
7.2 – The Sign TestTRANSCRIPT
Chapter 7 Nonparametric Statistics
7.1 - Introduction
• Parametric tests– Have a requirement about the distribution of the
population• Nonparametric tests– Have no such requirement
7.2 – The Sign Test
Purpose: To determine if a set of data consisting of + and − signs has an unusually small number of one sign or another. Let– total number of + and − signs in the data– number of times the less frequent sign appears
The test statistic is
29( 0.5) 0.5
for 290.25
for x n
x n
z nn
The Sign Test
Critical value– : See Table C.5– : or
– Reject H0 if test statistic critical value
P-value – is and is
Requirement1. The sample is random
Example 7.2.1
The first row of the table below shows the number of children’s books in 11 randomly selected homes with children in a town. Use the data to test the claim that the median number of books in homes with children in this town is greater than 12.
Example 7.2.1
Parameter: the median number of books
Hypotheses: H0: , H1:
Test statistic:– , – Critical value: 1
P-value:
– Reject H0 : The data support the claim
0 10 0 1 10 110 10( 1) 0.5 (1 0.5) 0.5 (1 0.5) 0.0107
0 1P X
7.3 – The Wilcoxon Signed-Rank Test
Do scores on a writing survey improve from the beginning to the end of the semester?– Test the claim that the median of the population of the
differences of the scores, denoted , is positive– Hypotheses: H0: , H1:
Test Statistic
Calculate the sum of the positive signed ranks and the sum of the negative signed ranks:
Theorem 7.3.1 Assuming the distribution of the differences is symmetric around 0, then the random variable is approximately normally distributed with mean and variance
8 5 3 6 2 9 7 40 and 4 1 5S S
2( 1) ( 1)(2 1)and4 24
n n n n n
The Wilcoxon Signed-Rank Test
Purpose: To test the null hypothesis H0: where m is the population median of the differences of a set of paired data.– Let smaller of and
Test statistic for
f
30( 1) / 4 30
( 1)(2 1)o
4
/ 2rT n nz
n
T n
n nn
The Wilcoxon Signed-Rank Test
Critical value– : See Table C.6– : or
– Reject H0 if test statistic critical value
Requirements1. The sample is random2. The population of differences has a distribution that is
symmetric
( 0.5) ( 1) / 4P-value( 1)(2 1) / 24
T n nP Zn n n
Example
Test statistic:
Critical value: 8P-value:
– Reject H0: The data support the claim
29(9 1) 9(9 1)(2·9 1)22.5, and 71.254 24
(5 0.5) 22.5P-value ( 2.01) 0.022271.25
P Z P Z
7.4 – The Wilcoxon Rank-Sum Test
There are two commonly used techniques for throwing a shot put: glide and rotational. The first and third rows of the table below give the maximum distances (in m) of 12 different athletes using the glide method and 13 athletes using the rotational method at international competitions (data collected by David Meyer, 2010). Use this data to test the claim that there is not a significant difference between these two methods.
The Wilcoxon Rank-Sum Test
Parameters– median of all distances using the glide method– median of all distances using the rotation method
Hypotheses: H0: , H1:Test statistic: Add ranks from the first sample
1 1 2 4 25 129R
Theorem 7.4.1
If there are no ties and both populations have the same continuous distribution, then is an observed value of a random variable with mean and variance
1 1 2 1 2 1 221 1and
2 12R R
n n n n n n n
The Wilcoxon Rank-Sum Test
Purpose: To test the null hypothesis H0: where and are the medians of two independent populations with continuous distributions.Let– and sample sizes where 2
– Rank all the values from 1 to 2
– sum of the ranks from the first sample 1 1 2 1
2R
n n n
The Wilcoxon Rank-Sum Test
Test statistic:
Critical value: See Table C.7– Reject H0 if critical value
P-value: For a one-tail test
– Double this probability for a two-tail testRequirement
1. The samples are random and independent
1 1
1
if 2 otherwise
R
R
R RR
R
1 2 1 2 1( 0.5) wheP-value re 12
RR
R
n n n nRP Z
Example
,
Critical value: 119
– Do not reject H0: There is not a statistically significant difference between the medians of the two methods
1
12 12 13 1 12·13 12 13 1156 and 338
2 12129 156 129
R R
R R
(129 0.5) 156P-value 2 2 ( 1.44) 0.1498338
P Z P Z
7.5 – The Runs Test for Randomness
Purpose: To test the claim that a set of data with two types of values is arranged randomly
Definition 7.5.1 A run is a sequence of data of the same type preceded and followed by data of a different type or by no data at all
Example 7.5.1
A classical music fan has a collection of songs composed by Bach and Vivaldi on her MP3 player which is supposed to randomly choose songs. The order of the composer of the songs played is shown below. Test the claim that the composers are arranged randomly.
– songs by Bach – songs by Vivaldi– Total of runs
Theorem 7.5.1
If the data values are chosen randomly, then is an observed value of a random variable with p.m.f.
for
1 2
1 2
1
1 2 1 2
1 2
1
1 12
/ 2 1 / 2 1if is even
( )1 1 1 1
( 1) / 2 ( 3) / 2 ( 3) / 2 ( 1) / 2if is odd
n nr r
rn n
nf r
n n n nr r r r
rn n
n
Theorem 7.5.1
As , the distribution of approaches a normal distribution with mean and variance
1 2 1 2 1 221 22
1 2 1 2 1 2
2 22 1 and1
R R
n n n n n nn nn n n n n n
The Runs Test for Randomness
Null hypothesis: H0: The data are arranged randomly
Test statistic: Critical values: See Table C.8– Reject H0 if the smaller c.v. or the larger c.v.
P-value: Approximately twice the extreme region under the standard normal bell curve bounded by
R
R
Rz
Example 7.5.1
Hypotheses:H0: The composers are arranged randomlyH1: The composers are not arranged randomly
Test statistic:
Critical values: 7 and 17P-value:
2
2
2 10 12 2 10 12 10 122 10 12 1 11.91 and 5.1510 12 10 12 10 12 1
13 11.91P-value 2 2 ( 0.48) 0.63125.15
R R
Z P Z
Example 7.5.1
Conclusion– Do not reject H0
– It appears that the songs are randomly chosen