nonparametric statistics

12
Nonparametric Statistics aka, distribution-free statistics makes no assumption about the underlying distribution, other than that it is continuous the data can be non-quantitative, rank order, etc. Competitors of the t- and F- procedures we used in chapters 11 and 12. generally less efficient, require larger sample sizes for the same confidence level and power

Upload: ilana

Post on 12-Jan-2016

48 views

Category:

Documents


5 download

DESCRIPTION

Nonparametric Statistics. aka, distribution-free statistics makes no assumption about the underlying distribution, other than that it is continuous the data can be non-quantitative, rank order, etc. Competitors of the t- and F- procedures we used in chapters 11 and 12. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Nonparametric Statistics

Nonparametric Statistics

aka, distribution-free statistics makes no assumption about the underlying

distribution, other than that it is continuousthe data can be non-quantitative, rank order,

etc. Competitors of the t- and F- procedures we

used in chapters 11 and 12.generally less efficient, require larger sample

sizes for the same confidence level and power

Page 2: Nonparametric Statistics

Some Commonly Used Statistical Tests

Normal theory based test

Corresponding nonparametric test

Purpose of test

t test for independent

samples

Mann-Whitney U test; Wilcoxon rank-sum test

Compares two independent samples

Paired t testWilcoxon matched pairs

signed-rank testExamines a set of differences

Pearson correlation coefficient

Spearman rank correlation coefficient

Assesses the linear association between two variables.

One way analysis of variance (F test)

Kruskal-Wallis analysis of variance by ranks

Compares three or more groups

Two way analysis of variance

Friedman Two way analysis of variance

Compares groups classified by two different factors

Source: Gerard E. Dallal, Ph.D., Nonparametric Statistics. http://www.jerrydallal.com/LHSP/npar.htm

ETM 620 - 09U2

Page 3: Nonparametric Statistics

Test of the median: the Sign Test Tests hypotheses about the median of a

continuous distribution, i.e.,

Recall that the median is that value for which

Therefore, the sign test looks at the number of values above (R+) and below (R-) the hypothesized median. When the null hypothesis is true, R = min(R+, R-) follows the binomial distribution with sample size n and p = 0.5, i.e.

ETM 620 - 09U3

H0 : ˜ ˜ 0H1 : ˜ ˜ 0

P(X ˜ 0) P(X ˜ 0) 0.5

min

0min )5.0()5.0()(

R

r

rnr

r

nRRP

Page 4: Nonparametric Statistics

An example: Recall the example comparing two methods for

testing shear strength in steel girders. Suppose we are interested in testing whether or not the actual median of the Karlsruhe method is 1.2, that is …

given the data as shown on pg 293 and in the Excel data file.

Note the difference between the algorithm given in the textbook (as done in Excel) and the results from Minitab …

ETM 620 - 09U4

2.1~:

2.1~:

1

0

H

H

Page 5: Nonparametric Statistics

The Sign Test for paired samples Same as for single samples, but the null

hypothesis is that the median difference = 0, i.e.

Example, paired comparison of example 11-17 ignoring the normality assumption …

Calculate P-value as the probability that number of data points is less than or equal to the minimum R value given a binomial distribution with p = 0.5, i.e.

ETM 620 - 09U5

0~:

0~:

1

0

D

D

H

H

min

0min )5.0()5.0()(

R

r

rnr

r

nRRP

Page 6: Nonparametric Statistics

Determining β Recall that β is the probability of a Type II

error, i.e.

This is highly dependent on the shape of the underlying distributionsee, for example, the example on pg. 491 of

your textbook

ETM 620 - 09U6

)|Pr( 0 aXx

Page 7: Nonparametric Statistics

Wilcoxon signed rank test Sign test only focuses on whether the data are

above or below the presumed median, ignoring the magnitude

If we assume a symmetrical continuous distribution, we can use the Wilcoxon signed rank test Similar to the sign test, but now we order the

differences from the mean in order of magnitude and add the ranks together.

Let’s do this once on Excel and once on Minitab. (Note the differences!)

ETM 620 - 09U7

Page 8: Nonparametric Statistics

Large sample approximation Given n >20, then it can be shown that R is

approximately normally distributed with

and a test of H0: µ = µ0 can be based on the statistic

24)12)(1(

4)1(

2

nnn

nn

R

R

24/)12)(1(

4/)1(0

nnn

nnRZ

Page 9: Nonparametric Statistics

Comparing 2 means: Wilcoxon rank sum Order all data from lowest to highest, keeping

up with which data point belongs to which groupFor example, see example 16-5, pg 500

Then, R1=sum(rank order for sample 1) and R2=sum(rank order for sample 2)From table IX, obtain R*

α for n1 and n2 at α of 0.01 and 0.05

Alternatively, using Mann-Whitney on Minitab …

ETM 620 - 09U9

Page 10: Nonparametric Statistics

Large sample approximation Given n1 and n2 >8, then it can be shown that

R1 is approximately normally distributed with

and a test of H0: µ1 = µ2 can be based on the statistic

12)1(

2)1(

21212

211

1

1

nnnn

nnn

R

R

1

110

R

RRZ

Page 11: Nonparametric Statistics

Analysis of Variance: the Kruskal-Wallis Test

Expands the rank-sum method to more than one factor level

Use Minitab to perform the statistical analysis …

Look at example 16-6, pg. 503

ETM 620 - 09U11

Page 12: Nonparametric Statistics

Other nonparametric tests … Mood’s Median Test

similar to Kruskal-Wallis, more robust against outliers but less robust when samples are from different distributions

Friedman Testtest of the randomized block design

(nonparametric equivalent to the two-way ANOVA)

Runs testchecks for data runs (> expected number of

observations above or below the median)

ETM 620 - 09U12