parametric & non-parametric tests

34
Basics of PARAMETRIC & NON-PARAMETRIC TESTS

Upload: ohlyanaarti

Post on 03-Apr-2018

260 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 1/34

Basics of PARAMETRIC

&

NON-PARAMETRIC

TESTS

Page 2: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 2/34

4/7/2013 

HYPOTHESIS

• An unproved theory;

• Something taken to be true for the purpose

of investigation;

• An assumption; 

• The antecedent of a conditional statement;• A supposition

Page 3: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 3/34

4/7/2013 

Measurement

1. Nominal or Classificatory Scale

• Gender, ethnic background

2. Ordinal or Ranking Scale

• Hardness of rocks, beauty, military ranks

3. Interval Scale• Celsius or Fahrenheit

4. Ratio Scale

• Kelvin temperature, speed, height, mass or weight

Page 4: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 4/34

4/7/2013 

DATA 

MEAN Average or arithmetic mean of the data

MEDIAN

The value which comes half way when the data are

ranked in order 

MODE Most common value observed

•In a normal distribution, mean and median are the

same

•If median and mean are different, indicates thatthe data are not normally distributed

•The mode is of little use.

Page 5: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 5/34

4/7/2013 

Parametric Data

• The data that can be measured.• For example, heights, weight, depth, amount of money, etc… 

• Interval and ratio measurements are considered parametric.

Data is considered parametric if it has the following three assumptions:

• Normality:

Distribution is normal.

• Equal variances:The populations from which the data is obtained should have equal variances. The F-test can be used to test the hypothesis that the samples have been drawn from populations with the equal variances.

• Independence:

The data should be measured on an interval scale .

• If any or all of these assumptions are untrue• then the results of the test may be invalid.

• it is safest to use a non-parametric test.

Page 6: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 6/34

4/7/2013 

PARAMETRIC TESTS

Page 7: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 7/34

4/7/2013 

•These techniques are termed parametric because they focuson specific parameters of the population, usually the meanand variance.

•Also known as standard tests of hypothesis.

•use mean values, standard deviation and variance toestimate differences between measurements that

characterize particular populations.•Applied on parametric data.

•Parametric tests require data from which means andvariances can be calculated, i.e., interval and ratio data. As

long as the actual data meet the parametric assumptions,regardless of the origin of the numbers, then parametrictests can be conducted.

Page 8: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 8/34

4/7/2013 

TYPES

• Student's t-tests (Student is the name of the statistician who developed thetest)

• Z-test :

 based on the normal probablity distribution;

used for judging the significance of several statistical measures, particularlythe mean;

• Chi-square test;

• F-test;

• Analyses of variance (ANOVA);

• Linear regression, and others.

Page 9: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 9/34

4/7/2013 

Example: t-test

Suppose you have two independent groups (corresponding to two drugs)

on which some measurement has been made – for example, the length of 

time until relief of pain. You want to determine if one drug has a better 

overall (shorter) time to relief than the other drug. However, when you

examine the data it’s obvious that the distribution of the data is not 

normal (You can test for normality of data using a statistical test.) If thedata had been normally distributed, you would have performed a

standard independent group t-test on this data.

Page 10: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 10/34

4/7/2013 

NON-PARAMETRIC TESTS

Page 11: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 11/34

4/7/2013 

Although nonparametric techniques do not require the stringentassumptions associated with their parametric counterparts, this doesnot imply that they are ‘assumption free’.

CHARACTERISTICS:

• Fewer assumptions regarding the population distribution

• Sample sizes are often less stringent and > 30.

• Measurement level may be nominal or ordinal

• Independence of randomly selected observations, except when paired 

• Primary focus is on the rank ordering or frequencies of data 

• Hypotheses are posed regarding ranks, medians, or frequencies of data

It is needed because:• Sample distribution is unknown.

• When the population distribution is abnormal i.e. too many variablesinvolved.

Page 12: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 12/34

4/7/2013 

MethodsChi-square test (χ2):

Used to compare between observed and expected data. Also known as ‘Test of goodness of fit’, ‘Test of  

independence’, ‘Test of homogeneity’. 

Kruskal-Wallis test: 

For testing whether samples originate from the same distribution. – used for comparing more than two

Samples that are independent, or not related – Alternative to ANOVA.

Wilcoxon signed-rank-

Used when comparing two related samples or repeated measurements on a single sample to assess

whether their  population mean ranks differ.

Median test-

Use to test the null hypothesis that the medians of the populations from which two samples are drawn are

identical. – The data in sample is assigned to two groups, one consisting of data whose values are higher 

than the median value in the two groups combined, and the other consisting of data whose values are at

the median or below.

Sign test: 

Can be used to test the hypothesis if there is "no difference in medians" between the continuous

distribution of two random variables X and Y,

Fishers exact test:

Test used in the analysis of contingency where sample sizes are small.

Page 13: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 13/34

4/7/2013 

Statistical test for paired or matched

observation

Variable Test

 Nominal Mc Nemar’s Test 

Ordinal Wilcoxon

Quantitative (discrete or non normal) Wilcoxon

Quantitative (normal*) Paired t est

Page 14: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 14/34

4/7/2013 

Note… These tables should be considered as guides only, and each case should be considered on its merits.

• The Kruskal-Wallis test is used for comparing ordinal or non-Normal variables for more than two groups, and isa generalisation of the Mann-Whitney U test.

• Analysis of variance is a general technique, and one version (one way analysis of variance) is used to compare Normally distributed variables for more than two groups, and is the parametric equivalent of the Kruskal-Wallistest.

• If the outcome variable is the dependent variable, then provided the residuals (see ) are plausibly Normal, thenthe distribution of the independent variable is not important.

• There are a number of more advanced techniques, such as Poisson regression, for dealing with these situations.However, they require certain assumptions and it is often easier to either dichotomise the outcome variable or treat it as continuous.

• When valid, use parametric. Useful for non-normal data

• If possible use some transformation

• Use if normalization not possible

• Commonly used

• Wilcoxon signed-rank test• Wilcoxon rank-sum test

• Spearman rank correlation

• Chi square etc.

Page 15: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 15/34

4/7/2013 

ADVANTAGES• Treat samples from several different populations.

• If the sample size is small there is no alternative.

• If the data is nominal or ordinal.

• Easier to learn and apply than parametric tes

Page 16: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 16/34

4/7/2013 

DISADVANTAGES

• Discard information by converting to ranks

• Less powerful

• Tables of critical values may not be easily available.

• False sense of security

Page 17: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 17/34

4/7/2013 

Wilcoxon signed rank test  To test difference between paired data

CALCULATION:

Step 1:

• Exclude any differences which are zero• Ignore their signs

• Put the rest of differences in ascending order 

• Assign them ranks

• If any differences are equal, average their ranks

• STEP 2:• Count up the ranks of +ives as T+ 

• Count up the ranks of  –ives as T-

Page 18: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 18/34

4/7/2013 

Step 3:

• If there is no difference between drug (T+) and placebo (T-), then T+ & T-

would be similar 

• If there is a difference

• one sum would be much smaller and

• the other much larger than expected• The larger sum is denoted as T

• T = larger of T+ and T- .

Step 4:

• Compare the value obtained with the critical values (5%, 2% and 1% ) intable.

• N is the number of differences that were ranked (not the total number of differences).

• So the zero differences are excluded.

EXAMPLE

Page 19: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 19/34

4/7/2013 

Patient

Hours of sleep

Drug Placebo

1 6.1 5.2

2 7.0 7.9

3 8.2 3.9

4 7.6 4.7

5 6.5 5.3

6 8.4 5.4

7 6.9 4.2

8 6.7 6.1

9 7.4 3.8

10 5.8 6.3

EXAMPLE

Null Hypothesis: Hours of sleep are the same using placebo & the drug

Page 20: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 20/34

4/7/2013 

Patient

Hours of sleep

Difference

Rank 

Ignoring signDrug Placebo

1 6.1 5.2 0.9 3.5*

2 7.0 7.9 -0.9 3.5*

3 8.2 3.9 4.3 10

4 7.6 4.7 2.9 7

5 6.5 5.3 1.2 5

6 8.4 5.4 3.0 8

7 6.9 4.2 2.7 6

8 6.7 6.1 0.6 2

9 7.4 3.8 3.6 9

10 5.8 6.3 -0.5 1

3rd & 4th ranks are tied hence averaged; T= larger of T+ (50.5) and T- (4.5) 

Here, calculated value of T= 50.5; tabulated value of T= 47 (at 5%)

significant at 5% level indicating that the drug (hypnotic) is more effectivethan placebo.

Page 21: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 21/34

4/7/2013 

Wilcoxon rank sum test

Compares two groups.

Calculation:

Step 1:

• Rank the data of both the groups in ascending order.

• If any values are equal, average their ranks.

Step 2:•  Add up the ranks in the group with smaller sample size.

• If the two groups are of the same size either one may be picked.

• T= sum of ranks in the group with smaller sample size.

Step 3:

• Compare this sum with the critical ranges given in table.• Look up the rows corresponding to the sample sizes of the two

groups.

• A range will be shown for the 5% significance level.

Page 22: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 22/34

4/7/2013 

 Non-smokers (n=15) Heavy smokers (n=14)

Birth wt (Kg)

3.99

3.79

3.60*

3.73

3.21

3.60*

4.08

3.613.83

3.31

4.13

3.26

3.54

3.51

2.71

Birth wt (Kg)

3.18

2.84

2.90

3.27

3.85

3.52

3.23

2.763.60*

3.75

3.59

3.63

2.38

2.34

Null Hypothesis: Mean birth weight is same between non-smokers & smokers

Page 23: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 23/34

4/7/2013 

 Non-smokers (n=15) Heavy smokers (n=14)Birth wt (Kg) Rank Birth wt (Kg) Rank 

3.99 27 3.18 7

3.79 24 2.84 5

3.60* 18 2.90 6

3.73 22 3.27 113.21 8 3.85 26

3.60* 18 3.52 14

4.08 28 3.23 9

3.61 20 2.76 4

3.83 25 3.60* 18

3.31 12 3.75 234.13 29 3.59 16

3.26 10 3.63 21

3.54 15 2.38 2

3.51 13 2.34 1

2.71 3

Sum=272 Sum=163

* 17, 18 & 19are tied hence the ranks are averagedHence caculated value of T = 163; tabulated value of T (14,15) = 151

Mean birth weights are not same for non-smokers & smokersthe are si nificantl different

Page 24: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 24/34

4/7/2013 

Spearman’s Rank Correlation Coefficient 

•  Based on the ranks of the items rather than actual values.

• Can be used even with the actual values

Example: 

• To know the correlation between honesty and wisdom of the boysof a class.

• It can also be used to find the degree of agreement between the

 judgments of two examiners or two judges.

Page 25: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 25/34

4/7/2013 

R (Rank correlation coefficient) =

D = Difference between the ranks of two items

N = The number of observations.

Note: -1 R 1.

i) When R = +1 Perfect positive correlation or complete agreement in

the same direction

ii) When R = -1 Perfect negative correlation or complete agreement inthe opposite direction.

iii) When R = 0 No Correlation.

Page 26: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 26/34

4/7/2013 

COMPUTATION

• Give ranks to the values of items.

• Generally the item with the highest value is ranked 1 and

then the others are given ranks 2, 3, 4, .... according to

their values in the decreasing order.

• Find the difference D = R1 - R2

where R1 = Rank of x and R2 = Rank of y

 Note that ΣD = 0 (always) 

• Calculate D2 and then find ΣD2 

• Apply the formula.

Page 27: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 27/34

If there is a tie between two or more items.

Then give the average rank. If m be the number of items of 

equal rank, the factor 1(m3-m)/12 is added to ΣD2. If there

is more than one such case then this factor is added as

many times as the number of such cases, then

R R (R R )2

Page 28: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 28/34

4/7/2013 

Student

No.

Rank in

Maths 

(R1)

Rank in

Stats 

(R2)

R1 - R2

D

(R1 - R2 )2

D2 

1 1 3 -2 4

2 3 1 2 4

3 7 4 3 9

4 5 5 0 0

5 4 6 -2 4

6 6 9 -3 9

7 2 7 -5 25

8 10 8 2 4

9 9 10 -1 1

10 8 2 6 36

N = 10 Σ D = 0  Σ D2

= 96

Page 29: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 29/34

4/7/2013 

Page 30: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 30/34

4/7/2013 

Chi-square

• Commonly used to compare observed data with data we wouldexpect.

• To obtain according to a specific hypothesis.

• Chi-square requires that you use numerical values, not percentagesor ratios.

• The formula for calculating chi-square ( 2) is:

2= (o-e)2 /e 

• That is, chi-square is the sum of the squared difference betweenobserved (o) and the expected (e) data (or the deviation, d ), divided

 by the expected data in all possible categories.

Page 31: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 31/34

4/7/2013 

Procedure1. State the hypothesis being tested and the predicted results. Gather the data by conducting the proper 

experiment (or, if working genetics problems, use the data provided in the problem).

2. Determine the expected numbers for each observational class. Remember to use numbers, not percentages.

• Chi-square should not be calculated if the expected value in any category is less than 5.

3. Calculate using the formula. Complete all calculations to three significant digits. Round off your answer totwo significant digits.

4. Use the chi-square distribution table to determine significance of the value.

• Determine degrees of freedom and locate the value in the appropriate column.

• Locate the value closest to your calculated 2 on that degrees of freedom df row.

• Move up the column to determine the p value.

5. State your conclusion in terms of your hypothesis.

• If the p value for the calculated 2 is p > 0.05, accept your hypothesis. 'The deviation is small enough thatchance alone accounts for it. A p value of 0.6, for example, means that there is a 60% probability that anydeviation from expected is due to chance only. This is within the range of acceptable deviation.

• If the p value for the calculated 2 is p < 0.05, reject your hypothesis, and conclude that some factor other than chance is operating for the deviation to be so great. For example, a p value of 0.01 means that there isonly a 1% chance that this deviation is due to chance alone. Therefore, other factors must be involved.

Page 32: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 32/34

4/7/2013 

Parametric  Non-parametric 

Assumed distribution  Normal Any 

Assumed variance  Homogeneous  Any 

Typical data  Ratio or Interval  Ordinal or Nominal 

Data set relationships  Independent  Any 

Usual central measure  Mean  Median 

Benefits  Can draw more conclusions  Simplicity; Less affected by outliers 

Tests 

Choosing  Choosing parametric test  Choosing a non-parametric test 

Correlation test  Pearson  Spearman 

Independent measures, 2 groups  Independent-measures t-test  Mann-Whitney test 

Independent measures, >2 groups  One-way, independent-measures ANOVA  Kruskal-Wallis test 

Repeated measures, 2 conditions  Matched-pair t-test  Wilcoxon test 

Repeated measures, >2 conditions  One-way, repeated measures ANOVA  Friedman's test 

Page 33: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 33/34

4/7/2013 

Limitations of the tests of hypothesis 

The tests should not be used in mechanical fashion i.e. testing is notdecision-making itself; the tests are only useful aids for decision-making.

The tests do not explain as to why does the difference exists, say betweenthe mean of two samples;

Results of significance tests are based on probabilities and as such cannot beexpressed with full certainty;

The inferences based on the tests cannot be said to be entirely correctevidences concerning the truth of the hypothesis. It happens in the case of small samples where erring inferences happens to be generally higher. For greater reliability, the size of the sample be sufficiently enlarged.

Page 34: parametric & non-parametric tests

7/28/2019 parametric & non-parametric tests

http://slidepdf.com/reader/full/parametric-non-parametric-tests 34/34

4/7/2013