non-parametric tests

55
Non-parametric Tests

Upload: mya-sumitra

Post on 24-May-2017

224 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Non-Parametric Tests

Non-parametric Tests

Page 2: Non-Parametric Tests

Learning Objectives1. Distinguish Parametric &

Nonparametric Test Procedures 2. Explain commonly used

Nonparametric Test Procedures3. Perform Hypothesis Tests

Using Nonparametric Procedures

Page 3: Non-Parametric Tests

Introduction• The word parametric comes from parameter, or

characteristic of a population. • The parametric tests (include assumptions about the

shape of the population distribution (e.g. normally distributed).

• Non-parametric techniques, do not have such stringent requirements and do not make assumptions about the underlying population distribution (which is why they are sometimes referred to as distribution-free tests).

Page 4: Non-Parametric Tests

Hypothesis Testing Procedures

Many More Tests Exist!

HypothesisTesting

Procedures

NonparametricParametric

Z Test

Kruskal-WallisH-Test

WilcoxonRank Sum

Test

t Test One-WayANOVA

Page 5: Non-Parametric Tests
Page 6: Non-Parametric Tests

Parametric Test Procedures

1. Involve Population Parameters (Mean)

2. Have Stringent Assumptions (Normality)

3. Examples: Z Test, t Test, F test

Page 7: Non-Parametric Tests

Nonparametric Test Procedures

1. Do Not Involve Population ParametersExample: Probability Distributions, Independence

2. Data Measured on Any Scale (Ratio or Interval, Ordinal or Nominal)

3. Example: Mann-Whitney, Kruskal-Wallis , Wilcoxon Rank Sum Test, 2 Test

Page 8: Non-Parametric Tests

Advantages of Nonparametric Tests

1. Used With All Scales

2. Easier to Compute3. Make Fewer

Assumptions4. Need Not Involve

Population Parameters

5. Results May Be as Exact

as Parametric Procedures

© 1984-1994 T/Maker Co.

Page 9: Non-Parametric Tests

Disadvantages of Nonparametric Tests

1. May Waste Information Parametric model more efficient if data Permit

2. Difficult to Compute by hand for Large Samples3. Tables Not Widely

Available

© 1984-1994 T/Maker Co.

Page 10: Non-Parametric Tests

Popular Nonparametric Tests1. Mann-Whitney2. Rank Sum Test 3. Sign Test4. Wilcoxon Test 5. Kruskal-Wallis Test 6. Friedman Test 7. Spearman’s Rank Correlation 8. Kolmogorov-Smirnov 9. Chi-square for independence

Page 11: Non-Parametric Tests

Nonparametric tests vs. ParametricComparison NonParametric Test Parametric

Equivalent

2 IndependentMann-WhitneyRank Sum Test Independent t-test

2 Matched/Related Sign TestWilcoxon Test Paired samples t-test

>2 Independent Kruskal-Wallis Test One-way ANOVA

Two-way Friedman Test Two-way ANOVACorrelation Spearman’s Rank

CorrelationPearson’s Correlation

Distribution Kolmogorov-Smirnov None

Chi-square for independence

None

Page 12: Non-Parametric Tests

Assumptions for non-parametric techniquesRandom samples.Independent observations. Each person or case

can be counted only once, they cannot appear in more than one category or group, and the data from one subject cannot influence the data from another.

The exception to this is the repeated measures techniques (Wilcoxon Signed Rank Test, Friedman Test), where the same subjects are retested on different occasions or under different conditions.

Some of the techniques discussed in this lecture have additional assumptions that should be checked.

Page 13: Non-Parametric Tests

1. Chi-squareThere are two different types of chi-

square tests, both involving categorical data:

1. The chi-square for goodness of fit (also referred to as one-sample chi-square) explores the proportion of cases that fall into the various categories of a single variable, and compares these with hypothesised values. 2. The chi-square test for independence is used to determine whether two categorical variables are related. It compares the frequency of cases found in the various categories of one variable across the different categories of another variable. For example: Is the proportion of smokers to non-smokers the same for males and females? Or, expressed another way: Are males more likely than females to be smokers?

Page 14: Non-Parametric Tests

1. Chi-square test for independence

This test is used when you wish to explore the relationship between two categorical variables. Each of these variables can have two or more categories.

Page 15: Non-Parametric Tests

Summary for chi-squareExample of research question:

There are a variety of ways questions can be phrased: Are males more likely to be smokers than females? Is the proportion of males that smoke the same as the proportion of females? Is there a relationship between gender and smoking behaviour?

What you need: Two categorical variables, with two or more categories in each, for example:• Gender (Male/Female); and• Smoker (Yes/No).

Page 16: Non-Parametric Tests

Cont. Additional assumptions :

The lowest expected frequency in any cell should be 5 or more. Some authors suggest a less stringent criteria: at least 80 per cent of cells should have expected frequencies of 5 or more. If you have a 1 by 2 or a 2 by 2 table, it is recommended that the expected frequency be at least 10.If you have a 2 by 2 table that violates this assumption you should consider using Fisher’s Exact Probability Test instead (also provided as part of the output from chi square).

Parametric alternative: none

Page 17: Non-Parametric Tests

Procedure for chi-square1. From the menu at the top of the screen click on:

Analyze, then click on Descriptive Statistics, then on Crosstabs.

2. Click on one of your variables (e.g. sex) to be your row variable, click on the arrow to move it into the box marked Row(s).

3. Click on the other variable to be your column variable (e.g. smoker), click on the arrow to move it into the box marked Column(s).

4. Click on the Statistics button. Choose Chi-square. Click on Continue.

5. Click on the Cells button.6. In the Counts box, click on the Observed and

Expected boxes.7. In the Percentage section click on the Row, Column

and Total boxes. Click on Continue and then OK.

Page 18: Non-Parametric Tests

The output

Page 19: Non-Parametric Tests

Cont.

Page 20: Non-Parametric Tests

Interpretation of output from chi-squareAssumptionsThe first thing you should check is whether you

have violated one of the assumptions of chi-square concerning the ‘minimum expected cell frequency’, which should be 5 or greater (or at least 80 per cent of cells have expected frequencies of 5 or more). This information is given in a footnote below the final table (labelled Chi-Square Tests). Footnote b in the example provided indicates that ‘0 cells (.0%) have expected count less than 5’. This means that we have not violated the assumption, as all our expected cell sizes are greater than 5 (in our case greater than 35.87).

Page 21: Non-Parametric Tests

Cont. Chi-square tests

The main value that you are interested in from the output is the Pearson chi square value.If you have a 2 by 2 table (i.e. each variable has only two categories), then you should use the value in the second row (Continuity Correction). In the example presented above the corrected value is .337, with an associated significance level of .56 (Asymp. Sig. (2-sided). In this case the value of .56 is larger than the alpha value of .05, so we can conclude that our result is not significant. This means that the proportion of males that smoke is not significantly different from the proportion of females that smoke.

Page 22: Non-Parametric Tests

Summary informationTo find what percentage of each sex smoke you will need to

look at the summary information provided in the table labelled SEX*SMOKE Crosstabulation. This table may look a little confusing to start with, with a fair bit of information presented in each cell. To find out what percentage of males are smokers you need to read across the page in the first row, which refers to males. In this case we look at the values next to ‘% within sex’. For this example 17.9 per cent of males were smokers, while 82.1 per cent were non-smokers. For females, 20.6 per cent were smokers, 79.4 per cent non-smokers. If we wanted to know what percentage of the sample as a whole smoked we would move down to the total row, which summarises across both sexes. In this case we would look at the values next to ‘% of total’. According to these results, 19.5 per cent of the sample smoked, 80.5 per cent being non-smokers.

Page 23: Non-Parametric Tests

23

2. The Chi-Square Test for Goodness-of-Fit The chi-square test for goodness-of-

fit uses frequency data from a sample to test hypotheses about the shape or proportions of a population.

Each individual in the sample is classified into one category on the scale of measurement.

The data, called observed frequencies, simply count how many individuals from the sample are in each category.

Page 24: Non-Parametric Tests

24

The Chi-Square Test for Goodness-of-Fit (cont.)The null hypothesis specifies the proportion

of the population that should be in each category.

The proportions from the null hypothesis are used to compute expected frequencies that describe how the sample would appear if it were in perfect agreement with the null hypothesis.

Page 25: Non-Parametric Tests
Page 26: Non-Parametric Tests

Procedure for chi-square

Page 27: Non-Parametric Tests

3. Mann-Whitney U TestThis technique is used to test for differences

between two independent groups on a continuous measure. For example, do males and females differ in terms of their self-esteem?

This test is the non-parametric alternative to the t-test for independent samples.

Mann-Whitney U Test compares medians. It converts the scores on the continuous variable to ranks, across the two groups. It then evaluates whether the ranks for the two groups differ significantly.

As the scores are converted to ranks, the actual distribution of the scores does not matter.

Page 28: Non-Parametric Tests

Summary for Mann-Whitney U TestExample of research question:

Do males and females differ in terms of their levels of self-esteem?

Do males have higher levels of self-esteem than females?

What do you need: Two variables:• one categorical variable with two groups (e.g. sex); and• one continuous variable (e.g. total self-esteem).

Assumptions: the general assumptions for non-parametric techniques presented at the beginning of this presentation.

Parametric alternative: Independent-samples t-test.

Page 29: Non-Parametric Tests

Procedure for Mann-Whitney U Test1. From the menu at the top of the screen click on:

Analyze, then click on Nonparametric Tests, then on 2 Independent Samples.

2. Click on your continuous (dependent) variable (e.g. total self-esteem) and move it into the Test Variable List box.

3. Click on your categorical (independent) variable (e.g. sex) and move into Grouping Variable box.

4. Click on Define Groups button.Type in the value for Group 1 (e.g. 1) and for Group 2 (e.g. 2). These are the values that were used to code your values for this variable (see your codebook). Click on Continue.

5. Make sure that the Mann-Whitney U box is ticked under the section labelled Test

Type. Click on OK.

Page 30: Non-Parametric Tests

The output

Page 31: Non-Parametric Tests

Interpretation of output from Mann-Whitney U Test

The two values that you need to look at in your output are the Z value and the significance level, which is given as Asymp. Sig (2-tailed).

If your sample size is larger than 30, SPSS will give you the value for a Z-approximation test which includes a correction for ties in the data.

In the example given above, the Z value is –1.23 (rounded) with a significance level of p=.22. The probability value (p) is not less than or equal to .05, so the result is not significant. There is no statistically significant difference in the self-esteem scores of males and females.

Page 32: Non-Parametric Tests

4. Wilcoxon Signed Rank TestThe Wilcoxon Signed Rank Test (also referred to as

the Wilcoxon matched pairs signed ranks test) is designed for use with repeated measures: that is, when your subjects are measured on two occasions, or under two different conditions.

It is the non-parametric alternative to the repeated measures t-test, but instead of comparing means the Wilcoxon converts scores to ranks and compares them at Time 1 and at Time 2.

The Wilcoxon can also be used in situations involving a matched subject design, where subjects are matched on specific criteria.

Page 33: Non-Parametric Tests

Summary for Wilcoxon Signed Rank TestExample of research question: Is there a change in the scores on the Fear of Statistics

test from Time 1 to Time 2?What do you need: One group of subjects measured on the same continuous

scale or criterion on two different occasions. The variables involved are scores at Time 1 or Condition 1, and scores at Time 2 or Condition 2.

Assumptions: See general assumptions for non-parametric techniques

presented at the beginning of this presentation.Parametric alternative: Paired-samples t-test.

Page 34: Non-Parametric Tests

Procedure for Wilcoxon Signed Rank Test1. From the menu at the top of the screen

click on: Analyze, then click on Nonparametric

Tests, then on 2 Related Samples.2. Click on the variables that represent the

scores at Time 1 and at Time 2 (e.g. fost1,fost2). Move these into the Test Pairs List box.3. Make sure that the Wilcoxon box is ticked

in the Test Type section. Click on OK.

Page 35: Non-Parametric Tests

The output

Page 36: Non-Parametric Tests

Interpretation of output from Wilcoxon Signed Rank TestThe two things you are interested in the output

are the Z value and the associated significance levels, presented as Asymp. Sig. (2-tailed).

If the significance level is equal to or less than .05 (e.g. .04, .01, .001) then you can conclude that the difference between the two scores is statistically significant.

In this example the Sig. value is .000 (which really means less than .0005). Therefore we can conclude that the two sets of scores are significantly different.

Page 37: Non-Parametric Tests

5. Kruskal-Wallis TestThe Kruskal-Wallis Test (sometimes referred to as

the Kruskal-Wallis H Test) is the non-parametric alternative to a one-way between-groups analysis of variance.

It allows you to compare the scores on some continuous variable for three or more groups. It is similar in nature to the Mann-Whitney test presented earlier in this chapter, but it allows you to compare more than just two groups.

Scores are converted to ranks and the mean rank for each group is compared. This is a ‘between groups’ analysis, so different people must be in each of the different groups.

Page 38: Non-Parametric Tests

Summary for Kruskal-Wallis TestExample of research question: Is there a difference in optimism levels across three age

levels?What you need: Two variables:• one categorical independent variable with three or

more categories (e.g. agegp3: 18–29, 30–44, 45+); and• one continuous dependent variable (e.g. total

optimism).Assumptions: See general assumptions for non-parametric techniques

presented at the beginning of this presentation.Parametric alternative: One-way between-groups analysis of variance.

Page 39: Non-Parametric Tests

Procedure for Kruskal-Wallis Test1. From the menu at the top of the screen click on:

Analyze, then click on Nonparametric Tests, then on K Independent Samples.

2. Click on your continuous (dependent variable) (e.g. total optimism) and move it into the Test Variable List box.

3. Click on your categorical (independent variable) (e.g. agegp3) and move it into the Grouping Variable box.

4. Click on the Define Range button. Type in the first value of your categorical variable (e.g., 1) in the Minimum box. Type the largest value for your categorical variable (e.g. 3) in the Maximum box. Click on Continue.

5. In the Test Type section make sure that the Kruskal-Wallis H box is ticked. Click on OK.

Page 40: Non-Parametric Tests

The output

Page 41: Non-Parametric Tests

Interpretation of output from Kruskal-Wallis TestThe main pieces of information you need from this output are: Chi-

Square value, the degrees of freedom (df) and the significance level (presented as Asymp. Sig.).

If this significance level is a value less than .05 (e.g. .04, .01, .001), then you can conclude that there is a statistically significant difference in your continuous variable across the three groups.

You can then inspect the Mean Rank for the three groups presented in your first output table. This will tell you which of the groups had the highest overall ranking that corresponds to the highest score on your continuous variable.

In the output presented above the significance level was .01 (rounded). This is less than the alpha level of .05, so these results suggest that there is a difference in optimism levels across the different age groups. An inspection of the mean ranks for the groups suggest that the older group (45+) had the highest optimism scores, with the younger group reporting the lowest.

Page 42: Non-Parametric Tests

6. Friedman TestThe Friedman Test is the non-parametric

alternative to the one-way repeated measures analysis of variance.

It is used when you take the same sample of subjects or cases and you measure them at three or more points in time, or under three different conditions.

Page 43: Non-Parametric Tests

Summary for Friedman TestExample of research question: Is there a change in Fear of Statistics scores across three

time periods (pre-intervention, post-intervention and at follow-up)?

What do you need: One sample of subjects, measured on the same scale or

measured at three different time periods, or under three different conditions.

Assumptions: See general assumptions for non-parametric techniques .Parametric alternative: Repeated measures (within-subjects) analysis of variance.

Page 44: Non-Parametric Tests

Procedure for Friedman Test1. From the menu at the top of the screen

click on: Analyze, then click on Nonparametric Tests, then on K Related Samples.

2. Click on the variables that represent the three measurements (e.g. fost1, fost2, fost3).

3. In the Test Type section check that the Friedman option is selected. Click on OK.

Page 45: Non-Parametric Tests

The output

Page 46: Non-Parametric Tests

Interpretation of output from Friedman TestThe results of this test suggest that there are

significant differences in the Fear of Statistics scores across the three time periods.

This is indicated by a Sig. level of .000 (which really means less than .0005). Comparing the ranks for the three sets of scores, it appears that there was a steady decrease in Fear of Statistics scores over time.

Page 47: Non-Parametric Tests

Reporting Statistics in APA StyleThe following examples illustrate how to report

statistics in the text of a research report. You will note that significance levels in journal articles--especially in tables--are often reported as either "p > .05," "p < .05," "p < .01," or "p < .001."

APA style dictates reporting the exact p value within the text of a manuscript (unless the p value is less than .001).

Please pay attention to issues of italics and spacing. APA style is very precise about these. Also, with the exception of some p values, most statistics should be rounded to two decimal places.

Page 48: Non-Parametric Tests

EXAMPLESMean and Standard Deviation are most

clearly presented in parentheses:The sample as a whole was relatively young (M = 19.22, SD = 3.45).The average age of students was 19.22 years (SD = 3.45).

Percentages are also most clearly displayed in parentheses with no decimal places:

Nearly half (49%) of the sample was married.

Page 49: Non-Parametric Tests

CONT.Reporting a significant single sample t-test

(μ ≠ μ0):Students taking statistics courses in psychology

at the University of Washington reported studying more hours for tests (M = 121, SD = 14.2) than did UW college students in in general, t(33) = 2.10, p = .034.

Reporting a significant t-test for dependent groups (μ1 ≠ μ2):

Results indicate a significant preference for pecan pie (M = 3.45, SD = 1.11) over cherry pie (M = 3.00, SD = .80), t(15) = 4.00, p = .001.

Page 50: Non-Parametric Tests

CONT. Reporting a significant t-test for

independent groups (μ1 ≠ μ2):UW students taking statistics courses in

Psychology had higher IQ scores (M = 121, SD = 14.2) than did those taking statistics courses in Statistics (M = 117, SD = 10.3), t(44) = 1.23, p = .09.

Over a two-day period, participants drank significantly fewer drinks in the experimental group (M= 0.667, SD = 1.15) than did those in the wait-list control group (M= 8.00, SD= 2.00), t(4) = -5.51, p=.005.

Page 51: Non-Parametric Tests

CONT. Reporting a significant omnibus F test for

a one-way ANOVA:An analysis of variance showed that the effect of

noise was significant, F(3,27) = 5.94, p = .007. Post hoc analyses using the Scheffé post hoc criterion for significance indicated that the average number of errors was significantly lower in the white noise condition (M = 12.4, SD = 2.26) than in the other two noise conditions (traffic and industrial) combined (M = 13.62, SD = 5.56), F(3, 27) = 7.77, p = .042.

Page 52: Non-Parametric Tests

CONT. Reporting the results of a chi-square test of

independence:A chi-square test of independence was performed to

examine the relation between religion and college interest. The relation between these variables was significant, X2 (2, N = 170) = 14.14, p <.01. Catholic teens were less likely to show an interest in attending college than were Protestant teens.

Reporting the results of a chi-square test of goodness of fit:

A chi-square test of goodness-of-fit was performed to determine whether the three sodas were equally preferred. Preference for the three sodas was not equally distributed in the population, X2 (2, N = 55) = 4.53, p < .05.

Page 53: Non-Parametric Tests

Cont. Regression results are often best presented in a table.

APA doesn't say much about how to report regression results in the text, but if you would like to report the regression in the text of your Results section, you should at least present the unstandardized or standardized slope (beta), whichever is more interpretable given the data, along with the t-test and the corresponding significance level. (Degrees of freedom for the t-test is N-k-1 where k equals the number of predictor variables.) It is also customary to report the percentage of variance explained along with the corresponding F test. Social support significantly predicted depression scores, b = -.34, t(225) = 6.53, p < .001. Social support also explained a significant proportion of variance in depression scores, R2 = .12, F(1, 225) = 42.64,p < .001.

Page 54: Non-Parametric Tests

Cont. Correlations are reported with the degrees of

freedom (which is N-2) in parentheses and the significance level: The two variables were strongly correlated, r(55) = .49, p < .01.

Tables are useful if you find that a paragraph has almost as many numbers as words. If you do use a table, do not also report the same information in the text. It's either one or the other.

Based on:  American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC: Author.

Page 55: Non-Parametric Tests

THE END