ps215: methods in psychology ii w eek 8. 2 next friday (week 9) evaluating research, class test...

PS215: Methods in Psychology II

Week 8

2

Next Friday (Week 9)

Evaluating research, class test

First ten minutes of lecture (2.05-2.15)

Please come a little early

Please sit one seat space apart if possible

Please do not talk once seated, until the test finishes

There will be a lecture after the test

3

Learning objectives

• specific contrasts are sometimes more useful than ANOVA main effects

• linear contrasts and pairwise comparison are important examples of contrasts

• effect size can be more relevant than significance

• multiplicity affects the interpretation of results• distinction between planned and unplanned

comparisons affects interpretation of p-value

4

Study: Development of motor skill

50 children at five ages (11, 12, 13, 14, 15)

record how well they play a new video game

Is Age a good predictor of Game Score?

Age 11 12 13 14 15

Score 25 30 40 50 55

Example from Rosenthal, Rosnow, and Rubin (2000)

5

0

10

20

30

40

50

60

11 12 13 14 15

Age (years)

Sco

re (

po

ints

)

6

ANOVA

Source SS df MS F p--------------------------------------------------------

-----------Age levels 6,500 4 1,625 1.03 .40Within error70,875 45 1,575

not significant!

Should we conclude that age is not a useful predictor?

7

Should we conclude that age is not a useful predictor?

ANOVA main effect did not use information about the order of the ages

ANOVA tests an unfocused question"any differences among the five age levels“

A more focused question a more powerful test

8

A specific contrast

Choose – a weight for each level– weights reflect the contrast you want to test– weights add up to zero

Age 11 12 13 14 15

Contrast -2 -1 0 1 2

The contrast weight represents a specific model the form you expect the relationship to take

9

ANOVA - scores are different at different ages

Linear contrast- scores go up in a straight line

as age increases

In this example, the linear contrast is statistically significant:

t(45) = 2.02, p = .025

10

Pairwise comparisons

Overall main effect often not an especially interesting hypothesis

Week 5 ANOVA tested whether the average comfort score was different for different drugs (main effect of 'Drug')

Effect significant, but what can you conclude?

"The drugs did not all have the same effect"

11

Pairwise comparisons

A more interesting question would be:

'Is aspirin more effective than tylenol?‘

When two groups are compared, it's called a pairwise comparison

You can express a pairwise comparison as a contrast too:

Drug

Asprin Tylenol Nuprin Bufferin

+1 -1 0 0

12

Effect size

If asprin is significantly better than tylenol,

should we stop ordering tylenol for the pharmacy?

Significance level (p-value) & sample size

a very large sample can detect tiny effects

too small a sample can miss even a large effect

A very small p (eg. p = .001) does not in itself mean a strong effect

Significance and effect size are different things

13

To measure effect size

d = M1 – M2

Where:M1 and M2 are the respective group meansis an estimate of population s.d.

0.2 is "small"; 0.8 is a "large" effect(Cohen, 1977)

14

Multiplicity

Take 15 measures of individual differences

Correlate each with all the others

There will be 105 different correlations

So we expect 5 to reach the 5% p-value (.05) even if there are no real relationships

15

Not appropriate to claim statistical significance for results in such circumstances

Choice

• use a stricter, more conservative, criterion

• attempt to replicate your result

16

More conservative criterion

Bonferroni adjustment

For 105 comparisons

set required p-value to 0.05 / 105

Simple approach, wide applicability

17

Replication

Does the result continue to appear?

If it is real, it should appear again in another study

Meta-analysis takes this method further by aggregating results from several studies

18

Planned and unplanned comparisons

Planned (“a priori”)

contrast envisaged at the outset

follows from the logic of the study design

Treat significance values straightforwardly

19

Unplanned comparisons

Unplanned (“post hoc” tests)

chosen on the basis of looking at the data

often – is an unexpected difference or pattern statistically reliable?

Multiplicity issue

-- even if you actually do just one, effectively you looked at them all

20

Unplanned comparisons

Choice

• use a stricter, more conservative, criterion

Bonferroni adjusted tests

Special purpose tests

eg. Tukey HSD

• attempt to replicate your result

21

Learning objectives

• specific contrasts are sometimes more useful than ANOVA main effects

• linear contrasts and pairwise comparison are important examples of contrasts

• effect size can be more relevant than significance

• multiplicity affects the interpretation of results• distinction between planned and unplanned

comparisons affects interpretation of p-value

22

Getting a contrast in SPSS

Syntax window (start setting up ANOVA, then choose paste)

For a two way ANOVAIVs a(2) x b(4)DV yTo contrast the four means within bNote F-ratio for doing this is bigger than if collapse b as a one –way, cos including the extra predictor a will reduce

error variance

glm y by a b/contrast(b)= special (0 0 1 -1).

Actually, the following is what you get from GLM if you set up a two-way ANOVA, and the same contrast can be added.

UNIANOVA y BY a b /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /CRITERIA = ALPHA(.05) /DESIGN = a b a*b/contrast(b) = special (0 0 1 -1) .

ps215: methods in psychology ii w eek 8. 2 next friday (week 9) evaluating research, class test...

Documents