ps215: methods in psychology ii w eek 8. 2 next friday (week 9) evaluating research, class test...
TRANSCRIPT
PS215: Methods in Psychology II
Week 8
2
Next Friday (Week 9)
Evaluating research, class test
First ten minutes of lecture (2.05-2.15)
Please come a little early
Please sit one seat space apart if possible
Please do not talk once seated, until the test finishes
There will be a lecture after the test
3
Learning objectives
• specific contrasts are sometimes more useful than ANOVA main effects
• linear contrasts and pairwise comparison are important examples of contrasts
• effect size can be more relevant than significance
• multiplicity affects the interpretation of results• distinction between planned and unplanned
comparisons affects interpretation of p-value
4
Study: Development of motor skill
50 children at five ages (11, 12, 13, 14, 15)
record how well they play a new video game
Is Age a good predictor of Game Score?
Age 11 12 13 14 15
Score 25 30 40 50 55
Example from Rosenthal, Rosnow, and Rubin (2000)
5
0
10
20
30
40
50
60
11 12 13 14 15
Age (years)
Sco
re (
po
ints
)
6
ANOVA
Source SS df MS F p--------------------------------------------------------
-----------Age levels 6,500 4 1,625 1.03 .40Within error70,875 45 1,575
not significant!
Should we conclude that age is not a useful predictor?
7
Should we conclude that age is not a useful predictor?
ANOVA main effect did not use information about the order of the ages
ANOVA tests an unfocused question"any differences among the five age levels“
A more focused question a more powerful test
8
A specific contrast
Choose – a weight for each level– weights reflect the contrast you want to test– weights add up to zero
Age 11 12 13 14 15
Contrast -2 -1 0 1 2
The contrast weight represents a specific model the form you expect the relationship to take
9
ANOVA - scores are different at different ages
Linear contrast- scores go up in a straight line
as age increases
In this example, the linear contrast is statistically significant:
t(45) = 2.02, p = .025
10
Pairwise comparisons
Overall main effect often not an especially interesting hypothesis
Week 5 ANOVA tested whether the average comfort score was different for different drugs (main effect of 'Drug')
Effect significant, but what can you conclude?
"The drugs did not all have the same effect"
11
Pairwise comparisons
A more interesting question would be:
'Is aspirin more effective than tylenol?‘
When two groups are compared, it's called a pairwise comparison
You can express a pairwise comparison as a contrast too:
Drug
Asprin Tylenol Nuprin Bufferin
+1 -1 0 0
12
Effect size
If asprin is significantly better than tylenol,
should we stop ordering tylenol for the pharmacy?
Significance level (p-value) & sample size
a very large sample can detect tiny effects
too small a sample can miss even a large effect
A very small p (eg. p = .001) does not in itself mean a strong effect
Significance and effect size are different things
13
To measure effect size
d = M1 – M2
Where:M1 and M2 are the respective group meansis an estimate of population s.d.
0.2 is "small"; 0.8 is a "large" effect(Cohen, 1977)
14
Multiplicity
Take 15 measures of individual differences
Correlate each with all the others
There will be 105 different correlations
So we expect 5 to reach the 5% p-value (.05) even if there are no real relationships
15
Not appropriate to claim statistical significance for results in such circumstances
Choice
• use a stricter, more conservative, criterion
• attempt to replicate your result
16
More conservative criterion
Bonferroni adjustment
For 105 comparisons
set required p-value to 0.05 / 105
Simple approach, wide applicability
17
Replication
Does the result continue to appear?
If it is real, it should appear again in another study
Meta-analysis takes this method further by aggregating results from several studies
18
Planned and unplanned comparisons
Planned (“a priori”)
contrast envisaged at the outset
follows from the logic of the study design
Treat significance values straightforwardly
19
Unplanned comparisons
Unplanned (“post hoc” tests)
chosen on the basis of looking at the data
often – is an unexpected difference or pattern statistically reliable?
Multiplicity issue
-- even if you actually do just one, effectively you looked at them all
20
Unplanned comparisons
Choice
• use a stricter, more conservative, criterion
Bonferroni adjusted tests
Special purpose tests
eg. Tukey HSD
• attempt to replicate your result
21
Learning objectives
• specific contrasts are sometimes more useful than ANOVA main effects
• linear contrasts and pairwise comparison are important examples of contrasts
• effect size can be more relevant than significance
• multiplicity affects the interpretation of results• distinction between planned and unplanned
comparisons affects interpretation of p-value
22
Getting a contrast in SPSS
Syntax window (start setting up ANOVA, then choose paste)
For a two way ANOVAIVs a(2) x b(4)DV yTo contrast the four means within bNote F-ratio for doing this is bigger than if collapse b as a one –way, cos including the extra predictor a will reduce
error variance
glm y by a b/contrast(b)= special (0 0 1 -1).
Actually, the following is what you get from GLM if you set up a two-way ANOVA, and the same contrast can be added.
UNIANOVA y BY a b /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /CRITERIA = ALPHA(.05) /DESIGN = a b a*b/contrast(b) = special (0 0 1 -1) .