psy2004 research methods psy2005 applied research methods week six
Post on 13-Jan-2016
231 Views
Preview:
TRANSCRIPT
PSY2004 Research Methods PSY2005 Applied Research
Methods
Week Six
Last WeekGeneral principleshow it works, what it tells you etc.
TodayExtra bits and bobsassumptions, follow-on analyses, effect sizes
last weekindependent groups (IG) ANOVA
can also use ANOVA with repeated measures (RM) designs
e.g., same participants tested after three different time periods
[last week]
comparison of drug treatments for offenders
12 step programme
cognitive-behavioural motivational intervention
standard care
DV: no. of days drugs taken in a month
[this week]
effect of drug treatments over time[ignoring, for now, different treatments]
1 month
6 months
12 months
DV: no. of days drugs taken in a month
repeated measures ANOVAsame general principle as IG ANOVA
some computational differences[don’t worry about these for now]
somewhat different ‘assumptions’[more of this later]
generally (all other things being equal)more sensitive, more ‘powerful’
[revision]
why stats?
variability in the data
lots of different, random sources of variability
[revision]
we’re trying to see if changes in the
Independent Variable• treatment type
affects scores on the
Dependent Variable• no. of days drugs taken
[revision]
lots of other things affect the DV• individual differences• time of day
• mood
• level of attention
• etc etc etc etc
Lots of random, unsystematic, sources of variation, unrelated to IV
‘noise’
[revision]
trying to see any effect due to the Independent Variableon the Dependent Variable
through the ‘noise’
multiple scores from the same participants
allow you to identify
& remove the ‘noise’ in the data
due to individual differences
assumptionscharacteristics our data should possessfor statistical tests to be valid & accurate
tests ‘assume’ such characteristics
if data doesn’t meet assumptions then outcome of test likely to wrong
need to know what they areand check they are met
IG ANOVA assumptionsindependence of observations
[no correlations across groups]
normally distributed scores[for population, within groups]
homogeneity of variance [HoV][all groups come from populations with same variance]only difference (if any) in terms of means
Levene’s test for HoV
what it doesn’t do:
does not test for differences between means
not relevant to our hypotheses about or interest in any differences between means
Levene’s test for HoV
what it does:
compares variances of the different groups
to make inference about whether the variances of the populations are different
Levene’s test for HoV
another NHST:
H0 – population variances are all the same
H1 – population variances not all the same
if statistically significant [i.e., p < 0.05]reject H0 and so accept H1
Levene’s test for HoVa statistically significant Levene’s test suggests we should not assume HoV
should ‘correct’ the F ratio to reduce likelihood of error
two commonly used options [Welch, Brown-
Forsythe] available within SPSS
impacts on choice of post-hoc test as well[more on this later]
Levene’s test for HoV
This test is unusual in that we don’t want it to be statistically significant!
if Levene’s test NOT significantthen you can usually assume you have HoV
but be careful if your sample size is lowyour Levene’s test [like any other NHST] might not have enough ‘power’ to detect differences in variance
‘Robustness’
ANOVA generally quite ‘robust’
not-so-serious breaking of assumptions doesn’t increase probability of error very much
as long as sample sizes are equal
RM ANOVA assumptions‘sphericity’
equality of variances for differences between conditions
e.g., measures taken at 3 points in time (1 month, 3 months, 12 months)
variance1-3 = variance3-12 = variance1-12
Mauchly’s test for sphericityanother NHST
H0 = variances of differences between
conditions are equal
H1 = variances of differences between
conditions are not equal
if statistically significant [i.e., p < 0.05]reject H0 and so accept H1
Mauchly’s test for sphericitya statistically significant Mauchley’s test suggests we should not assume sphericity
increased probability of making type I error
adjust degrees of freedom to control this[various options – covered in lab class]
impacts on choice of post-hoc test as well[more on this later]
Mauchly’s test for sphericity
This test is unusual in that we don’t want it to be statistically significant!
if Mauchly’s test NOT significantthen you can usually assume you have sphericity
but be careful if your sample size is lowyour Mauchly’s test [like any other NHST] might not have enough ‘power’ to detect differences in variance
Effect Size
NHST tests limited to binary decision
reject H0 or not
evidence for an effect or not
doesn’t distinguish between small and big effects
With a large enough samplei.e., high power or sensitivity
even a very very small effect can reach statistical significance
only rejecting hypothesis that effect is exactly equal to zero
non-zero effect can be statistically significant but small enough to be trivial
interesting and important to also
consider the size of any effect
PSY1017 – correlation coefficientsmagnitude of coefficientstrength of relationship – i.e., size of effect
‘benchmarks’.1 = small, .3 = medium, .5 = large
want something similar for ANOVA
a statistically significant F ratio suggests some kind of effect
need measure of effect size to tell us how big (or small) the effect is
various different effect size measures
we shall use partial eta-squared (partial η2)
proportion of total variability in the sample ‘explained’ by the effect [i.e., the IV]
(and which is not explained by any other variable in the analysis)
between-groups variabilitybetween- & within-groups variabilityNB - measure of between-group variability used as basis for estimate of population variance [see last week] but isn’t equal to it.
NB 2 – for one-way anova [only one IV] you don’t have to worry about the “and which is not explained by any other variable in the analysis bit” because there aren’t any other variables! But things are different when you’ve got 2 or more IVs [covered later in the module]
between-groups
within-groups
variability
estimates of population variance
why partial eta-squared?(partial η2)
• it’s probably the widely used
• SPSS can compute it for us• [this is a rubbish reason]
• rough ‘benchmarks’ exist• .01 = small; .06 = medium; .14 = large
[last week]
ANOVA is another NHSTprobability of getting F-ratio (or more extreme) if H0 true
If p < 0.05, reject H0
H0 – all the population means are the same
and so accept
H1 – the population means are not all the same
NB this doesn’t say anything about which means are different to which other ones
Follow-on analyses
if you have very specific hypothesesonly interested in a few specific comparisons
can specify these in advance (a priori)
Planned Comparisons (Contrasts)not following this route in lab classes
see Field section 10.2.11 (3rd ed); 11.4 (4th ed)
if interested in all comparisonscan’t specify those of interest in advance
only looking after study completed (post hoc)
multiple comparisons / post hoc tests12-step vs cog-behavioural
12-step vs standard care
cog-behavioural vs standard care
[last week]
Moving from comparing two means to considering three has complicated matters
we seem to face either
increased type I error rateor
increased type II error rate[lower power]
[last week]
This [finally] is where ANOVA comes in
It can help us detect any difference
between our 3 (or more) group means
withoutincreasing type I error rateor
reducing power
so, I hear you cry,
you end up doing the very procedures ANOVA is meant to spare us the pitfalls of!
why bother with the ANOVA?
only bother with post hocs if you have a statistically significant ANOVA
the ANOVA serves the purpose of detecting that there is some kind of effect in the first place[controlling Type I error rate & without cost in terms of power]
corrected multiple comparisons (post hocs)[with their reduced power]
might have missed that effect
[ANOVA can do other stuff; more on that later in the module]
Many different post hoc procedures[18 available in SPSS]
Bonferroni correction very populareasy to understandvery good control over Type I error ratepays heavy prices in terms of power
procedures vary in terms of
particular balance they strike between Type I error & Type II error [power]
• those that favour controlling Type I error over concern with power [‘conservative’]
• those that relax Type I error control so as to maintain power [‘liberal’]
‘robustness’• accuracy when samples sizes unequal and/or
small, when you haven’t got HoV or Sphericity
See Field 10.2.12 (3rd ed); 11.5 (4th ed)• need to make a judgment• nature of your data (assumptions, sample sizes)• which type of error worst in your situation• no single correct answer, but has to be justified
Can’t take a ‘cookbook’ approach
top related