psy2004 research methods psy2005 applied research methods week five
TRANSCRIPT
PSY2004 Research Methods PSY2005 Applied Research
Methods
Week Five
TodayGeneral principleshow it works, what it tells you etc.
Next weekExtra bits and bobsassumptions, follow-on analyses, effect sizes
ANalysis Of VAriance
A ‘group’ of statistical tests
Useful, hence widely used
used with a variety of designs
start with independent groups & 1 independent variable.
other designs later
revision[inferential stats]
trying to make inference about POPULATION on basis of SAMPLE
but sampling ‘error’ means sample not quite equal to population
Two hypotheses for e.g., difference between group means in your sample
H0 - just sampling error, no difference
between population means
H1 - a difference between the population
means
Decide on the basis of probability• of getting your sample were H0 to be true
if that probability is low
if such our difference between sample means would be unlikely were H0 true
if it would be a rare event
we reject H0 (and so accept H1)
low / unlikely / rare = < 0.05 (5%)
what is ANOVA for?despite its name (analysis of variance)
looks at differences between means
looks at differences between groups means in sample
to make inference about differences between group means in the population
but we already have the t-test
• used to compare means• e.g., PSY1016 last year: difference between males’ &
females’ mean Trait Emotional Intelligence scores
• can only compare two means at a time
• what if we have more than two groups/means?
e.g.,
comparison of drug treatments for offenders
12 step programme
cognitive-behavioural motivational intervention
standard care
DV: no. of days drugs taken in a month
e.g.,
comparison of coaching methods
professional coaching
peer coaching
standard tutorial
DV: self-reported goal attainment (1-5 scale)
we can use the t-test here
• just lots of them
• multiple comparisons• 12-step vs cog-behavioural• 12-step vs standard care• cog-behavioural vs standard care
• bit messy / longwindedbut computer does the hard work
• far more serious potential problem …
• professional coaching vs peer• professional vs standard tutorial• peer vs standard tutorial
increased chance of making a
mistake
statistical inference based on probability
not certainalways a chance that we will make the wrong
decision
two types of mistaketype I – reject H0 when it is in fact true
decide there’s a difference between population means when there isn’t
[false positive]
type II – fail to reject H0 when it is in fact false[false negative]
type I error[false positive]
we reject H0 when p < 0.05i.e., less than 5% chance of getting our data
(or more extreme) were H0 to be true
5% is small, but it isn’t zero
still a chance of H0 being true and getting our data
still a chance of rejecting H0 but it being true
alpha (α)[criterion for rejecting H0 - typically 5%]
sets a limit on probability of making type I error
if H0 true we would only reject it 5% of the time
but multiple comparisons change the situation ….
russian roulette[with six chamber revolver]
one bullet, spin the cylinder
muzzle to temple, pull trigger
one-in-six chance of blowing your brains out
russian roulette
with three guns
each gun on its owna one-in-six chance of blowing your brains out
for the ‘family’ of three gunsthe probability of you getting a bullet in your brain is worryingly higher
russian roulette[with twenty chamber revolver]
one-in-twenty (5%) chance of blowing your brains out
with three such gunsprobability = 1 – (.95 x .95 x .95) = .14
just the same with multiple comparisons[with one obvious difference]
each individual t-testmaximum type I error rate (α) of 5%
for a ‘family’ of three such t-testserror rate = 1 – (.95 x .95 x .95) = .14
controlling family-wise error
various techniques
e.g., Bonferroni correction
adjust α for each individual comparison
where k = number of comparisons
for our ‘family’ of three comparisons
use adjusted α of .0167 for each t-testfamily-wise Type I error rate limited to 5%
and all is well ….
… actually it isn’t.
Such ‘corrections’ come at a price
increased chance of making a type II error
failing to reject H0 when it is in fact false[false negative]
less chance of detecting an effect when there is one
aka low ‘power’ [more of this in Week 10]
Moving from comparing two means to considering three has complicated matters
we seem to face either
increased type I error rateor
increased type II error rate[lower power]
This [finally] is where ANOVA comes in
It can help us detect any difference
between our 3 (or more) group means
withoutincreasing type I error rateor
reducing power
ANOVA is another NHST[Null Hypothesis Significance Test]
need to know what your H0 and H1 are
H0 – all the population means are the same,
any differences between sample means are simply due to sampling error
H1 – the population means are not all the same
NB one-tailed vs two-tailed doesn’t apply
How does ANOVA work?
the heart of ANOVA is the F ratio
a ratio of two estimates of the population variance, both based on the sample
what’s that got to do with differences between means?Be patient.
e.g.,
comparison of drug treatments for offenders
12 step programme
cognitive-behavioural motivational intervention
standard care
DV: no. of days drugs taken in a month
Random data generated by SPSS[just like PSY1017 W09 labs last year]
3 samples (N=48)
all from the same population
H0 [null hypothesis]
(no difference between population means)
TRUE
the sample means are not all the same
due to ‘sampling error’
they vary around the overall mean of 6.61
between-group variability
the standard deviations show how varied individual scores are for each group
within-group variability
both
between-groups variabilityand
within-groups variability
can be used to estimate the population variance
Don’t worry [for now] how this is done
estimate of population variance based on
between-groups variability(differences of groups means around overall mean)
= 3.07
estimate of population variance based on
within-groups variability(how varied individual scores are for each group)
= 2.02
F ratio = between-groups estimatewithin-groups estimate
= 3.072.02
= 1.52
estimates unlikely to be exactly the same, but similar, and so F ratio will be approximately = 1 WHEN H0 IS TRUE
Random data generated by SPSS3 samples (N=48), all as before but
+1 to all ‘Standard Care’ scores
H0 [null hypothesis]
(no difference between population means)
FALSE
previous exampleH0 true
newexampleH0 false
within-groups variability UNCHANGED
only between-groups variability affected
estimate of population variance based on
between-groups variability(differences of groups means around overall mean)
= 3.07[previous], = 9.79 [new]
estimate of population variance based on
within-groups variability(how varied individual scores are for each group)
= 2.02[previous], = 2.02 [new]
F ratio = between-groups estimatewithin-groups estimate
= 3.07[previous], = 9.79[new]
2.02 2.02
= 1.52[previous], =4.85[new]
F ratio will tend to be larger WHEN H0 IS FALSE
as only between-groups estimate affected by differences between means.
ANOVA is another NHSTprobability of getting F-ratio (or more extreme) if H0 true
If p < 0.05, reject H0
H0 – all the population means are the same
and so accept
H1 – the population means are not all the same
NB this doesn’t say anything about which means are different to which other ones