psy2004 research methods psy2005 applied research methods week five

PSY2004 Research Methods PSY2005 Applied Research

Methods

Week Five

TodayGeneral principleshow it works, what it tells you etc.

Next weekExtra bits and bobsassumptions, follow-on analyses, effect sizes

ANalysis Of VAriance

A ‘group’ of statistical tests

Useful, hence widely used

used with a variety of designs

start with independent groups & 1 independent variable.

other designs later

revision[inferential stats]

trying to make inference about POPULATION on basis of SAMPLE

but sampling ‘error’ means sample not quite equal to population

Two hypotheses for e.g., difference between group means in your sample

H0 - just sampling error, no difference

between population means

H1 - a difference between the population

means

Decide on the basis of probability• of getting your sample were H0 to be true

if that probability is low

if such our difference between sample means would be unlikely were H0 true

if it would be a rare event

we reject H0 (and so accept H1)

low / unlikely / rare = < 0.05 (5%)

what is ANOVA for?despite its name (analysis of variance)

looks at differences between means

looks at differences between groups means in sample

to make inference about differences between group means in the population

but we already have the t-test

• used to compare means• e.g., PSY1016 last year: difference between males’ &

females’ mean Trait Emotional Intelligence scores

• can only compare two means at a time

• what if we have more than two groups/means?

e.g.,

comparison of drug treatments for offenders

12 step programme

cognitive-behavioural motivational intervention

standard care

DV: no. of days drugs taken in a month

e.g.,

comparison of coaching methods

professional coaching

peer coaching

standard tutorial

DV: self-reported goal attainment (1-5 scale)

we can use the t-test here

• just lots of them

• multiple comparisons• 12-step vs cog-behavioural• 12-step vs standard care• cog-behavioural vs standard care

• bit messy / longwindedbut computer does the hard work

• far more serious potential problem …

• professional coaching vs peer• professional vs standard tutorial• peer vs standard tutorial

increased chance of making a

mistake

statistical inference based on probability

not certainalways a chance that we will make the wrong

decision

two types of mistaketype I – reject H0 when it is in fact true

decide there’s a difference between population means when there isn’t

[false positive]

type II – fail to reject H0 when it is in fact false[false negative]

type I error[false positive]

we reject H0 when p < 0.05i.e., less than 5% chance of getting our data

(or more extreme) were H0 to be true

5% is small, but it isn’t zero

still a chance of H0 being true and getting our data

still a chance of rejecting H0 but it being true

alpha (α)[criterion for rejecting H0 - typically 5%]

sets a limit on probability of making type I error

if H0 true we would only reject it 5% of the time

but multiple comparisons change the situation ….

russian roulette[with six chamber revolver]

one bullet, spin the cylinder

muzzle to temple, pull trigger

one-in-six chance of blowing your brains out

russian roulette

with three guns

each gun on its owna one-in-six chance of blowing your brains out

for the ‘family’ of three gunsthe probability of you getting a bullet in your brain is worryingly higher

russian roulette[with twenty chamber revolver]

one-in-twenty (5%) chance of blowing your brains out

with three such gunsprobability = 1 – (.95 x .95 x .95) = .14

just the same with multiple comparisons[with one obvious difference]

each individual t-testmaximum type I error rate (α) of 5%

for a ‘family’ of three such t-testserror rate = 1 – (.95 x .95 x .95) = .14

controlling family-wise error

various techniques

e.g., Bonferroni correction

adjust α for each individual comparison

where k = number of comparisons

for our ‘family’ of three comparisons

use adjusted α of .0167 for each t-testfamily-wise Type I error rate limited to 5%

and all is well ….

… actually it isn’t.

Such ‘corrections’ come at a price

increased chance of making a type II error

failing to reject H0 when it is in fact false[false negative]

less chance of detecting an effect when there is one

aka low ‘power’ [more of this in Week 10]

Moving from comparing two means to considering three has complicated matters

we seem to face either

increased type I error rateor

increased type II error rate[lower power]

This [finally] is where ANOVA comes in

It can help us detect any difference

between our 3 (or more) group means

withoutincreasing type I error rateor

reducing power

ANOVA is another NHST[Null Hypothesis Significance Test]

need to know what your H0 and H1 are

H0 – all the population means are the same,

any differences between sample means are simply due to sampling error

H1 – the population means are not all the same

NB one-tailed vs two-tailed doesn’t apply

How does ANOVA work?

the heart of ANOVA is the F ratio

a ratio of two estimates of the population variance, both based on the sample

what’s that got to do with differences between means?Be patient.

e.g.,

comparison of drug treatments for offenders

12 step programme

cognitive-behavioural motivational intervention

standard care

DV: no. of days drugs taken in a month

Random data generated by SPSS[just like PSY1017 W09 labs last year]

3 samples (N=48)

all from the same population

H0 [null hypothesis]

(no difference between population means)

TRUE

the sample means are not all the same

due to ‘sampling error’

they vary around the overall mean of 6.61

between-group variability

the standard deviations show how varied individual scores are for each group

within-group variability

both

between-groups variabilityand

within-groups variability

can be used to estimate the population variance

Don’t worry [for now] how this is done

estimate of population variance based on

between-groups variability(differences of groups means around overall mean)

= 3.07


within-groups variability(how varied individual scores are for each group)

= 2.02

F ratio = between-groups estimatewithin-groups estimate

= 3.072.02

= 1.52

estimates unlikely to be exactly the same, but similar, and so F ratio will be approximately = 1 WHEN H0 IS TRUE

Random data generated by SPSS3 samples (N=48), all as before but

+1 to all ‘Standard Care’ scores

H0 [null hypothesis]

(no difference between population means)

FALSE

previous exampleH0 true

newexampleH0 false

within-groups variability UNCHANGED

only between-groups variability affected


between-groups variability(differences of groups means around overall mean)

= 3.07[previous], = 9.79 [new]


within-groups variability(how varied individual scores are for each group)

= 2.02[previous], = 2.02 [new]

F ratio = between-groups estimatewithin-groups estimate

= 3.07[previous], = 9.79[new]

2.02 2.02

= 1.52[previous], =4.85[new]

F ratio will tend to be larger WHEN H0 IS FALSE

as only between-groups estimate affected by differences between means.

ANOVA is another NHSTprobability of getting F-ratio (or more extreme) if H0 true

If p < 0.05, reject H0

H0 – all the population means are the same

and so accept

H1 – the population means are not all the same

NB this doesn’t say anything about which means are different to which other ones

psy2004 research methods psy2005 applied research methods week five

Documents