psy2004 research methods psy2005 applied research methods week six

PSY2004 Research Methods PSY2005 Applied Research

Methods

Week Six

Last WeekGeneral principleshow it works, what it tells you etc.

TodayExtra bits and bobsassumptions, follow-on analyses, effect sizes

last weekindependent groups (IG) ANOVA

can also use ANOVA with repeated measures (RM) designs

e.g., same participants tested after three different time periods

[last week]

comparison of drug treatments for offenders

12 step programme

cognitive-behavioural motivational intervention

standard care

DV: no. of days drugs taken in a month

[this week]

effect of drug treatments over time[ignoring, for now, different treatments]

1 month

6 months

12 months

DV: no. of days drugs taken in a month

repeated measures ANOVAsame general principle as IG ANOVA

some computational differences[don’t worry about these for now]

somewhat different ‘assumptions’[more of this later]

generally (all other things being equal)more sensitive, more ‘powerful’

[revision]

why stats?

variability in the data

lots of different, random sources of variability

[revision]

we’re trying to see if changes in the

Independent Variable• treatment type

affects scores on the

Dependent Variable• no. of days drugs taken

[revision]

lots of other things affect the DV• individual differences• time of day

• mood

• level of attention

• etc etc etc etc

Lots of random, unsystematic, sources of variation, unrelated to IV

‘noise’

[revision]

trying to see any effect due to the Independent Variableon the Dependent Variable

through the ‘noise’

multiple scores from the same participants

allow you to identify

& remove the ‘noise’ in the data

due to individual differences

assumptionscharacteristics our data should possessfor statistical tests to be valid & accurate

tests ‘assume’ such characteristics

if data doesn’t meet assumptions then outcome of test likely to wrong

need to know what they areand check they are met

IG ANOVA assumptionsindependence of observations

[no correlations across groups]

normally distributed scores[for population, within groups]

homogeneity of variance [HoV][all groups come from populations with same variance]only difference (if any) in terms of means

Levene’s test for HoV

what it doesn’t do:

does not test for differences between means

not relevant to our hypotheses about or interest in any differences between means

what it does:

compares variances of the different groups

to make inference about whether the variances of the populations are different

another NHST:

H0 – population variances are all the same

H1 – population variances not all the same

if statistically significant [i.e., p < 0.05]reject H0 and so accept H1

Levene’s test for HoVa statistically significant Levene’s test suggests we should not assume HoV

should ‘correct’ the F ratio to reduce likelihood of error

two commonly used options [Welch, Brown-

Forsythe] available within SPSS

impacts on choice of post-hoc test as well[more on this later]

This test is unusual in that we don’t want it to be statistically significant!

if Levene’s test NOT significantthen you can usually assume you have HoV

but be careful if your sample size is lowyour Levene’s test [like any other NHST] might not have enough ‘power’ to detect differences in variance

‘Robustness’

ANOVA generally quite ‘robust’

not-so-serious breaking of assumptions doesn’t increase probability of error very much

as long as sample sizes are equal

RM ANOVA assumptions‘sphericity’

equality of variances for differences between conditions

e.g., measures taken at 3 points in time (1 month, 3 months, 12 months)

variance1-3 = variance3-12 = variance1-12

Mauchly’s test for sphericityanother NHST

H0 = variances of differences between

conditions are equal

H1 = variances of differences between

conditions are not equal

if statistically significant [i.e., p < 0.05]reject H0 and so accept H1

Mauchly’s test for sphericitya statistically significant Mauchley’s test suggests we should not assume sphericity

increased probability of making type I error

adjust degrees of freedom to control this[various options – covered in lab class]

impacts on choice of post-hoc test as well[more on this later]

Mauchly’s test for sphericity

This test is unusual in that we don’t want it to be statistically significant!

if Mauchly’s test NOT significantthen you can usually assume you have sphericity

but be careful if your sample size is lowyour Mauchly’s test [like any other NHST] might not have enough ‘power’ to detect differences in variance

Effect Size

NHST tests limited to binary decision

reject H0 or not

evidence for an effect or not

doesn’t distinguish between small and big effects

With a large enough samplei.e., high power or sensitivity

even a very very small effect can reach statistical significance

only rejecting hypothesis that effect is exactly equal to zero

non-zero effect can be statistically significant but small enough to be trivial

interesting and important to also

consider the size of any effect

PSY1017 – correlation coefficientsmagnitude of coefficientstrength of relationship – i.e., size of effect

‘benchmarks’.1 = small, .3 = medium, .5 = large

want something similar for ANOVA

a statistically significant F ratio suggests some kind of effect

need measure of effect size to tell us how big (or small) the effect is

various different effect size measures

we shall use partial eta-squared (partial η2)

proportion of total variability in the sample ‘explained’ by the effect [i.e., the IV]

(and which is not explained by any other variable in the analysis)

between-groups variabilitybetween- & within-groups variabilityNB - measure of between-group variability used as basis for estimate of population variance [see last week] but isn’t equal to it.

NB 2 – for one-way anova [only one IV] you don’t have to worry about the “and which is not explained by any other variable in the analysis bit” because there aren’t any other variables! But things are different when you’ve got 2 or more IVs [covered later in the module]

between-groups

within-groups

variability

estimates of population variance

why partial eta-squared?(partial η2)

• it’s probably the widely used

• SPSS can compute it for us• [this is a rubbish reason]

• rough ‘benchmarks’ exist• .01 = small; .06 = medium; .14 = large

[last week]

ANOVA is another NHSTprobability of getting F-ratio (or more extreme) if H0 true

If p < 0.05, reject H0

H0 – all the population means are the same

and so accept

H1 – the population means are not all the same

NB this doesn’t say anything about which means are different to which other ones

Follow-on analyses

if you have very specific hypothesesonly interested in a few specific comparisons

can specify these in advance (a priori)

Planned Comparisons (Contrasts)not following this route in lab classes

see Field section 10.2.11 (3rd ed); 11.4 (4th ed)

if interested in all comparisonscan’t specify those of interest in advance

only looking after study completed (post hoc)

multiple comparisons / post hoc tests12-step vs cog-behavioural

12-step vs standard care

cog-behavioural vs standard care

[last week]

Moving from comparing two means to considering three has complicated matters

we seem to face either

increased type I error rateor

increased type II error rate[lower power]

[last week]

This [finally] is where ANOVA comes in

It can help us detect any difference

between our 3 (or more) group means

withoutincreasing type I error rateor

reducing power

so, I hear you cry,

you end up doing the very procedures ANOVA is meant to spare us the pitfalls of!

why bother with the ANOVA?

only bother with post hocs if you have a statistically significant ANOVA

the ANOVA serves the purpose of detecting that there is some kind of effect in the first place[controlling Type I error rate & without cost in terms of power]

corrected multiple comparisons (post hocs)[with their reduced power]

might have missed that effect

[ANOVA can do other stuff; more on that later in the module]

Many different post hoc procedures[18 available in SPSS]

Bonferroni correction very populareasy to understandvery good control over Type I error ratepays heavy prices in terms of power

procedures vary in terms of

particular balance they strike between Type I error & Type II error [power]

• those that favour controlling Type I error over concern with power [‘conservative’]

• those that relax Type I error control so as to maintain power [‘liberal’]

‘robustness’• accuracy when samples sizes unequal and/or

small, when you haven’t got HoV or Sphericity

See Field 10.2.12 (3rd ed); 11.5 (4th ed)• need to make a judgment• nature of your data (assumptions, sample sizes)• which type of error worst in your situation• no single correct answer, but has to be justified

Can’t take a ‘cookbook’ approach

psy2004 research methods psy2005 applied research methods week six

years material

days drugs

datalots of different

ivnoise chapters

different time periodschapters

characteristicsif data

research methodsweek

repeated measures rm

Documents

sensation & perception. -discussion section- session 2 –...

methods, methods, methods

mastering marketing methods - online methods

methods of biomaterials testing - physical methods

34 hypnotic methods hypnotic methods (manuscript)

analytical methods - algebraic methods

7. analytical methods well-established methods that are used...

stated preference methods or direct valuation methods ·...

formal methods diffusion: formal methods diffusion

chapter 5- even more about objects and methods. overview n...

introduction yongsik lee. classification of analytical...

chromatographic methods: basics, advanced hplc methods

statistical methods bayesian methods 2

psy2004 literature searching 2014

psy2004 research methods psy2005 applied research methods...

psy2004 literature searching

voltammetric methods and electrodes. electroanalytical...

welcome to psy2005 week 2 - ethics. aims: to discuss the...

numerical methods - hcmutlmcuong/em/chapter15.pdf ·...

materials and methods: computational methods