analysis of variance

27
Analysis of Variance ANOVA and its terminology Within and between subject designs Case study Slide deck by Saul Greenberg. Permission is granted to use this for non-commercial purposes as long as general credit to Saul Greenberg is clearly maintained. Warning: some material in this deck is used from other sources without permission. Credit to the original source is given if it is known.

Upload: madonna-harris

Post on 30-Dec-2015

21 views

Category:

Documents


0 download

DESCRIPTION

Analysis of Variance. ANOVA and its terminology Within and between subject designs Case study. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Analysis of Variance

Analysis of Variance

• ANOVA and its terminology• Within and between subject designs• Case study

Slide deck by Saul Greenberg. Permission is granted to use this for non-commercial purposes as long as general credit to Saul Greenberg is clearly maintained. Warning: some material in this deck is used from other sources without permission. Credit to the original source is given if it is known.

Page 2: Analysis of Variance

Analysis of Variance (Anova)

Statistical Workhorse – supports moderately complex experimental designs

and statistical analysis

– Lets you examine multiple independent variables at the same time

– Examples:• There is no difference between people’s mouse typing

ability on the Random, Alphabetic and Qwerty keyboard

• There is no difference in the number of cavities of people aged under 12, between 12-16, and older than 16 when using Crest vs No-teeth toothpaste

Page 3: Analysis of Variance

Analysis of Variance (Anova)

Terminology– Factor = independent variable – Factor level = specific value of independent variable

Keyboard

Qwerty Random Alphabetic

Factor Toothpaste type

Crest No-teeth

Ag

e

<12

12-16

>16

Factor levelFactor level

Factor

Page 4: Analysis of Variance

Anova terminology

Factorial design– cross combination of levels of one factor with levels of

another– eg keyboard type (3) x size (2)

Cell– unique treatment combination– eg qwerty x large Qwerty Random Alphabeti

c

KeyboardS

ize

large

small

Page 5: Analysis of Variance

Anova terminology

Between subjects (aka nested factors)– subject assigned to only one factor level of treatment– control is general population– advantage:

–guarantees independence i.e., no learning effects

– problem: –greater variability, requires more subjects

Qwerty

S1-20

Random

S21-40

Alphabetic

S41-60

Keyboard

different subjects in each cell

Page 6: Analysis of Variance

Anova terminology

Within subjects (aka crossed factors)– subjects assigned to all factor levels of a treatment– advantages

• requires fewer subjects• subjects act as their own control• less variability as subject measures are paired

– problems: • order effects

Qwerty

S1-20

Random

S1-20

Alphabetic

S1-20

Keyboard

same subjects in each cell

Page 7: Analysis of Variance

Anova terminology

Order effects– within subjects only– doing one factor level affects performance in doing the

next factor level, usually through learning

Example: • learning to mouse type on any keyboard likely improves

performance on the next keyboard• even if there was really no difference between keyboards:

Alphabetic > Random > Qwerty performance

S1: Q then R then AS2: Q then R then AS3: Q then R then AS4: Q then R then A …

Page 8: Analysis of Variance

Anova terminology

Counter-balanced ordering– mitigates order problem– subjects do factor levels in different orders– distributes order effect across all conditions, but does not

remove them

Works only if order effects are equal between conditions

– e.g., people’s performance improves when starting on Qwerty but worsens when starting on Random

S1: Q then R then A q > (r < a)S2: R then A then Q r << a < qS3: A then Q then R a < q < rS4: Q then A then R… q > (a < r) …

Page 9: Analysis of Variance

Anova terminology

Mixed factor– contains both between and within subject combinations

– within subjects: keyboard type– between subjects: size

Qwerty Random Alphabetic

Keyboard

S1-20

S21-40 S21-40 S21-40

Siz

e

Large

Small

S1-20 S1-20

Page 10: Analysis of Variance

Single Factor Analysis of Variance

Compare means between two or more factor levels within a single factor

example:– independent variable (factor): keyboard– dependent variable: mouse-typing speed

Qwerty Alphabetic Random

S1: 25 secsS2: 29…S20: 33

S21: 40 secsS22: 55…S40: 33

S51: 17 secsS52: 45…S60: 23

KeyboardQwerty Alphabetic Random

S1: 25 secsS2: 29…S20: 33

S1: 40 secsS2: 55…S20: 43

S1: 41 secsS2: 54…S20: 47

Keyboard

between subject design within subject design

Page 11: Analysis of Variance

Anova

Compares relationships between many factors

In reality, we must look at multiple variables to understand what is going on

Provides more informed results– considers the interactions between factors

Page 12: Analysis of Variance

Anova Interactions

Example interaction– typists are:

• faster on Qwerty-large keyboards• slower on the Alpha-small• same on all other keyboards is the same

– cannot simply say that one layout is best without talking about size

Qwerty Random Alpha

S1-S10 S11-S20 S21-S30

S31-S40 S41-S50 S51-S60

large

small

Page 13: Analysis of Variance

Anova Interactions

Example interaction– typists are faster on Qwerty than the other keyboards– non-typists perform the same across all keyboards– cannot simply say that one keyboard is best without

talking about typing ability

Qwerty Random Alpha

S1-S10 S11-S20 S21-S30

S31-S40 S41-S50 S51-S60

non-typist

typist

Page 14: Analysis of Variance

Anova - Interactions

Example: – t-test: crest vs no-teeth

• subjects who use crest have fewer cavities

– interpretation: recommend crest

cavities

0

5

crest no-teeth

Statistically different

Page 15: Analysis of Variance

Anova - Interactions

Example: – anova: toothpaste x age

• subjects 14 or less have fewer cavities with crest.• subjects older than 14 have fewer cavities with no-teeth.

– interpretation?• the sweet taste of crest makes kids

use it more, while it repels older folks

cavities

0

5

crest no-teeth

age 0-6

age 7-14

age >14

Statistically different

Page 16: Analysis of Variance

Anova case study

The situation– text-based menu display for large telephone directory– names listed as a range within a selectable menu item– users navigate menu until unique names are reached

1) Arbor - Kalmer2) Kalmerson - Ulston3) Unger - Zlotsky

1) Arbor - Farquar2) Farston - Hoover3) Hover - Kalmer

1) Horace - Horton2) Hoster, James3) Howard, Rex

Page 17: Analysis of Variance

Anova case study

The problem– we can display these ranges in several possible ways– expected users have varied computer experiences

General question– which display method is best for particular classes of

user expertise?

Page 18: Analysis of Variance

1) Arbor2) Barrymore3) Danby4) Farquar5) Kalmerson6) Moriarty7) Proctor8) Sagin9) Unger--(Zlotsky)

-- (Arbor)1) Barney2) Dacker3) Estovitch4) Kalmer5) Moreen6) Praleen7) Sageen8) Ulston9) Zlotsky

1) Arbor - Barney2) Barrymore - Dacker3) Danby - Estovitch4) Farquar - Kalmer5) Kalmerson - Moreen6) Moriarty - Praleen7) Proctor - Sageen8) Sagin - Ulston9) Unger - Zlotsky

Range Delimeters

Full Lower Upper

Page 19: Analysis of Variance

1) Arbor2) Barrymore3) Danby4) Farquar5) Kalmerson6) Moriarty7) Proctor8) Sagin9) Unger--(Zlotsky)

1) A2) Barr3) Dan4) F5) Kalmers6) Mori7) Pro8) Sagi9) Un--(Z)

-- (Arbor)1) Barney2) Dacker3) Estovitch4) Kalmer5) Moreen6) Praleen7) Sageen8) Ulston9) Zlotsky

1) Arbor - Barney2) Barrymore - Dacker3) Danby - Estovitch4) Farquar - Kalmer5) Kalmerson - Moreen6) Moriarty - Praleen7) Proctor - Sageen8) Sagin - Ulston9) Unger - Zlotsky

-- (A)1) Barn2) Dac3) E4) Kalmera5) More6) Pra7) Sage8) Ul9) Z

1) A - Barn2) Barr - Dac3) Dan - E4) F - Kalmerr5) Kalmers - More6) Mori - Pra7) Pro - Sage8) Sagi - Ul9) Un - Z

Range Delimeters

Truncation

Full Lower Upper

None

Truncated

Page 20: Analysis of Variance

1) Arbor2) Barrymore3) Danby4) Farquar5) Kalmerson6) Moriarty7) Proctor8) Sagin9) Unger--(Zlotsky)

1) Danby2) Danton3) Desiran4) Desis5) Dolton6) Dormer7) Eason8) Erick9) Fabian--(Farquar)

Wide Span Narrow Span

Spanas one descends the menu hierarchy, name suffixes become similar

Span

Page 21: Analysis of Variance

Null Hypothesis

– six menu display systems based on combinations of truncation and range delimiter methods do not differ significantly from each other as measured by people’s scanning speed and error rate

– menu span and user experience has no significant effect on these results

– 2 level (truncation) x2 level (menu span) x2 level (experience) x3 level (delimiter)

S1-8 S1-8 S1-8 S1-8Novice

S9-16 S9-16 S9-16 S9-16Expert

S17-24 S17-24 S17-24 S17-24Novice

S25-32 S25-32 S25-32 S25-32Expert

S33-40 S33-40 S33-40 S33-40Novice

S40-48 S40-48 S40-48 S40-48Expert

Full

Upper

Lower

narrow wide narrow wide

Truncated Not Truncated

Page 22: Analysis of Variance

Statistical results

Scanning speed F-ratio. pRange delimeter (R) 2.2* <0.5Truncation (T) 0.4Experience (E) 5.5* <0.5Menu Span (S) 216.0** <0.01RxT 0.0RxE 1.0RxS 3.0TxE 1.1Trunc. X Span 14.8* <0.5ExS 1.0RxTxE 0.0RxTxS 1.0RxExS 1.7TxExS 0.3RxTxExS 0.5

Page 23: Analysis of Variance

Statistical results

Scanning speed:

• Truncation x Span Main effects (means)

Results on Selection time

• Full range delimiters slowest

• Truncation has very minor effect on time: ignore

• Narrow span menus are slowest

• Novices are slower

speed

4

6

wide narrow

not truncated

truncated

Full Lower UpperFull ---- 1.15* 1.31*Lower ---- 0.16Upper ----

Span: Wide 4.35 Narrow 5.54

Experience Novice 5.44 Expert 4.36

Page 24: Analysis of Variance

Statistical results

Error rate F-ratio. pRange delimeter (R) 3.7* <0.5Truncation (T) 2.7Experience (E) 5.6* <0.5Menu Span (S) 77.9** <0.01RxT 1.1RxE 4.7* <0.5RxS 5.4* <0.5TxE 1.2TxS 1.5ExS 2.0RxTxE 0.5RxTxS 1.6RxExS 1.4TxExS 0.1RxTxExS 0.1

Page 25: Analysis of Variance

Statistical results

Error ratesRange x Experience Range x Span

Results on Errors– more errors with lower range delimiters at narrow span– truncation has no effect on errors– novices have more errors at lower range delimiter

novice

errors

0

16

full upper

expert

lower

errors

0

16

wide narrow

lower

upper

full

Page 26: Analysis of Variance

Conclusions

Upper range delimiter is bestTruncation up to the implementersKeep users from descending the menu hierarchyExperience is critical in menu displays

Page 27: Analysis of Variance

You now know

Anova terminology– factors, levels, cells– factorial design

• between, within, mixed designs

You should be able to:Find a paper in CHI proceedings that uses Anova Draw the Anova table, and state

dependant variablesindependant variables / factors factor levelsbetween/within subject design