analysis of differences between two groups between multiple groups independent groups dependent...

40
Analysis of Differences Between Two Groups Between Multiple Groups Independent Groups Dependent Groups Independent Groups Dependent Groups Independent Samples t-test Repeated Measures t-test Independent Samples ANOVA Repeated Measures ANOVA Frequency CHI Square Nominal / Ordinal Data Some kinds of Regression Correlation: Pearson Regression Analysis of Relationships Multiple Predictors Correlation: Spearman Multiple Regression One Predictor Concept Map For Statistics as taught in IS271 (a work in progress) Rashmi Sinha Interval Data Type of Data Ordinal Regression

Post on 22-Dec-2015

249 views

Category:

Documents


1 download

TRANSCRIPT

Analysis of Differences

Between Two Groups

Between Multiple Groups

IndependentGroups

DependentGroups

IndependentGroups

DependentGroups

IndependentSamples t-test

Repeated Measures t-test

IndependentSamples ANOVA

Repeated Measures ANOVA

FrequencyCHI Square

Nominal / Ordinal

Data

Some kinds ofRegression

Correlation:Pearson

RegressionAnalysis of

Relationships MultiplePredictors

Correlation:Spearman

MultipleRegression

OnePredictor

Concept Map For Statistics as taught in IS271(a work in progress)

Rashmi Sinha

IntervalData

Type of Data

OrdinalRegression

Analysis of Variance or F testANOVA is a technique for using differences between sample means to draw inferences about the presence or absence of differences between populations means.

•The logic of ANOVA and calculation in SPSS

•Magnitude of effect: eta squared, omega squared

Note: ANOVA is equivalent to t-test in case of two group situation

Logic of Analysis of Logic of Analysis of VarianceVariance

• Null hypothesis (HNull hypothesis (Hoo): Population ): Population means from different conditions are means from different conditions are equalequal mm11 = m = m22 = m = m33 = m = m44

• Alternative hypothesis: Alternative hypothesis: HH11

Not all population means equal.Not all population means equal.

Lets visualize total amount of Lets visualize total amount of variance in an experimentvariance in an experiment

Between Group Differences(Mean Square Group)

Error Variance (Individual Differences + Random Variance) Mean Square Error

Total Variance = Mean Square Total

F ratio is a proportion of the MS group/MS Error.The larger the group differences, the bigger the FThe larger the error variance, the smaller the F

LogicLogic• Create a measure of variability among group means Create a measure of variability among group means

MSMSgroupgroup

• Create a measure of variability within groups Create a measure of variability within groups MSMSerrorerror

• Form ratio of Form ratio of MSMSgroupgroup /MS /MSerrorerror Ratio approximately 1 if null trueRatio approximately 1 if null true Ratio significantly larger than 1 if null falseRatio significantly larger than 1 if null false ““approximately 1” can actually be as high as 2 or 3, but not approximately 1” can actually be as high as 2 or 3, but not

much highermuch higher Look up statistical tables to see if F ratio is significant for Look up statistical tables to see if F ratio is significant for

the specified degrees of freedomthe specified degrees of freedom

Grand mean = 3.78

Hypothetical DataHypothetical Data

CalculationsCalculations• Start with Sum of Squares (SS) Start with Sum of Squares (SS)

We need:We need:

• SSSStotaltotal

• SSSSgroupsgroups

• SSSSerrorerror

• Compute degrees of freedom (Compute degrees of freedom (df df ))

• Compute mean squares and Compute mean squares and FF

Cont.

Calculations--cont.Calculations--cont.

889.83

556.132444.216

556.132)364.7(18

78.389.1...78.350.478.322.318

444.216

78.31...78.33)78.31(

)(

222

2..

222

2..

groupstotalerror

jgroups

total

SSSSSS

XXnSS

XXSS

Degrees of Freedom (Degrees of Freedom (df df ))

• Number of “observations” free to varyNumber of “observations” free to vary

dfdftotaltotal = = NN - 1 - 1

• NN observations observations

dfdfgroupsgroups = = gg - 1 - 1

• gg means means

dfdferrorerror = = g g ((nn - 1) - 1)

• nn observations in each group = observations in each group = nn - 1 - 1 dfdf

• times times gg groups groups

Summary TableSummary Table

When there are more than When there are more than two groupstwo groups

• Significant Significant FF only shows that not all only shows that not all groups are equalgroups are equal We want to know what groups are different.We want to know what groups are different.

• Such procedures are designed to control Such procedures are designed to control familywise error rate.familywise error rate. Familywise error rate definedFamilywise error rate defined

Contrast with per comparison error rateContrast with per comparison error rate

In case of multiple comparisons: In case of multiple comparisons: Bonferroni Bonferroni adjustmentadjustment

• The more tests we run the more likely we are The more tests we run the more likely we are to make Type I error.to make Type I error. Good reason to hold down number of testsGood reason to hold down number of tests

• Run Run tt tests between pairs of groups, as usual tests between pairs of groups, as usual Hold down number of Hold down number of tt tests tests Reject if Reject if tt exceeds critical value in Bonferroni table exceeds critical value in Bonferroni table

• Works by using a more strict level of Works by using a more strict level of significance for each comparison significance for each comparison

Bonferroni Bonferroni tt--cont.--cont.• Critical value of a for each test set Critical value of a for each test set

at .05/at .05/cc, where , where cc = number of tests run = number of tests run Assuming familywise a = .05Assuming familywise a = .05 e. g. with 3 tests, each e. g. with 3 tests, each tt must be significant must be significant

at .05/3 = .0167 level.at .05/3 = .0167 level.

• With computer printout, just make sure With computer printout, just make sure calculated probability < .05/calculated probability < .05/cc

• Necessary table is in the bookNecessary table is in the book

Magnitude of EffectMagnitude of Effect• Why you need to compute magnitude of Why you need to compute magnitude of

effect indiceseffect indices

• Eta squared (hEta squared (h22)) Easy to calculateEasy to calculate

Somewhat biased on the high sideSomewhat biased on the high side

Percent of variation in the data that can be Percent of variation in the data that can be attributed to treatment differencesattributed to treatment differences

Magnitude of Effect--cont.Magnitude of Effect--cont.

• Omega squared (wOmega squared (w22)) Much less biased than hMuch less biased than h22

Not as intuitiveNot as intuitive

We adjust both numerator and We adjust both numerator and denominator with MSdenominator with MSerrorerror

Formula on next slideFormula on next slide

12.6.556.2786)6.55(38.507)1(

18.6.2786

8.507

2

2

errortotal

errorgroups

total

groups

MSSS

MSkSS

SS

SS

hh22 and w and w22 for Foa, et al. for Foa, et al.

• hh22 = .18: 18% of variability in = .18: 18% of variability in symptoms can be accounted for by symptoms can be accounted for by treatmenttreatment

• ww22 = .12: This is a less biased = .12: This is a less biased estimate, and note that it is 33% estimate, and note that it is 33% smaller.smaller.

Factorial Analysis of Variance

• What is a factorial design?What is a factorial design?

• Main effectsMain effects

• InteractionsInteractions

• Simple effectsSimple effects

• Magnitude of effectMagnitude of effect

What is a FactorialWhat is a Factorial

• At least two independent variablesAt least two independent variables

• All combinations of each variableAll combinations of each variable

• Rows X Columns factorialRows X Columns factorial

• CellsCells2 X 2 Factorial2 X 2 Factorial

Source of Product Review

Product Type Expert Review Peer ReviewLaptopDigital Camera

There are two factors in the experiment: Source of Review and Type of Product.

•If you examine effect of Source of Review (ignoring Type of Product for the time being), you are looking at the main effect of Source of Review.

•If we look at the effect of Type of Product, ignoring Source of Review, then you are looking at the main effect of Type of Product.

Main effects

Source of Product Review

Product Type Expert Review Peer ReviewLaptopDigital Camera

If you could restrict yourself to one level of one IV for the time being, and looking at the effect of the other IV within that level.

•Effect of Source of Review at one level of Product Type (e.g. for one kind of Product), then that is a simple effect.

•Effect of Product Type at one level of Source of Review (e.g. for one kind of Source, then that is a simple effect.

Simple effects

Source of Product Review

Product Type Expert Review Peer ReviewLaptopDigital Camera

Simple of Effect of Simple of Effect of Product Type at one Product Type at one level of Source of level of Source of Review (I.e., one kind Review (I.e., one kind of Review Type, of Review Type, Expert Review)Expert Review)

Interactions (Effect of one variable on the other)

Source of Product Review

Product Type Expert Review Peer ReviewLaptop 5 12Digital Camera 10 6

Effect: Source of Review on Product Type

012

34567

89

10

Expert Review Peer Review

Sa

tis

fac

tio

n w

ith

Ad

vis

or

Laptop

Digital Camera

Source of Product Review

Product Type Expert Review Peer ReviewLaptop 8 5Digital Camera 9 6

Effect: Source of Review on Product Type

0

2

4

6

8

10

12

14

16

18

Expert Review Peer Review

Sa

tis

fac

tio

n w

ith

Ad

vis

or

Laptop

Digital Camera

0

0.5

1

1.5

2

2.5

A1 A2 A3

0

0.5

1

1.5

2

2.5

3

3.5

A1 A2 A30

0.5

1

1.5

2

2.5

3

3.5

4

4.5

A1 A2 A3

0

0 . 5

1

1. 5

2

2 . 5

3

3 . 5

4

4 . 5

A1 A2 A3

0

0 . 5

1

1. 5

2

2 . 5

3

3 . 5

4

4 . 5

A1 A2 A3

0

0 . 5

1

1. 5

2

2 . 5

3

3 . 5

4

4 . 5

A1 A2 A3

Types of Interactions

And this is when there are only two variables!And this is when there are only two variables!

F ratio is biased because it goes up with F ratio is biased because it goes up with sample size. sample size. For a true estimate for the treatment effect For a true estimate for the treatment effect size, use eta squared (the proportion of the size, use eta squared (the proportion of the treatment effect / total variance in the treatment effect / total variance in the experiment). experiment).

Eta Squared is a better estimate than F but it Eta Squared is a better estimate than F but it is still a biased estimate. A better index is is still a biased estimate. A better index is Omega Squared. Omega Squared.

Magnitude of Effect

Magnitude of EffectMagnitude of Effect

• Eta SquaredEta Squared

InterpretationInterpretation

• Omega squaredOmega squared Less biased estimateLess biased estimate

total

effect

SS

SS2

errortotal

erroreffect

MSSS

MSkSS

)1(2

k = number of levels for the effectin question

TreatmentEffectErrorVariance

Omega Squared

R2 is also often used. It is based on the sum of squares. For experiments use Omega Squared. For correlations use R squared.

Value of R square is greater than omega squared.

Cohen classified effects as Small Effect: .01Medium Effect: .06Large Effect: .15

The Data The Data (cell means and standard (cell means and standard

deviations)deviations) No

Instructions

Instructions

Means

Male 7.7 (4.6)

6.2 (3.5)

6.95

Female 6.5 (4.2)

5.1 (2.8)

5.80

Means 7.1 5.65 6.375

Plotting ResultsPlotting Results

0

2

4

6

8

10

No Instructions Instructions

Male Female

Effects to be estimatedEffects to be estimated• Differences due to instructionsDifferences due to instructions

Errors more in condition without instructionsErrors more in condition without instructions

• Differences due to genderDifferences due to gender Males appear higher than femalesMales appear higher than females

• Interaction of video and genderInteraction of video and gender What is an interaction?What is an interaction?

Do instructions effect males and females equally?Do instructions effect males and females equally?

Cont.

Estimated Effects--cont.Estimated Effects--cont.

• ErrorError average within-cell varianceaverage within-cell variance

• Sum of squares and mean squaresSum of squares and mean squares Extension of the same concepts in the Extension of the same concepts in the

one-wayone-way

CalculationsCalculations

• Total sum of squaresTotal sum of squares

• Main effect sum of squaresMain effect sum of squares 2

..XXSStotal

2..XXngSS Vvideo

2..XXnvSS Ggender

Cont.

Calculations--cont.Calculations--cont.• Interaction sum of squaresInteraction sum of squares

Calculate SSCalculate SScellscells and subtract SS and subtract SSVV and SS and SSGG

• SSSSerrorerror = SS = SStotaltotal - SS - SScellscells

or, or, MSMSerrorerror can be found as average of cell variances can be found as average of cell variances

2..)( XXnSS ijcells

Degrees of FreedomDegrees of Freedom

• dfdf for main effects = number of for main effects = number of levels - 1levels - 1

• dfdf for interaction = product of for interaction = product of dfdfmain main

effectseffects

• dfdf errorerror = = NN - - abab = = NN - # cells - # cells

• dfdftotaltotal = = NN - 1 - 1

Calculations for DataCalculations for Data

• SSSStotaltotal requires raw data. requires raw data.

It is actually = 171.50It is actually = 171.50

• SSSSvideovideo

125.105

375.665.5375.61.7250

..22

2

XXngSS Vvideo

Cont.

Calculations--cont.Calculations--cont.

• SSSSgendergender

125.66

375.680.5375.695.6)2(50

..22

2

XXnvSS Ggender

Cont.

Calculations--cont.Calculations--cont.• SSSScellscells

• SSSSVXGVXG = SS = SScellscells - SS - SSinstructioninstruction- SS- SSgendergender

== 171.375 - 105.125 - 66.125 = 0.125 171.375 - 105.125 - 66.125 = 0.125

375.171)4275.3(50

)375.61.5()375.65.6(

)375.62.6()375.67.7(50

..)(

22

22

2

XXnSS cellcells

Cont.

Calculations--cont.Calculations--cont.

• MSMSerrorerror = average of cell variances = = average of cell variances =(4.6(4.622 + 3.5 + 3.522 + 4.2 + 4.222 + 2.8 + 2.822)/4 )/4 =58.89/4 = 14.723 =58.89/4 = 14.723

• Note that this is MSNote that this is MSerrorerror and not SS and not SSerrorerror

Summary TableSummary Table

Source df SS MS F Instructions

1 105.125 105.125 7.14

Gender 1 66.125 66.125 4.49 VXG 1 0.125 0.125 .01 Error 19

6 2885.610 14.723

Total 199

3056.980

Elaborate on InteractionsElaborate on Interactions

• Diagrammed on next slide as line Diagrammed on next slide as line graphgraph

• Note parallelism of linesNote parallelism of lines Instruction differences did not depend Instruction differences did not depend

on genderon gender

Line Graph of InteractionLine Graph of Interaction

0123456789

No Instructions Instructions

MaleFemale