structural equation modeling: problems and ambiguities with “well fitting” models andrew...

Structural Equation Modeling: Problems and Ambiguities with

“Well Fitting” Models

Andrew Tomarken

1

Overview of Talk

• Brief introduction to structural equation modeling (SEM) with emphasis on core concept of model fit

• Review of several ambiguities and problems associated with well-fitting models that are typically ignored by users

• Conclusions:

– It is important for users to bear in mind what precisely is being tested when assessing model fit

– Users need to look beyond omnibus measures of fit

2

What is SEM?

• A set of methods for estimating and testing models that are hypothesized to account for the variances and covariances (and possibly mean structures) among a set of variables

• Such models typically consist of sets of linear equations containing free, fixed, or otherwise constrained parameters

• Two types of linear relations can be specified– between latent constructs (or factors) and their observable

indicators (measurement model)– between latent constructs (structural model)

• One way to think about it: Combines simultaneous equation/econometric approaches and factor-analytic/ psychometric approaches

3

SEM as a General Statistical Approach

• Most statistical procedures conventionally used to test hypotheses can be considered special cases of SEM

• Parallels development of GLMs in 1970’s and 1980’s as liberalization of classic linear models

• Recent development of multilevel and mixture modeling within SEM domain represents further extension of GLM’s to latent continuous and categorical variables

• Thus SEM may arguably be most general data-analytic framework at present time (Tomarken & Waller, 2005)

4

Some Advantages of SEM

• High level of explicitness: Forces researchers to specify a model with a high level of detail

• Typically aligns the statistical null hypothesis with the research hypothesis

• In principle, allows for separate assessments of relations between observable indicators and latent variables (measurement model) and among latent variables

• Can test models that are difficult or impossible to test with other procedures (e.g., factor of curves, associative growth)

• Allows you to test the overall fit of even very complex models – and that’s the focus of today’s talk

5

Path Analysis Model (Lynam et al., 1993)

6

SES

Test Effort

VIQ

DelinquencyImpulsivity

a

c

e

d

e1

1

e2

1

b

The Figure Implies Linear Equations

Figure Equations

Imp = a SES + b TE + c VIQ + e1

Del = d SES + e Imp + e2

SES

Test Effort

VIQ

DelinquencyImpulsivity

a

c

e

d

e1

1

e2

1

b

7

Confirmatory Factor Analysis Model

spatial

visperc

cubes

lozenges

wordmean

paragrap

sentence

e_v

e_c

e_l

e_p

e_s

e_w

verbal

1

a

b

1

c

d

1

1

1

1

1

1

8

The Figure Implies Linear Equations

Figure Equations

Visperc = Spatial + e_v

Cubes = a Spatial + e_c

Lozenges = b Spatial + e_l

Paragraph = Verbal + e_p

Sentence = c Verbal + e_s

Wordmean = d Verbal + e_w

spatial

visperc

cubes

lozenges

wordmean

paragrap

sentence

e_v

e_c

e_l

e_p

e_s

e_w

verbal

1

a

b

1

c

d

1

1

1

1

1

1

9

Latent Variable Causal Model (Trull, 2001)

ParentalMood

Disorder

ParentalDisinhibitory

Disorder

TraitDisinhibition

TraitNegative

Affectivity

BorderlineFeaturesAbuse

D1

D21

D3

1

SA PA

DEP ANX HOS

DEL SD IMP

PAI MMPI DIBR SIDP

e1

1

e2

1

e31

e41

e51

e6

1

e7

1

e8

1

e9

1

e10

1

e11

1

e12

1

1

10

A SEM Analysis: What Do We Want to Do?

11

spatial

visperc

cubes

lozenges

wordmean

paragrap

sentence

e_v

e_c

e_l

e_p

e_s

e_w

verbal

1

a

b

1

c

d

1

1

1

1

1

1

Estimate Coefficients and Standard Errors

12

Estimate S.E. C.R. P Label

visperc <--- spatial 1.000

cubes <--- spatial .610 .143 4.250 *** a

lozenges <--- spatial 1.198 .272 4.405 *** b

paragrap <--- verbal 1.000

sentence <--- verbal 1.334 .160 8.322 *** c

wordmean <--- verbal 2.234 .263 8.482 *** d

Assess Overall Fit

13

MODEL NPARCHI-

SQUAREDF P

Correlated Factors

13 7.853 8 .448

RMSEA LO 90 HI 90 PCLOSE

Correlated Factors

.000 .000 .137 .577

Model Comparisons

14

spatial

visperc

cubes

lozenges

wordmean

paragrap

sentence

e_v

e_c

e_l

e_p

e_s

e_w

verbal

1

a

b

1

c

d

1

1

1

1

1

1

spatial

visperc

cubes

lozenges

wordmean

paragrap

sentence

e_v

e_c

e_l

e_p

e_s

e_w

verbal

1

a

b

1

c

d

1

1

1

1

1

1

Model DFCHI-

SQUAREP

Orthogonal Factors

9 19.860

Correlated Factors

8 7.853

Nested Comparison

1 12.008 .001

The Concept of Model Fit in SEM

• The question: Does the structure implied by the model account for the observed variances and covariances among a set of variables?

• We compare the observed covariance matrix to the covariance matrix implied by the model

• A fitting function (=F) assesses the discrepancy between S (sample cov. matrix) and (estimated population covariance matrix implied by the model)

• Example: ML fitting function:• F – or something very much like F – appears in the formulae for

all conventionally used statistical tests of fit and fit indices • Estimates of free parameters are chosen that meet two potentially

competing goals:– minimize the discrepancy between the implied and observed

matrices– respect the restrictions (constraints) on the covariance matrix

implied by the model

1log tr S logMLF S p

15

Example of Model-Imposed Restrictions: 3 Variable Mediational Model

16

MX Y

Equations:M = a X + e1Y = b M + e2

Implied Restriction: Cov (X,Y) *Var(M) = Cov (X,M)*Cov(M,Y)Standardized: R(X,Y) = R(X,M)R(M,Y)

a b

e11

e21

Example of Model-Imposed Restrictions: Confirmatory Factor Model

F1

1

v4

e4

d

1

v3

e3

c

1

v2

e2

b

1

v1

e1

a

1

10 knowns: the 4 variances and 6 covariances among v1-v4

8 free parameters to estimate: 4 factor loadings (a-d) and the variances of the four error terms (e1-e4)

This model also implies a set of constraints on the covariances among the observable variables

C(1,3)C(2,4) = C(1,4)C(2,3) = C(1,2)C(3,4)

This model will result in an estimated or implied covariance matrix that respects these constraints

If the sample and implied matrices agree, the model fits

17

This Should Sound Familiar

• Although the models and the specific criteria minimized may differ, the notion that statistical tests and fit indices evaluate model-imposed restrictions is completely consistent with general principles of statistical modeling, particularly in specific contexts (e.g., ML estimation)

18

How Do Users Typically Assess Overall Fit?

• Hypothesis-testing using inferential statistical tests– Likelihood ratio chi-square test of exact fit – Compares

target model to a saturated (just-identified model)

– Nested chi-square tests for competing models (very important for model comparisons)

• Fit indices that indicate degree of fit• Historically, more methodological papers on SEM

have focused on measures of fit than any other topic

19

Below the Radar

• Both methodological literature and empirical applications heavily emphasize statistical tests and descriptive indices of fit

• This focus can blind users to an important point: Even “well-fitting” models can have substantial problems and uncertainties that are often ignored by researchers

• Tomarken and Waller’s (2003) review indicated a number of respects in which users ignore several potential problems with models that appear to fit well

• Ironically, these issues are not particularly subtle. Rather they are linked to core features of the concept of “model fit” in the SEM context

20

Potential Problems/Ambiguities with Well-Fitting Models

---- and/or the Researchers Who Test Them

1. Lack of clarity concerning what exactly is being tested. 2. A poorly fitting structural (i.e., path) component that is

masked by a well-fitting composite model3. A large number of equivalent models that will always yield

identical fit to the target model4. Questionable lower-order components of fit5. Omitted variables that influence constructs included in the

model 6. The presence of a number of non-equivalent and non-nested

alternative models that could fit better but are rarely ever tested

7. Low power or sensitivity to detect critical misspecifications 8. Specifications driven by hidden post-hoc modifications that

lower the validity and replicability of the results21

Issue # 1: Do you Know What Exactly is Being Tested?

• SEM models impose restrictions on variances and covariances among the observed variables (and sometimes on means too).

• Unfortunately: – Researchers are often unaware of the restrictions tested by even simple

models– Such restrictions sometimes do not reflect what the researcher would

identify as core features of the model --- questions that motivated the study in the first place

– Many models impose so many restrictions that it’s typically impossible for even specialists to figure them all out or render them comprehensible in a more global way

• In short: – People often are unaware of what exactly is being assessed by statistical

tests of fit or fit measures – and what is being assessed is often not exactly what the researcher had in mind

22

X2

f2

X1

f3

Y1 Y2f5

f8

DX21

f9

DY2

f1f7f6

2W 2V Panel Model9 free parameters

1degree of freedomQ: What's This Model Testing?

f4

1

23

X2

f2

X1

f3

Y1 Y2f5

f8

DX21

f9

DY2

f1f7f6

AnswerCOV(X2,Y2.X1,Y1)=0

f4

1

0

24

X 1

Y1 Y2

X 2 X 3

Y3

D X 2

D Y2

D X 3

D Y3

C ro ss-L a g g ed P a n el M o d el1 7 F ree P a ra m eters4 R estrictio n s = 4 d f

Q : W h a t's T h is M o d el T estin g ?

25

X 1

Y1 Y2

X 2 X 3

Y3

D X 2

D Y2

D X 3

D Y3

A n sw er: L a g -2 P a th s = 0

0

0

00

26

This Does Not Mean Overall Model Fit is Irrelevant!

• One might argue: Let’s just ignore fit indices and look at what we’re really interested in

• Flawed argument: One would not want to test coefficients, estimate direct and indirect effects, estimate proportion of variance, etc., etc in a model that does not fit well and appears to be mis-specified. Parameter estimates and standard errors will be inaccurate.

• Don’t ignore fit but see it as a first step or necessary condition for looking at what you really are interested in. It is not an end in itself.

27

Why the Problem?

• Educational

• Perceptual/cognitive biases

– Feature-positive effect: We attend more to presence (what’s there) than to absence (what’s not there)

– Model restrictions are usually characterized by absence (e.g., coefficients that are fixed at 0).

• Reliance on graphics and other user-friendly mechanisms for specifying models in software

28

X Y Z

Ry1

Rz1

X Y

Z

Ry

Rz

1

Can You See It Now?

29

Why the Problem?

• Educational

• Perceptual/cognitive biases

– Feature-positive effect: We attend more to presence (what’s there) than to absence (what’s not there)

– Model restrictions are usually characterized by absence (e.g., coefficients that are fixed at 0).

• Reliance on graphics and other user-friendly mechanisms for specifying models in software

• Complexity of many models makes it impossible to catalogue all restrictions

30

PMD

PDD

SAPA

NEO_DEP NEO_HOS NEO_ANX

NEO_DEL NEO_SD NEO_IMP

PAI_BOR

MM_BPD

DIB_R

SIDP

Abuse

TDIS

TNA

BOR

e1

1

e2

1

e31

e41

e51

e9

e10

e11

e12

e6

1

e7

1

e8

1

1

1

1

D1

D21

D3

11

Trull (2001) Borderline Personality Disorder ModelIdentify the 62 Restrictions this Model is Testing

and Win a Prize!

1

1

1

1

1

31

Issue # 2: A poorly fitting composite model that masks an ill-fitting structural (path) model

• In many latent variable SEM models it’s important to distinguish between:– Measurement model: Relations between manifest indicators and latent

constructs– Structural (path) model: Relations among latent constructs– Composite model: The whole model that combines both the

measurement and structural components

• Typically, in latent variable models the clear majority of the restrictions are imposed at the level of the measurement model -- and that often fits well

• Common result: A well-fitting composite model that masks an ill-fitting structural component

• But the main motivation for the study typically is the structural component!

32

L X

X 1 X 2 X 3

e 2 e 3

L Y

Y1 Y2 Y3

e 4 e 5 e 6

L Z

Z1 Z2 Z3

e 7 e 8 e 9

C o m p o s ite (i.e . , T a rg e t) M o d e l

M e a s u re m e n t M o d e l

R L Y R L Z

L X

X 1 X 2 X 3

e 1 e 2 e 3

L Y

Y1 Y2 Y3

e 4 e 5 e 6

L Z

Z1 Z2 Z3

e 7 e 8 e 9

e 1

33

Chi-Square Tests of the C, M, and S Models

• Composite: Global test of the composite model

• Measurement: Global test of the measurement model

• Structural: Nested Test assessing relative fit of the composite and mesurement models

2 2 2S C M

S C Mdf df df

2

2

2

34

Model df p RMSEA

Composite 35.46 25 .080 .0290

Measurement 24.66 24 .425 .0074

Structural 10.80 1 .001 .1402

2

Illustrating the Problem

35

Issue # 3: Equivalent Models

• Two models are equivalent when their assessed fit across all possible samples is identical because they impose identical restrictions on the data

• Such models are ubiquitous in statistics• In the context of SEM, two models are equivalent when their

implied covariance matrices are identical because they impose the same restrictions on the variances and covariances

• If their implied covariance matrices are identical, then for any given sample, their discrepancy functions will be identical.

• If their discrepancy functions are identical, the values of all conventionally used fit indices will be identical.

36

The Problem

• The typical structural equation model has many equivalent models that impose the same restrictions on the data

• Typically, at least some are compelling theoretical alternatives to the target model of interest

• Such equivalent models are almost never acknowledged by researchers

37

3 Equivalent Causal Models

• These 3 models share the same restriction: [Cov(x,z)*Var(y)]-[Cov(x,y)*Cov(y,z)] = 0

• If variables are standardized, this restriction is: rxz-rxyryz=0

• All 3 models predict that the partial correlation between x and z, adjusting for y equals 0

• The overall fit of these 3 models will always be the same

• However, they represent three radically different claims about causal structure

X Y Z

Ry1

Rz1

Model 1A

X Y Z

Model 1B

Ry1

Rx1

X

Y

Z

Model 1C

Rx1

Rz1

38

Three Equivalent Measurement Models

F1

X4

e4

1

X3

e3

1

X2X1

e1

1

e2

1

Model 2A

F1

X4

e4

1

X3

e3

1

X2X1

e1

1

e2

1

Model 2B

X4

e4

1

X3

e3

1

X2X1

e1

1

Model 2C

e2

1

F1 F2

All 3 models impose the same restriction on the implied covariance matrix:

[Cov(x1,x3)*Cov(x2,x4)]-[Cov(x1,x4)*Cov(x2,x3)] = 0

39

PMD

PDD

SAPA



PAI_BOR

MM_BPD

DIB_R

SIDP

Abuse

TDIS

TNA

BOR

e1

1

e2

1

e31

e41

e51

e9

e10

e11

e12

e6

1

e7

1

e8

1

1

1

1

D1

D21

D3

11

How Many Equivalent Models?Lower Bound Estimate = 33,925

1

1

1

1

1

40

ParentalMood

Disorder


Disorder

TraitDisinhibition

TraitNegative

Affectivity


Model 3BEquivalent Model

D1 D21

D3

1

D4

1

1

ParentalMood

Disorder


Disorder

TraitDisinhibition

TraitNegative

Affectivity

BorderlineFeatures

Abuse

Model 3DEquivalent Model

ParentalMood

Disorder


Disorder

TraitDisinhibition

TraitNegative

Affectivity


D1

1

D2

1

D3

1

Model 3AOriginal Model

ParentalMood

Disorder


Disorder

TraitDisinhibition

BorderlineFeatures Abuse

Model 3CEquivalent Model

D21

D41Trait

NegativeAffectivity

D11

D31

41

Recommendations

• Researchers need to acknowledge presence of equivalent models

• Use designs that limit number of plausible equivalents (e.g., one rarely noted advantage of longitudinal relative to cross-sectional designs).

42

Issue # 4: Fixated on FitInattention to Lower-order Components

• What are “lower-order components” ?– Specific model parameters (e.g., path coefficents)– Measures that can be derived from parameters

• Direct, indirect, and total effects• Proportion of variance

• In most other statistical procedures that we use (e.g., multiple regression), the focus is on lower-order components

• There can be dissociations between measures of overall fit and lower-order components – A model can fit perfectly, yet have problematic or disappointing lower-

order components– Lower-order components can indicate very strong effects, yet the overall

fit can be terrible• Problem: Applied researchers often inappropriately de-emphasize lower-order

components in favor of reliance on global fit indices

43

X Y Z

Ry

1

Rz

1

Model Tested

Q

44

Sample Covariance Matrix SA Sample Covariance Matrix SB

X

Q Y

Z

X

100

Q

20

100

Y

55

65

100

Z

65

75

80

100

45

X

Q Y

Z

X

100

Q

30

100

Y

6.5

6.5

100

Z

0.52

0.52

8.0

100

Overall Fit and Components of Fit when AS and BS are Analyzed

Measure Good Fit Region Matrix SA Matrix SB Overall Fit

2 (df = 2) 0.00, p = 1.00 421.26, p < .0001

RMSEA Estimate .06 0.000 0.648 Lower 90% Limit -------- 0.596 Upper 90% Limit --------- 0.701 SRMSR .08 0.000 0.098 TLI .95 1.126 0.108 CFI .95 1.000 0.703 Components of Fit Path Coefficients PYX 0.05, NS 0.44, p <.0001 PYQ 0.05, NS 0.56, p <.0001 PZY 0.08, NS 0.80, p < .0001 % of Variance

2YR < 1% 61%

2ZR < 1% 64%

Note: N = 500. 2 = chi-square test of exact fit; RMSEA = root mean squared error of approximation; SRMSR = standardized root mean squared residual; TLI = Tucker-Lewis Index; CFI = Comparative Fit Index; PYX = path coefficient denoting effect of X on Y; PYQ = path coefficient denoting effect of Q on Y; PZY = path coefficient denoting effect of Y on Z; NS = not significant. For these models, the non-standardized and standardized coefficients are identical.

46

How Can a Model with Problematic Lower-order Components Fit Well?

• Residuals are part of the model!• Two types of residuals in SEM

– Residual matrix that is difference between observed and implied covariance matrices

– Residual variances and covariances (e.g., variance of an endogenous variable not accounted for by its predictors) that are model parameters

• Residual variances– Typically, are just-identified (impose no restrictions)– Can easily “fill in the difference” to reproduce the observed variance

of a variable even when predictors account for very small proportion of variance

• In essence, a weak theory can be bailed out by residual terms

47

Residual Covariances

are Often Critical Too

PMD

PDD

SAPA



PAI_BOR

MM_BPD

DIB_R

SIDP

Abuse

TDIS

TNA

BOR

e1

1

e2

1

e31

e41

e51

e9

e10

e11

e12

e6

1

e7

1

e8

1

1

1

1

D1

D21

D3

11

1

1

1

1

1

48

Other Respects in Which Local Features of a Model are Ignored

• Confidence intervals around parameter estimates rarely reported

• Potential problems with tests of parameters often ignored– Reliance on Wald tests– Incorrect chi-square distributions for tests at the boundary of the

parameter space – Often invariance across different parameterizations is mistakenly

assumed • Issue of assessment of fit at the level of individual subjects is

typically ignored (e.g. no analysis of residuals or of individual contributions to fit)

• Irony: In many cases, a more rigorous assessment of a model is afforded by a more traditional multiple regression approach!

49

Issue # 5: Omitted Variables

• Sometimes measures of fit are sensitive to the problem of omitted variables (4A tested model, 4B true model)

• Sometimes they are not (4A tested, 4C true model)

• Thus, a well-fitting model could -- and typically does -- omit important variables

X Y Z

Ry1

Rz1

Model 4A:Hypothesized Model

X Y Z

Ry

1

Rz

1

Model 4B:Correct Alternative Structure # 1

Q

X Y Z

Ry

1

Rz

1

Model 4C:Correct Alternative Structure #2

Q

50

Omitted Variables: Can Residual Covariance Terms Bail us Out?

• By representing the omitted influences that do variables may share in common, residual covariance terms can improve model fit

• However, there are limits on the covariances that can be specified

• In addition, they typically do not correct for biased estimates due to omitted variables

51

Y1

X1

X2

Y2

X3

0.4

0.4

.4

0.2

0.20.2

0.20.6

0.6

.34

D11

.34

D21

True Model

52

Y1X1

X2 Y2

D11

D21

Model TestedBad Fit: Chi-square (1) = 114.33, p < .0001

Note Also Biased Path Estimates

0.37

0.37

0.37 0.37

53

Y1X1

X2 Y2

D11

D21

Revised ModelNow Perfect Fit (Saturated)

But Path Estimates Unchanged

0.37

0.37

0.37 0.370.28

54

Summary

• SEM is a powerful and comprehensive data-analytic technique

• There are a number of issues regarding model-imposed restrictions and assessment of fit that commonly operate under the radar of the applied user

• A well-fitting model can have substantial problems and ambiguities

55

structural equation modeling: problems and ambiguities with “well fitting” models andrew...

Documents