charles darwin university  · web viewmultiple regression. to assess whether emotional...

31
INTRODUCTION TO STRUCTURAL EQUATION MODELLING by Simon Moss Introduction Structural equation modelling is an extension of linear regression , sometimes called multiple regression. In particular, researchers conduct linear regression to explore whether a set of predictors—such as IQ, motivation, age, and gender—are associated with a numerical outcome—such as income. On some occasions, however, one or more of the variables are actually composites of multiple items or indicators for example, to evaluate motivation, participants might answer five questions the researcher then averages the answer to these questions to gauge motivation Whenever one or more of the variables are composites of multiple items or questions, structural equation modelling tends to be more accurate, and certainly more respected, than linear regression. This document outlines how to conduct structural equation modelling and assumes, at least some, familiarity with linear regression. Example To introduce this topic, consider a researcher who wants to predict which research candidates are likely to be especially motivated during their degree. To achieve this goal, research candidates are invited to indicate the degree to which they agree or

Upload: others

Post on 03-Jan-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

INTRODUCTION TO STRUCTURAL EQUATION MODELLING

by Simon Moss

Introduction

Structural equation modelling is an extension of linear regression, sometimes called multiple regression. In particular, researchers conduct linear regression to explore whether a set of predictors—such as IQ, motivation, age, and gender—are associated with a numerical outcome—such as income. On some occasions, however,

one or more of the variables are actually composites of multiple items or indicators for example, to evaluate motivation, participants might answer five questions the researcher then averages the answer to these questions to gauge motivation

Whenever one or more of the variables are composites of multiple items or questions, structural equation modelling tends to be more accurate, and certainly more respected, than linear regression. This document outlines how to conduct structural equation modelling and assumes, at least some, familiarity with linear regression.

Example

To introduce this topic, consider a researcher who wants to predict which research candidates are likely to be especially motivated during their degree. To achieve this goal, research candidates are invited to indicate the degree to which they agree or disagree with nine statements. These statements appear in the following table. Although these measures are not especially valid

three questions were designed to gauge motivation three questions were designed to gauge emotional intelligence, and three questions were designed to estimate the IQ of participants indirectly

Questions in this survey

Motivation

When I awake in the morning, I feel excited about the day ahead

I often feel inspired to study and to complete my research in the evening or on the weekend

I tend feel energized and motivated throughout the day

Page 2: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

Emotional intelligence

The negative feelings I experience, such as anxiety or anger, subside very quickly

I can often decipher the feelings of other people from subtle changes in their facial expression

I can often appease people who seem angry and aggressive

IQ

The grades I received at school were very high, given the level of effort I devoted to my work

I have developed a very extensive word vocabulary

I tend solve puzzles more rapidly than other people

The researcher, then, distributes this survey to 100 research candidates. The following table presents an excerpt of the data

Motivation Emotional intelligence IQ

Participant m1 m2 m3 e1 e2 e3 i1 i2 i3 age

1 4 1 1 5 2 5 4 3 2 4

2 5 4 3 4 1 3 5 4 1 3

3 1 2 1 2 2 4 1 3 4 3

4 2 4 4 3 4 3 5 2 2 4

Multiple regression

To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a linear regression analysis, also called a multiple regression analysis. Linear regression analysis explores whether a set of predictors, such as IQ and age, predicts or is associated with a numerical outcome, such as motivation. However, to conduct this regression, the researcher would most likely

on each measure, average the three items to generate one score, sometimes called a composite or scale—an extract of which is shown in the following table

subject these composites or scales to linear regression

Page 3: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

Original data

Motivation Emotional intelligence IQ

Participant m1 m2 m3 e1 e2 e3 i1 i2 i3 age

1 4 1 1 5 2 5 4 3 2 4

2 5 4 3 4 1 3 5 4 1 3

3 1 2 1 2 2 4 1 3 4 3

4 2 4 4 3 4 3 5 2 2 4

Composites

Participant Motivation Emotional intelligence IQ age

1 2 4 3 4

2 4 2.7 3.3 3

3 1.3 2.7 2.7 3

4 3.3 3.3 3 4

In particular, in this instance, the outcome or criterion is motivation and the predictors are emotional intelligence, IQ, and age. The output appears in the following table. This output reveals that emotional intelligence, but not IQ or age, are significantly associated with motivation.

Predictor Unstandardised B

SE Standardised B or beta

t

Constant .52 .14

eq .42 .04 .31 4.43**

Iq .14 .07 .14 1.82

Age .12 .29 .15 1.32

* p < .05, ** p < .01

Limitations of linear regression

Despite this prevalence of linear regression, several limitations diminish the utility of this technique. The following table outlines these limitations

Page 4: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

Limitation Clarification

Linear regression overlooks the error in the variables

That is, linear regression assumes the predictors and criterion are measured with no error

In this instance, motivation, emotional intelligence, and IQ are assumed to be measured completely accurately

Yet, few measures are devoid of random error. Because this error is overlooked, the parameters—such

as the B coefficients—are slightly inaccurate

Linear regression might show that two variables are related, but only because they share an item in common

In this example, motivation, emotional intelligence, and IQ are composites of several items or questions, sometimes called indicators

An item on one variable might overlap closely with an item on another variable

For example, an item on motivation and emotional intelligence might both allude to family

Hence, scores on these two items might be very similar, inflating the relationship between motivation and emotional intelligence

Linear regression can examine the predictors of only one criterion at a time

In this example, the research comprises only one criterion; in other research, the research might explore several criteria, like motivation, satisfaction, and support

linear regression only explores one criterion at a time and thus might overlook key insights

for example, perhaps one predictor is marginally correlated to all criteria; but, if each criterion is examined separately, this predictor may be erroneously deemed as unimportant

To examine mediators—variables that explain why the predictors and criterion are related—researchers need to conduct a cumbersome set of regression equations

In this example, the researcher could explore the possibility that emotional intelligence mediates or explains the relationship between age and motivation

According to Baron and Cohen, researchers need to report at least three distinct regression equations to test this hypothesis

Page 5: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

Structural equation modelling, or SEM, circumvents all these limitations. In particular

when researchers conduct structural equation modelling, they explicitly include the errors of each variable in the model

when researchers conduct structural equation modeling, they include the specific items or indicators in the model—and therefore can uncover items that overlap inordinately

structural equation modelling can predict several criteria, or dependent variables, simultaneously

structural equation modelling can examine mediators efficiently

Example of a structural equation modelling

To conduct SEM, the researcher first constructs a model—like a diagram—to represent all the relationships of interest. The following diagram represents a typical model. At first glance, this model looks complicated. But, in essence, this model simply assumes that

three items correspond to each of the key attributes: motivation, emotional intelligence, and IQ the items cannot be measured entirely accurately—but contain some error emotional intelligence, IQ, and age predict motivation

Page 6: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

How to implementation the model

Next, the researcher utilises a software tool to conduct this technique. Researchers can utilise a variety of tools, such as AMOS and Lisrel. This document will later reveal how you can utilise R, a free software package, to conduct structural equation modelling. But, in essence, the researcher merely needs to

download the software at no cost upload the data copy, paste, and then adapt the following code interpret the results

Page 7: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

install.packages("lavaan", dependencies=TRUE)library(lavaan)

model1 <- ' #latent variablesmotivation =~ m1 + m2 + m3eq =~ e1 + e2 + e3 iq =~ i1 + i2 + i3

#regressionsmotivation ~ eq + iq + age

#variances and covarianceseq~~iq

#residual correlations can be addedi1~~i2'

#fitfit1<-sem(model1, data=data.file)summary(fit1, fit.measures=TRUE, standardized=TRUE)

This technique will generate a lot of output. However, as indicated by the yellow rectangles in the following display, conclusions can be derived from only a subset of the output.

Page 8: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

Interpret the output: Regression equations

Despite the mounds of output that SEM can generate, the most important information is usually the B coefficients—that values that appear under the heading Regressions. In this example

two of the three p values, in the column called P(>|z|), are less than .05 and thus significant specifically, eq and iq are significantly associated with motivation the B coefficients appear in the column called Estimate the B coefficient associated with eq is positive; hence, eq is positively associated with motivation

after controlling the other predictors

Page 9: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

the B coefficient associated with iq is negative; hence, iq is negatively associated with motivation after controlling the other predictors

Interpret the output: Chi-square values

If the model you constructed does not explain the data, these B coefficients might be inaccurate. To clarify, instead of the model you tested, you could have constructed another model, such as the following example.

As the red arrows underscore, this updated model diverges from the original model. For example, in this updated model

IQ is assumed to predict EQ age is assigned to predict IQ the error associated with two of the items or questions—e3 and i1—are related. To illustrate,

perhaps both of these questions refer to some topic in common, such as family. Therefore, high scores on one item may tend to coincide with high scores on the other item.

Page 10: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

So, how can you determine whether the original model is sufficient? How does SEM evaluate models? In essence, SEM first estimates the extent to which the model predicts the variables should be correlated. For example, roughly speaking, if the original model was correct

e1, e2, and e3 should be more related to each other than to i1, i2, and i3 age should be more related to motivation than to emotional intelligence, and so forth

Second, SEM determines whether these estimated correlations match the observed correlations in the data. To illustrate, roughly speaking, the computer would assess whether

e1, e2, and e3 are indeed more related to each other than to i1, i2, and i3 in the data file age is indeed related to motivation than to emotional intelligence in the data file, and so forth

Finally, SEM calculates the differences between these estimated and actual correlations. A statistic, called c2 or chi-square, represents the degree to which the estimated correlations diverge from the actual correlations. This value appears in a box called Model Test User Model, as reproduced in the following table

Test statistic 99.596

Degrees of freedom 31

P value (Chi-square) 0.000

In this example, the c2 or chi-square is 99.59, the degrees of freedom is 31, and the p value is less than .001. We would report these values as c2(31) = 99.59, p <.001. To interpret this output,

because the p value is significant, the chi-square value is significantly higher than 0 consequently, the observed correlations diverge from the actual correlations the model, therefore, is not entirely accurate

Interpret the output: Other fit indices

Unfortunately, whenever you have collected extensive data—such as many participants, animals, specimens, or units—the c2 statistic is invariably significant. The model is virtually always deemed to be inaccurate. Therefore, researchers tend to utilise other statistics, called fit indices, to evaluate the model. For example

two of these indices, called CFI and TLI, appear in a box called User model versus Baseline model, as reproduced in the following table

according to many scholars, if these indices exceed 0.9, the model is deemed to be adequate (e.g., Schumacker & Lomax, 2010)

Page 11: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

in this example, the model, and hence the B coefficients, can be deemed as adequate

Comparative fit index or CFI .974

Tucker Lewis index or TLI .962

How to conduct structural equation modelling

Researchers can utilise many software packages, such as AMOS, a tool that complements SPSS, to conduct SEM. However, the free software platform, R, has recently developed a package that novice researchers can utilise to conduct SEM. This tool includes several defaults that simplify this technique.

Step 1: Download R and R studio

If you have not used R before, you can download and install this software at no cost. To achieve this goal

visit https://www.cdu1prdweb1.cdu.edu.au/files/2020-08/Introduction%20to%20R.docx to download an introduction to R

read the section called Download R and R studio although not essential, you could also skim a few of the other sections of this document to

familiarize yourself with R.

Step 2: Upload the data file

Your next step is to upload the data into R. To achieve this goal

open Microsoft Excel enter your data into Excel; you might need to copy your data from another format. Or, your

data might already have been entered into Excel

In particular, as the following example shows

each column should correspond to one variable each row should correspond to one individual, animal, specimen, and so forth the first row labels the variables to prevent complications, use labels that comprise only lowercase letters—although you could

end the label with a number, such as age3

Page 12: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

To convert this file into a csv file—such as a file called research.data—and then to upload this file into R studio

visit https://www.cdu1prdweb1.cdu.edu.au/files/2020-08/Introduction%20to%20R.docx to download the introduction to R—unless you have already downloaded this document

read the section called “Upload some data”

Step 3: Enter the code and interpret the results

To conduct SEM, you need to enter some code. The code might resemble the following display. At first glance, this code looks absolutely terrifying. But actually, this code is straightforward once explained.

install.packages("lavaan", dependencies=TRUE)library(lavaan)

model1 <- ' #latent variablesmotivation =~ m1 + m2 + m3eq =~ e1 + e2 + e3 iq =~ i1 + i2 + i3

Page 13: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

#regressionsmotivation ~ eq + iq + age

#variances and covarianceseq~~iq

#residual correlations can be addedi1~~i2'

#fitfit1<-sem(model1, data=research.data)summary(fit1, fit.measures=TRUE, standardized=TRUE)

To enter code, you could write one row, called a command, at a time in the Console. But, if you want to enter code more efficiently,

in R studio, choose the File menu and then “New File” as well as “R script” in the file that opens, paste the code that appears in the left column of the following table to execute this code, highlight all the instructions and press the “Run” button—a button that

appears at the top of this file

You should not change the bold characters in the left column. You might change the other characters, depending on the name of your data file, the name of your variables, and so forth. The right column of the following table explains this code. You do not, however, need to understand all the code.

Code to enter Explanation or clarification

install.packages("lavaan", dependencies=TRUE)

library(lavaan)

R comprises many distinct sets of formulas or procedures, each called a package

lavaan is a package that conducts structural equation modelling

install.packages merely installs this package onto the computer

library then activates this package the quotation marks should perhaps be written

in R rather than Word; the reason is that R recognises this simple format— " —but not the more elaborate format that often appears in Word, such as “ or ”.

Page 14: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

model1 <- ' The ' in this code is merely designed to instruct the computer that you are just about to specify the model

That is, in essence, you will convert the diagram to a series of simple equations

You will call this model model1—although you are welcome to choose another name

#latent variables The computer skips any lines that start with a # These lines are usually comments, designed to

remind the researcher of the aim or purpose of the following codes

In this example, the comment latent variables indicates the following code will define the latent variables

So, what are latent variables? To illustrate

The researcher assessed some of the variables, such as e1, e2, and e3, directly in the survey, called observed variables or indicators

The researcher, however, did not assess other variables directly in the survey, such as motivation, eq, and iq

The researcher can only derive the motivation, eq, and iq from specific indicators, such as e1, e2, and e3

Variables that are not measured directly are called latent—and tend to appear as ovals, instead of rectangles, in the diagram

motivation =~ m1 + m2 + m3eq =~ e1 + e2 + e3 iq =~ i1 + i2 + i3

To describe the model, researchers tend to define the latent variables first

That is, they specify which indicators, such as m1, m2, and m3, correspond to which latent variables, such as motivation

In these instances, researchers use the symbols =~ to connect the latent variable with the indicators.

In response to this symbol, R will automatically attach an error term to each indicator

Page 15: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

In contrast, if you use other tools or packages, you need to remember to stipulate these errors terms yourself

#regression

motivation ~ eq + iq + age

After defining the latent variables, researchers tend to stipulate the regression equations

That is, they specify which predictors should be associated with each outcome or criterion

This example comprises only one regression equation; but SEM usually comprises more than one regression equation

R will also automatically attach an error term to each regression equation

#variances and covarianceseq~~iq

#residual correlations can be addedi1~~i2

After defining the latent variables and regression equations, researchers might sometimes include some more information.

Variances and covariances First, researchers might indicate the predictors

—that is, all latent variables that are not outcomes or criteria—can be correlated with each other

The code eq~~iq enables R to include and estimate the correlation between eq and iq

Residual correlations

Second, researchers sometimes permit specific items or indicators to be correlated with each other

This provision is common if the fit is inadequate—an issue that will be discussed later

' The quotation mark is then closed

fit1<-sem(model1, data= research.data) This code then applies SEM to evaluate model1 using the data stored in research.data

The outcome of this SEM is stored in fit1

summary(fit1, fit.measures=TRUE, standardized=TRUE)

This code then prints the output—including the fit indices—that was stored in fit1

Page 16: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

Variations to SEM

This document has outlined the fundamental principles of SEM. Specifically, this document has

delineated the benefits of SEM over linear regression briefly illustrated how to implement and interpret SEM

Nevertheless, in practice, researchers often need to manage a range of complications and variations. This section introduces some of these complications.

Comparisons between nested models

Often, researchers want to assess whether removing one or more relationships—or adding one or more relationships—significantly affects the model. To illustrate, consider the following two models. In this example

the left model is nested within the right model that is, the left model is identical to the right model except one relationship has been removed,

as underscored by the red arrows

When one model is nested within another model, researchers often apply a procedure that can generate some helpful insights. In particular, researchers will often

Page 17: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

calculate the difference in the c2 statistic and df that each model generates determine whether this difference is significant if the difference is significant, conclude the relationships that differ between the models must be

significant as well.

To illustrate, consider the following table. According to this table,

the difference in the c2 statistic and df between the two models is 10.0 and 1 respectively to determine whether this c2 statistic of 10 is significant, the researcher must compare this value

to a critical value—the c2 statistic that corresponds to a p value of 0.05 to estimate this critical value, enter “=CHISQ.INV(0.95, 1)” into a cell in Excel but delete the

quotation marks; also, instead of a 2, enter the difference in degrees of freedom that you computed

Left model Right model Difference Critical value

Test statistic 99.596 89.596 10.0 3.84

Degrees of freedom 31 30 1

In this example, the difference in the c2 statistic, 10, exceeds the critical value, 3.84. You would thus conclude that

the right model explains the data significantly better than does the left model consequently, IQ must directly affect emotional intelligence

Sample size

Typically, to conduct SEM, researchers prefer large datasets—datasets comprising about 250 to 1000 or so participants, animals, specimens, or units. Nevertheless, researchers have not reached consensus on how to estimate the appropriate sample size. For example, according to Kline, researchers should

calculate the number of parameters—roughly the number of arrows in the model, including the arrows that connect the indicators to error terms

multiply this number of parameters by 20 to estimate the minimum sample size

Yet, other researchers suggest that samples do not have to be as extensive. Bentler and Chou (1987) recommend the sample size should be at least 5 times the number of parameters (for more detailed considerations, see Westland, 2010; Wolf et al., 2013).

Plots

Page 18: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

Furthermore, if you want to construct a plot that displays the model, you can enter the following code. Replace fit1 with the label you used to store the output.

install.packages("lavaanPlot") library(lavaanPlot) lavaanPlot(name="plot", fit1)

Parcelling items

Because the sample size should exceed 20 times the number of parameters, if some latent variables comprise too many indicators, your sample size might be inadequate. To illustrate, if one measure of motivation comprises 50 questions, your sample size will need to be at least 1000 participants and probably many more. To overcome this problem, researchers often reduce the questions or items to a reduced number of subsets, called parcels. For example, as the following table shows

12 items could be reduced to 3 parcels—each derived from 4 items to generate the parcels, the researcher could merely average the 4 items only the parcels, instead of the original items, are subjected to the SEM

Person m1 m2 m3 m4 m5 m6 m7 m8 m9 m10 m11 m12

1 4 1 1 5 2 5 4 3 2 4 5 4

2 5 4 3 4 1 3 5 4 1 3 3 5

3 1 2 1 2 2 4 1 3 4 3 4 1

4 2 4 4 3 4 3 5 2 2 4 3 5

Person Parcel1 Parcel2 Parcel3

1 2 4 3

2 4 2.7 3.3

3 1.3 2.7 2.7

4 3.3 3.3 3

Page 19: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

In the majority of models, each latent variable is assigned at least three, and usually fewer than eight, items or indicators. If the number of indicators exceeds eight, researchers will often reduce these items or questions to three or more parcels.

Other fit indices

Besides the c2 statistic, CFI, and TFI, researchers can utilise a range of other fit indices to evaluate the models. Most of these fit indices are derived from the c2 statistic. Therefore, to appreciate the various fit indices, you need to understand the c2 statistic. In essence, to calculate the c2 statistic, the computer first estimates the correlation between all variables—including the error terms. A subset of these correlations appears in the following table.

m1 m2 m3 e1 e2 e3 i1 i2 i3 error1

m1 1 .13 .14 .54 .25 .58 .42 .30 .26 .42

m2 1 .31 .47 .10 .32 .54 .44 .15 .35

m3 1 .43 .14 .64 .25 .58 .42 .03

m4 1 .13 .04 .64 .05 .58 .64

The computer then multiplies these correlations by the standard deviation of the corresponding variables. For example

suppose the standard deviation of m1 is 2 and the standard deviation of m2 is 5 the computer would thus multiply the correlation between m1 and m2—.13—by 2 x 5 or 10 the answer would be 1.3, as illustrated in the following table these correlations, when multiplied by the standard deviation of each variable, are called

covariances covariances are like correlations but do not range from -1 to 1.

m1 m2 m3 e1 e2 e3 i1 i2 i3 error1

m1 3.2 1.3 5.14 1.54 1.25 1.58 3.42 1.30 4.26 1.42

m2 4.1 1.31 2.47 2.10 2.32 2.54 2.44 2.15 2.35

m3 1.92 1.43 3.14 3.64 4.25 4.58 1.42 3.03

m4 3.1 1.13 4.04 1.64 1.05 3.58 3.64

The computer then applies a range of formulas to estimate the covariances the data would generate if the model was true. For example, as mentioned previously, if the model was true

e1, e2, and e3 should be more related to each other than to i1, i2, and i3

Page 20: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

age should be more related to motivation than to emotional intelligence, and so forth the following table presents the expected covariances, as predicted from the model

m1 m2 m3 e1 e2 e3 i1 i2 i3 error1

m1 1.21 1.34 2.19 3.64 1.25 1.51 3.12 1.39 4.29 1.48

m2 4.12 1.31 2.17 2.90 2.39 2.58 2.44 2.15 2.38

m3 1.99 1.43 3.89 3.61 4.25 4.58 1.42 3.03

m4 3.1 1.13 4.04 1.68 1.05 3.58 3.84

Finally, the computer will

for each cell, compute the difference between the actual and expected covariance square this number divide this square by the expected covariance sum these answers to generate the c2 statistic

This demonstration might seem cumbersome. But, in essence, the demonstration merely indicates the c2 statistic represents the difference between the actual and expected covariances. The following table outlines the other indices that researchers might utilise instead.

Fit index Rationale Interpretation

Loglikelihood and Information Criteria

Akaike information criteria or AIC

Reflects the extent to which the expected and observed covariances differ from each other—but penalizes too many parameters

The raw number is meaningless

Instead, researchers compare more than one model

The model that generates the smallest AIC is chosen, because this model explains the data effectively but with fewer relationships

Bayesian information criteria or BIC

Similar to the Akaike information criteria

But also penalizes models when the sample size is

Similar to the Akaike information criteria

The lowest value reflects the most efficient model

Page 21: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

small

Sample-size adjusted Bayesian information criteria

Similar to the Akaike information criteria

Similar to the Akaike information criteria

Root mean square error of approximation indices

RMSEA

90% confidence interval - lower 90% confidence interval - upper

p-value of RMSEA

According to some researchers, if this value is less than 0.08, the fit is regarded as adequate (Browne & Cudeck, 1993)

But, according to other researchers, if the upper confidence interval is less than 0.08, the model is regarded as accurate (Hu & Bentler, 1998)

Standardized Root Mean Square Residual

SRMR To calculate this index, the computer

calculates the difference between each estimated and actual covariance

averages these values then calculates the square

root of these averages

According to some researchers, if this value is less than 0.08, the fit is regarded as adequate

Modification indices

Finally, when researchers conduct SEM, they sometimes instruct the computer to present modification indices. Modification indices represent the extent to which the c2 statistic would change if the model was adjusted. For example, the computer might indicate that

the c2 statistic might diminish by 4.3 if the error terms that correspond to i1 and m2 were correlated

or the c2 statistic might diminish by 1.90 if the error terms that correspond to e2 and m3 were correlated

Page 22: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

When researchers receive this information, they might permit some of these error terms to be correlated. These adjustments diminish the simplicity, but enhances the accuracy, of these models. To generate these modification indices, researchers might enter code that resembles the following box.

mi <- modindices(fit1) mi

This code will then generate output that resembles the following display. To utilise this output,

the researcher might identify the highest number in the column called mi or modification index in this instance, the highest modification index is 16.8—corresponding to the covariance

between m1 and m2 the researcher might then evaluate the model again, except permit m1 and m2 to be correlated that is, the researcher would include the code m1 ~~ m2 in the model the researcher might continue this procedure, until the fit is reasonable

Further considerations

Page 23: Charles Darwin University  · Web viewMultiple regression. To assess whether emotional intelligence, IQ, and age predict motivation, the researcher could, in principle, conduct a

Sometimes, you might want to explore whether the relationships vary across groups, such as genders. To learn how to perform this analysis, visit the document on path analysis and read the section called Comparisons Between Groups.

References

Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238-246.

Bentler, P. M., & Chou, C. P. (1987). Practical issues in structural modeling. Sociological Methods & Research, 16(1), 78-117.

Bollen, K.A. (1990). Overall fit in covariance structure models: Two types of sample size effects. Psychological Bulletin, 107, 256-259

Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 136-162). Newsbury Park, CA: Sage.

Hipp J. R., & Bollen K. A. (2003). Model fit in structural equation models with censored, ordinal, and dichotomous variables: Testing vanishing tetrads. Sociological Methodology, 33, 267-305.

Hu, L. T., & Bentler, P. M. (1995). Evaluating model fit. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 76-99). Thousand Oaks, CA: Sage.

Kline, R. B. (2015). Principles and practice of structural equation modeling. Guilford publications.

Schumacker, R. E., & Lomax, R. G. (2010). A beginner’s guide to structural equation modeling (3rd ed.). New York, NY: Routledge Academic.

Westland, J.C. (2010). Lower bounds on sample size in structural equation modeling. Electronic Commerce Research and Applications, 9(6), 476-487.

Wolf, E. J., Harrington, K. M., Clark, S. L., & Miller, M. W. (2013). Sample size requirements for structural equation models: An evaluation of power, bias, and solution propriety. Educational and Psychological Measurement, 73(6), 913-934.