unit 7a: factor analysis © andrew ho, harvard graduate school of educationunit 7a – slide 1
TRANSCRIPT
Unit 7a: Factor Analysis
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 1
http://xkcd.com/419/
• From principal components to factor analysis• From factor analysis to structural equation modeling• Interpreting factor analytic results
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 2
Multiple RegressionAnalysis (MRA)
Multiple RegressionAnalysis (MRA) iiii XXY 22110
Do your residuals meet the required assumptions?
Test for residual
normality
Use influence statistics to
detect atypical datapoints
If your residuals are not independent,
replace OLS by GLS regression analysis
Use Individual
growth modeling
Specify a Multi-level
Model
If time is a predictor, you need discrete-
time survival analysis…
If your outcome is categorical, you need to
use…
Binomial logistic
regression analysis
(dichotomous outcome)
Multinomial logistic
regression analysis
(polytomous outcome)
If you have more predictors than you
can deal with,
Create taxonomies of fitted models and compare
them.
Form composites of the indicators of any common
construct.
Conduct a Principal Components Analysis
Use Cluster Analysis
Use non-linear regression analysis.
Transform the outcome or predictor
If your outcome vs. predictor relationship
is non-linear,
Use Factor Analysis:EFA or CFA?
Course Roadmap: Unit 7a
Today’s Topic Area
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 3
Dataset FOWLER.txt
Overview
In this study, which was her qualifying paper, Amy Fowler reports on the piloting and construct validation of a self-designed survey instrument to measure teachers’ professional defensiveness within the context of a professional relationship with a peer. The dataset contains responses from teachers on this Teacher Professional Defensiveness Scale (TPDS), and on two other published scales – the Specific Interpersonal Trust Scale (SITS) and the Fear of Negative Evaluation Scale (FNES) (Robinson et al., 1991).
Source
Fowler, A.M. (2009). Measuring Teacher Defensiveness: An Instrument Validation Study. Unpublished Qualifying Paper, Harvard Graduate School of Education.
Robinson, J. P., Shaver, P. R., Wrightsman, L. S., & Andrews, F. M. (1991). Measures Of Personality And Social Psychological Attitudes. San Diego: Academic Press.
Additional Info
The three instruments are described more fully in Fowler (2009), but briefly are:· Specific Interpersonal Trust Scale (SITS): The SITS scale assesses the level of
trust one individual places in another. Participants choose a response that reflects their feelings towards a referent peer, on a five-point scale (see Metric below). There are 8 items, 3 of which must be reverse coded when totaling the responses and then larger total scores represent a higher level of respondents’ trust in their selected peers.
· Teacher Professional Defensiveness Scale (TPDS): The TPDS Scale was designed by the investigator to assess teacher sensitivity to peer feedback. Respondents are directed to imagine their referent peer making a series of comments regarding the respondent’s teaching practice and are then asked to respond to these comments on a seven-point scale (see Metric below). The scale contains 21 items presented in random order; statement stems are alternately complimentary,neutral or critical.
· Fear of Negative Evaluation Scale (FNES): The FNES scale assesses the degree of fear respondents have regarding negative evaluation from others. Respondents are asked to assess how characteristic each statement is of their own feelings, on a five-point scale (see Metric below). There are 12 items, including 4 for which responses must be reverse coded when totaling the responses and then larger total scores indicate higher levels of fear towards negative evaluation.
Sample size
411 Teachers
Construct Validation:A New Instrument to Measure
Teacher Professional Defensiveness
Three Sub-Scales, each with 7 items:
• “Criticisms”• “Compliments”• “Neutral”
Research Objective:• Check that three sub-scales of
TPDS are each separately uni-dimensional.
• Check that the constructs measured by each sub-scale are distinct from each other (but perhaps correlated).
• Check the relationship between the TPD sub-scales and other accepted measures (SITS, FNES).
Teacher “Professional Defensiveness”
Col#
VarName
Variable Description Variable Metric/Labels
1 TID Teacher identification code? Integer
2 SID School identification code? Integer
Teacher Professional Defensiveness Scale (TPDS):
16 D1I like the tone of your room; it is friendly but serious about school at the same time.
1 = High Criticism2 = Criticism3 = Mild Criticism4 = Neither5 = Mild Compliment6 = Compliment7 = High Compliment
17 D2I think you underestimated student’s prior knowledge of today’s topic.
20 D5 I’m not clear how today’s lesson related to the curriculum standards.
21 D6The examples you used to explain the main concept helped students to understand the big ideas.
25 D10I’m not sure the task you had kids do required kids to achieve the objective you had on the board.
26 D11 It seems like you worry about whether or not the students like you.
28 D13I can tell by the way you talk to the students that you believe they can learn this material.
29 D14I wonder if the students learned the concept you wanted them to learn through that hands-on lesson.
31 D16I wonder if students in the class understand your sarcasm in the same way you mean it.
33 D18 Students seem to follow the classroom rules.
36 D21The way you started the class with students’ interests got them involved and attentive to the lesson.
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 4
Indicators ofTeacher Professional
Defensiveness:“Criticisms” Sub-
Scale(We’ll look at Amy’s
final 5 Items)
Indicators ofTeacher Professional
Defensiveness:“Compliments ” Sub-
Scale(Final 4 Items)
Read Each Item. Take the Test. Know your scale.
The assumption here is that some teachers
consistently score higher (or lower) than others on these items, a possible measure of
defensiveness or sensitivity to criticism
from peers.
010
2030
4050
Per
cent
1 2 3 4 5 6 7You underestimated students knowledge of topic
010
2030
Per
cent
1 2 3 4 5 6 7Not clear how lesson related to curric standards
05
1015
2025
Per
cent
1 2 3 4 5 6 7Task didnt help kids achieve your objective
010
2030
Per
cent
1 2 3 4 5 6 7You like to worry about whether students like you
05
1015
2025
Per
cent
1 2 3 4 5 6 7Wonder if students understand your sarcasm
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 5
Teachers’ responses to the items on the Criticisms sub-scale have
different levels of variability across items. This is common
and raises the question of whether this variability is
comparable.
Variable Label D2 You underestimated students knowledge of today’s topic D5 Not clear how lesson related to curriculum standards. D10 Task didn’t help kids achieve your objective. D11 You like to worry about whether students like you. D16 Wonder if students understand your sarcasm.
Variable Label D2 You underestimated students knowledge of today’s topic D5 Not clear how lesson related to curriculum standards. D10 Task didn’t help kids achieve your objective. D11 You like to worry about whether students like you. D16 Wonder if students understand your sarcasm.
Let’s focus first on the “Criticisms” subscale of Teacher Professional Defensiveness…Let’s focus first on the “Criticisms” subscale of Teacher Professional Defensiveness…
On average, teachers were more sensitive/defensive to certain items than others.
This is common and usually a good thing (some items are more “difficult” than others).
Exploratory Data Analysis – Univariate
1 = High Compliment2 = Compliment3 = Mild Compliment4 = Neither5 = Mild Criticism6 = Criticism7 = High Criticism
Youunderestimated
studentsknowledge of
topic
Not clearhow lessonrelated to
curricstandards
Task didnthelp kidsachieve
yourobjective
You like toworry about
whetherstudentslike you
Wonder ifstudents
understandyour
sarcasm
1 2 3 4 5 6 7
1234567
1 2 3 4 5 6 7
1234567
1 2 3 4 5 6 7
1234567
1 2 3 4 5 6 7
1234567
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 6
Let’s focus first on the “Criticisms” subscale of Teacher Professional Defensiveness…Let’s focus first on the “Criticisms” subscale of Teacher Professional Defensiveness…
Pearson Correlation Coefficients, N = 399(Estimated under listwise deletion)
D2 D5 D10 D11 D16
D2 1.0000 D5 0.3452 1.0000 D10 0.4411 0.5932 1.0000 D11 0.3212 0.7997 0.6338 1.0000 D16 0.4677 0.5575 0.7501 0.5721 1.0000
Pearson Correlation Coefficients, N = 399(Estimated under listwise deletion)
D2 D5 D10 D11 D16
D2 1.0000 D5 0.3452 1.0000 D10 0.4411 0.5932 1.0000 D11 0.3212 0.7997 0.6338 1.0000 D16 0.4677 0.5575 0.7501 0.5721 1.0000
The sample bivariate correlations among the indicators on the “Criticisms” sub-scale are all positive, and the relationships do not appear to be obviously nonlinear. Could the subscale be a useful indicator of a single common construct? Could the subscale be unidimensional?
Exploratory Data Analysis – Bivariate
Variable Label D2 You underestimated students knowledge of today’s topic D5 Not clear how lesson related to curriculum standards. D10 Task didn’t help kids achieve your objective. D11 You like to worry about whether students like you. D16 Wonder if students understand your sarcasm.
Variable Label D2 You underestimated students knowledge of today’s topic D5 Not clear how lesson related to curriculum standards. D10 Task didn’t help kids achieve your objective. D11 You like to worry about whether students like you. D16 Wonder if students understand your sarcasm.
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 7
Research Question?
We could ask … “Are There A Number Of Unseen (Latent) Factors (Constructs) Acting “Beneath” These
Indicators To Forge Their Observed Values?”
Instead, we would need …Factor Analysis (CFA or EFA?)
Path Model of Factor Analysis Path Model of Factor Analysis
D2i
D5i
D10i
D11i
D16i
2i
5i
10i
11i
16i
1i
2i
Rather than asking … “Can We Forge These Several Indicators Together Into A Smaller Number Of
Composites With Defined Statistical Properties?”
Then, we would need …Principal Components Analysis (PCA)
Path Model of Principal Components AnalysisPath Model of Principal Components Analysis
C1i
C2i
D*2i
D*5i
D*10i
D*11i
D*16i
C3i
C4i
C5i
C6i
PCA vs. Exploratory Factor Analysis, Graphically
Remaining variation
Remainingvariation (
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 8
Statistical ModelStatistical Model
For Factor Analysis …For Factor Analysis …
iiii
iiii
iiii
iiii
iiii
D
D
D
D
D
1622,1611,1616
1122,1111,1111
1022,1011,1010
522,511,55
222,211,22
...
...
...
...
...
Given the X’s (s), estimate the ’s, guess the ’s,compute the ’s ... Too many parameters!
Given the X’s (s), estimate the ’s, guess the ’s,compute the ’s ... Too many parameters!
For Principal Components Analysis …For Principal Components Analysis …
Given the X’s (s), pick the a’s, & compute the PC’s ; this is the “Eigenvalue” problem, solved!!!
Given the X’s (s), pick the a’s, & compute the PC’s ; this is the “Eigenvalue” problem, solved!!!
Unambiguous Set of Orthogonal CompositesUnambiguous Set of Orthogonal Composites
The answer is determined completely by the dataThe answer is determined completely by the data
As written, there is no unique solutionAs written, there is no unique solution
But by setting a scale for the latent variables and constraining their interrelationships, we can arrive at useful and informative solutions.
But by setting a scale for the latent variables and constraining their interrelationships, we can arrive at useful and informative solutions.
PCA vs. EFA, Formulaically
*1616,6
*55,6
*22,66
*1616,5
*55,5
*22,55
*1616,4
*55,4
*22,44
*1616,3
*55,3
*22,33
*1616,2
*55,2
*22,22
*1616,1
*55,1
*22,11
iiii
iiii
iiii
iiii
iiii
iiii
DaDaDaC
DaDaDaC
DaDaDaC
DaDaDaC
DaDaDaC
DaDaDaC
*-----------------------------------------------------------------------* Exploratory factor analysis of TPD "Criticisms" sub-scale, on its own*----------------------------------------------------------------------- factor D2 D5 D10 D11 D16, pf factor D2 D5 D10 D11 D16, pcf factor D2 D5 D10 D11 D16, ipf factor D2 D5 D10 D11 D16, ml
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 9
Because the problem is ill-specified, there has been an oversaturated literature of methods to operationalize the number of dimensions and arrive at “ideal” solutions. This is like arguing over the best way to visualize a univariate relationship: stem and leaf? histogram? dot-plot?
Because the problem is ill-specified, there has been an oversaturated literature of methods to operationalize the number of dimensions and arrive at “ideal” solutions. This is like arguing over the best way to visualize a univariate relationship: stem and leaf? histogram? dot-plot?
Ways of Obtaining an “Initial” Factor Solution (of either the covariance or the correlation matrix):• Alpha factor analysis,• Harris component analysis,• Image component analysis,• ML factor analysis,• Principal axis factoring,• Pattern, specified by user,• Prinit factor analysis,• Unweighted least-squares
factor analysis.
Ways Of Obtaining Initial Estimates Of The Measurement Error Variances:• Absolute SMC,• Input from external file,• Maximum absolute correlation,• Set to One,• Set to Random,• SMC.
Ways Of Rotating To A Final Factor Solution :• None,• Biquartimax• Equamax,• Orthogonal Crawford-Ferguson,• Generalized Crawford-Ferguson,• Orthomax,• Parsimax,• Quartimax,• Varimax.• Biquartimin,• Covarimin,• Harris-Kaiser Ortho-Oblique,• Oblique Biquartimax,• Oblique Equamax,• Oblique Crawford-Ferguson,• Oblique Generalized Crawford-Ferguson,• Oblimin,• Oblique Quartimax,• Oblique Varimax• Procrustes,• Promax, • Quartimin, … etc.
62 22 )7(#of Methods of EFA …at least = 1848=
“Exploratory” Factor Analysis
*--------------------------------------------------------------------------* Unidimensional factor analysis of TPD "Criticisms" sub-scale, with standardized variables. Compares the "factor" and "sem" commands.*--------------------------------------------------------------------------
* The principal factor method (a classic exploratory technique)factor D2 D5 D10 D11 D16, factor(1)
* Maximum likelihoodfactor D2 D5 D10 D11 D16, ml factor(1)
* Single Factor EFAsem (ETA1 -> D2 D5 D10 D11 D16), nocapslatent latent(ETA1) standardized
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 10
EFA is a specific and unstructured special case in a broader modeling framework known as “structural equation modeling” (SEM).Often more “confirmatory” than “exploratory,” incorporating both “measurement” components (multiple indicators of a latent construct) and “structural” components (prediction for latent variables).
EFA is a specific and unstructured special case in a broader modeling framework known as “structural equation modeling” (SEM).Often more “confirmatory” than “exploratory,” incorporating both “measurement” components (multiple indicators of a latent construct) and “structural” components (prediction for latent variables).
iii
iii
iii
iii
iii
D
D
D
D
D
1611,16*16
1111,11*11
1011,10*10
511,5*5
211,2*2
See Unit7a.doSee Unit7a.do
Hypothesis: Indicators of the “Criticisms” subscale have a “unidimensional” factor structure in the population.
Hypothesized Uni-Dimensional Factor Structure for the “Criticisms” Sub-Scale…Hypothesized Uni-Dimensional Factor Structure for the “Criticisms” Sub-Scale…
D2i
D5i
D10i
D11i
D16i
2i
5i
10i
11i
16i
1i
Towards Structural Equation Modeling
We can contrast this with the equation for the first principal component:
In unidimensional FA, each indicator is predicted by a single factor.
In single-component PCA, the first principal component is the weighted sum of all indicators.
We can contrast this with the equation for the first principal component:
In unidimensional FA, each indicator is predicted by a single factor.
In single-component PCA, the first principal component is the weighted sum of all indicators.
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 11
More than conventional analyses, SEM techniques operationalize “the data” not as values for individual observations but as the elements of the covariance matrix.More than conventional analyses, SEM techniques operationalize “the data” not as values for individual observations but as the elements of the covariance matrix.
Structural Equation Modeling: The Data
D16 0.4677 0.5575 0.7501 0.5721 1.0000 D11 0.3212 0.7997 0.6338 1.0000 D10 0.4411 0.5932 1.0000 D5 0.3452 1.0000 D2 1.0000 D2 D5 D10 D11 D16
(obs=399). correlate D2 D5 D10 D11 D16
D16 .798598 1.36083 1.88873 1.39402 2.09873 D11 .636824 2.2661 1.85272 2.82878 D10 .903685 1.73714 3.02106 D5 .685621 2.83883 D2 1.3894 D2 D5 D10 D11 D16
(obs=399). correlate D2 D5 D10 D11 D16, covariance
The Sample Correlation Matrix
The Sample Correlation Matrix
The Sample Covariance Matrix
The Sample Covariance Matrix
In structural equation modeling, the “Goodness of Fit” of the model is most often represented by how well the model’s “implied covariance matrix” reproduces the sample covariance matrix.
Recall that a loose operational definition of the “degrees of freedom” is the number of observations minus the number of parameters estimated in the model.
In the SEM world, the number of observations is 15, the number of unique elements in the variance covariance matrix. For a number of variables, , the number of “observations” is .
In structural equation modeling, the “Goodness of Fit” of the model is most often represented by how well the model’s “implied covariance matrix” reproduces the sample covariance matrix.
Recall that a loose operational definition of the “degrees of freedom” is the number of observations minus the number of parameters estimated in the model.
In the SEM world, the number of observations is 15, the number of unique elements in the variance covariance matrix. For a number of variables, , the number of “observations” is .
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 12
The “principal factor” method for “extracting factors” simply replaces the 1’s in the correlation matrix (the correlation of a variable with itself) with the R-squared value from a regression of that variable on all other variables (This is an extremely ad hoc estimate of the reliability of that variable). Then, we run a PCA as usual!
The “principal factor” method for “extracting factors” simply replaces the 1’s in the correlation matrix (the correlation of a variable with itself) with the R-squared value from a regression of that variable on all other variables (This is an extremely ad hoc estimate of the reliability of that variable). Then, we run a PCA as usual!
Unidimensional EFA: The “Principal Factor” Extraction Method
D16 0.4677 0.5575 0.7501 0.5721 1.0000 D11 0.3212 0.7997 0.6338 1.0000 D10 0.4411 0.5932 1.0000 D5 0.3452 1.0000 D2 1.0000 D2 D5 D10 D11 D16
(obs=399). correlate D2 D5 D10 D11 D16
Hypothesized Uni-Dimensional Factor Structure for the “Criticisms” Sub-Scale…Hypothesized Uni-Dimensional Factor Structure for the “Criticisms” Sub-Scale…
D2i
D5i
D10i
D11i
D16i
2i
5i
10i
11i
16i
1i
LR test: independent vs. saturated: chi2(10) = 1079.76 Prob>chi2 = 0.0000 Factor5 -0.14570 . -0.0517 1.0000 Factor4 -0.13318 0.01252 -0.0473 1.0517 Factor3 -0.04086 0.09232 -0.0145 1.0990 Factor2 0.30092 0.34179 0.1068 1.1135 Factor1 2.83696 2.53604 1.0067 1.0067 Factor Eigenvalue Difference Proportion Cumulative
Rotation: (unrotated) Number of params = 5 Method: principal factors Retained factors = 1Factor analysis/correlation Number of obs = 399
(obs=399). factor D2 D5 D10 D11 D16, pf factor(1)
We can model error variance after running PCA on the adjusted correlation matrix, where the diagonals are the R-squared values of a regressions of each variable on all the remaining variables.
For D2, this R-squared value is .2423. We substitute that for the 1 in the
correlation matrix, for each variable, and run a PCA on the resulting matrix.
We can model error variance after running PCA on the adjusted correlation matrix, where the diagonals are the R-squared values of a regressions of each variable on all the remaining variables.
For D2, this R-squared value is .2423. We substitute that for the 1 in the
correlation matrix, for each variable, and run a PCA on the resulting matrix.
The “factor” command does this automatically.
Familiar PCA results. Negative eigenvalues because the matrix is not well formed due to its ad hoc construction, but we only keep
one factor anyways.
The “factor” command does this automatically.
Familiar PCA results. Negative eigenvalues because the matrix is not well formed due to its ad hoc construction, but we only keep
one factor anyways.
D16 .4677 .5575 .7501 .5721 .603D11 .3212 .7997 .6338 .6805D10 .4411 .5932 .6337 D5 .3452 .6587 D2 .2423 D2 D5 D10 D11 D16symmetric RFAC[5,5]
. matrix list RFAC, format(%6.0g)
D16 0.7863 0.3818 D11 0.8187 0.3297 D10 0.8168 0.3328 D5 0.8016 0.3574 D2 0.4886 0.7613 Variable Factor1 Uniqueness
Factor loadings (pattern matrix) and unique variances
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 13
The “factor loadings” are the estimated correlations between each variable and the respective factor. The “uniqueness” is the estimated error variance left unaccounted for by the factor structure.The “factor loadings” are the estimated correlations between each variable and the respective factor. The “uniqueness” is the estimated error variance left unaccounted for by the factor structure.
Unidimensional EFA: The “Principal Factor” Extraction Method
Hypothesized Uni-Dimensional Factor Structure for the “Criticisms” Sub-Scale…Hypothesized Uni-Dimensional Factor Structure for the “Criticisms” Sub-Scale…
D2i
D5i
D10i
D11i
D16i
2i
5i
10i
11i
16i
1i Again, we can contrast this with the equation for the first principal
component:
In unidimensional FA, each indicator is predicted by a single factor. The unexplained variation is formally modeled as an error variance.
In single-component PCA, the first principal component is the weighted sum of all indicators. The unexplained variation is informally represented by the remaining principal components.
Again, we can contrast this with the equation for the first principal
component:
In unidimensional FA, each indicator is predicted by a single factor. The unexplained variation is formally modeled as an error variance.
In single-component PCA, the first principal component is the weighted sum of all indicators. The unexplained variation is informally represented by the remaining principal components.
D16 0.7569 0.4271 D11 0.8470 0.2826 D10 0.7972 0.3645 D5 0.8279 0.3146 D2 0.4697 0.7794 Variable Factor1 Uniqueness
Factor loadings (pattern matrix) and unique variances
LR test: 1 factor vs. saturated: chi2(5) = 158.49 Prob>chi2 = 0.0000 LR test: independent vs. saturated: chi2(10) = 1079.76 Prob>chi2 = 0.0000 Factor1 2.83182 . 1.0000 1.0000 Factor Eigenvalue Difference Proportion Cumulative
Log likelihood = -79.87649 (Akaike's) AIC = 169.753 Schwarz's BIC = 189.698 Rotation: (unrotated) Number of params = 5 Method: maximum likelihood Retained factors = 1Factor analysis/correlation Number of obs = 399
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 14
By assuming multivariate normality (not always a realistic assumption), we gain the ability to conduct likelihood ratio tests.By assuming multivariate normality (not always a realistic assumption), we gain the ability to conduct likelihood ratio tests.
Unidimensional EFA: The Maximum Likelihood Extraction Method
The reference point is the “perfectly fitting” saturated model, that reproduces the covariance matrix exactly. This is the best possible fit, with 0 degrees of freedom.
This is the OPPOSITE of our conventional approach, which usually starts from bad fit!
The independent model estimates the five indicator variances but no covariances. (15 observations minus 5 parameters = 10 df)
Compared to the saturated model, the independent model effectively fixes all pairwise relationships (10 degrees of freedom). The null hypothesis is that this fits just as well as the saturated model. It does not. It is worse.
The badness of fit of the independent model is rarely surprising but can serve as a reference.
The unidimensional model estimates 10 parameters (5 error variances and 5 loadings), for 15 observations – 10 parameters = 5 degrees of freedom.
The provided test shows that relative badness of fit is not due to chance and is worse in the population (not good news, but not uncommon).
These chi-square tests are “badness of fit tests” where we want Chi-square tests are seen as ultra-conservative in this literature. There is a
ridiculous number of alternative fit statistics that will make you look better! estat gof, stats(all) and google your hearts out.
Saturated Model Fitted Model Independent Model
Fit Perfect Hypothesized Usually Bad
Degrees of Freedom 15 obs – 15 parameters = 0 15 obs – 5errvar – 5load = 5 15 obs – 5 errvar/var = 10
0 158.49 1079.76
Results differ slightly from the principal factor extraction method.
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 15
EFA is a special case of a universe of SEM models that can be specified either graphically or on the command line in Stata: sem (ETA1 -> D2 D5 D10 D11 D16), nocapslatent latent(ETA1) standardizedEFA is a special case of a universe of SEM models that can be specified either graphically or on the command line in Stata: sem (ETA1 -> D2 D5 D10 D11 D16), nocapslatent latent(ETA1) standardized
The sem Command in Stata 12+ (Graphical User Interface)
Estimated error variance (.78). Due to standardization, this can be interpreted as the proportion of variance in D2 left unexplained by the factor structure.Called “uniqueness” in the factor analytic literature and output.
The “factor loading,” in this case, standardized, the estimated correlation between the factor and the observed variable. Two individuals that differ by one unit on the unobserved factor differ by 0.47 standard deviation units on D2,
Because of the “standardized” option, the variance of the latent variable and all indicators have been set to 1.
The scale of the latent variable MUST be set somehow. This is one way to do it (but not the only way). This is sometimes called the “unit variance constraint” or “unit variance identification.”
We also often see “unit loading identification,” where the variance of the latent variable is estimated, but a single loading is set to 1.
This choice between UVI and ULI does not change model fit.
These are the “constants” in the latent regression equation and are often ignored.
Latent: ETA1
Exogenous variables
Measurement: D2 D5 D10 D11 D16
Endogenous variables
specify option 'method(mlmv)' to use all observations)(12 observations with missing values excluded;. sem (ETA1 -> D2 D5 D10 D11 D16), nocapslatent latent(ETA1) standardized
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 16
EFA is a special case of a universe of SEM models that can be specified either graphically or on the command line in Stata: sem (ETA1 -> D2 D5 D10 D11 D16), nocapslatent latent(ETA1) standardizedEFA is a special case of a universe of SEM models that can be specified either graphically or on the command line in Stata: sem (ETA1 -> D2 D5 D10 D11 D16), nocapslatent latent(ETA1) standardized
Interpreting output directly from the sem command
Because we capitalize our variables (not conventional), we have to use this nocapslatent option.
Otherwise, Stata assumes capitalized variables are latent variables.
The standardized command reports standardized output and is often used for reporting when scales are not meaningful.
Note, however, that the model is fit to the covariance matrix, and then values are standardized.
Endogenous – On the receiving end of the arrows, the outcome variables being “impacted.”Exogenous – At the start of the arrows, the predictor variables “having an impact.” Latent in this case.
LR test of model vs. saturated: chi2(5) = 159.75, Prob > chi2 = 0.0000 ETA1 1 . . . e.D16 .4272563 .0515827 .3372274 .54132 e.D11 .282488 .0458443 .205523 .3882752 e.D10 .3645842 .0493314 .2796552 .4753055 e.D5 .3144491 .0467755 .2349259 .4208912 e.D2 .7794197 .0418701 .7015283 .8659594Variance _cons 3.133521 .1216993 25.75 0.000 2.894995 3.372047 ETA1 .7567983 .0340795 22.21 0.000 .6900037 .823593 D16 <- _cons 3.116814 .1211605 25.72 0.000 2.879344 3.354284 ETA1 .8470608 .0270608 31.30 0.000 .7940225 .900099 D11 <- _cons 2.743128 .109251 25.11 0.000 2.529 2.957256 ETA1 .7971297 .0309432 25.76 0.000 .7364822 .8577772 D10 <- _cons 3.142568 .1219913 25.76 0.000 2.90347 3.381667 ETA1 .82798 .0282468 29.31 0.000 .7726174 .8833426 D5 <- _cons 3.502062 .1336983 26.19 0.000 3.240018 3.764105 ETA1 .4696598 .0445749 10.54 0.000 .3822946 .557025 D2 <- Measurement Standardized Coef. Std. Err. z P>|z| [95% Conf. Interval] OIM ( 1) [D2]ETA1 = 1
Log likelihood = -3214.554Estimation method = mlStructural equation model Number of obs = 399
All the loadings, error variances, (and constants) seen in the previous path diagram.
The same chi-square “badness of fit” test from the previous slide. Degrees of freedom equal to 15 observations in the covariance matrix , subtracting the 10 parameters estimated (5 loadings, 5 error variances), makes a , five df.
Rejection means a significantly worse fit than the saturated (perfectly fitting) model.
The unidimensional model implies a covariance (in this case, correlation) matrix whose fit is significantly worse than the saturated model..
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 17
Building Models with SEM
Test scale 0.3955 0.8548 mean(standardized items) D21 383 - 0.5321 0.3992 0.4279 0.8568 Starting with students interests made them attentiveD18 383 - 0.5811 0.4567 0.4172 0.8513 Students seem to follow classroom rulesD13 383 - 0.6550 0.5454 0.4010 0.8427 You believe students can learn the materialD6 383 - 0.6588 0.5500 0.4002 0.8422 Examples helped students understand main conceptD16 383 + 0.7728 0.6927 0.3753 0.8278 Wonder if students understand your sarcasmD11 383 + 0.8194 0.7530 0.3651 0.8215 You like to worry about whether students like youD10 383 + 0.8013 0.7295 0.3691 0.8239 Task didnt help kids achieve your objectiveD5 383 + 0.8130 0.7446 0.3665 0.8223 Not clear how lesson related to curric standardsD2 383 + 0.4885 0.3492 0.4375 0.8615 You underestimated students knowledge of topic Item Obs Sign corr. corr. corr. alpha Label item-test item-rest interitem
Test scale = mean(standardized items)
. alpha D2-D21, label item casewise std
Model 1: Maybe there are two CORRELATED factors underlying item responses on teacher defensiveness scale.
Model 2: Maybe there are two correlated factors that have a “simple structure,” where one factor explains responses on items where teachers take criticism, and one factor explains responses on items where teachers take compliments.
Model 3: If the simple structure model holds, how well can we predict your latent “ability to hear compliments” from your latent “ability to hear criticism?
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 18
An SEM Progression: Step 1, Correlated Factors, Fully Crossed Loadings
LR test of model vs. saturated: chi2(17) = 271.15, Prob > chi2 = 0.0000 ETA2 .3144086 56.3613 0.01 0.996 -110.1517 110.7805 ETA1 Covariance
sem (ETA1 ETA2 -> D2 D5 D10 D11 D16 D6 D13 D18 D21), ///nocapslatent latent(ETA1 ETA2) standardized
sem (ETA1 ETA2 -> D2 D5 D10 D11 D16 D6 D13 D18 D21), ///nocapslatent latent(ETA1 ETA2) standardized
observations, 45 observations – (18 loadings + 9 error variances + 1 correlation) = 17 degrees of freedom. The model does not fit as well as the saturated model, in the population. observations, 45 observations – (18 loadings + 9 error variances + 1 correlation) = 17 degrees of freedom. The model does not fit as well as the saturated model, in the population.
LR test of model vs. saturated: chi2(26) = 685.10, Prob > chi2 = 0.0000 ETA2 -.7626425 .0372352 -20.48 0.000 -.8356222 -.6896629 ETA1 Covariance
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 19
An SEM Progression: Step 2, “Simple Structure” Confirmatory FAsem (ETA1 -> D2 D5 D10 D11 D16) (ETA2 -> D6 D13 D18 D21), /// nocapslatent latent(ETA1 ETA2) standardizedsem (ETA1 -> D2 D5 D10 D11 D16) (ETA2 -> D6 D13 D18 D21), /// nocapslatent latent(ETA1 ETA2) standardized
observations, 45 observations – (9 loadings + 9 error variances + 1 correlation) = 26 degrees of freedom. The model does not fit as well as the saturated model, in the population.
The fit is also significantly worse than the fully crossed model . But we will continue as an illustration, or assuming other fit statistics are agreeable: estat gof, stats(all) (they aren’t agreeable, but let’s continue):
observations, 45 observations – (9 loadings + 9 error variances + 1 correlation) = 26 degrees of freedom. The model does not fit as well as the saturated model, in the population.
The fit is also significantly worse than the fully crossed model . But we will continue as an illustration, or assuming other fit statistics are agreeable: estat gof, stats(all) (they aren’t agreeable, but let’s continue):
ETA1 -.7626425 .0372352 -20.48 0.000 -.8356222 -.6896628 ETA2 <- Structural Standardized Coef. Std. Err. z P>|z| [95% Conf. Interval] OIM
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 20
An SEM Progression: Step 3, Latent Variable Regression
sem (ETA1 -> D2 D5 D10 D11 D16) (ETA2 -> D6 D13 D18 D21) /// (ETA1 -> ETA2), nocapslatent latent(ETA1 ETA2) standardizedsem (ETA1 -> D2 D5 D10 D11 D16) (ETA2 -> D6 D13 D18 D21) /// (ETA1 -> ETA2), nocapslatent latent(ETA1 ETA2) standardized
observations, 45 observations – (9 loadings + 9 error variances + 1 regression coefficient) = 26 degrees of freedom. The fit is equivalent to the last model and does not fit as well as the saturated model, in the population.
I encourage you to see fit as comparative, not absolute, and do not play the “SEM fit game,” where we add and remove loadings, latent factors, and error covariances until we find a model that fits.
observations, 45 observations – (9 loadings + 9 error variances + 1 regression coefficient) = 26 degrees of freedom. The fit is equivalent to the last model and does not fit as well as the saturated model, in the population.
I encourage you to see fit as comparative, not absolute, and do not play the “SEM fit game,” where we add and remove loadings, latent factors, and error covariances until we find a model that fits.
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 21
Comparing correlations between observed and latent variables
𝑋 1∗ 𝑋 2
∗
X2 -0.5891 1.0000 X1 1.0000 X1 X2
(obs=383). correlate X1 X2.
(17 missing values generated). generate X2 = D6+D13+D18+D21
(12 missing values generated). generate X1 = D2+D5+D10+D11+D16.
-.5891 Correlations between observed variables will always be “attenuated,” smaller in magnitude and closer to 0.
The sem command estimates the correlation between the latent variables and takes the lack of reliability of the indicators into account.
This “disattenuates” the correlation, attempting to reinflate the observed correlation to account for the measurement error as informed by the intercorrelations between the indicators.
Correlations between observed variables will always be “attenuated,” smaller in magnitude and closer to 0.
The sem command estimates the correlation between the latent variables and takes the lack of reliability of the indicators into account.
This “disattenuates” the correlation, attempting to reinflate the observed correlation to account for the measurement error as informed by the intercorrelations between the indicators.
_cons 27.90022 .660665 42.23 0.000 . X1 -.3813722 .0267974 -14.23 0.000 -.5891435 X2 Coef. Std. Err. t P>|t| Beta
Total 6410.46475 382 16.7813213 Root MSE = 3.3144 Adj R-squared = 0.3454 Residual 4185.45594 381 10.9854487 R-squared = 0.3471 Model 2225.00881 1 2225.00881 Prob > F = 0.0000 F( 1, 381) = 202.54 Source SS df MS Number of obs = 383
. regress X2 X1, beta
© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 22
Comparing correlations between observed and latent variables
𝑋 1∗ 𝑋 2
∗-.5891 Simple regression coefficients between observed variables will always be “attenuated,” smaller in magnitude and closer to 0.
In multiple regression, the bias due to measurement error will be unpredictable!
The sem command estimates the regression coefficients between the latent variables and takes the lack of reliability of the indicators into account.
This corrects the regression coefficients, attempting to account for the measurement error as informed by the intercorrelations between the indicators.
This seems important, no?
Simple regression coefficients between observed variables will always be “attenuated,” smaller in magnitude and closer to 0.
In multiple regression, the bias due to measurement error will be unpredictable!
The sem command estimates the regression coefficients between the latent variables and takes the lack of reliability of the indicators into account.
This corrects the regression coefficients, attempting to account for the measurement error as informed by the intercorrelations between the indicators.
This seems important, no?