unit 7a: factor analysis © andrew ho, harvard graduate school of educationunit 7a – slide 1

22
Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 1 ttp://xkcd.com/419/

Upload: jasmine-george

Post on 26-Dec-2015

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

Unit 7a: Factor Analysis

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 1

http://xkcd.com/419/

Page 2: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

• From principal components to factor analysis• From factor analysis to structural equation modeling• Interpreting factor analytic results

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 2

Multiple RegressionAnalysis (MRA)

Multiple RegressionAnalysis (MRA) iiii XXY 22110

Do your residuals meet the required assumptions?

Test for residual

normality

Use influence statistics to

detect atypical datapoints

If your residuals are not independent,

replace OLS by GLS regression analysis

Use Individual

growth modeling

Specify a Multi-level

Model

If time is a predictor, you need discrete-

time survival analysis…

If your outcome is categorical, you need to

use…

Binomial logistic

regression analysis

(dichotomous outcome)

Multinomial logistic

regression analysis

(polytomous outcome)

If you have more predictors than you

can deal with,

Create taxonomies of fitted models and compare

them.

Form composites of the indicators of any common

construct.

Conduct a Principal Components Analysis

Use Cluster Analysis

Use non-linear regression analysis.

Transform the outcome or predictor

If your outcome vs. predictor relationship

is non-linear,

Use Factor Analysis:EFA or CFA?

Course Roadmap: Unit 7a

Today’s Topic Area

Page 3: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 3

Dataset FOWLER.txt

Overview

In this study, which was her qualifying paper, Amy Fowler reports on the piloting and construct validation of a self-designed survey instrument to measure teachers’ professional defensiveness within the context of a professional relationship with a peer.  The dataset contains responses from teachers on this Teacher Professional Defensiveness Scale (TPDS), and on two other published scales – the Specific Interpersonal Trust Scale (SITS) and the Fear of Negative Evaluation Scale (FNES) (Robinson et al., 1991).

Source

Fowler, A.M. (2009).  Measuring Teacher Defensiveness:  An Instrument Validation Study. Unpublished Qualifying Paper, Harvard Graduate School of Education.

Robinson, J. P., Shaver, P. R., Wrightsman, L. S., & Andrews, F. M. (1991).  Measures Of Personality And Social Psychological Attitudes.  San Diego: Academic Press.

Additional Info

The three instruments are described more fully in Fowler (2009), but briefly are:·   Specific Interpersonal Trust Scale (SITS): The SITS scale assesses the level of

trust one individual places in another. Participants choose a response that reflects their feelings towards a referent peer, on a five-point scale (see Metric below).  There are 8 items, 3 of which must be reverse coded when totaling the responses and then larger total scores represent a higher level of respondents’ trust in their selected peers.

·   Teacher Professional Defensiveness Scale (TPDS): The TPDS Scale was designed by the investigator to assess teacher sensitivity to peer feedback.  Respondents are directed to imagine their referent peer making a series of comments regarding the respondent’s teaching practice and are then asked to respond to these comments on a seven-point scale (see Metric below). The scale contains 21 items presented in random order; statement stems are alternately complimentary,neutral or critical.

·   Fear of Negative Evaluation Scale (FNES): The FNES scale assesses the degree of fear respondents have regarding negative evaluation from others.  Respondents are asked to assess how characteristic each statement is of their own feelings, on a five-point scale (see Metric below). There are 12 items, including 4 for which responses must be reverse coded when totaling the responses and then larger total scores indicate higher levels of fear towards negative evaluation.

Sample size

411 Teachers

Construct Validation:A New Instrument to Measure

Teacher Professional Defensiveness

Three Sub-Scales, each with 7 items:

• “Criticisms”• “Compliments”• “Neutral”

Research Objective:• Check that three sub-scales of

TPDS are each separately uni-dimensional.

• Check that the constructs measured by each sub-scale are distinct from each other (but perhaps correlated).

• Check the relationship between the TPD sub-scales and other accepted measures (SITS, FNES).

Teacher “Professional Defensiveness”

Page 4: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

Col#

VarName

Variable Description Variable Metric/Labels

1 TID Teacher identification code? Integer

2 SID School identification code? Integer

Teacher Professional Defensiveness Scale (TPDS):

16 D1I like the tone of your room; it is friendly but serious about school at the same time.

1 = High Criticism2 = Criticism3 = Mild Criticism4 = Neither5 = Mild Compliment6 = Compliment7 = High Compliment

17 D2I think you underestimated student’s prior knowledge of today’s topic.

20 D5 I’m not clear how today’s lesson related to the curriculum standards.

21 D6The examples you used to explain the main concept helped students to understand the big ideas.

25 D10I’m not sure the task you had kids do required kids to achieve the objective you had on the board.

26 D11 It seems like you worry about whether or not the students like you.

28 D13I can tell by the way you talk to the students that you believe they can learn this material.

29 D14I wonder if the students learned the concept you wanted them to learn through that hands-on lesson.

31 D16I wonder if students in the class understand your sarcasm in the same way you mean it.

33 D18 Students seem to follow the classroom rules.

36 D21The way you started the class with students’ interests got them involved and attentive to the lesson.

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 4

Indicators ofTeacher Professional

Defensiveness:“Criticisms” Sub-

Scale(We’ll look at Amy’s

final 5 Items)

Indicators ofTeacher Professional

Defensiveness:“Compliments ” Sub-

Scale(Final 4 Items)

Read Each Item. Take the Test. Know your scale.

The assumption here is that some teachers

consistently score higher (or lower) than others on these items, a possible measure of

defensiveness or sensitivity to criticism

from peers.

Page 5: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

010

2030

4050

Per

cent

1 2 3 4 5 6 7You underestimated students knowledge of topic

010

2030

Per

cent

1 2 3 4 5 6 7Not clear how lesson related to curric standards

05

1015

2025

Per

cent

1 2 3 4 5 6 7Task didnt help kids achieve your objective

010

2030

Per

cent

1 2 3 4 5 6 7You like to worry about whether students like you

05

1015

2025

Per

cent

1 2 3 4 5 6 7Wonder if students understand your sarcasm

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 5

Teachers’ responses to the items on the Criticisms sub-scale have

different levels of variability across items. This is common

and raises the question of whether this variability is

comparable.

 Variable Label D2 You underestimated students knowledge of today’s topic D5 Not clear how lesson related to curriculum standards. D10 Task didn’t help kids achieve your objective. D11 You like to worry about whether students like you. D16 Wonder if students understand your sarcasm.

 Variable Label D2 You underestimated students knowledge of today’s topic D5 Not clear how lesson related to curriculum standards. D10 Task didn’t help kids achieve your objective. D11 You like to worry about whether students like you. D16 Wonder if students understand your sarcasm.

Let’s focus first on the “Criticisms” subscale of Teacher Professional Defensiveness…Let’s focus first on the “Criticisms” subscale of Teacher Professional Defensiveness…

On average, teachers were more sensitive/defensive to certain items than others.

This is common and usually a good thing (some items are more “difficult” than others).

Exploratory Data Analysis – Univariate

1 = High Compliment2 = Compliment3 = Mild Compliment4 = Neither5 = Mild Criticism6 = Criticism7 = High Criticism

Page 6: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

Youunderestimated

studentsknowledge of

topic

Not clearhow lessonrelated to

curricstandards

Task didnthelp kidsachieve

yourobjective

You like toworry about

whetherstudentslike you

Wonder ifstudents

understandyour

sarcasm

1 2 3 4 5 6 7

1234567

1 2 3 4 5 6 7

1234567

1 2 3 4 5 6 7

1234567

1 2 3 4 5 6 7

1234567

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 6

Let’s focus first on the “Criticisms” subscale of Teacher Professional Defensiveness…Let’s focus first on the “Criticisms” subscale of Teacher Professional Defensiveness…

Pearson Correlation Coefficients, N = 399(Estimated under listwise deletion)

D2 D5 D10 D11 D16

D2 1.0000 D5 0.3452 1.0000 D10 0.4411 0.5932 1.0000 D11 0.3212 0.7997 0.6338 1.0000 D16 0.4677 0.5575 0.7501 0.5721 1.0000

Pearson Correlation Coefficients, N = 399(Estimated under listwise deletion)

D2 D5 D10 D11 D16

D2 1.0000 D5 0.3452 1.0000 D10 0.4411 0.5932 1.0000 D11 0.3212 0.7997 0.6338 1.0000 D16 0.4677 0.5575 0.7501 0.5721 1.0000

The sample bivariate correlations among the indicators on the “Criticisms” sub-scale are all positive, and the relationships do not appear to be obviously nonlinear. Could the subscale be a useful indicator of a single common construct? Could the subscale be unidimensional?

Exploratory Data Analysis – Bivariate

 Variable Label D2 You underestimated students knowledge of today’s topic D5 Not clear how lesson related to curriculum standards. D10 Task didn’t help kids achieve your objective. D11 You like to worry about whether students like you. D16 Wonder if students understand your sarcasm.

 Variable Label D2 You underestimated students knowledge of today’s topic D5 Not clear how lesson related to curriculum standards. D10 Task didn’t help kids achieve your objective. D11 You like to worry about whether students like you. D16 Wonder if students understand your sarcasm.

Page 7: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 7

Research Question?

We could ask … “Are There A Number Of Unseen (Latent) Factors (Constructs) Acting “Beneath” These

Indicators To Forge Their Observed Values?”

Instead, we would need …Factor Analysis (CFA or EFA?)

Path Model of Factor Analysis Path Model of Factor Analysis

D2i

D5i

D10i

D11i

D16i

2i

5i

10i

11i

16i

1i

2i

Rather than asking … “Can We Forge These Several Indicators Together Into A Smaller Number Of

Composites With Defined Statistical Properties?”

Then, we would need …Principal Components Analysis (PCA)

Path Model of Principal Components AnalysisPath Model of Principal Components Analysis

C1i

C2i

D*2i

D*5i

D*10i

D*11i

D*16i

C3i

C4i

C5i

C6i

PCA vs. Exploratory Factor Analysis, Graphically

Remaining variation

Remainingvariation (

Page 8: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 8

Statistical ModelStatistical Model

For Factor Analysis …For Factor Analysis …

iiii

iiii

iiii

iiii

iiii

D

D

D

D

D

1622,1611,1616

1122,1111,1111

1022,1011,1010

522,511,55

222,211,22

...

...

...

...

...

Given the X’s (s), estimate the ’s, guess the ’s,compute the ’s ... Too many parameters!

Given the X’s (s), estimate the ’s, guess the ’s,compute the ’s ... Too many parameters!

For Principal Components Analysis …For Principal Components Analysis …

Given the X’s (s), pick the a’s, & compute the PC’s ; this is the “Eigenvalue” problem, solved!!!

Given the X’s (s), pick the a’s, & compute the PC’s ; this is the “Eigenvalue” problem, solved!!!

Unambiguous Set of Orthogonal CompositesUnambiguous Set of Orthogonal Composites

The answer is determined completely by the dataThe answer is determined completely by the data

As written, there is no unique solutionAs written, there is no unique solution

But by setting a scale for the latent variables and constraining their interrelationships, we can arrive at useful and informative solutions.

But by setting a scale for the latent variables and constraining their interrelationships, we can arrive at useful and informative solutions.

PCA vs. EFA, Formulaically

*1616,6

*55,6

*22,66

*1616,5

*55,5

*22,55

*1616,4

*55,4

*22,44

*1616,3

*55,3

*22,33

*1616,2

*55,2

*22,22

*1616,1

*55,1

*22,11

iiii

iiii

iiii

iiii

iiii

iiii

DaDaDaC

DaDaDaC

DaDaDaC

DaDaDaC

DaDaDaC

DaDaDaC

Page 9: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

*-----------------------------------------------------------------------* Exploratory factor analysis of TPD "Criticisms" sub-scale, on its own*----------------------------------------------------------------------- factor D2 D5 D10 D11 D16, pf factor D2 D5 D10 D11 D16, pcf factor D2 D5 D10 D11 D16, ipf factor D2 D5 D10 D11 D16, ml

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 9

Because the problem is ill-specified, there has been an oversaturated literature of methods to operationalize the number of dimensions and arrive at “ideal” solutions. This is like arguing over the best way to visualize a univariate relationship: stem and leaf? histogram? dot-plot?

Because the problem is ill-specified, there has been an oversaturated literature of methods to operationalize the number of dimensions and arrive at “ideal” solutions. This is like arguing over the best way to visualize a univariate relationship: stem and leaf? histogram? dot-plot?

Ways of Obtaining an “Initial” Factor Solution (of either the covariance or the correlation matrix):• Alpha factor analysis,• Harris component analysis,• Image component analysis,• ML factor analysis,• Principal axis factoring,• Pattern, specified by user,• Prinit factor analysis,• Unweighted least-squares

factor analysis.

Ways Of Obtaining Initial Estimates Of The Measurement Error Variances:• Absolute SMC,• Input from external file,• Maximum absolute correlation,• Set to One,• Set to Random,• SMC.

Ways Of Rotating To A Final Factor Solution :• None,• Biquartimax• Equamax,• Orthogonal Crawford-Ferguson,• Generalized Crawford-Ferguson,• Orthomax,• Parsimax,• Quartimax,• Varimax.• Biquartimin,• Covarimin,• Harris-Kaiser Ortho-Oblique,• Oblique Biquartimax,• Oblique Equamax,• Oblique Crawford-Ferguson,• Oblique Generalized Crawford-Ferguson,• Oblimin,• Oblique Quartimax,• Oblique Varimax• Procrustes,• Promax, • Quartimin, … etc.

62 22 )7(#of Methods of EFA …at least = 1848=

“Exploratory” Factor Analysis

Page 10: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

*--------------------------------------------------------------------------* Unidimensional factor analysis of TPD "Criticisms" sub-scale, with standardized variables. Compares the "factor" and "sem" commands.*--------------------------------------------------------------------------

* The principal factor method (a classic exploratory technique)factor D2 D5 D10 D11 D16, factor(1)

* Maximum likelihoodfactor D2 D5 D10 D11 D16, ml factor(1)

* Single Factor EFAsem (ETA1 -> D2 D5 D10 D11 D16), nocapslatent latent(ETA1) standardized

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 10

EFA is a specific and unstructured special case in a broader modeling framework known as “structural equation modeling” (SEM).Often more “confirmatory” than “exploratory,” incorporating both “measurement” components (multiple indicators of a latent construct) and “structural” components (prediction for latent variables).

EFA is a specific and unstructured special case in a broader modeling framework known as “structural equation modeling” (SEM).Often more “confirmatory” than “exploratory,” incorporating both “measurement” components (multiple indicators of a latent construct) and “structural” components (prediction for latent variables).

iii

iii

iii

iii

iii

D

D

D

D

D

1611,16*16

1111,11*11

1011,10*10

511,5*5

211,2*2

See Unit7a.doSee Unit7a.do

Hypothesis: Indicators of the “Criticisms” subscale have a “unidimensional” factor structure in the population.

Hypothesized Uni-Dimensional Factor Structure for the “Criticisms” Sub-Scale…Hypothesized Uni-Dimensional Factor Structure for the “Criticisms” Sub-Scale…

D2i

D5i

D10i

D11i

D16i

2i

5i

10i

11i

16i

1i

Towards Structural Equation Modeling

We can contrast this with the equation for the first principal component:

In unidimensional FA, each indicator is predicted by a single factor.

In single-component PCA, the first principal component is the weighted sum of all indicators.

We can contrast this with the equation for the first principal component:

In unidimensional FA, each indicator is predicted by a single factor.

In single-component PCA, the first principal component is the weighted sum of all indicators.

Page 11: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 11

More than conventional analyses, SEM techniques operationalize “the data” not as values for individual observations but as the elements of the covariance matrix.More than conventional analyses, SEM techniques operationalize “the data” not as values for individual observations but as the elements of the covariance matrix.

Structural Equation Modeling: The Data

D16 0.4677 0.5575 0.7501 0.5721 1.0000 D11 0.3212 0.7997 0.6338 1.0000 D10 0.4411 0.5932 1.0000 D5 0.3452 1.0000 D2 1.0000 D2 D5 D10 D11 D16

(obs=399). correlate D2 D5 D10 D11 D16

D16 .798598 1.36083 1.88873 1.39402 2.09873 D11 .636824 2.2661 1.85272 2.82878 D10 .903685 1.73714 3.02106 D5 .685621 2.83883 D2 1.3894 D2 D5 D10 D11 D16

(obs=399). correlate D2 D5 D10 D11 D16, covariance

The Sample Correlation Matrix

The Sample Correlation Matrix

The Sample Covariance Matrix

The Sample Covariance Matrix

In structural equation modeling, the “Goodness of Fit” of the model is most often represented by how well the model’s “implied covariance matrix” reproduces the sample covariance matrix.

Recall that a loose operational definition of the “degrees of freedom” is the number of observations minus the number of parameters estimated in the model.

In the SEM world, the number of observations is 15, the number of unique elements in the variance covariance matrix. For a number of variables, , the number of “observations” is .

In structural equation modeling, the “Goodness of Fit” of the model is most often represented by how well the model’s “implied covariance matrix” reproduces the sample covariance matrix.

Recall that a loose operational definition of the “degrees of freedom” is the number of observations minus the number of parameters estimated in the model.

In the SEM world, the number of observations is 15, the number of unique elements in the variance covariance matrix. For a number of variables, , the number of “observations” is .

Page 12: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 12

The “principal factor” method for “extracting factors” simply replaces the 1’s in the correlation matrix (the correlation of a variable with itself) with the R-squared value from a regression of that variable on all other variables (This is an extremely ad hoc estimate of the reliability of that variable). Then, we run a PCA as usual!

The “principal factor” method for “extracting factors” simply replaces the 1’s in the correlation matrix (the correlation of a variable with itself) with the R-squared value from a regression of that variable on all other variables (This is an extremely ad hoc estimate of the reliability of that variable). Then, we run a PCA as usual!

Unidimensional EFA: The “Principal Factor” Extraction Method

D16 0.4677 0.5575 0.7501 0.5721 1.0000 D11 0.3212 0.7997 0.6338 1.0000 D10 0.4411 0.5932 1.0000 D5 0.3452 1.0000 D2 1.0000 D2 D5 D10 D11 D16

(obs=399). correlate D2 D5 D10 D11 D16

Hypothesized Uni-Dimensional Factor Structure for the “Criticisms” Sub-Scale…Hypothesized Uni-Dimensional Factor Structure for the “Criticisms” Sub-Scale…

D2i

D5i

D10i

D11i

D16i

2i

5i

10i

11i

16i

1i

LR test: independent vs. saturated: chi2(10) = 1079.76 Prob>chi2 = 0.0000 Factor5 -0.14570 . -0.0517 1.0000 Factor4 -0.13318 0.01252 -0.0473 1.0517 Factor3 -0.04086 0.09232 -0.0145 1.0990 Factor2 0.30092 0.34179 0.1068 1.1135 Factor1 2.83696 2.53604 1.0067 1.0067 Factor Eigenvalue Difference Proportion Cumulative

Rotation: (unrotated) Number of params = 5 Method: principal factors Retained factors = 1Factor analysis/correlation Number of obs = 399

(obs=399). factor D2 D5 D10 D11 D16, pf factor(1)

We can model error variance after running PCA on the adjusted correlation matrix, where the diagonals are the R-squared values of a regressions of each variable on all the remaining variables.

For D2, this R-squared value is .2423. We substitute that for the 1 in the

correlation matrix, for each variable, and run a PCA on the resulting matrix.

We can model error variance after running PCA on the adjusted correlation matrix, where the diagonals are the R-squared values of a regressions of each variable on all the remaining variables.

For D2, this R-squared value is .2423. We substitute that for the 1 in the

correlation matrix, for each variable, and run a PCA on the resulting matrix.

The “factor” command does this automatically.

Familiar PCA results. Negative eigenvalues because the matrix is not well formed due to its ad hoc construction, but we only keep

one factor anyways.

The “factor” command does this automatically.

Familiar PCA results. Negative eigenvalues because the matrix is not well formed due to its ad hoc construction, but we only keep

one factor anyways.

D16 .4677 .5575 .7501 .5721 .603D11 .3212 .7997 .6338 .6805D10 .4411 .5932 .6337 D5 .3452 .6587 D2 .2423 D2 D5 D10 D11 D16symmetric RFAC[5,5]

. matrix list RFAC, format(%6.0g)

Page 13: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

D16 0.7863 0.3818 D11 0.8187 0.3297 D10 0.8168 0.3328 D5 0.8016 0.3574 D2 0.4886 0.7613 Variable Factor1 Uniqueness

Factor loadings (pattern matrix) and unique variances

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 13

The “factor loadings” are the estimated correlations between each variable and the respective factor. The “uniqueness” is the estimated error variance left unaccounted for by the factor structure.The “factor loadings” are the estimated correlations between each variable and the respective factor. The “uniqueness” is the estimated error variance left unaccounted for by the factor structure.

Unidimensional EFA: The “Principal Factor” Extraction Method

Hypothesized Uni-Dimensional Factor Structure for the “Criticisms” Sub-Scale…Hypothesized Uni-Dimensional Factor Structure for the “Criticisms” Sub-Scale…

D2i

D5i

D10i

D11i

D16i

2i

5i

10i

11i

16i

1i Again, we can contrast this with the equation for the first principal

component:

In unidimensional FA, each indicator is predicted by a single factor. The unexplained variation is formally modeled as an error variance.

In single-component PCA, the first principal component is the weighted sum of all indicators. The unexplained variation is informally represented by the remaining principal components.

Again, we can contrast this with the equation for the first principal

component:

In unidimensional FA, each indicator is predicted by a single factor. The unexplained variation is formally modeled as an error variance.

In single-component PCA, the first principal component is the weighted sum of all indicators. The unexplained variation is informally represented by the remaining principal components.

Page 14: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

D16 0.7569 0.4271 D11 0.8470 0.2826 D10 0.7972 0.3645 D5 0.8279 0.3146 D2 0.4697 0.7794 Variable Factor1 Uniqueness

Factor loadings (pattern matrix) and unique variances

LR test: 1 factor vs. saturated: chi2(5) = 158.49 Prob>chi2 = 0.0000 LR test: independent vs. saturated: chi2(10) = 1079.76 Prob>chi2 = 0.0000 Factor1 2.83182 . 1.0000 1.0000 Factor Eigenvalue Difference Proportion Cumulative

Log likelihood = -79.87649 (Akaike's) AIC = 169.753 Schwarz's BIC = 189.698 Rotation: (unrotated) Number of params = 5 Method: maximum likelihood Retained factors = 1Factor analysis/correlation Number of obs = 399

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 14

By assuming multivariate normality (not always a realistic assumption), we gain the ability to conduct likelihood ratio tests.By assuming multivariate normality (not always a realistic assumption), we gain the ability to conduct likelihood ratio tests.

Unidimensional EFA: The Maximum Likelihood Extraction Method

The reference point is the “perfectly fitting” saturated model, that reproduces the covariance matrix exactly. This is the best possible fit, with 0 degrees of freedom.

This is the OPPOSITE of our conventional approach, which usually starts from bad fit!

The independent model estimates the five indicator variances but no covariances. (15 observations minus 5 parameters = 10 df)

Compared to the saturated model, the independent model effectively fixes all pairwise relationships (10 degrees of freedom). The null hypothesis is that this fits just as well as the saturated model. It does not. It is worse.

The badness of fit of the independent model is rarely surprising but can serve as a reference.

The unidimensional model estimates 10 parameters (5 error variances and 5 loadings), for 15 observations – 10 parameters = 5 degrees of freedom.

The provided test shows that relative badness of fit is not due to chance and is worse in the population (not good news, but not uncommon).

These chi-square tests are “badness of fit tests” where we want Chi-square tests are seen as ultra-conservative in this literature. There is a

ridiculous number of alternative fit statistics that will make you look better! estat gof, stats(all) and google your hearts out.

Saturated Model Fitted Model Independent Model

Fit Perfect Hypothesized Usually Bad

Degrees of Freedom 15 obs – 15 parameters = 0 15 obs – 5errvar – 5load = 5 15 obs – 5 errvar/var = 10

0 158.49 1079.76

Results differ slightly from the principal factor extraction method.

Page 15: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 15

EFA is a special case of a universe of SEM models that can be specified either graphically or on the command line in Stata: sem (ETA1 -> D2 D5 D10 D11 D16), nocapslatent latent(ETA1) standardizedEFA is a special case of a universe of SEM models that can be specified either graphically or on the command line in Stata: sem (ETA1 -> D2 D5 D10 D11 D16), nocapslatent latent(ETA1) standardized

The sem Command in Stata 12+ (Graphical User Interface)

Estimated error variance (.78). Due to standardization, this can be interpreted as the proportion of variance in D2 left unexplained by the factor structure.Called “uniqueness” in the factor analytic literature and output.

The “factor loading,” in this case, standardized, the estimated correlation between the factor and the observed variable. Two individuals that differ by one unit on the unobserved factor differ by 0.47 standard deviation units on D2,

Because of the “standardized” option, the variance of the latent variable and all indicators have been set to 1.

The scale of the latent variable MUST be set somehow. This is one way to do it (but not the only way). This is sometimes called the “unit variance constraint” or “unit variance identification.”

We also often see “unit loading identification,” where the variance of the latent variable is estimated, but a single loading is set to 1.

This choice between UVI and ULI does not change model fit.

These are the “constants” in the latent regression equation and are often ignored.

Page 16: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

Latent: ETA1

Exogenous variables

Measurement: D2 D5 D10 D11 D16

Endogenous variables

specify option 'method(mlmv)' to use all observations)(12 observations with missing values excluded;. sem (ETA1 -> D2 D5 D10 D11 D16), nocapslatent latent(ETA1) standardized

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 16

EFA is a special case of a universe of SEM models that can be specified either graphically or on the command line in Stata: sem (ETA1 -> D2 D5 D10 D11 D16), nocapslatent latent(ETA1) standardizedEFA is a special case of a universe of SEM models that can be specified either graphically or on the command line in Stata: sem (ETA1 -> D2 D5 D10 D11 D16), nocapslatent latent(ETA1) standardized

Interpreting output directly from the sem command

Because we capitalize our variables (not conventional), we have to use this nocapslatent option.

Otherwise, Stata assumes capitalized variables are latent variables.

The standardized command reports standardized output and is often used for reporting when scales are not meaningful.

Note, however, that the model is fit to the covariance matrix, and then values are standardized.

Endogenous – On the receiving end of the arrows, the outcome variables being “impacted.”Exogenous – At the start of the arrows, the predictor variables “having an impact.” Latent in this case.

LR test of model vs. saturated: chi2(5) = 159.75, Prob > chi2 = 0.0000 ETA1 1 . . . e.D16 .4272563 .0515827 .3372274 .54132 e.D11 .282488 .0458443 .205523 .3882752 e.D10 .3645842 .0493314 .2796552 .4753055 e.D5 .3144491 .0467755 .2349259 .4208912 e.D2 .7794197 .0418701 .7015283 .8659594Variance _cons 3.133521 .1216993 25.75 0.000 2.894995 3.372047 ETA1 .7567983 .0340795 22.21 0.000 .6900037 .823593 D16 <- _cons 3.116814 .1211605 25.72 0.000 2.879344 3.354284 ETA1 .8470608 .0270608 31.30 0.000 .7940225 .900099 D11 <- _cons 2.743128 .109251 25.11 0.000 2.529 2.957256 ETA1 .7971297 .0309432 25.76 0.000 .7364822 .8577772 D10 <- _cons 3.142568 .1219913 25.76 0.000 2.90347 3.381667 ETA1 .82798 .0282468 29.31 0.000 .7726174 .8833426 D5 <- _cons 3.502062 .1336983 26.19 0.000 3.240018 3.764105 ETA1 .4696598 .0445749 10.54 0.000 .3822946 .557025 D2 <- Measurement Standardized Coef. Std. Err. z P>|z| [95% Conf. Interval] OIM ( 1) [D2]ETA1 = 1

Log likelihood = -3214.554Estimation method = mlStructural equation model Number of obs = 399

All the loadings, error variances, (and constants) seen in the previous path diagram.

The same chi-square “badness of fit” test from the previous slide. Degrees of freedom equal to 15 observations in the covariance matrix , subtracting the 10 parameters estimated (5 loadings, 5 error variances), makes a , five df.

Rejection means a significantly worse fit than the saturated (perfectly fitting) model.

The unidimensional model implies a covariance (in this case, correlation) matrix whose fit is significantly worse than the saturated model..

Page 17: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 17

Building Models with SEM

Test scale 0.3955 0.8548 mean(standardized items) D21 383 - 0.5321 0.3992 0.4279 0.8568 Starting with students interests made them attentiveD18 383 - 0.5811 0.4567 0.4172 0.8513 Students seem to follow classroom rulesD13 383 - 0.6550 0.5454 0.4010 0.8427 You believe students can learn the materialD6 383 - 0.6588 0.5500 0.4002 0.8422 Examples helped students understand main conceptD16 383 + 0.7728 0.6927 0.3753 0.8278 Wonder if students understand your sarcasmD11 383 + 0.8194 0.7530 0.3651 0.8215 You like to worry about whether students like youD10 383 + 0.8013 0.7295 0.3691 0.8239 Task didnt help kids achieve your objectiveD5 383 + 0.8130 0.7446 0.3665 0.8223 Not clear how lesson related to curric standardsD2 383 + 0.4885 0.3492 0.4375 0.8615 You underestimated students knowledge of topic Item Obs Sign corr. corr. corr. alpha Label item-test item-rest interitem

Test scale = mean(standardized items)

. alpha D2-D21, label item casewise std

Model 1: Maybe there are two CORRELATED factors underlying item responses on teacher defensiveness scale.

Model 2: Maybe there are two correlated factors that have a “simple structure,” where one factor explains responses on items where teachers take criticism, and one factor explains responses on items where teachers take compliments.

Model 3: If the simple structure model holds, how well can we predict your latent “ability to hear compliments” from your latent “ability to hear criticism?

Page 18: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 18

An SEM Progression: Step 1, Correlated Factors, Fully Crossed Loadings

LR test of model vs. saturated: chi2(17) = 271.15, Prob > chi2 = 0.0000 ETA2 .3144086 56.3613 0.01 0.996 -110.1517 110.7805 ETA1 Covariance

sem (ETA1 ETA2 -> D2 D5 D10 D11 D16 D6 D13 D18 D21), ///nocapslatent latent(ETA1 ETA2) standardized

sem (ETA1 ETA2 -> D2 D5 D10 D11 D16 D6 D13 D18 D21), ///nocapslatent latent(ETA1 ETA2) standardized

observations, 45 observations – (18 loadings + 9 error variances + 1 correlation) = 17 degrees of freedom. The model does not fit as well as the saturated model, in the population. observations, 45 observations – (18 loadings + 9 error variances + 1 correlation) = 17 degrees of freedom. The model does not fit as well as the saturated model, in the population.

Page 19: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

LR test of model vs. saturated: chi2(26) = 685.10, Prob > chi2 = 0.0000 ETA2 -.7626425 .0372352 -20.48 0.000 -.8356222 -.6896629 ETA1 Covariance

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 19

An SEM Progression: Step 2, “Simple Structure” Confirmatory FAsem (ETA1 -> D2 D5 D10 D11 D16) (ETA2 -> D6 D13 D18 D21), /// nocapslatent latent(ETA1 ETA2) standardizedsem (ETA1 -> D2 D5 D10 D11 D16) (ETA2 -> D6 D13 D18 D21), /// nocapslatent latent(ETA1 ETA2) standardized

observations, 45 observations – (9 loadings + 9 error variances + 1 correlation) = 26 degrees of freedom. The model does not fit as well as the saturated model, in the population.

The fit is also significantly worse than the fully crossed model . But we will continue as an illustration, or assuming other fit statistics are agreeable: estat gof, stats(all) (they aren’t agreeable, but let’s continue):

observations, 45 observations – (9 loadings + 9 error variances + 1 correlation) = 26 degrees of freedom. The model does not fit as well as the saturated model, in the population.

The fit is also significantly worse than the fully crossed model . But we will continue as an illustration, or assuming other fit statistics are agreeable: estat gof, stats(all) (they aren’t agreeable, but let’s continue):

Page 20: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

ETA1 -.7626425 .0372352 -20.48 0.000 -.8356222 -.6896628 ETA2 <- Structural Standardized Coef. Std. Err. z P>|z| [95% Conf. Interval] OIM

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 20

An SEM Progression: Step 3, Latent Variable Regression

sem (ETA1 -> D2 D5 D10 D11 D16) (ETA2 -> D6 D13 D18 D21) /// (ETA1 -> ETA2), nocapslatent latent(ETA1 ETA2) standardizedsem (ETA1 -> D2 D5 D10 D11 D16) (ETA2 -> D6 D13 D18 D21) /// (ETA1 -> ETA2), nocapslatent latent(ETA1 ETA2) standardized

observations, 45 observations – (9 loadings + 9 error variances + 1 regression coefficient) = 26 degrees of freedom. The fit is equivalent to the last model and does not fit as well as the saturated model, in the population.

I encourage you to see fit as comparative, not absolute, and do not play the “SEM fit game,” where we add and remove loadings, latent factors, and error covariances until we find a model that fits.

observations, 45 observations – (9 loadings + 9 error variances + 1 regression coefficient) = 26 degrees of freedom. The fit is equivalent to the last model and does not fit as well as the saturated model, in the population.

I encourage you to see fit as comparative, not absolute, and do not play the “SEM fit game,” where we add and remove loadings, latent factors, and error covariances until we find a model that fits.

Page 21: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 21

Comparing correlations between observed and latent variables

𝑋 1∗ 𝑋 2

X2 -0.5891 1.0000 X1 1.0000 X1 X2

(obs=383). correlate X1 X2.

(17 missing values generated). generate X2 = D6+D13+D18+D21

(12 missing values generated). generate X1 = D2+D5+D10+D11+D16.

-.5891 Correlations between observed variables will always be “attenuated,” smaller in magnitude and closer to 0.

The sem command estimates the correlation between the latent variables and takes the lack of reliability of the indicators into account.

This “disattenuates” the correlation, attempting to reinflate the observed correlation to account for the measurement error as informed by the intercorrelations between the indicators.

Correlations between observed variables will always be “attenuated,” smaller in magnitude and closer to 0.

The sem command estimates the correlation between the latent variables and takes the lack of reliability of the indicators into account.

This “disattenuates” the correlation, attempting to reinflate the observed correlation to account for the measurement error as informed by the intercorrelations between the indicators.

Page 22: Unit 7a: Factor Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 7a – Slide 1

_cons 27.90022 .660665 42.23 0.000 . X1 -.3813722 .0267974 -14.23 0.000 -.5891435 X2 Coef. Std. Err. t P>|t| Beta

Total 6410.46475 382 16.7813213 Root MSE = 3.3144 Adj R-squared = 0.3454 Residual 4185.45594 381 10.9854487 R-squared = 0.3471 Model 2225.00881 1 2225.00881 Prob > F = 0.0000 F( 1, 381) = 202.54 Source SS df MS Number of obs = 383

. regress X2 X1, beta

© Andrew Ho, Harvard Graduate School of Education Unit 7a – Slide 22

Comparing correlations between observed and latent variables

𝑋 1∗ 𝑋 2

∗-.5891 Simple regression coefficients between observed variables will always be “attenuated,” smaller in magnitude and closer to 0.

In multiple regression, the bias due to measurement error will be unpredictable!

The sem command estimates the regression coefficients between the latent variables and takes the lack of reliability of the indicators into account.

This corrects the regression coefficients, attempting to account for the measurement error as informed by the intercorrelations between the indicators.

This seems important, no?

Simple regression coefficients between observed variables will always be “attenuated,” smaller in magnitude and closer to 0.

In multiple regression, the bias due to measurement error will be unpredictable!

The sem command estimates the regression coefficients between the latent variables and takes the lack of reliability of the indicators into account.

This corrects the regression coefficients, attempting to account for the measurement error as informed by the intercorrelations between the indicators.

This seems important, no?