what is factor analysis?

Factor analysis examines Factor analysis examines the interrelationships the interrelationships among a large number of among a large number of variables and, then, variables and, then, attempts to explain them attempts to explain them in terms of their common in terms of their common underlying dimensionunderlying dimension– Common underlying Common underlying

dimensions are referred to dimensions are referred to as factorsas factors

Interdependence techniqueInterdependence technique– No I.V.s or D.V.s No I.V.s or D.V.s – All variables are considered All variables are considered

simultaneouslysimultaneously

What is Factor What is Factor Analysis?Analysis?

Why do Factor Why do Factor Analysis?Analysis? Data SummarizationData Summarization

– Research question is to better understand Research question is to better understand the interrelationships among the variablesthe interrelationships among the variables

– Identify latent dimensions within data setIdentify latent dimensions within data set– Identification and understanding of these Identification and understanding of these

underlying dimensions is the goalunderlying dimensions is the goal Data ReductionData Reduction

– Discover underlying dimensions to reduce Discover underlying dimensions to reduce data to fewer variables so all dimensions data to fewer variables so all dimensions are represented in subsequent analysesare represented in subsequent analyses Surrogate variablesSurrogate variables Aggregated scalesAggregated scales Factor ScoresFactor Scores

Precursor to subsequent MV techniquesPrecursor to subsequent MV techniques– Data SummarizationData Summarization

Latent dimensions -- research question Latent dimensions -- research question answered with other MV techniquesanswered with other MV techniques

– Data ReductionData Reduction Avoid multicollinearity problemsAvoid multicollinearity problems Improve reliability of aggregated scalesImprove reliability of aggregated scales

AssumptionsAssumptions Variables must be Variables must be

interrelatedinterrelated– 20 unrelated variables=20 20 unrelated variables=20

factorsfactors– Matrix must have sufficient Matrix must have sufficient

number of correlationsnumber of correlations Some underlying factor Some underlying factor

structurestructure Sample must be Sample must be

homogeneoushomogeneous Metric variables assumedMetric variables assumed MV normality not requiredMV normality not required Sample sizeSample size

– Min 50, prefer 100Min 50, prefer 100– Min 5 observations/item, Min 5 observations/item,

prefer 10 observations/itemprefer 10 observations/item

Types of Factor Types of Factor AnalysisAnalysis

Exploratory Factor Analysis Exploratory Factor Analysis (EFA)(EFA)– Used to discover underlying structureUsed to discover underlying structure– Principal components analysis (PCA) Principal components analysis (PCA)

(Thurstone)(Thurstone) Treats individual items or measures Treats individual items or measures

as though they have no unique as though they have no unique errorerror

– Factor analysis (common factors Factor analysis (common factors analysis) (Spearman)analysis) (Spearman) Treats individual items or measures Treats individual items or measures

as having unique erroras having unique error– Both PCA and FA give similar answers Both PCA and FA give similar answers

most of the timemost of the time

Confirmatory Factor Analysis Confirmatory Factor Analysis (CFA)(CFA)– Used to test whether data fit a priori Used to test whether data fit a priori

expectations for data structureexpectations for data structure– Structural equations modelingStructural equations modeling

Purpose of EFAPurpose of EFA EFA is a data reduction techniqueEFA is a data reduction technique

– Scientific parsimonyScientific parsimony– Which items are virtually the same Which items are virtually the same

thingthing Objective: simplification of items Objective: simplification of items

into subset of concepts or into subset of concepts or measuresmeasures

Part of construct validation (what Part of construct validation (what are underlying patterns in data?)are underlying patterns in data?)

EFA assesses dimensionality or EFA assesses dimensionality or homogeneityhomogeneity

Issues:Issues:– Use principal components analysis Use principal components analysis

(PCA) or factor analysis (FA)?(PCA) or factor analysis (FA)?– How many factors?How many factors?– What type of rotation?What type of rotation?– How to interpret?How to interpret?

LoadingsLoadings Cross-loadingsCross-loadings

Types of EFATypes of EFA Principal components analysisPrincipal components analysis

– A composite of the observed variables as A composite of the observed variables as a summary of those variablesa summary of those variables

– Assumes no error in itemsAssumes no error in items– No assumption of underlying constructNo assumption of underlying construct– Often used in physical scienceOften used in physical science– Precise mathematical solutions possiblePrecise mathematical solutions possible– Unity inserted on diagonal of matrixUnity inserted on diagonal of matrix

Factor (or common factors) analysisFactor (or common factors) analysis– In SPSS known as principal axis factoringIn SPSS known as principal axis factoring– Explain relationship between observed Explain relationship between observed

vars in terms of latent vars or factorsvars in terms of latent vars or factors– Factor is a hypothesized constructFactor is a hypothesized construct– Assumes error in itemsAssumes error in items– Precise math not possible, solved by Precise math not possible, solved by

iterationiteration– Communalities (shared var) on diagonalCommunalities (shared var) on diagonal

Basic Logic of EFABasic Logic of EFA Items you want to reduce. Items you want to reduce. Creates mathematical combination Creates mathematical combination

of variables that maximizes of variables that maximizes variance you can predict in all variance you can predict in all variablesvariables principal component or principal component or a factora factor

New combination of items from New combination of items from residual variance that maximizes residual variance that maximizes variance you can predict in what is variance you can predict in what is leftleft second principal component second principal component or factoror factor

Continue until all variance is Continue until all variance is accounted for. Select the minimal accounted for. Select the minimal number of factors that captures number of factors that captures the most amount of variance.the most amount of variance.

Interpret the factors. Interpret the factors. Rotated matrix and loadings are Rotated matrix and loadings are

more interpretable. more interpretable.

Concepts and TermsConcepts and TermsPCA starts with data matrix of N persons PCA starts with data matrix of N persons

arranged in rows and k measures arranged in arranged in rows and k measures arranged in columnscolumns

MeasuresMeasures

PersonsPersons A A B B C C D ... k D ... k

11

22 The objective is to explainThe objective is to explain

33 the data in less than thethe data in less than the

.. total number of itemstotal number of items

..

NN

N persons, k different measuresN persons, k different measures

PCA is a method to transform the original set of PCA is a method to transform the original set of variables into a new set of principal variables into a new set of principal components that are unrelated to each components that are unrelated to each other.other.

Concepts and Concepts and TermsTerms

Factor - Linear composite. A Factor - Linear composite. A way of turning multiple way of turning multiple measures into one thing.measures into one thing.

Factor Score - Measure of Factor Score - Measure of one person’s score on a one person’s score on a given factor.given factor.

Factor Loadings - Factor Loadings - Correlation of a factor score Correlation of a factor score with an item. Variables with with an item. Variables with high loadings are the high loadings are the distinguishing features of distinguishing features of the factor.the factor.

Concepts and Concepts and TermsTerms

Communality - (hCommunality - (h22) - Variance in ) - Variance in a given item accounted for by all a given item accounted for by all factors. Sum of squared factor factors. Sum of squared factor loadings in a row from factor loadings in a row from factor analysis results. These are analysis results. These are presented in the diagonal in presented in the diagonal in common factor analysiscommon factor analysis

Factorally pure - A test that only Factorally pure - A test that only loads on one factor.loads on one factor.

Scale score - score for individual Scale score - score for individual obtained by adding together obtained by adding together items making up a factor. items making up a factor.

The ProcessThe Process Because we are trying to Because we are trying to

reduce the data, we don’t reduce the data, we don’t want as many factors as want as many factors as itemsitems

Because each new Because each new component or factor is the component or factor is the best linear combination of best linear combination of residual variance, data can residual variance, data can be explained relatively well in be explained relatively well in many less factors than many less factors than original number of itemsoriginal number of items

Stop taking additional factors Stop taking additional factors is a difficult decision. Primary is a difficult decision. Primary methods:methods:– Scree ruleScree rule– Kaiser criterion Kaiser criterion

(eigenvalues > 1)(eigenvalues > 1)

How Many Factors?How Many Factors? Scree Plot (Cattell) - Not a test Scree Plot (Cattell) - Not a test

– Look for bend in plotLook for bend in plot– Include factor located right at bend pointInclude factor located right at bend point

Kaiser (or Latent Root) criterionKaiser (or Latent Root) criterion– Eigenvalues greater than 1Eigenvalues greater than 1– Also, 1 is the amount of variance Also, 1 is the amount of variance

accounted for by a single item (raccounted for by a single item (r22 = = 1.00). If eigenvalue < 1.00 then factor 1.00). If eigenvalue < 1.00 then factor accounts for less variance than a single accounts for less variance than a single item.item.

– Tinsley & Tinsley - Kaiser criterion can Tinsley & Tinsley - Kaiser criterion can underestimate number of factorsunderestimate number of factors

A priori hypothesized # of A priori hypothesized # of factorsfactors

Percent of variance criterion Percent of variance criterion Parallel analysis – eigenvalues Parallel analysis – eigenvalues

higher than expect by chancehigher than expect by chance Use both Use both plus theoryplus theory to make to make

determinationdetermination

ExampleExampleRR matrix (correlation matrix) matrix (correlation matrix)

BlPrBlPr LSatLSatChol LStrChol LStr BdWtBdWt JSatJSatJStrJStr

BlPrBlPr 1.001.00

LSatLSat -.18 -.18 1.00 1.00

CholChol .65.65 -.17 -.17 1.00 1.00

LStrLStr .15 .15 -.45-.45 .22 .22 1.00 1.00

BdWtBdWt .45.45 -.11 -.11 .52.52 .16 .16 1.00 1.00

JSatJSat -.21 -.21 .85.85 -.12 -.12 -.35-.35 -.05 -.05 1.001.00

JStrJStr .19 .19 -.21 -.21 .02 .02 .79.79 .19 .19 -.35-.35 1.001.00

Principal Components Analysis (PCA)Principal Components Analysis (PCA)

Initial Statistics:Initial Statistics:Variable Communality * Factor Eigenval %Var Cum%Variable Communality * Factor Eigenval %Var Cum%

BLPR 1.00000 * 1 2.85034 40.7 40.7BLPR 1.00000 * 1 2.85034 40.7 40.7

LSAT 1.00000 * 2 1.74438 24.9 65.6LSAT 1.00000 * 2 1.74438 24.9 65.6

CHOL 1.00000 * 3 1.16388 16.6 82.3CHOL 1.00000 * 3 1.16388 16.6 82.3

LSTR 1.00000 * 4 .56098 8.0 90.3LSTR 1.00000 * 4 .56098 8.0 90.3

BDWT 1.00000 * 5 .44201 6.3BDWT 1.00000 * 5 .44201 6.3 96.696.6

JSAT 1.00000 * 6 .20235 2.9 99.5JSAT 1.00000 * 6 .20235 2.9 99.5

JSTR 1.00000 * 7 .03607 .5 100.0JSTR 1.00000 * 7 .03607 .5 100.0

ExampleExampleVariable Communality * Factor Eigenval %Var Variable Communality * Factor Eigenval %Var

Cum%Cum%

BLPR 1.00000 * 1 2.85034 40.7 40.7BLPR 1.00000 * 1 2.85034 40.7 40.7

LSAT 1.00000 * 2 1.74438 24.9 65.6LSAT 1.00000 * 2 1.74438 24.9 65.6

CHOL 1.00000 * 3 1.16388 16.6 82.3CHOL 1.00000 * 3 1.16388 16.6 82.3

LSTR 1.00000 * 4 .56098 8.0 90.3LSTR 1.00000 * 4 .56098 8.0 90.3

BDWT 1.00000 * 5 .44201 6.3 96.6BDWT 1.00000 * 5 .44201 6.3 96.6

JSAT 1.00000 * 6 .20235 2.9 99.5JSAT 1.00000 * 6 .20235 2.9 99.5

JSTR 1.00000 * 7 .03607 .5 100.0JSTR 1.00000 * 7 .03607 .5 100.0

Factor Matrix (Unrotated):Factor Matrix (Unrotated):

Factor 1 Factor 2 Factor 3 ... Fac7Factor 1 Factor 2 Factor 3 ... Fac7

LSTR .73738 -.32677 .47575LSTR .73738 -.32677 .47575

LSAT -.71287 .38426 .52039LSAT -.71287 .38426 .52039

JSAT -.70452 .42559 .48553JSAT -.70452 .42559 .48553

JSTR .64541 -.32867 .62912JSTR .64541 -.32867 .62912

CHOL .54945 .68694 -.10453CHOL .54945 .68694 -.10453

BDWT .48867 .60471 .13043BDWT .48867 .60471 .13043

BLPR .58722 .60269 -.08534BLPR .58722 .60269 -.08534

Eigenvalue 2.850343Eigenvalue 2.850343 1.74438 1.16388 1.74438 1.16388

ExampleExampleFinal Statistics:Final Statistics:

Variable Communality * Factor Eigenvalue %Var CumVariable Communality * Factor Eigenvalue %Var Cum%%

BLPR .71533 * 1 2.85034 40.7 40.7BLPR .71533 * 1 2.85034 40.7 40.7

LSAT .92665 * 2 1.74438 24.9 65.6LSAT .92665 * 2 1.74438 24.9 65.6

CHOL .78470 * 3 1.16388 16.6 82.3CHOL .78470 * 3 1.16388 16.6 82.3

LSTR .87684 *LSTR .87684 *

BDWT .62149 *BDWT .62149 *

JSAT .91321 *JSAT .91321 *

JSTR .92037 *JSTR .92037 *

VARIMAX Rotated Factor Matrix:VARIMAX Rotated Factor Matrix:

Factor 1 Factor 2 Factor 3Factor 1 Factor 2 Factor 3 hh22

CHOL CHOL .87987.87987 -.10246 -.00574 -.10246 -.00574.78470.78470

BLPR BLPR .83043.83043 -.14875 .05988 -.14875 .05988 .71533.71533

BDWT BDWT .76940.76940 .05630 .16234 .05630 .16234 .62149.62149

LSAT -.09806 LSAT -.09806 .94430.94430 -.15917 -.15917 .92665.92665

JSAT -.05790 JSAT -.05790 .93376.93376 -.19479 -.19479 .91321.91321

JSTR .06542 -.10717 JSTR .06542 -.10717 .95110.95110 .92036.92036

LSTR .12381 -.26465 LSTR .12381 -.26465 .88965.88965 .87684.87684

Eigenvalue 2.0883Eigenvalue 2.0883 1.88091.8809 1.78931.7893

Scree PlotScree PlotFactor Scree Plot

Factor Number

7654321

Eig

env

alu

e

3.5

3.0

2.5

2.0

1.5

1.0

.5

0.0

Scree comes from a word for loose rock anddebris at the base of a cliff!

Information from Information from EFAEFA FACTORFACTOR

MsrMsr F1F1 F2F2 F3F3 hh22

aa .60.60 -.06-.06 .02 .02 .36.36

bb .81.81 .12 .12 -.03-.03 .67.67

cc .77.77 .03 .03 .08 .08 .60.60

dd .01.01 .65 .65 -.04-.04 .42.42

ee .03.03 .80 .80 .07 .07 .65.65

ff .12.12 .67 .67 -.05-.05 .47.47

gg .19.19 -.02-.02 .68 .68 .50.50

hh .08.08 -.10-.10 .53 .53 .30.30

ii .26.26 -.13-.13 .47 .47 .31.31

Sum Sq LdngSum Sq Ldng 1.761.76 1.561.56 .98.98 TotalTotal

% Variance% Variance .195.195 .173.173 .109.109 47.7%47.7%

(1.76/9) (1.56/9) (.98/9)(1.76/9) (1.56/9) (.98/9)

A factor loading is the correlation A factor loading is the correlation between a factor and an itembetween a factor and an item

When factors are orthogonal, factor When factors are orthogonal, factor loadings squared are the amount of loadings squared are the amount of variance in one variable explained by variance in one variable explained by that factor (F1 explains 36% of the that factor (F1 explains 36% of the variance in Msr a; F3 explains 46% of variance in Msr a; F3 explains 46% of the variance in Msr g)the variance in Msr g)

Information from EFAInformation from EFAMsrMsr F1F1 F2F2 F3F3 hh22

aa .60.60 -.06-.06 .02 .02 .36.36bb .81.81 .12 .12 -.03-.03 .67.67...... ... ... ... ... ... ... ... ... ii .26.26 -.13-.13 .47 .47 .31.31Sum Sq LdngSum Sq Ldng 1.761.76 1.561.56 .98.98 TotalTotal% Variance% Variance .195.195 .173.173 .109.109 47.7%47.7%

(1.76/9) (1.56/9) (.98/9)(1.76/9) (1.56/9) (.98/9)

Eigenvalue:Eigenvalue: Sum of squared Sum of squared loadings down a column loadings down a column (associated with a factor). Total (associated with a factor). Total variance in all vars explained by variance in all vars explained by one factor. Factors with one factor. Factors with eigenvalues less than 1 predict eigenvalues less than 1 predict less than the variance of 1 item.less than the variance of 1 item.

Communality (hCommunality (h22):): Variance in a Variance in a given item accounted for by all given item accounted for by all factors. Sum of squared factors. Sum of squared loadings across rows. Will equal loadings across rows. Will equal 1 if you retain all possible 1 if you retain all possible factors.factors.

Eigenvalue

Information from EFAInformation from EFA FACTORFACTOR

MsrMsr F1F1 F2F2 F3F3 hh22

aa .60.60 -.06-.06 .02 .02 .36.36

bb .81.81 .12 .12 -.03-.03 .67.67

...... ... ... ... ... ... ... ... ...

ii .26.26 -.13-.13 .47 .47 .31.31



(1.76/9) (1.56/9) (.98/9)(1.76/9) (1.56/9) (.98/9)

Average of all communalities (hAverage of all communalities (h22 / k) = / k) = proportion of variance in all variables proportion of variance in all variables explained by all factors. explained by all factors.

If all variables reproduced perfectly by If all variables reproduced perfectly by the factors, correlation between the factors, correlation between original variables equals sum of the original variables equals sum of the products of factor loadings. When not products of factor loadings. When not perfect, gives an estimate of the perfect, gives an estimate of the correlation.correlation.

e.g. re.g. rabab (.60*.81) + (-.06*.12) + (.02*-.03) (.60*.81) + (-.06*.12) + (.02*-.03) 48 48

Information from EFAInformation from EFAMsrMsr F1F1 F2F2 F3F3 hh22

aa .60.60 -.06-.06 .02 .02 .36.36

bb .81.81 .12 .12 -.03-.03 .67.67

...... ... ... ... ... ... ... ... ...

ii .26.26 -.13-.13 .47 .47 .31.31



(1.76/9) (1.56/9) (.98/9)(1.76/9) (1.56/9) (.98/9)

1-h1-h22 is the uniqueness is the uniqueness variance of variance of an item not shared with other items. an item not shared with other items. Unique variance could be random Unique variance could be random error or systematic. error or systematic.

The factor matrix above is after The factor matrix above is after rotation. Eigenvalues computed on rotation. Eigenvalues computed on the unrotated and unreduced factor the unrotated and unreduced factor loading matrix because we are loading matrix because we are interested in total variance interested in total variance accounted for in the data. Use of accounted for in the data. Use of eigenvalues and % variance eigenvalues and % variance accounted for in SPSS not reordered accounted for in SPSS not reordered after rotation.after rotation.

Eigenvalue

Important Properties of Important Properties of PCAPCA

Each factor in turn maximizes Each factor in turn maximizes variance explained from an variance explained from an RR matrixmatrix

For any number of factors For any number of factors obtained, PCs maximize variance obtained, PCs maximize variance explainedexplained

Amount of variance explained by Amount of variance explained by each PC equals the corresponding each PC equals the corresponding characteristic root (eigenvalue)characteristic root (eigenvalue)

All characteristic roots of PCs are All characteristic roots of PCs are positivepositive

Number of PCs derived equal the Number of PCs derived equal the number of factors need to explain number of factors need to explain all the variance in all the variance in RR

The sum of characteristic roots The sum of characteristic roots equals the sum of diagonal equals the sum of diagonal RR elementselements

RotationsRotations All original PC and PF solutions All original PC and PF solutions

are orthogonal.are orthogonal. Once you obtain minimal number Once you obtain minimal number

of factors, you have to interpret of factors, you have to interpret themthem

Interpreting original solutions is Interpreting original solutions is difficult. Rotation aids difficult. Rotation aids interpretation.interpretation.

You are looking for simple You are looking for simple structurestructure– Component loadings should be Component loadings should be

very high for a few vars and very high for a few vars and near 0 for remaining variablesnear 0 for remaining variables

– Each variable should load Each variable should load highly on only 1 componenthighly on only 1 component

Unrotated MatrixUnrotated Matrix Rotated Rotated MatrixMatrixVarVar F1F1 F2 F2 F1F1 F2F2aa .75.75 .63 .63 .14.14 .95.95bb .69.69 .57 .57 .14.14 .90.90cc .80.80 .49 .49 .18.18 .92.92dd .85.85 -.42-.42 .94.94 .09.09ee .76.76 -.42-.42 .92.92 .07.07

RotationRotation After rotation, variance After rotation, variance

accounted for by a factor accounted for by a factor is spread out. First factor is spread out. First factor no longer accounts for no longer accounts for max variance possible; max variance possible; others get more variance. others get more variance. Total variance accounted Total variance accounted for is the same.for is the same.

Two types of rotationTwo types of rotation– Orthogonal (factors Orthogonal (factors

uncorrelated)uncorrelated)– Oblique (factors Oblique (factors

correlated)correlated)

RotationRotation Orthogonal rotation (rigid, 90 Orthogonal rotation (rigid, 90

degrees) - PCs or PFs remain degrees) - PCs or PFs remain uncorrelated after transformationuncorrelated after transformation– Varimax - Simplifying column Varimax - Simplifying column

weights to 1s and 0s. Factor has weights to 1s and 0s. Factor has items loading highly, others items loading highly, others don’t load. Not appropriate if don’t load. Not appropriate if you expect a single factor.you expect a single factor.

– Quartimax - Simplify to 1s and Quartimax - Simplify to 1s and 0s in a row. Item loads high on 1 0s in a row. Item loads high on 1 factor, almost 0 on others. factor, almost 0 on others. Appropriate if you expect single Appropriate if you expect single general factor.general factor.

– Equimax. Compromise of Equimax. Compromise of Varimax and Quartimax Varimax and Quartimax rotations.rotations.

– In practice, choice of rotation In practice, choice of rotation makes little differencemakes little difference

RotationRotation Oblique or correlated components (less Oblique or correlated components (less

or more than 90 degrees) - Account for or more than 90 degrees) - Account for same % var, but factors correlatedsame % var, but factors correlated– Some say not meaningful with PCASome say not meaningful with PCA– Many factors are theoretically related, so Many factors are theoretically related, so

rotation method should not “force” rotation method should not “force” orthogonalityorthogonality Allows the loadings to more closely match Allows the loadings to more closely match

simple structuresimple structure Correlated solutions will get you closer to Correlated solutions will get you closer to

simple structuresimple structure Oblimin (Kaiser) and promax are goodOblimin (Kaiser) and promax are good

– Provides a structure matrix of loadings and Provides a structure matrix of loadings and a pattern matrix of partial weights – which a pattern matrix of partial weights – which to interpret?to interpret?

Orthogonal RotationOrthogonal Rotation

Unrotated MatrixUnrotated MatrixRotated MatrixRotated Matrix

VarVar F1F1 F2 F2 F1F1 F2F2

aa .75.75 .63 .63 .14.14 .95.95

bb .69.69 .57 .57 .14.14 .90.90

cc .80.80 .49 .49 .18.18 .92.92

dd .85.85 -.42-.42 .94.94 .09.09

ee .76.76 -.42-.42 .92.92 .07.07

.au

RF1

RF2

F1

F2

-1.00

-1.00

1.00

1.00

.bu .cu

.du.eu

Simple Structure Simple Structure (Thurstone)(Thurstone)

(1) Each row of factor matrix (1) Each row of factor matrix should have at least one 0 should have at least one 0 loadingloading

(2) The number of items with 0 (2) The number of items with 0 loadings equals the number of loadings equals the number of factors; each column has 1 or factors; each column has 1 or more 0 loadingsmore 0 loadings

(3) Items with high loadings on (3) Items with high loadings on one factor or the otherone factor or the other

(4) If there are more than 4 (4) If there are more than 4 factors, a large portion of items factors, a large portion of items should have zero loadingsshould have zero loadings

(5) For every pair of columns, (5) For every pair of columns, there should be few cross-there should be few cross-loadingsloadings

(6) Few if any negative loadings(6) Few if any negative loadings

Simple StructureSimple Structure FactorFactor

MsrMsr 11 22 33aa xx 00 00bb xx 00 00cc xx 00 00dd 00 xx 00ee 00 xx 00ff 00 xx 00gg 00 00 xxhh 00 00 xxii 00 00 xxjj 00 00 xx

Oblique RotationOblique Rotation Example:Example:

Unrotated MatrixUnrotated MatrixRotated MatrixRotated Matrix

VarVar F1F1 F2 F2 F1F1 F2F2

aa .75.75 .63 .63 .04.04 .98.98

bb .69.69 .57 .57 .02.02 .99.99

cc .80.80 .49 .49 .01.01 .97.97

dd .85.85 -.42-.42 .99.99 .01.01

ee .76.76 -.42-.42 .98.98 .02.02

.au

RF1

RF2

F1

F2

-1.00

-1.00

1.00

1.00

.bu .cu

.du.eu

Orthogonal or Orthogonal or Oblique Rotation?Oblique Rotation?

Nunnally suggests using Nunnally suggests using orthogonal as opposed to orthogonal as opposed to oblique rotationsoblique rotations– Orthogonal is simplerOrthogonal is simpler– Leads to same conclusionsLeads to same conclusions– Oblique can be misleadingOblique can be misleading

Ford et al. suggest using Ford et al. suggest using oblique unless oblique unless orthogonality assumption orthogonality assumption is tenableis tenable

InterpretationInterpretation Factors usually interpreted by Factors usually interpreted by

observing which variables load observing which variables load highest on each factorhighest on each factor– a priori criteria for loadings a priori criteria for loadings

(min .3+)(min .3+) Name factor. Always provide Name factor. Always provide

factor loading matrix in study.factor loading matrix in study. Cross-loadings are problematicCross-loadings are problematic

– a priori criteria for “large” a priori criteria for “large” cross-loadingcross-loading

– decide a priori what you will dodecide a priori what you will do Factor loadings or summated Factor loadings or summated

scales used to define new scale. scales used to define new scale. Can go back to correlation matrix Can go back to correlation matrix and do not only use factor and do not only use factor loadings. Loadings can be loadings. Loadings can be inflated.inflated.

PCA and FAPCA and FA PCA - No constructs of theoretical PCA - No constructs of theoretical

meaning assumed; Simple meaning assumed; Simple mechanical linear combination. (1s in mechanical linear combination. (1s in the diagonal of R)the diagonal of R)

FA - assumes underlying latent FA - assumes underlying latent constructs. Allows for measurement constructs. Allows for measurement error (communalities in diagonal of R)error (communalities in diagonal of R)– Also PAF or common factors Also PAF or common factors

analysisanalysis PCA uses all the variance. FA uses PCA uses all the variance. FA uses

ONLY shared variance. ONLY shared variance. In FA you can have indeterminant In FA you can have indeterminant

(unsolvable) solutions. Have to (unsolvable) solutions. Have to iterate (computer makes best iterate (computer makes best “guess”) to get the solutions.“guess”) to get the solutions.

FAFA Also known as principal axis Also known as principal axis

factoring or common factor factoring or common factor analysisanalysis

StepsSteps– Estimate communalities of the Estimate communalities of the

variables (shared variance)variables (shared variance)– Substitute communalities in Substitute communalities in

place of 1s on diagonal of place of 1s on diagonal of RR– Perform a principal component Perform a principal component

analysis on the reduced matrixanalysis on the reduced matrix– Iterated FAIterated FA

Estimate hEstimate h22

Solve for factor modelSolve for factor model Calculate new communalitiesCalculate new communalities Substitute new estimates of hSubstitute new estimates of h22 into into

matrix and redomatrix and redo Iterate until communalities don’t Iterate until communalities don’t

change muchchange much Rotate for interpretationRotate for interpretation

Estimating Estimating CommunalitiesCommunalities

Highest correlation of given Highest correlation of given variable with other variables in variable with other variables in data setdata set

Squared multiple correlations Squared multiple correlations (SMCs) of each variable (SMCs) of each variable predicted by all other variables predicted by all other variables in the data setin the data set

Reliability of the variableReliability of the variable Because you are estimating and Because you are estimating and

the factors are no longer the factors are no longer combinations of actual combinations of actual variables, can get funny results:variables, can get funny results:– Communalities > 1.00Communalities > 1.00– Negative eigenvaluesNegative eigenvalues– Negative uniquenessNegative uniqueness

Example FAExample FARR matrix (correlation matrix with h matrix (correlation matrix with h22))

BlPrBlPr LSatLSatChol LStrChol LStr BdWtBdWt JSatJSatJStrJStr

BlPrBlPr .54 .54

LSatLSat -.18 -.18 .89 .89

CholChol .65.65 -.17 -.17 .67 .67

LStrLStr .15 .15 -.45-.45 .22 .22 .87 .87

BdWtBdWt .45.45 -.11 -.11 .52.52 .16 .16 .41 .41

JSatJSat -.21 -.21 .85.85 -.12 -.12 -.35-.35 -.05-.05 .86 .86

JStrJStr .19 .19 -.21 -.21 .02 .02 .79.79 .19 .19 -.35-.35.87.87

Principal Axis Factoring (PAF)Principal Axis Factoring (PAF)

Initial Statistics:Initial Statistics:

Variable Communality * Factor Eigenvalue %Var Cum%Variable Communality * Factor Eigenvalue %Var Cum%

BLPR .53859 * 1 2.85034 40.7 40.7BLPR .53859 * 1 2.85034 40.7 40.7

LSAT .88573 * 2 1.74438 24.9 65.6LSAT .88573 * 2 1.74438 24.9 65.6

CHOL .66685 * 3 1.16388 16.6 82.3CHOL .66685 * 3 1.16388 16.6 82.3

LSTR .87187 * 4 .56098 8.0 90.3LSTR .87187 * 4 .56098 8.0 90.3

BDWT .41804 * 5 .44201 6.3 96.6BDWT .41804 * 5 .44201 6.3 96.6

JSAT .86448 * 6 .20235 2.9 99.5JSAT .86448 * 6 .20235 2.9 99.5

JSTR .86966 * 7 .03607 .5 100.0JSTR .86966 * 7 .03607 .5 100.0

FAFAPrincipal Axis Factoring (PAF)Principal Axis Factoring (PAF)

Initial Statistics:Initial Statistics:

Variable Communality * Factor Eigenvalue Variable Communality * Factor Eigenvalue %Var Cum% %Var Cum%

BLPR .53859 * 1 2.85034 40.7 40.7BLPR .53859 * 1 2.85034 40.7 40.7

LSAT .88573 * 2 1.74438 24.9 65.6LSAT .88573 * 2 1.74438 24.9 65.6

CHOL .66685 * 3 1.16388 16.6 82.3CHOL .66685 * 3 1.16388 16.6 82.3

LSTR .87187 * 4 .56098 8.0 90.3LSTR .87187 * 4 .56098 8.0 90.3

BDWT .41804 * 5 .44201 6.3 96.6BDWT .41804 * 5 .44201 6.3 96.6

JSAT .86448 * 6 .20235 2.9 99.5JSAT .86448 * 6 .20235 2.9 99.5

JSTR .86966 * 7 .03607 .5 100.0JSTR .86966 * 7 .03607 .5 100.0

Factor Matrix (Unrotated):Factor Matrix (Unrotated):

Factor 1 Factor 2 Factor 3Factor 1 Factor 2 Factor 3

LSAT -.75885 .31104 .54455LSAT -.75885 .31104 .54455

LSTR .70084 -.20961 .36388LSTR .70084 -.20961 .36388

JSAT -.70038 .31502 .39982JSAT -.70038 .31502 .39982

JSTR .68459 -.29044 .66213JSTR .68459 -.29044 .66213

CHOL .48158 .74399 -.07267CHOL .48158 .74399 -.07267

BLPR .48010 .56066 -.02253BLPR .48010 .56066 -.02253

BDWT .36699 .47668 .08381BDWT .36699 .47668 .08381

FAFAPrincipal Axis Factoring (PAF)Principal Axis Factoring (PAF)

Final Statistics:Final Statistics:Variable Communality * Factor Eigenvalue %Var Variable Communality * Factor Eigenvalue %Var

Cum%Cum%

BLPR .54535 * 1 2.62331 37.5 37.5BLPR .54535 * 1 2.62331 37.5 37.5

LSAT .96913 * 2 1.41936 20.3 57.8LSAT .96913 * 2 1.41936 20.3 57.8

CHOL .79071 * 3 1.04004 14.9 72.6CHOL .79071 * 3 1.04004 14.9 72.6

LSTR .66752 *LSTR .66752 *

BDWT .36893 *BDWT .36893 *

JSAT .74962 *JSAT .74962 *

JSTR .99144 *JSTR .99144 *

Rotated Factor Matrix (VARIMAX):Rotated Factor Matrix (VARIMAX):

Factor 1 Factor 2 Factor 3Factor 1 Factor 2 Factor 3

LSAT .96846 -.10483 -.14223LSAT .96846 -.10483 -.14223

JSAT .83532 -.07092 -.21643JSAT .83532 -.07092 -.21643

CHOL -.08425 .88520 -.00547CHOL -.08425 .88520 -.00547

BLPR -.11739 .72364 .08898BLPR -.11739 .72364 .08898

BDWT -.00430 .59379 .12778BDWT -.00430 .59379 .12778

JSTR -.10474 .07011 .98770JSTR -.10474 .07011 .98770

LSTR -.28514 .15273 .75026LSTR -.28514 .15273 .75026

Logic of FALogic of FA

BlPr LSat Chol LStr BdWt JSat JStr

How many? What are the factors?

What we found:


PCA vs. FAPCA vs. FA Pros & Cons:Pros & Cons:

– Pro PCA: has solvable equations. Pro PCA: has solvable equations. “Math is right”.“Math is right”.

– Con PCA: Lumping garbage Con PCA: Lumping garbage together. Also, no underlying together. Also, no underlying concepts.concepts.

– Pro FA: considers role of Pro FA: considers role of measurement error, gets at measurement error, gets at concepts. concepts.

– Con FA: doing mathematical Con FA: doing mathematical gymnastics.gymnastics.

Practically: Usually not much Practically: Usually not much differencedifference– PCA will tend to converge more PCA will tend to converge more

consistentlyconsistently– FA is more meaningful conceptuallyFA is more meaningful conceptually

PCA vs. FAPCA vs. FA Situations where you might Situations where you might

want to use FA:want to use FA:– Where there are 12 or fewer Where there are 12 or fewer

variables (diagonal will have a variables (diagonal will have a large impact)large impact)

– Where the correlations between Where the correlations between the variables are small, then the variables are small, then diagonals will have a large impactdiagonals will have a large impact

If you have clear factor If you have clear factor structure, won’t make much structure, won’t make much differencedifference

Otherwise:Otherwise:– PCA will tend to overfactorPCA will tend to overfactor– If doing exploratory analysis, may If doing exploratory analysis, may

not mind overfactoringnot mind overfactoring

Using FA ResultsUsing FA Results Single surrogate measure – choose a Single surrogate measure – choose a

single item with a high loading to single item with a high loading to represent factorrepresent factor

Summated Scale*Summated Scale*– Form a composite from items Form a composite from items

loading on same factorloading on same factor– Average all items that load on a Average all items that load on a

factor (unit weighting)factor (unit weighting)– Calculate the alpha for the Calculate the alpha for the

reliabilityreliability– Name the scale/constructName the scale/construct

Factor ScoresFactor Scores– Composite measures for each Composite measures for each

factor were computed for each factor were computed for each subjectsubject

– Based on all factor loadings for all Based on all factor loadings for all itemsitems

– Not easily replicatedNot easily replicated

ReportingReporting If you create a factor based If you create a factor based

scale, describe the processscale, describe the process Factor analytic study, report:Factor analytic study, report:

– Theoretical rationale for EFATheoretical rationale for EFA– Detailed description of subjects Detailed description of subjects

and items, including and items, including descriptive statsdescriptive stats

– Correlation matrixCorrelation matrix– Methods used (PCA/FA, Methods used (PCA/FA,

communality estimates, factor communality estimates, factor extraction, rotation)extraction, rotation)

– Criteria employed for number Criteria employed for number of factors and meaningful of factors and meaningful loadingsloadings

– Factor matrix (aka pattern Factor matrix (aka pattern matrix)matrix)

Confirmatory Factor Confirmatory Factor AnalysisAnalysis

Part of construct validation process Part of construct validation process (do the data conform to expectations (do the data conform to expectations regarding the underlying patterns?)regarding the underlying patterns?)

Use SEM packages to perform CFAUse SEM packages to perform CFA EFA with specified number of factors EFA with specified number of factors

for a criterion is NOT a CFAfor a criterion is NOT a CFA Basically start with a correlation Basically start with a correlation

matrix and expected relationshipsmatrix and expected relationships Look at whether expected Look at whether expected

relationships can reproduce the relationships can reproduce the correlation matrix wellcorrelation matrix well

Tested with chi-square goodness of Tested with chi-square goodness of fit. If significant, data don’t fit fit. If significant, data don’t fit expected structure. No confirmation.expected structure. No confirmation.

Alternative measures of fit available.Alternative measures of fit available.

Logic of CFALogic of CFALet’s say I believe:


Phys Hlth Life Happ Job Happ


But the reality is:

Phys Hlth Stress Satisfact

Data won’t confirm expected structure

ExampleExampleR matrix (correlation matrix)R matrix (correlation matrix)

BlPrBlPr LSatLSatChol LStrChol LStr BdWtBdWtJSatJSat JStrJStr

BlPrBlPr 1.001.00

LSatLSat -.18 -.18 1.00 1.00

CholChol .65.65 -.17 -.17 1.00 1.00

LStrLStr .15 .15 -.45-.45 .22 .22 1.00 1.00

BdWtBdWt .45.45 -.11 -.11 .52.52 .16 .16 1.00 1.00

JSatJSat -.21 -.21 .85.85 -.12 -.12 -.35-.35 -.05-.051.001.00

JStrJStr .19 .19 -.21 -.21 .02 .02 .79.79 .19 .19 -.35-.351.001.00

Do the data fit?Do the data fit?


Phys Hlth Life Happ Job Happ

what is factor analysis?

Documents

factor analysis fa

observed variables

variables principal

underlying patterns

fewer variables

data fit

data setidentification

large number of variables