1 all about variable selection in factor analysis and structural equation modeling yutaka kano osaka...
TRANSCRIPT
1
All about variable selection in factor analysis and structural equation modeling
Yutaka KanoOsaka University
School of Human Sciences
IMPS2001, July 15-19,2001Osaka, Japan
2
Today’s talk Motivation for variable selection How SEFA (and SCoFA) works Derivation of the statistics Theoretical property What does variable selection with
model fit mean? Summary
3
Needs for variable selection Variable selection in EFA is an
important but time-consuming process Composite scale construction Reliability analysis
Variable selection in SEM should be less important but … Indicator selection Improvement of model fit
4
Recent literature Little et. al. (1999). On selecting indicators for
multivariate measurement and modeling with latent variables. Psychological Methods, 4, 192-211.
Fabrigar et. al. (1999). Evaluating the use of EFA in psychological research. Psychological Methods, 4, 272-299.
Kano et. al. (in press, 2000, 1994).
5
Procedures for variable selection in EFA Usual procedure
Magnitude of communalities Interpretability Towards simple structures
Our approach Model fit
6
Programs for variable selection in factor analysis Exploratory analysis
SEFA(Stepwise variable selection in EFA)
http://koko15.hus.osaka-u.ac.jp/~harada/sefa2001/stepwise/
Confirmatory analysis SCoFA(Stepwise Confirmatory FA) http://koko16.hus.osaka-u.ac.jp/~hara
da/scofa/input.html
7
Example_1 A questionnaire on perception on physic
al exercise n=653, p=15, one-factor model Data was collected by Dr Oka
(Waseda U.)
Conclusion Remove X2, X9, X13, X14
8
Example_2
9
Example_3
10
Example_4
11
Example_5
12
Example_6
13
SCoFa:24 Pschological variable
1445.267)05.0(2
231
Original Model (p=24)
15
Theory of SEFA and SCoFA Obtain estimates for a current model Construct predicted chi-square for
each one-variable-deleted model using the estimates, without tedious iterations
Take a sort of LM approach
16
Known quantities and goal_1
saturated is)V(:)()V(:
:
ˆ:
,)()V(:
Statistics and Model Current
00
2
XX
X
AvsHT
STATISTICS
MLE
MODEL
17
Known quantities and goal_2
examined be toent variablinconsistepossibly :
model)current a(in vector observed : ]',,,[
where
saturated is )V(:)()V(:
is want What we
1
2
21
222222
X
XXX
AvsHT
p
X
X
XX
18
Basic idea
)()V(:)()V(:
saturated is )V(:)(
)V(:
saturated is )V(:)()V(:
saturated is)V(:)()V(:
:used be tostatistics test New
2221
1211'20'02
2221
1211'2'2
222222
00
XX
XX
XX
XX
HvsHT
AvsHT
AvsHT
AvsHT
'020
'200'22
TT
TTTTTa
We construct T02’ as LM test
19
Final formula for T2
)(
)()'()()()'()()()(
')(
2222
122
12
1222
12
12
2222
'0202
Sv
Svn
TTT
NNNN
Note: This is Browne’s (Browne 1982) statistic of goodness-of-fit using general estimates
20
Properties_1
ODifT
ODifT
Dn
XV
L
L
2222
2
2222
2
222
1
)0(
)0(
01)(
0
0
X
21
Question 1 Can T2 work even if X1 is inconsistent?
Estimate for Θ is biased.
ODifT
ODifT
Dn
XV
L
L
2222
2
2222
2
222
1
)0(
)0(
01)(
0
0
X
22
Properties_2
ODifT
ODifT
D
d
n
XV
L
L
2222
2
2222
2
12
2221
1211
2
1
)0(
)0(
provecan wenot,or Either
1)(
0d
d
d
X
23
Question 2 Can SEFA identify an uncorrelated
variable? Unfortunately, no We have developed a way of
testing zero communality in SEFA (see Harada-Kano, IMPS)
24
Question 3 What is the actual meaning of
variable selection with model fit?
The following shows an illustrative example:
25
Answer 3_1: Example again X2, X9, X13, X14 are to be removed
26
Answer 3_2: Example again Best fitted model with correlated errors
SEFA conclusion: X2, X9, X13, X14 are to be removed
27
Answer 3_3: Example again Variables to be deleted are identified so as to
break up the correlated errors
Correlated errors may cause Different interpretation of FA results
Common factors considered are not enough to explain correlations between observed variables
Such variables are not good indicators (e.g., in SEM) Inaccurate reliability estimates
Green-Hershberger (2000), Raykov (2001) Kano-Azuma (2001, IMPS)
28
Question 4 How one should do if SEFA or
SCoFA identifies a variable with large factor loading estimate as inconsistent?
29
Answer 4_1: Reliability If one employs the alpha coefficient
or
(s)he has to delete it to have a good-fit model.
ii
i
2
2
30
Answer 4_2: Reliability If one employs
(s)he can remain it, and compare reliability between models.
iji
i
2
2
'
31
Answer 4_3: Example
ρ' 0.64α 0.74Bad-fitted One-factor Model based ρ 0.76
32
Answer 4_4: Example
ρ' 0.64 0.63α 0.74 0.63
33
Answer 4_5: Example
ρ' 0.60 0.63α 0.78 0.63
34
Summary_1 A new option for variable selection was introdu
ced, which is based on model fit. You can easily access the programs on the int
ernet SEFA(Stepwise variable selection in EFA)
http://koko15.hus.osaka-u.ac.jp/~harada/sefa2001/stepwise/
SCoFA(Stepwise Confirmatory FA) http://koko16.hus.osaka-u.ac.jp/~harada/scofa
/input.html
35
Summary_2 It enjoys preferable theoretical propertie
s Testing null communality is important
Uncorrelated variables cannot be identified Variable selection with model fit can find
out error correlations Traditional reliability coefficients based
on a poor-fit model have serious bias
36
Summary_3 High communality variables can be
inconsistent Whether such variables should be
removed depends Reliability has to be figured out using
nonstandard factor model
37
References Harada, A. and Kano, Y. (2001) Variable
selection and test of communality in EFA. IMPS2001, Osaka
Kano, Y. (in press). Variable selection for structural models. Journal of Statistical Inference and Planning.
Kano, Y. and Harada, A. (2000). Stepwise variable selection in factor analysis. Psychometrika, 65, 7-22.
Kano, Y. and Ihara, M. (1994). Identification of inconsistent variates in factor analysis. Psychometrika, Vol.59, 5-20
38Thank you for coming to Osaka and being at my talk
TakoYaki performance will start soon
You can understand how octopus relates to Osaka, if you see and taste it