introduction to multivariate analysis of variance, factor analysis, and logistic regression
Post on 18-Mar-2016
114 Views
Preview:
DESCRIPTION
TRANSCRIPT
Introduction to Introduction to Multivariate Analysis of Multivariate Analysis of Variance, Factor Variance, Factor Analysis, and Logistic Analysis, and Logistic RegressionRegressionRubab G. ARIM, MARubab G. ARIM, MAUniversity of British ColumbiaUniversity of British ColumbiaDecember 2006December 2006rubab@interchange.ubc.carubab@interchange.ubc.ca
TopicsTopics Multivariate Analysis of Variance Multivariate Analysis of Variance
(MANOVA)(MANOVA) Factor AnalysisFactor Analysis
– Principal Component AnalysisPrincipal Component Analysis Logistic RegressionLogistic Regression
MANOVAMANOVA Extension of ANOVAExtension of ANOVA More than one dependent variable (DV)More than one dependent variable (DV)
– Conceptual reasonConceptual reason– Statistically relatedStatistically related
Compares the groups and tells whether Compares the groups and tells whether there are group mean differences on there are group mean differences on the combination of the DVs the combination of the DVs
Why not just conduct a Why not just conduct a series of ANOVAs?series of ANOVAs? Risk of an inflated Type 1 error:Risk of an inflated Type 1 error:
The more analyses you run, the more The more analyses you run, the more likely you are to find a significant likely you are to find a significant result, even if in reality there are no result, even if in reality there are no differences between groups. differences between groups.
If you choose to do so:If you choose to do so: Bonferroni adjustment--divide your Bonferroni adjustment--divide your
alpha value .05 by the number of tests alpha value .05 by the number of tests that you are intending to performthat you are intending to perform
MANOVA: Pros and MANOVA: Pros and ConsCons MANOVA prevents the inflation of MANOVA prevents the inflation of
Type 1 error Type 1 error Controls for correlation among a Controls for correlation among a
set of DVs by combining themset of DVs by combining themHowever,However, A complex set of proceduresA complex set of procedures Additional assumptions requiredAdditional assumptions required
ExampleExample Research Question:Research Question:
Do adolescent boys and girls Do adolescent boys and girls differ in their problem behaviors?differ in their problem behaviors?
What you need?What you need?– One categorical IV (i.e., gender)One categorical IV (i.e., gender)– Two or more continuous DVs (e.g., Two or more continuous DVs (e.g.,
depression, aggression,depression, aggression,– etc.)etc.)
Example (cont’)Example (cont’) What MANOVA doesWhat MANOVA does
– Tests the null hypothesis that the Tests the null hypothesis that the population means on a set of DVs do population means on a set of DVs do not vary across different levels of a not vary across different levels of a grouping variablegrouping variable
AssumptionsAssumptions– sample size, normality, outliers, sample size, normality, outliers,
linearity, multicollinearity, linearity, multicollinearity, homogeneity of variance-covariance homogeneity of variance-covariance matrices matrices
Interpretation of the Interpretation of the outputoutput Descriptive StatisticsDescriptive Statistics
– Check N values (more subjects in Check N values (more subjects in each cell than the number of DVs)each cell than the number of DVs)
Box’s TestBox’s Test– Checking the assumption of Checking the assumption of
variance-covariance matricesvariance-covariance matrices Levene’s TestLevene’s Test
– Checking the assumption of equality Checking the assumption of equality of varianceof variance
Interpretation (cont’)Interpretation (cont’) Multivariate testsMultivariate tests
– Wilks’ Lambda (most commonly used)Wilks’ Lambda (most commonly used)– Pillai’s Trace (most robust)Pillai’s Trace (most robust)(see Tabachnick & Fidell, 2007)(see Tabachnick & Fidell, 2007)
Tests of between-subjects effectsTests of between-subjects effects– Use a Bonferroni AdjustmentUse a Bonferroni Adjustment– Check Sig. columnCheck Sig. column
Interpretation (cont’)Interpretation (cont’) Effect sizeEffect size
– Partial Eta Squared: the proportion of Partial Eta Squared: the proportion of the variance in the DV that can be the variance in the DV that can be explained by the IV (see Cohen, explained by the IV (see Cohen, 1988)1988)
Comparing group meansComparing group means– Estimated marginal meansEstimated marginal means
Follow-up analysesFollow-up analyses(see Hair et al., 1998; Weinfurt, 1995) (see Hair et al., 1998; Weinfurt, 1995)
Factor Analysis (FA)Factor Analysis (FA) Not designed to test hypothesesNot designed to test hypotheses Data reduction techniqueData reduction technique
– Whether the data may be reduced to Whether the data may be reduced to a smaller set of components or a smaller set of components or factorsfactors
Used in the development and Used in the development and evaluation of tests and scalesevaluation of tests and scales
Two main approaches in Two main approaches in FAFA Exploratory factor analysis (EFA)Exploratory factor analysis (EFA)
– Explore the interrelationships among Explore the interrelationships among a set of variablesa set of variables
Confirmatory factor analysis (CFA)Confirmatory factor analysis (CFA)– Confirm specific hypotheses or Confirm specific hypotheses or
theories concerning the structure theories concerning the structure underlying a set of variablesunderlying a set of variables
Principal Component Principal Component Analysis (PCA)Analysis (PCA) A technique similar to Factor Analysis A technique similar to Factor Analysis
in the sense that PCA also produces a in the sense that PCA also produces a smaller number of variables that smaller number of variables that accounts for most of the variability in accounts for most of the variability in the pattern or correlationsthe pattern or correlations
However,However, Factor AnalysisFactor Analysis
– Mathematical model: only the shared Mathematical model: only the shared variance in the variables is analyzedvariance in the variables is analyzed
Principal Component AnalysisPrincipal Component Analysis– All the variance in the variables are usedAll the variance in the variables are used
PCA or FA?PCA or FA? If you are interested in a If you are interested in a
theoretical solution, use FA theoretical solution, use FA If you want an empirical summary If you want an empirical summary
of your data set, use PCA of your data set, use PCA (see Tabachnick & Fidell, 2001)(see Tabachnick & Fidell, 2001)
Steps involved in PCASteps involved in PCA Assessment of the suitability of the dataAssessment of the suitability of the data
– Sample size (see Stevens, 1996)Sample size (see Stevens, 1996)– Strength of the relationship among the Strength of the relationship among the
itemsitemsan inspection of the correlation matrix r > .30an inspection of the correlation matrix r > .30– Bartlett’s test of sphericity (p < .05)Bartlett’s test of sphericity (p < .05)– Kaiser-Meyer Olkin (KMO)Kaiser-Meyer Olkin (KMO)This index ranges from 0 to 1, with .6 This index ranges from 0 to 1, with .6
suggested as the minimum valuesuggested as the minimum value
Steps involved in PCA Steps involved in PCA (cont’)(cont’) Factor ExtractionFactor Extraction
– Determine the smallest number of Determine the smallest number of factors that best represent the factors that best represent the interrelations among the set of itemsinterrelations among the set of items
– Various techniques (e.g., principal factor Various techniques (e.g., principal factor analysis, maximum likelihood factoring)analysis, maximum likelihood factoring)
– Determine the number of factorsDetermine the number of factors Kaiser’s criterion (eigenvalue > 1)Kaiser’s criterion (eigenvalue > 1) Scree test (plots each eigenvalue, find the Scree test (plots each eigenvalue, find the
point where the shape becomes horizontal)point where the shape becomes horizontal)
Steps involved in PCA Steps involved in PCA (cont’)(cont’) Factor rotation and interpretationFactor rotation and interpretation
– Orthogonal (uncorrelated) factor solutionsOrthogonal (uncorrelated) factor solutionsVarimax is the most common techniqueVarimax is the most common technique– Oblique (correlated) factor solutionsOblique (correlated) factor solutionsDirect Oblimin is the most common Direct Oblimin is the most common
techniquetechnique– Simple structure (Thurstone, 1947): each Simple structure (Thurstone, 1947): each
factor is represented by a number of factor is represented by a number of strongly loading itemsstrongly loading items
ExampleExample Research Question:Research Question:
– What is the underlying factor structure of the What is the underlying factor structure of the Subjective Age Identity (SAI) scale?Subjective Age Identity (SAI) scale?
What you needWhat you need– A set of correlated continuous variables (i.e., A set of correlated continuous variables (i.e.,
items of the SAI scale)items of the SAI scale) What PCA doesWhat PCA does
– Attempts to identify a small set of factors Attempts to identify a small set of factors that represents the underlying relationships that represents the underlying relationships among a group of related variables (i.e., SAI among a group of related variables (i.e., SAI items)items)
Example (cont’)Example (cont’) AssumptionsAssumptions
– Sample size N > 150+ and a ratio of at Sample size N > 150+ and a ratio of at least five cases for each of the itemsleast five cases for each of the items
– Factorability of the correlation matrixFactorability of the correlation matrixr = .3 or greater; KMO ≥ .6; Bartlett (p r = .3 or greater; KMO ≥ .6; Bartlett (p
< .05)< .05)– LinearityLinearity– Outliers among casesOutliers among cases
Interpretation of the Interpretation of the outputoutput Is PCA appropriate?Is PCA appropriate?
– Check Correlation MatrixCheck Correlation Matrix– Check KMO and Bartlett’s testCheck KMO and Bartlett’s test
How many factors? Eigenvalue > How many factors? Eigenvalue > 11– Check the Total Variance ExplainedCheck the Total Variance Explained– Look at the Scree PlotLook at the Scree Plot
Interpretation (cont’)Interpretation (cont’) How many components are How many components are
extracted?extracted?– Component MatrixComponent Matrix– Rotated Component MatrixRotated Component Matrix
Look for the highest loading items on Look for the highest loading items on each of the component-this can be each of the component-this can be used to identify the nature of the used to identify the nature of the underlying latent variable represented underlying latent variable represented by each componentby each component
Logistic RegressionLogistic Regression Three types of regressionThree types of regression
– BivariateBivariate– MultipleMultiple– Logistic*Logistic*
Relationships among variables Relationships among variables (NOT mean differences)(NOT mean differences) One DV + 2 or more predictors or explanatory One DV + 2 or more predictors or explanatory
variablesvariables *The DV is dichotomous*The DV is dichotomous *Core concept: Odds Ratio (OR)*Core concept: Odds Ratio (OR)
Logistic Regression Logistic Regression Program Program
AAProgram Program
BBMaleMale 200200 100100
FemaleFemale 5050 150150For malesFor males, the odds of watching Program A, the odds of watching Program Aare: 200/100 (or 2 to 1).are: 200/100 (or 2 to 1).For femalesFor females, the odds of watching Program A, the odds of watching Program Aare: 50/150 (or 1 to 3).are: 50/150 (or 1 to 3).To obtain the ratio of the odds for gender relative to Program ATo obtain the ratio of the odds for gender relative to Program A::This OR = (2/1) / (1/3) = 6This OR = (2/1) / (1/3) = 6>Males are six time more likely to be watching Program A. >Males are six time more likely to be watching Program A.
ExampleExample Research Question:Research Question:
Are adolescent girls more likely to Are adolescent girls more likely to have anxiety/depression?have anxiety/depression?
What you need?What you need?– One categorical IV (i.e., gender)One categorical IV (i.e., gender)– One dichotomous DV (non-One dichotomous DV (non-
depressed=0 and depressed = 1)depressed=0 and depressed = 1)
Interpretation of the Interpretation of the outputoutput Nagelkerke Nagelkerke RR22 Is the model significant?Is the model significant? Wald’s TestWald’s TestAt the parameter-level of inference, At the parameter-level of inference,
is the gender variable significant? is the gender variable significant?
Selected ReferencesSelected References Pallant, J. (2004). Pallant, J. (2004). SPSS survival manual: A SPSS survival manual: A
step by step guide to data analysis using step by step guide to data analysis using SPSS SPSS (2nd ed.).(2nd ed.). Maidenhead: Open University Maidenhead: Open University Press.Press.
Pett, M. A., Lackey, N. R., Sullivan, J. J. (2003). Making sense of factor analysis: The use of factor analysis for instrument development in health care research. Thousand Oaks, CA: Sage.
Tabachnick, B. G., & Fidell, L. S. (2001). Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statisticsUsing multivariate statistics (4th.ed.). (4th.ed.). Boston: Allyn & Bacon. Boston: Allyn & Bacon.
top related