an introduction to factor analysis ppt

47
RAKESH KUMAR MUKESH CHANDRA BISHT(PhD Scholar, LNIPE) A Presentation by AN INTRODUCTION TO EXPOLRATORY FACTOR ANALYSIS

Upload: mukesh-bisht

Post on 12-Feb-2017

676 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: An Introduction to Factor analysis ppt

• RAKESH KUMAR• MUKESH CHANDRA

BISHT(PhD Scholar, LNIPE)

A Presentation by

AN INTRODUCTION TO EXPOLRATORY FACTOR ANALYSIS

Page 2: An Introduction to Factor analysis ppt

“When you CAN MEASURE what you are speaking about and express it in numbers, you know

something about it; but when you CANNOT express it in numbers your knowledge is of a mearge and

unsatisfactory kind.”

Measurement is necessary.

LORD KELVIN, British Scientist

Page 3: An Introduction to Factor analysis ppt

FIRST NOTABLE MENTION

Charles Edward Spearmen was known for his seminal work on testing and measuring of HUMAN INELLIGENCE by using the FACTOR ANALYSIS during World War I.

CHARLES EDWARD SPEARMEN(BRITISH PSYCHOLOGIST)

Page 4: An Introduction to Factor analysis ppt

A factor is a linear combination of variables. It is a construct that is not directly observed but

that needs to be inferred from the input variables.

What is a factor

Page 5: An Introduction to Factor analysis ppt

• Variable reduction technique

• Reduces a set of variable in terms of a small number of latent factors(unobservable).

• Factor analysis is a correlational method used to find and describe the underlying factors driving data values for a large set of variables.

Factor Analysis

Page 6: An Introduction to Factor analysis ppt

SIMPLE PATH DIAGRAM FOR A FACTOR ANALYSIS MODEL

•F1 and F2 are two common factors. Y1,Y2,Y3,Y4, and Y5 are observed variables, possibly 5 subtests or measures of other observations such as responses to items on a survey.• e1,e2,e3,e4, and e5 represent residuals or unique factors, which are assumed to be uncorrelated with each other.

Page 7: An Introduction to Factor analysis ppt

Questionnaire construction Test Battery construction

Uses of Factor Analysis

Page 8: An Introduction to Factor analysis ppt

Conducting Factor Analysis

Testing the Assumptions

Construction of correlation Matrix

Interpretation of Factors

Rotation of Factors

Determination of Number of Factors

Method of Factor Analysis

Page 9: An Introduction to Factor analysis ppt

1. No outliers in the data set.2. Normality of the data set.3. Adequate sample size.4. Multi collinearity and singularity among the

variables does not exist.5. Homoscedasticity does not exist between the

variables because factor analysis is a linear function of measured variables.

6. Variables should be linear in nature.7. Data should be metric in nature i.e. on

interval and ratio scale.

Assumptions to be fulfilled for running Factor analysis

KMO test is used

Page 10: An Introduction to Factor analysis ppt

Bartlett test of sphericity

It test the null hypothesis that all the correlation between the variables is Zero. It also test whether the correlation matrix is a identity matrix or not. If it is an identity matrix then factor analysis becomes in appropriate.

Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy

This test checks the adequacy of data for running the factor analysis. The value of KMO ranges from 0 to 1. The larger the value of KMO more adequate is the sample for running the factor analysis. Kaiser recommends accepting values greater than 0.5 as acceptable.

Page 11: An Introduction to Factor analysis ppt

Testing the Assumptions

Construction of correlation Matrix

Problem formulation

Interpretation of Factors

Rotation of Factors

Determination of Number of Factors

Method of Factor Analysis

Page 12: An Introduction to Factor analysis ppt

•Analyses the pattern of correlations between variables in the correlation matrix

•Which variables tend to correlate highly together?

•If variables are highly correlated, likely that they represent the same underlying dimension

Factor analysis pinpoints the clusters of high correlations between variables and for each cluster, it will assign a factor

Construction of the Correlation Matrix

Page 13: An Introduction to Factor analysis ppt

Correlation MatrixQ1 Q2 Q3 Q4 Q5 Q6

Q1 1

Q2 .987 1

Q3 .801 .765 1

Q4 -.003 -.088 0 1

Q5 -.051 .044 .213 .968 1

Q6 -.190 -.111 0.102 .789 .864 1

• Q1-3 correlate strongly with each other and hardly at all with 4-6• Q4-6 correlate strongly with each other and hardly at all with 1-3• Two factors!

Page 14: An Introduction to Factor analysis ppt

Testing the Assumptions

Construction of correlation Matrix

Problem formulation

Interpretation of Factors

Rotation of Factors

Determination of Number of Factors

Method of Factor Analysis

Page 15: An Introduction to Factor analysis ppt

Method of Factor Analysis

(A) Principal component analysis

•Provides a unique solution, so that the original data can be reconstructed from the results

•It looks at the total variance among the variables that is the unique as well as the common variance.

•In this method, the factor explaining the maximum variance is extracted first.

Page 16: An Introduction to Factor analysis ppt

Uses an estimate of common variance among the original variables to generate factor solution. Because of this, the number of factors will always be less than the number of original variables

(B) Common factor analysis

Un weighted least squares, Generalized least squares, Maximum likelihood, Principal axis factoring, Alpha factoring, and Image factoring.

Other Method s Includes:-

Page 17: An Introduction to Factor analysis ppt

Variable

Specific Variance

Error Varianc

e

Common

Variance

Variance unique to the variable itself

Variance due to

measurement error or some

random, unknown source

Variance that a variable

shares with other

variables in a matrix

When searching for the factors underlying the relationships between a set of variables, we are interested in detecting and explaining the common variance

Total Variance = common variance + specific variance + error variance

Page 18: An Introduction to Factor analysis ppt

Testing the Assumptions

Construction of correlation Matrix

Problem formulation

Interpretation of Factors

Rotation of Factors

Determination of Number of Factors

Method of Factor Analysis

Page 19: An Introduction to Factor analysis ppt

Determination of Number of FactorsEIGEN VALUE

•The Eigen value for a given factor measures the variance in all the variables which is accounted for by that factor. •It is the amount of variance explained by a factor. It is also called as characteristic root.

Kaiser Guttmann Criterion

This method states that the number of factors to be extracted should be equal to the number of factors having an Eigen value of 1 or greater than 1.

Page 20: An Introduction to Factor analysis ppt

The Scree Plot

The examination of the Scree plot provides a visual of the total variance associated with each factor.

The steep slope shows the large factors.

The gradual trailing off (scree) shows the rest of the factors usually lower than an Eigen value of 1.

Page 21: An Introduction to Factor analysis ppt

Scree Plot

Component Number

654321

3.5

3.0

2.5

2.0

1.5

1.0

.5

0.0

-.5

Take the components above the elbow

Page 22: An Introduction to Factor analysis ppt

Testing the Assumptions

Construction of correlation Matrix

Problem formulation

Interpretation of Factors

Rotation of Factors

Determination of Number of Factors

Method of Factor Analysis

Page 23: An Introduction to Factor analysis ppt

• Maximizes high item loadings and minimizes low item loadings, thereby producing a more interpretable and simplified solution. • Two common rotation techniques orthogonal rotation and oblique rotation.

Rotation of Factors

Rotation

Orthogonal Oblique

Varimax Qudramax Equamax Direct Oblimin Promax

Page 24: An Introduction to Factor analysis ppt
Page 25: An Introduction to Factor analysis ppt
Page 26: An Introduction to Factor analysis ppt

Testing the Assumptions

Construction of correlation Matrix

Problem formulation

Interpretation of Factors

Rotation of Factors

Determination of Number of Factors

Method of Factor Analysis

Page 27: An Introduction to Factor analysis ppt

KEY TERMINOLOGIES TO KNOW

Factor Loading

• It can be defined as the correlation coefficient between the variable

and the factor.

• The squared factor loading of a variable indicates the percentage

variability explained by the factor in that variable. A factor loading of

0.7 is considered to be sufficient.

Page 28: An Introduction to Factor analysis ppt

COMMUNALITY

•The communality is the amount of variance each variable in the analysis shares with other variables.•Squared multiple correlation for the variable as dependent using the factors as predictors and is denoted by h2.• The value of communality may be considered as the indicator of reliability of a variable.

Page 29: An Introduction to Factor analysis ppt

Variables Component 1 Component 2 Component 3 CommunalityVividness Qu -.198 -.805 .061 69%Control Qu .173 .751 .306 69%Preference Qu .353 .577 -.549 76%Generate Test -.444 .251 .543 55%Inspect Test -.773 .051 -.051 60%Maintain .734 -.003 .384 69%Transform (P&P) Test .759 -.155 .188 64%Transform (Comp) Test

-.792 .179 .304 75%

Visual STM Test .792 -.102 .215 69%

Eigenvalues 3.36 1.677 1.018 /

% Variance 37.3% 18.6% 11.3% /

Communality of Variable 1 (Vividness Qu) = (-.198)2 + (-.805)2 + (.061)2 = . 69 or 69%Eigenvalue of Comp 1 = ( [-.198]2 + [.173]2 + [.353]2 + [-.444]2 + [-.773]2 +[.734]2 + [.759]2 + [-.792]2 + [.792]2 ) = 3.363.36 / 9 = 37.3%

Page 30: An Introduction to Factor analysis ppt

 In a study on swimmers eleven physical and physiological parameters were measured. Apply factor analysis technique to study the factor structure and suggest the test battery that can be used for screening the talents in swimming.

Field Example

Page 31: An Introduction to Factor analysis ppt
Page 32: An Introduction to Factor analysis ppt
Page 33: An Introduction to Factor analysis ppt
Page 34: An Introduction to Factor analysis ppt

Click on this arrow

Page 35: An Introduction to Factor analysis ppt
Page 36: An Introduction to Factor analysis ppt

Click on Descriptives

Click on Continue

Page 37: An Introduction to Factor analysis ppt

Click on Extraction

Click on Continue

Select Principal components

Page 38: An Introduction to Factor analysis ppt

Click on Rotation

Click on Continue

Click on Rotation

Click on Continue

Click on OK

Select VARIMAX Rotation

Page 39: An Introduction to Factor analysis ppt

Interpretation of various outputsDescriptive Statistics

Mean Std. Deviation Analysis N

Standing Broad Jump 212.3810 15.45793 21

Shuttle Run 10.2514 .51167 21

Fifty Meter Dash 7.6938 .80880 21

Twelve Meter run and walk 2488.9524 222.46696 21

Anerobic capacity 39.9071 12.70207 21

Weight 37.8095 7.67215 21

Height 148.3810 10.18566 21

Leg Length 76.3333 5.18009 21

Calf Girth 28.5238 1.99045 21

Thigh Girth 40.5238 3.51595 21

Shoulder Width 38.1429 4.43041 21

Page 40: An Introduction to Factor analysis ppt

Correlation Matrix

Standing

Broad Jump

Shuttle Run

Fifty Meter Dash

Twelve Meter

run and walk

Anerobic

capacity

Weight

Height

Leg Lengt

h

Calf Girth

Thigh Girth

Shoulder

Width

Correlation

Standing Broad Jump

1.000

Shuttle Run -.651 1.000

Fifty Meter Dash -.359 .277 1.000

Twelve Meter run and walk

.539 -.691 -.492 1.000

Anerobic capacity .608 -.709 -.322 .686 1.000

Weight .469 -.087 -.231 -.045 .255 1.000

Height .416 -.048 -.358 .010 .142 .947 1.000

Leg Length .513 -.321 -.354 .151 .292 .687 .675 1.000

Calf Girth .606 -.495 -.400 .366 .602 .577 .522 .739 1.000

Thigh Girth .584 -.515 -.186 .269 .589 .632 .543 .646 .773 1.000

Shoulder Width .455 -.483 .128 .279 .410 .405 .244 .322 .377 .451 1.000

Page 41: An Introduction to Factor analysis ppt

KMO and Bartlett's Test

Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .687

Bartlett's Test of Sphericity

Approx. Chi-Square 165.579

df 55

Sig. .000

Since the value of KMO is more than 0.5 so the sample taken in the study is adequate to run the factor analysis.

Since the value for significance in Bartlett test of sphericity is less than 0.05 so the null hypothesis i.e. all the correlation between the variables is 0 is rejected. So the correlation matrix is not an identity matrix and that is good.

Page 42: An Introduction to Factor analysis ppt

Total Variance ExplainedComponent

Initial Eigenvalues Extraction Sums of Squared Loadings

Rotation Sums of Squared Loadings

Total % of Variance

Cumulative %

Total % of Variance

Cumulative %

Total % of Variance

Cumulative %

1 5.429 49.355 49.355 5.429 49.355 49.355 3.890 35.364 35.3642 2.157 19.608 68.963 2.157 19.608 68.963 3.692 33.559 68.9243 1.241 11.285 80.247 1.241 11.285 80.247 1.246 11.324 80.2474 .595 5.407 85.6545 .421 3.831 89.4856 .367 3.336 92.8217 .243 2.214 95.0358 .216 1.967 97.0019 .180 1.637 98.63810 .137 1.241 99.88011 .013 .120 100.000Extraction Method: Principal Component Analysis.

We are looking for an Eigen

value above 1.0

Cumulative percent of variance

explained.

Page 43: An Introduction to Factor analysis ppt

These three factors will be extracted out as they have an eigen value greater than 1.

Page 44: An Introduction to Factor analysis ppt

Factor loadings of all the variables on each of the two factors have been shown here. Since this is an unrotated factor solution, some of the variables may show their contribution in more than one factor. In order to avoid this situation, the factors are rotated by using the varimax rotation technique.

Unrotated Component MatrixComponent

1 2 3

Standing Broad Jump .814 -.179 .020Shuttle Run -.682 .587 -.136Fifty Meter Dash -.469 .108 .808Twelve Meter run and walk .549 -.694 -.230Anerobic capacity .731 -.484 .053Weight .700 .650 .050Height .647 .663 -.159Leg Length .762 .396 -.087Calf Girth .863 .088 -.051Thigh Girth .835 .138 .199Shoulder Width .560 -.082 .660Extraction Method: Principal Component Analysis.a. 3 components extracted.

Page 45: An Introduction to Factor analysis ppt

Rotated Component MatrixComponent

1 2 3Standing Broad Jump .469 .689 -.003Shuttle Run -.091 -.901 -.090Fifty Meter Dash -.292 -.356 .820Twelve Meter run and walk -.069 .868 -.279Anaerobic capacity .200 .855 .012Weight .954 .010 .079Height .930 -.047 -.128Leg Length .828 .230 -.074Calf Girth .690 .524 -.058Thigh Girth .696 .483 .194Shoulder Width .332 .479 .646Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization.a. Rotation converged in 4 iterations.

After varimax rotation factors will have non-overlapping variables. If the variable has factor loadings more than 0.7, it indicates that the factor extracts sufficient variance from that variable. Thus, all those variables having loadings more than 0.7 or more on a particular factor is identified in that factor.

Page 46: An Introduction to Factor analysis ppt

Shuttle Run

Fifty Meter Dash

Twelve Meter run and walk

ANTHROPOMETRIC

Weight

Height

Leg Length

Name each factor as per your wish

PHYSICAL

Page 47: An Introduction to Factor analysis ppt

THANK YOU FOR YOU KIND

ATTENTION