Lecture 11 Factor Analysis
Lecture 11 Factor Analysis
By
Amir Iqbal
Introduction
Factor analysis is a data reduction technique for identifying the internal structure
of a set of variables.
Factor analysis is a de-compositional procedure that identifies the underlying
relationships that exist within a set of variables.
Factor analysis forms groups of metric variables (interval or ratio scaled). These
groups are called factors.
These factors can be thought of as underlying constructs that cannot be measured
by a single variable (e.g. happiness).
Common factors have effects shared in common with more than one observed
variable.
Unique factors have effects that are unique to a specific variable.
2
Assumptions
Linear Relationship
The variables used in factor analysis should be linearly
related to each other. This can be checked by looking at
scatterplots of pairs of variables.
Moderately Correlated
The variables must also be at least moderately correlated to
each other; otherwise the number of factors will be almost
the same as the number of original variables, which means
that carrying out a factor analysis would be pointless.
4
correlation matrix
It presents the inter-correlations between the studied
variables.
The dimensionality of this matrix can be reduced by looking
for variables that correlate highly with a group of other
variables, but correlate very badly with variables outside of
that group (Field 2000).
These variables with high inter-correlations could well
measure one underlying variable, which is called a factor.
5
1.00
0.77
1.00
0.66
0.87 1.00
0.09
0.12
0.08
1.00
Correlation Matrix
important: Two things
The variables have to be inter-correlated,
But no extreme multi-collinearity. As this would cause
difficulties in determining the unique contribution of the
variables to a factor (Field 2000: 444).
on the rationale that the .7 level corresponds to about half of the variance in the
indicator being explained by the factor.
However, the .7 standard is a high one and real-life data may well not meet this
criterion,
A lower level such as .4 for the central factor and .25 for other factors call
loadings above .6 "high" and those below .4 "low".
In any event, factor loadings must be interpreted in the light of theory, not by
arbitrary cutoff levels.
10
Kaiser criterion:
The Kaiser rule is to drop all components with eigenvalues
under 1.0.
The Kaiser criterion is the default in SPSS and most computer
programs
But is not recommended when used as the sole cut-off
criterion for estimating the number of factors.
11
Scree plot:
The Cattell scree test, plots the components as the X axis and the
corresponding eigenvalues as the Y-axis.
As one moves to the right, toward later components, the eigenvalues
drop.
When the drop ceases and the curve makes an elbow toward less
steep decline, Cattell's scree test says to drop all further components
after the one starting the elbow.
This rule is sometimes criticised for being amenable to researchercontrolled "fudging".
That is, as picking the "elbow" can be subjective because the curve has
multiple elbows or is a smooth curve, the researcher may be tempted to set
the cut-off at the number of factors desired by his or her research agenda.
12
13
15
16
If fewer than 10 variables have a loading of more than 0.40 and the
sample size is less than 300, the loading structure is likely to be
random.
18
19
20
THANKS