Factor Analysis And Its Applications | Understanding Factor Analysis

February 09, 2018

Factor Analysis And Its Applications | Understanding Factor Analysis

Let's say, your data-set contains 200 variables.

Can you imagine how cumbersome its gonna be if you analyse your data-set using all the 200 variables?

Using Factor Analysis you can reduce a large number of variables into a smaller set of variables (factors), which is capable of explaining the observed variance in the larger number of variables.

In short, Factor Analysis summarizes your large data-set so that relationships and patterns can be easily interpreted and understood.

Five steps to Factor Analysis:
- Create a correlation matrix for all the variables
- Factor Extraction
- Calculate Initial Factor Loadings
- Factor Rotation
- Calculation of Factor Scores

Correlation Matrix
- It searches for variables that are strongly correlated to each other.
- If the correlation between variables are relatively small, it is very unlikely that they share a common factor.
- It focuses to extract factors that accounts for as much variation in the observed variables as possible.

Factor Extraction
- The main purpose of Factor Analysis is to identify combinations of variables, and those combinations are called factors.
- Different Factor Extraction methods:
-- Maximum Likelihood
-- Principal axis factoring
-- Unweighted Least Square
-- Generalized Least Square
-- Image Factoring

How to decide the number of factors?
- Look for the Factor Correlation - If correlation between factors are too high (> 0.7) then there is a high possibility that factors are pretty similar and in this case, merge the two related factors.
- Easily Explainable? Are you able to easily interpret and explain associated items of the each factors?
- The more items are present in a factor, there is a higher chances to consider it for further analysis.

Factor Loadings
- It represents the correlation between the factor and the variable.
- It tells you how much a factor explains a variable.
- Factor Loadings close to:
=> -1 or 1 indicates that the factor strongly influences the variable
=> 0 indicates that the factor has a weak influence on the variable

- For example, lets say we have nine variables i.e. Algebra, Chemistry, Geometry, Physics, Game theory, Number theory, Set Theory, Probability, Biology

Subjects

Algebra

Chemistry

Geometry

Physics

Game theory

Number theory

Set Theory

Probability

Biology

Subjects	Factor-1	Factor-2
Algebra	0.788	0.542
Chemistry	0.368	0.912
Geometry	0.729	0.367
Physics	0.541	0.875
Game theory	0.891	0.333
Number theory	0.795	0.412
Set Theory	0.832	0.390
Probability	0.955	0.324
Biology	0.289	0.816

- Algebra, Geometry, Game Theory, Number System, Set, Theory and Probability have high Factor Loadings in Factor-1.

- Chemistry, Physics and Biology have high Factor Loading in Factor-2.

- Items of Factor-1 is associated to a common latent relationship and can also be labeled as 'Mathematics' and similarly Factor-2 can be labelled as 'Science'.

Factor Rotation
- Once the Initial Factor Loadings have been calculated, the factors are rotated.
- It is a process of manipulation or adjusting the factor axes in order to achieve a simpler and pragmatically more meaningful factor solution.
- Rotation creates a simpler factor structure and makes the factors more clearly distinguishable.
- Orthogonal Rotation - It assumes that factors are not correlated.
- Oblique Rotation - Unlike Orthogonal Rotation, it allows for factor correlation.

Factor Scores
- Factor Scores are the estimated value of the factors.
- It is used to prioritize and rank the factors.
- With the help of Factor Score, you may decide easily that which factors are more important or which factors you need to focus more.
- In most of the cases, you look for the Factor Scores (positive or negative) >= 0.7
- Initially the obtained Factor Score can be low but after some iteration it can be achieved to a high score.

Deciding questions before using Factor Analysis
- Is there are any outliers in data? Since it assumes that there are no outliers in data.
- Is there any multi-collinearity between the variables?
Since for Factor Analysis, there should not be any perfect multi-collinearity between the variables.
- What are the minimum number of factors that can explain all the variation of data-set?
- How well do these factors describe all the data?

Comments

Anonymous28 April 2018 at 21:38
Can you explain more on orthogonal rotation and oblique rotation?
ReplyDelete
Replies
Anonymous6 May 2018 at 06:10
could you compare pca with factor analysis in some of your posts?
ReplyDelete
Replies
Anonymous6 May 2018 at 06:26
How factor analysis is different from Clustering since in factor analysis also we group similar variables into dimension?
ReplyDelete
Replies

Add comment

Search This Blog

Hashtag Statistics

Factor Analysis And Its Applications | Understanding Factor Analysis

Comments

Post a Comment

Popular Posts

Machine Learning and Its Applications

The Significance of Poisson Distribution in Statistics | Hashtag Statistics

What Makes Naive Bayes Classification So Naive? | How Does Naive Bayes Classifier Work