The Difference Between Cluster & Factor Analysis

Discriminant and clustering analysis are statistical methods.
••• woman reading business statistics image by forca from Fotolia.com

Cluster analysis and factor analysis are two statistical methods of data analysis. These two forms of analysis are heavily used in the natural and behavior sciences. Both cluster analysis and factor analysis allow the user to group parts of the data into "clusters" or onto "factors," depending on the type of analysis. Some researchers new to the methods of cluster and factor analyses may feel that these two types of analysis are similar overall. While cluster analysis and factor analysis seem similar on the surface, they differ in many ways, including in their overall objectives and applications.

Objective

Cluster analysis and factor analysis have different objectives. The usual objective of factor analysis is to explain correlation in a set of data and relate variables to each other, while the objective of cluster analysis is to address heterogeneity in each set of data. In spirit, cluster analysis is a form of categorization, whereas factor analysis is a form of simplification.

Complexity

Complexity is one question on which factor analysis and cluster analysis differ: data size affects each analysis differently. As the set of data grows, cluster analysis becomes computationally intractable. This is true because the number of data points in cluster analysis is directly related to the number of possible cluster solutions. For example, the number of ways to divide twenty objects into 4 clusters of equal size is over 488 million. This makes direct computational methods, including the category of methods to which factor analysis belongs, impossible.

Solution

Even though the solutions to both factor analysis and cluster analysis problems are subjective to some degree, factor analysis allows a researcher to yield a “best” solution, in the sense that the researcher can optimize a certain aspect of the solution (orthogonality, ease of interpretation and so on). This is not so for cluster analysis, since all algorithms that could possibly yield a best cluster analysis solution are computationally inefficient. Hence, researchers employing cluster analysis cannot guarantee an optimal solution.

Applications

Factor analysis and cluster analysis differ in how they are applied to real data. Because factor analysis has the ability to reduce a unwieldy set of variables to a much smaller set of factors, it is suitable for simplifying complex models. Factor analysis also has a confirmatory use, in which the researcher can develop a set of hypotheses regarding how variables in the data are related. The researcher can then run factor analysis on the data set to confirm or deny these hypotheses. Cluster analysis, on the other hand, is suitable for classifying objects according to certain criteria. For example, a researcher can measure certain aspects of a group of newly-discovered plants and place these plants into species categories by employing cluster analysis.

Related Articles

What Is the Purpose of Factor Analysis?
Similarities of Univariate & Multivariate Statistical...
Disadvantages of Factor Analysis
How to Factor X Squared Minus 2
Research Methods in Science
The Methods of Enumeration in Microbes
How to Find the Centroid in a Clustering Analysis
Statistical Analysis Tools
What Is PPS Sampling?
The Disadvantages of Biotechnology
What Is a Nominal Variable?
How to Find B in Y=Mx + B
How to Calculate Logit
How to Change 1/4 to a Decimal Form
Characteristics of a Colloid
What Is the Meaning of Variables in Research?
Distinguishing Between Descriptive & Causal Studies
Pros & Cons of Methods for Quadratic Equations
Inquiry-Based Math Learning
Advantages and Disadvantages of Quadrat Use