Principal Components Analysis
The PCA.xls workbook provides more complex analyses than does SPC.xls, and uses data reduction techniques to make large data sets manageable. It combines principal components analysis with Varimax factor rotation; see the link for more information.
PCA.xls uses multivariate techniques to reduce a mass of related variables to a smaller set of underlying factors. These factors often cannot be measured directly but instead get expressed in variables that we can quantify. In business analysis, for example, we often see a lengthy list of financial and operational and indicators such as these:
- Variable and fixed costs of production, such as materials and plant
- Unit and extended costs
- Advertising and promotion
- Market research and product management
- Warranty costs
- Distribution channel management costs
Even this abbreviated list can become unmanageable when it is repeated across several product lines. When the variables and values are subjected to principal components analysis (PCA) and rotation, it is often possible to isolate a few underlying factors that are responsible for the generation of profitable revenue.
Business analysis is just one field where PCA can provide guideposts. We've included a chart here that shows how per capita crime rates in the United States (for example, burglary, murder, larceny, auto theft and so on) resolve to two underlying factors: crimes against property and crimes against persons.
The chart shows how a plot of the individual states on those two factors tends to cluster the individual states into regions. For example:
- The Western region tends to load high on property crime
- The Southeastern region tends to load high on personal crime but low on property crime
- The Northeastern region tends to load low on both personal and property crime
Of course, we know that Washington, Oregon, California, Colorado, Nevada and Arizona are in the West. But if we didn't know that, PCA would help us cluster those states together by means of their similarity on the underlying factors. PCA and its close cousin, cluster analysis, support everything from resource allocation to customer acquisition and retention, in fields from health care to real estate.
We have designed a Microsoft Excel utility that you can use to test your own data, and we've made it available here. Simply download the Excel workbook and follow the instructions on the first worksheet. You'll get full PCA results including eigenvectors, factor loadings and scores, along with a Varimax rotation of the factors to an interpretable structure.