Learn how to carry out cluster analysis and principal components analysis using R, the open-source statistical computing software.
- [Instructor] Welcome to this course on data reduction. My name is Conrad Carlberg, and I've been using the techniques described in this course for many years and with a variety of companies and institutions. This course focuses on a need that is sharpened in recent years due to analytics packages that collect truly enormous amounts of data. That need is for data reduction. We have too many variables and too many observations to make sense of them all in all their raw forms. In response, theoreticians have developed ways to associate people into clusters.
Cluster analysis makes it possible to develop inferences about a handful of groups instead of an entire population of individual web users. We have methods, such as principal components analysis, that expose latent variables, ones that aren't directly measured but that underlie a large number of the variables that we do measure. This course focuses on how to carry out cluster analysis and principle components analysis, and it shows how to merge their results so that you can analyze a few hidden factors that find expression in a few clusters of people.
In this course, Conrad Carlberg explains how to carry out cluster analysis and principal components analysis using Microsoft Excel, which tends to show more clearly what's going on in the analysis. Then he explains how to carry out the same analysis using R, the open-source statistical computing software, which is faster and richer in analysis options than Excel. Plus, he walks through how to merge the results of cluster analysis and factor analysis to help you break down a few underlying factors according to individuals' membership in just a few clusters.
- Reviewing the problems created by an overabundance of data
- Understanding the rationale for clustering and principal components analysis
- Using Excel to extract principal components
- Using R to extract principal components
- Using R for cluster analysis
- Using Excel for cluster analysis
- Setting up confusion tables in Excel
- Using cluster analysis and factor analysis in concert