Join Conrad Carlberg for an in-depth discussion in this video Using R for cluster analysis, part of Business Analytics: Data Reduction Techniques Using Excel and R.
- [Instructor] You can get a sense of how…to do k-means cluster analysis in R by analyzing…what's called the iris data set.…This data set comes with the basic R installation…and so does the k-means function.…So as long as you have R already installed…on your computer, you don't need…to download anything else…to demonstrate k-means cluster analysis for yourself.…You do have to bring the iris data set into R's workspace,…so we might as well start by using the Library command…on the data sets package, as shown here.…
Now you have access to all of the data sets…that come with R's base package.…One of those data sets is named iris,…lower case throughout, and it has data…on 150 iris plants.…The data include the species of iris represented…by each plant as well as the length and width…of the sepal and a petal from that plant.…If you want to get a glance of what the data looks like,…use the head function on the iris data set.…That's all the preparation that's necessary.…
The next command actually carries out the k-means analysis.…
In this course, Conrad Carlberg explains how to carry out cluster analysis and principal components analysis using Microsoft Excel, which tends to show more clearly what's going on in the analysis. Then he explains how to carry out the same analysis using R, the open-source statistical computing software, which is faster and richer in analysis options than Excel. Plus, he walks through how to merge the results of cluster analysis and factor analysis to help you break down a few underlying factors according to individuals' membership in just a few clusters.
- Reviewing the problems created by an overabundance of data
- Understanding the rationale for clustering and principal components analysis
- Using Excel to extract principal components
- Using R to extract principal components
- Using R for cluster analysis
- Using Excel for cluster analysis
- Setting up confusion tables in Excel
- Using cluster analysis and factor analysis in concert