From the course: Text Analytics and Predictions with Python Essential Training

Unlock the full course today

Join today to access over 22,500 courses taught by industry experts or purchase this course individually.

k-means optimization

k-means optimization - Python Tutorial

From the course: Text Analytics and Predictions with Python Essential Training

Start my 1-month free trial

k-means optimization

- [Instructor] One of the challenges of using k-means clustering is to determine the optimal cluster size. The most important technique for this is called the elbow method. We execute k-means clustering for the given dataset iteratively from one to 15 cluster groups. For each of these cluster sizes, we find the sum of squared distances between these clusters. As the number of clusters go up, the sum of squared distances go down. Then we plot the sum of squared distances against the clusters. Let us now execute this code and look at the plot. This graph is usually in an elbow shape. The cluster value where the elbow occurs is the most optimal cluster size. In this case the elbow occurs at three. Even though the elbow is not very significant, we can conclude that three is the optimal number of clusters for this given dataset.

Contents