From the course: Text Analytics and Predictions with Python Essential Training

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

k-means clustering

k-means clustering - Python Tutorial

From the course: Text Analytics and Predictions with Python Essential Training

Start my 1-month free trial

k-means clustering

- [Instructor] Let us use clustering to group the courses that we have already loaded. We will use k-means clustering function available in the scikit-learn library. Let us try to create three clusters from this group of courses. We run the fit method, we generate these clusters. Finally, we print the cluster labels and the course names that are grouped inside that cluster. For this, we iterate through the unique list of cluster values which are zero, one and two. Then we iterate through the hashtags DataFrame. If the record's cluster ID is found to match the printed cluster, we print the code's title. Let's execute this code now. As you would notice from the groups, Java courses are mostly grouped into group zero. Group one has data science courses and group two has Python courses. Clustering has automatically grouped them into similar buckets based on their hashtags. How do we determine the optimal number of clusters? Let's look at that next.

Contents