Data needs to be pre-processed before it can be used for clustering. Learn about the steps need to pre-process text data.
- In this example, we will use data about courses … and their hashtags available in the file Course-Hashtags.csv … under the "hashtags" directory. … This contains a list of course items … and the hashtags used in their course description. … Let's assume that we did prior pre-processing … to extract these hashtags from the text. … We will now use these hashtags to group courses … into clusters. … The code for this example is available … in the file code_03_XX Clustering.R. … We will use the tm library for doing data processing. … First, we load the hashtag csv file into a data frame. … Let's run the code, and review the contents. … The course name and the hashtags are shown. … We will use the hashtags to cluster courses … into similar groups. … We load up these hashtags from the data frame … into a VCorpus in R. … Then, we replace the commas in the hashtags list with spaces … using the content transformer capabilities … available in tm map function. … We then inspect the hashtags … to see how the cleansing happened. …
- Creating a word cloud
- Analyzing sentiment
- Extracting emotions from text
- Clustering similar entities based on text
- Using classification for supervised learning
- Recommending items to users based on text data analytics
Skill Level Intermediate
Predictive Customer Analyticswith Kumaran Ponnambalam1h 37m Intermediate
1. Word Cloud
2. Sentiment Analysis
5. Predictive Text
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.