Analyze a corpus object to understand key metrics in this video.
- [Instructor] The NLTK library provides … a number of functions to analyze the distribution … of data as well as aggregate data in the corpus. … First, we use the frequency distribution method … to understand the distribution of words in the corpus. … This helps us understand the most … popular words in the corpus. … We then print the top 10 words in the corpus. … This is a list of couples. … It prints out each word and the number … of times it occurs in the corpus. … We can also look up a specific word … to see its distribution in the corpus. … We use the get method to get … the frequency distribution for the word spark. … In the output, you see the top 10 words listed … and there are couples mentioning the word … as well as the number of times it occurs in the corpus. … We also see the distribution for the word spark to be three. … …
- Text mining today
- Reading text files using Python
- Cleansing text data
- Build n-grams databases for text predictions
- Preparing TF-IDF matrices for machine learning
- Scaling text processing for performance
Skill Level Intermediate
1. Text Mining
2. Reading Text
3. Text Cleansing and Extraction
4. Advanced Text Processing
5. Best Practices
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.