In this video, learn about a text document for purposes of analytics.
- [Instructor] The next concept we need to look into … is that of a corpus. … The plural of a corpus is copora. … In text mining, a corpus is a collection of documents. … Documents inside a corpus are related to each other … either by entity or by time periods. … For example, a corpus may contain all reviews … for a given product in a month, … or log files generated in a day by a software process, … or all tweets by a Twitter user. … A corpus is equivalent of a table in a database … for comparison reasons. … What makes up a corpus may vary depending upon … the specific use case. … For example, all reviews by a user, … all reviews for a product, … or the global list of reviews in the system … can all be examples of corpora. … Text mining libraries work with a corpus, … hence converting text data to a corpus … and understanding its structure are important capabilities. … …
- Text mining today
- Reading text files using Python
- Cleansing text data
- Build n-grams databases for text predictions
- Preparing TF-IDF matrices for machine learning
- Scaling text processing for performance
Skill Level Intermediate
1. Text Mining
2. Reading Text
3. Text Cleansing and Extraction
4. Advanced Text Processing
5. Best Practices
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.