In this video, learn some best practices for processing text data.
- [Instructor] What are some of the key practices … to consider while processing text? … First, filter text data as early as possible in the process. … Text data is heavy and the lighter we make it earlier, … it is easier on resource consumption in the later stages. … Use an exhaustive and context specific stop-word list … to eliminate stop-words. … Stop-words do not carry any insights, … so eliminating most of them is important for efficiency. … Identify domain specific data for special use. … Examples of such strings would be product names, … which occur in text data. … These special words mean a specific purpose for the text … and can be used to index and classify them. … While building TF-IDF matrices, … eliminate tokens that occur rarely. … They usually are not useful in classification or analysis. … Build a clean and indexed corpus … based on the language and business context … persisted for future use. … …
- Text mining today
- Reading text files using Python
- Cleansing text data
- Build n-grams databases for text predictions
- Preparing TF-IDF matrices for machine learning
- Scaling text processing for performance
Skill Level Intermediate
1. Text Mining
2. Reading Text
3. Text Cleansing and Extraction
4. Advanced Text Processing
5. Best Practices
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.