In this video, discover techniques for cleansing text in Python.
- [Instructor] X-data need to undergo a series … of cleansing steps … before it's ready for analytics and machine learning. … We will review some of these steps in this video. … X-cleansing covers a number of general purpose … and specific cleansing activities. … Text may need to be formatted and standardized. … For example, dates inside text may need to be converted … to a standard format. … Language conversions may also need to be done. … Punctuations like period, comma, et cetera, … need to removed also … since they don't carry any inside value … in a text carapace. … Abbreviations need to be removed … or converted to their full form. … Case conversion may also be necessary to standardize text. … Elements like hashtags, mentions, and URLs … need to be cleaned up also. … In this example, we do two cleansing activities. … Removing punctuations, and conversion to lowercase. … In order to remove punctuations, … we use the Punkt package in NLTK. … Using a Lambda function, we ran each token, …
- Interpret the relationship of documents inside a corpus.
- Distinguish between the different text processing capabilities that the NLTK provides.
- Explain why text cleansing and extraction occur when processing text with Python.
- Apply advanced text processing steps to find and create TF-IDF and the TF-IDF array.
- Explain best practices when processing text with Python.
Skill Level Intermediate
Processing Text with R Essential Trainingwith Kumaran Ponnambalam55m 57s Intermediate
1. Text Mining
2. Reading Text
3. Text Cleansing and Extraction
4. Advanced Text Processing
5. Best Practices
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.