From the course: NLP with Python for Machine Learning Essential Training
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Introducing stemming - Python Tutorial
From the course: NLP with Python for Machine Learning Essential Training
Introducing stemming
- [Instructor] In the last chapter, we learned the basics of preparing our text to build the model. We learned how to remove punctuation, tokenize, and removed stop words to provide a clean list of words to Python. In this chapter, we're going to learn how to take our cleaning one step further, to provide the model with better information for classifying the text. We'll the introduce the concepts of stemming and memetizing in this chapter. Let's start with stemming. So what is stemming? The formal definition of stemming is the process of reducing inflected or derived words to their word stem or root. More simply put, the process of stemming means often crudely chopping off the end of a word, to leave only the base. So this means taking words with various suffixes and condensing them under the same root word. Recall when we removed stop words, it was to reduce the number of words Python has to look at or consider. Stemming is shooting for the same goal by reducing variations of the…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.