From the course: NLP with Python for Machine Learning Essential Training

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Introducing stemming

Introducing stemming - Python Tutorial

From the course: NLP with Python for Machine Learning Essential Training

Start my 1-month free trial

Introducing stemming

- [Instructor] In the last chapter, we learned the basics of preparing our text to build the model. We learned how to remove punctuation, tokenize, and removed stop words to provide a clean list of words to Python. In this chapter, we're going to learn how to take our cleaning one step further, to provide the model with better information for classifying the text. We'll the introduce the concepts of stemming and memetizing in this chapter. Let's start with stemming. So what is stemming? The formal definition of stemming is the process of reducing inflected or derived words to their word stem or root. More simply put, the process of stemming means often crudely chopping off the end of a word, to leave only the base. So this means taking words with various suffixes and condensing them under the same root word. Recall when we removed stop words, it was to reduce the number of words Python has to look at or consider. Stemming is shooting for the same goal by reducing variations of the…

Contents