From the course: NLP with Python for Machine Learning Essential Training

Unlock the full course today

Join today to access over 22,500 courses taught by industry experts or purchase this course individually.

Using lemmatizing

Using lemmatizing - Python Tutorial

From the course: NLP with Python for Machine Learning Essential Training

Start my 1-month free trial

Using lemmatizing

- [Instructor] Now that we've learned what lemmatizing means, we're going to put it to use. So we'll do this in two steps. First, we're going to test out the lemmatizer on specific words to understand how it works and then we'll apply it on the SMS Spam Collection Data Set to further clean it up. So the same process that we saw on the stemming notebook. Just like we saw with stemmers, there are a few different lemmatizers as well that handle words in slightly different ways. So we're going to use the WordNet lemmatizer. This is probably the most popular lemmatizer. WordNet is a collection of nouns, verbs, adjective and adverbs that are grouped together in sets of synonyms, each expressing a distinct concept. This lemmatizer runs off of this corpus of synonyms, so given a word, it will track that word to its synonyms, and then the distinct concept that that group of words represents. You can read more about it at the WordNet website right here. We're going to import the nltk package…

Contents