From the course: NLP with Python for Machine Learning Essential Training

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Introducing lemmatizing

Introducing lemmatizing - Python Tutorial

From the course: NLP with Python for Machine Learning Essential Training

Start my 1-month free trial

Introducing lemmatizing

- [Instructor] In the previous lessons, we learned about stemming as a tool to reduce our corpus and correlate related words. In this lesson and the one to follow, we'll be learning about a related but slightly different tool called lemmatizing. So what is lemmatizing? The formal definition is that it's the process of grouping together the inflected forms of a word so they can be analyzed as a single term, identified by the word's lemma. The lemma is the canonical form of a set of words. For instance, type, typed, and typing would all be forms of the same lemma. More simply put, lemmatizing is using vocabulary analysis of words to remove inflectional endings and return to the dictionary form of a word. So again, type, typed, and typing would all be simplified down to type, because that's the root of the word. Each variation carries the same meaning just with slightly different tense. So you might be thinking that that sounds an awful lot like stemming, and you wouldn't be wrong. They…

Contents