From the course: NLP with Python for Machine Learning Essential Training
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Introducing lemmatizing - Python Tutorial
From the course: NLP with Python for Machine Learning Essential Training
Introducing lemmatizing
- [Instructor] In the previous lessons, we learned about stemming as a tool to reduce our corpus and correlate related words. In this lesson and the one to follow, we'll be learning about a related but slightly different tool called lemmatizing. So what is lemmatizing? The formal definition is that it's the process of grouping together the inflected forms of a word so they can be analyzed as a single term, identified by the word's lemma. The lemma is the canonical form of a set of words. For instance, type, typed, and typing would all be forms of the same lemma. More simply put, lemmatizing is using vocabulary analysis of words to remove inflectional endings and return to the dictionary form of a word. So again, type, typed, and typing would all be simplified down to type, because that's the root of the word. Each variation carries the same meaning just with slightly different tense. So you might be thinking that that sounds an awful lot like stemming, and you wouldn't be wrong. They…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.