From the course: NLP with Python for Machine Learning Essential Training
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Model selection: Data prep - Python Tutorial
From the course: NLP with Python for Machine Learning Essential Training
Model selection: Data prep
- [Instructor] Now we've gone through pretty much the entire machine learning process. We've read in raw text, cleaned that text, created and transformed features in feature engineering, we've fit a simple model and evaluated it on a holdout test set, we've tuned hyperparameters and evaluated each one using GridSearchCV, and now we're going to cap it all off by comparing our best performing models to select the very best model. But before we do that, I have to mention that we've been bending the rules just a little bit in regards to our vectorizers. Vectorizers are like models. They need to be fit on a training set and then stored in order to transform the test set. So when we say fit on the training set, in the context of a vectorizer, it basically just means it stores all of the words in the training set. Then when it transforms the test set, it will only create columns for the words that were in the training set. Any words that appear in the test set but not in the training set…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
-
-
(Locked)
What is machine learning?4m 2s
-
(Locked)
Cross-validation and evaluation metrics7m 48s
-
(Locked)
Introducing random forest3m 4s
-
(Locked)
Building a random forest model8m 11s
-
(Locked)
Random forest with holdout test set12m 2s
-
(Locked)
Random forest model with grid search8m 48s
-
(Locked)
Evaluate random forest model performance8m 44s
-
(Locked)
Introducing gradient boosting4m 13s
-
(Locked)
Gradient-boosting grid search9m 44s
-
(Locked)
Evaluate gradient-boosting model performance9m 32s
-
(Locked)
Model selection: Data prep8m 25s
-
(Locked)
Model selection: Results9m 52s
-
(Locked)
-