In this video, learn how to deconstruct model error and how to achieve the optimal tradeoff.
- [Narrator] In the last few lessons, we've talked about bias, variance, underfitting, and overfitting. In this lesson, we're going to put all of these concepts together to try to explain what it means to find the optimal trade-off to minimize that total error. So let's revisit this plot just one more time. It highlights that underfitting is on the left, overfitting is on the right, and it calls out that there are some optimal model complexity right in the middle where we have an ideal trade-off between bias and variance to achieve minimum total error.
That's what we're trying to find. Now let's look at this in a slightly different way. Just looking at a scale of complexity, we have this very simple model. The simple model will underfit our data and won't learn the true pattern, and based on the plot in the prior slide, we know that this means high bias and low variance. Now on the other extreme, we have an overly complex model. This model is overfitting and essentially just memorizing the training set, and we know this means low bias and high variance.
Now the goal is to find something in the middle. This would have some kind of medium complexity. It would learn the true pattern in the data with this curved decision boundary but not memorizing every example in the training data, and this would have low, not minimum, but low bias and low variance. So again, just overlaying this with the other plot. On the left side we have low complexity and underfitting, on the right we have high complexity and overfitting, and then the middle, we have some optimal complexity with the right decision boundary with low bias, low variance, and low total error.
Now just summarizing this in table form. This is everything that I just said, and keep in mind this all relative. So the right model might actually be quite complex, but it's not as complex as the overfit model. Now up to this point, we've been talking entirely about attributes of an underfit or overfit model, but we haven't really talked much about how to really identify them. Sure, high variance means an overfit model, but how do we identify high variance? This plot illustrates how to identify it.
So this is showing training and test error across different levels of complexity. On the left side of the plot we have an underfit simple model, in the middle is a medium complexity good model, and on the right we have an overly complex overfit model. So what this plot shows is that for an underfit model, we'll have high training error and high test error, a good model will have low training error and low test error, and then an overfit model, we still have low training error, but now it gets to the point of memorizing the training set, and so its performance on unseen examples in the test set really suffers.
So lastly, here are these takeaways in a table form. This is how you can diagnose overfit versus underfit. Typically, you'll really only look at the test error. So if you see a low test error, then generally, that means that you have a good model. If you have high test error, you can look at the training error to really understand whether you're overfitting or you're underfitting and then you can work from there to improve your model.
- What is machine learning (ML)?
- ML vs. deep learning vs. AI
- Handling common challenges in ML
- Plotting continuous features
- Continuous and categorical data cleaning
- Measuring success
- Overfitting and underfitting
- Tuning hyperparameters
- Evaluating a model