Learn about regularization.
- [Instructor] In our recommendation system, we take review data and, from there, extract user attributes and movie attributes to act as the model. Using that model, we can make recommendations. A common problem that can happen when building a model like this is called overfitting. Overfitting is when the model doesn't learn the overall pattern of the data, but instead picks up too much on specific data points. Let's explain with an example. Imagine that we have two movies. The first movie's a horror comedy. The second is a serious bloody horror movie with no comedy at all.
Both movies have horror elements, but some viewers might prefer the funny movie and other viewers might prefer the serious movie. A good recommendation system will be able to separate those two movies and see that, while both have some similar elements, they are very different movies that appeal to different audiences. A bad recommendation system that's overfitting would focus entirely on the horror attribute and disregard everything else. That system would make worse recommendations, because it would recommend both movies to anyone who likes horror. It would ignore the other attributes that make the two movies unique.
To prevent this, we use regularization. Regularization limits how much weight we will place on one single attribute when finding user or movie attributes with matrix factorization. The higher we set the regularization amount, the less weight we'll put on any single attribute. This helps us make sure that the horror comedy will be recognized for both its horror and comedy elements and not for only one of those elements. So far, we've used a regularization amount of 0.1 in the code. That works for a smaller data set, like we have here with less than 100 products. But, for a larger data set with thousands of products, you should use a larger value.
When you're building a recommendation system with your own data set, you'll want to experiment with different regularization values to see how it affects the quality of your recommendations. In the next section, we'll learn how to measure the performance of our model. Which will be helpful when experimenting with different regularization settings.
Recommendation systems are a key part of almost every modern consumer website. The systems help drive customer interaction and sales by helping customers discover products and services they might not ever find themselves. The course uses the free, open source tools Python 3.5, pandas, and numpy. By the end of the course, you'll be equipped to use machine learning yourself to solve recommendation problems. What you learn can then be directly applied to your own projects.
- Building a machine learning system
- Training a machine learning system
- Refining the accuracy of the machine learning system
- Evaluating the recommendations received