Learn how to use the machine learning model to make recommendations.
- [Voiceover] Let's talk about the best way to put our recommendations system to use in the real world. Let's open up train reccommender dot py. This file contains the code to factor our review data set. We read the data set using the read csv function, then we create the ratings matrix using the pivot table function, factor the matrix to create U and M, and then multiply U and M to get the predicted ratings. Since our movie review data set is fairly small, this process runs pretty quickly. But with a larger data set, the factoring process can take several minutes, or even hours to run.
We don't have to factor the matrix every time we want to make a user recommendation. Instead, it's a lot more practical to factor the matrix once, and save the resulting model to a file. Then, we can use those files later to make recommendations without needing to perform any slow calculations. Python provides a feature for easily saving and loading data from files called Pickle. So at the bottom here, we're going to use Python's Pickle dot dump function to write out the U matrix, to a file called user features dot data. Then, we'll write out the M matrix to product features dot data.
This data is useful to save in case we want to calculate product similarity. We'll also write out the predicted ratings to a file called predicted ratings dot dat. This is the data file we'll need to make recommendations. Let's run the script to generate these files. Right click, choose run, great, the files were generated. In the real world, you would want to set up the script to run automatically on a regular basis. That way, as users write new reviews, you're feeding new data into your recommendations system regularly. Now, let's switch over to make recommendations from data files dot py.
This file contains just the logic to recommend movies using the data files we just created. First, we'll use Pickle to reload the U, M, and predicted ratings files. We'll also load the list of movie titles from movies dot csv into the movies df dataframe, because we'll want to have access to the movie titles. Then, we'll ask for a user ID, and print out the movies with the highest predicted rating for the user. Let's run the program, right click, choose run. Notice that when I type in the user ID and hit enter, the recommendations are nearly instantaneous.
That's because all of the hard work has already been done, and saved in the data files. It's important to separate the slow step of model generation from the faster step of making recommendations, so that the users don't have to wait for recommendations.
Recommendation systems are a key part of almost every modern consumer website. The systems help drive customer interaction and sales by helping customers discover products and services they might not ever find themselves. The course uses the free, open source tools Python 3.5, pandas, and numpy. By the end of the course, you'll be equipped to use machine learning yourself to solve recommendation problems. What you learn can then be directly applied to your own projects.
- Building a machine learning system
- Training a machine learning system
- Refining the accuracy of the machine learning system
- Evaluating the recommendations received