Understand the idea of recommendations based on user attributes.
- [Instructor] If we can fill in all the blanks in our user review matrix, we'll know how each user would rate each movie. Then, we can use that information to recommend movies that are highly rated. Let's learn how to complete this matrix by hand, assuming we have extra information about each user and each movie. This will teach us the basic idea that we'll use to calculate each user's interest in each movie. To understand how to predict a user's rating, let's think about how someone decides what rating to give a movie. Every human is unique. There's probably no way to completely understand the thought process that went into a certain rating.
So let's assume that a user's rating is a reflection of how much a particular movie appeals to that user's unique set of interests. This gives us a way to calculate a user's rating. First, we'll create a model of how much a movie would appeal to every possible interest. Then, we'll make a model of a user's specific interests. Finally, we can calculate the user's rating based on how well the user's interest matched the movie. Let's start by modeling each movie's appeal. What are some good attributes that we can use to describe different types of movies? Let's start with these five, an action appeal rating, a drama appeal rating, a romance appeal rating, an arthouse appeal rating, and a crowd-pleaser appeal rating.
Now that we have a list of attributes, we can go through the movies in our database and give each movie a rating in these areas from negative five to five. Let's start with an imaginary movie called Attack on Earth. Let's assume it's an action and science fiction adventure movie, a typical summer blockbuster, so we'll give it these ratings, action five, drama negative two, romance zero, arthouse negative five, and crowd-pleaser appeal four. The second movie is a slow moving drama about family issues so we'll give it these ratings, action negative five, drama five, romance one, arthouse appeal four, and crowd-pleaser appeal negative five.
These ratings are subjective, but we'll do our best to assign ratings as consistently as possible across different movies. Now that we have scores for the movies, we need to model the user's interests. Let's use the same categories to model each user. We'll assign each user a score based on how strong their interests are in each of these categories. To get these scores, we can give each user a personality quiz. We'll use the results of the personality quiz to capture how much the user likes each of these characteristics to appear in movies. For example, if we ask a user, "How much do you like explosions?" And they respond, "Very much," we might give them a five for their preference for action.
This quiz will help us come up with an idea of how the user's preferences will map to these attributes. Let's assume that after giving the personality quiz we got these results, action five, drama negative two, romance one, arthouse negative five, and crowd-pleaser interest five. Let's see which movie is the best match for this user. For the first movie, let's multiply each of the user's ratings by the movies ratings. This will give us a score for each attribute. Then we'll add up those scores and get a total score of 74 points. Now, let's do the same thing for the second movie.
This time, when we add up these attribute scores, we get a total of negative 79 points. Here are the final results. The user's interests matched Attack on Earth with a score of 74 points. The second movie had a score of negative 79 so based on our simple attribute rating system, the user would strongly prefer movie number one over movie number two because it's a closer fit to their preferences. If we repeat this process for every user in every movie, we could work out how to estimate every rating in our review matrix. If you are familiar with linear algebra, you might recognize how we can represent this problem as matrix multiplication.
If not, don't worry but just follow along. In linear algebra terms, we're defining a user matrix called U that contains the user attributes, here five, negative two, one, negative five, and five. Then, we're also defining a movie attribute matrix where each column contains the ratings for one movie. If we take the matrix multiplication of these two matrices, it gives us a total rating for each movie. Here's the result, 74, negative 79. The math works out the same as before, but the advantage is that we can use numpy to calculate this in one line of code since it's a standard matrix multiplication operation.
Numpy is optimized to calculate these in parallel. In this video, we learned how to estimate how much a user will like a movie if we know the user's interests and how much the movie will appeal to those interests. The problem is it's difficult to assign attribute ratings to lots of movies and users in a consistent way. In the next chapter, we'll learn how to automatically extract the interest attributes without having to manually assign them.
Recommendation systems are a key part of almost every modern consumer website. The systems help drive customer interaction and sales by helping customers discover products and services they might not ever find themselves. The course uses the free, open source tools Python 3.5, pandas, and numpy. By the end of the course, you'll be equipped to use machine learning yourself to solve recommendation problems. What you learn can then be directly applied to your own projects.
- Building a machine learning system
- Training a machine learning system
- Refining the accuracy of the machine learning system
- Evaluating the recommendations received