Understand how matrix completion is a good model for predicting user ratings.
- [Narrator] Let's take a look at create review matrix as csv.py. This code will generate a csv representation of the review matrix that we can open in a spreadsheet. First we load the data using Panda's read_csv function. Then we use Panda's pivot_table function to create the review matrix. Finally, we'll use Panda's to_csv function to save the result as a csv file. Let's run the code. Right click and choose run. Great. Now let's open that file in our spreadsheet application. I'm using Numbers but any spreadsheet application should work.
This is our review matrix. There's one row for each user and one column for each movie. Each number represents a review entered by a user. Blank spaces represent movies that have not yet been reviewed by a user. Imagine if we could figure out a way to fill in all the blank spaces based on the numbers we know. For example, let's look at user number three. We can see that user number three gave four stars to movie one and movie two, and five stars to movie number three. What if we could use the ratings we know and the ratings from other users to fill in what this user would most likely rate movie number four? Once we know the rating that a user would give a movie, we know whether or not we should recommend that movie.
If we think this user would give movie number four a five star rating, this is a movie we definitely want to recommend to that user. So in order to build a recommendation system, what we really need is an algorithm that helps us complete all the missing blanks in the matrix based on the numbers we already know. If we can fill in every blank in the matrix with the rating the user would have given that movie, then we'll know everything we need to know to make a recommendation to every user.
Recommendation systems are a key part of almost every modern consumer website. The systems help drive customer interaction and sales by helping customers discover products and services they might not ever find themselves. The course uses the free, open source tools Python 3.5, pandas, and numpy. By the end of the course, you'll be equipped to use machine learning yourself to solve recommendation problems. What you learn can then be directly applied to your own projects.
- Building a machine learning system
- Training a machine learning system
- Refining the accuracy of the machine learning system
- Evaluating the recommendations received