Join Lillian Pierson, P.E. for an in-depth discussion in this video Evaluating recommendation systems, part of Introduction to Python Recommendation Systems for Machine Learning.
- [Instructor] The last thing that I want to discuss for this course is Model Evaluation. To ascertain how reliable our models are we need to determine the quality of the predictions that they make. To do that in Python we can use scikit-learn's metrics module. Within this module you can find all sorts of functions for scoring your models and evaluating their predictive performance. Use these results to help you select the best model for your given situation. The first metric we'll look at is precision. Precision is a measure of a model's relevancy.
To represent it algebraically, you can think of precision as the number of items that I liked that were also recommended to me divided by the number of items that were recommended to me. Or, in other words, how relevant were the recommendations that were made? So, for example, if a system recommended eight items and four of those items were items that you like, then the system would have achieved 50% precision. Another important metric is recall.
Recall is a measure of the model's completeness. To represent it algebraically you can think of recall as equal to the number of items that I liked that were also recommended to me divided by the number of items that I liked. In other words, how completely did the recommender system predict the items that I liked? As an example, if a system recommended eight out of ten items that you liked, then the system would have achieved 80% completion or 80% recall.
Before going into the demo, let me explain it's scenario. This demo is based on the same demonstration that you saw on the segment on classification based collaborative filtering. As you may recall, we used logistic regression as a classifier there. The scenario is that you are a marketing data scientist for a bank and you need to decide whether one of your existing clients is a good candidate for a special term deposit offer that the bank is currently running. If the client is a good candidate then you will put his name on the list and a bank representative will reach out to him.
The client we considered was Sam. He is a single divorcee who works in management and is not in credit default. He doesn't have a home loan or any other personal loans. Let's move over to the Jupyter Notebook now. Now it's time to look at how to evaluate recommendation systems within Python and we're going to look at the precision and recall within this demonstration. We're going to bring in the demonstration that we did in 2.1 which was logistic regression as a classifier.
And then I'm just going to show you how to evaluate the performance of that model. So the first thing we need to do is import our libraries so we'll say import numpy as np. Import pandas as pd. From pandas we want to import Series and DataFrame. And the from sk.learn.linear_model we import LogisticRegression.
And from sklearn.metrics we import classification_report. And when we run this we have our necessary libraries. Now, as for the datasets we're going to use the bank marketing dataset that I got from UCI machine learning repository. You saw this in 2.1 earlier. Just a reminder, I went ahead and transformed this dataset so that it has binary variables for use in the logistic regression model.
And, so the first thing we want to do is we want to read the data set in. And so, let's call it bank_full and then we'll use the read_csv function and we'll pass in a string with the name of the dataset. This dataset comes with the download for the course. This is the transformed version of the dataset I got from UCI. So we'll say bank_full _w_dummy _vars.csv and then we'll look at the first five records by calling the head method.
And you can see here, here's out dataset. These are the original variables on the left here, and then on the right here are some of the dummy variables that I created to use in the model. So, now let's generate a description of the dataset. To do that we'll use the info method. So we'll say bank_full.info. We'll run this. And I just want to point out here that all of these from y_binary down to divorced, these are all dummy variables that I created.
This y_binary variable describes whether past users subscribed. We'll use this to build our model and predict whether new users will subscribe based on their user attributes. Just reminding you I'm repeating the analysis basically that we created in section 2.1 So you can get a deeper explanation of that there, but I just want to point out how to evaluate the predictive performance of that model in this demonstration. So, we're going to create our dataset x is equal to, and we're going to select the variables we need for our model.
Bank_full.ix. We'll use a special indexer and we're going to bring in the variables that are indexed at 18 through 36 and we only want the values so we'll say .values. Our target is y_binary because we want our model to predict whether a user will accept the marketing offer. So we'll say y is equal to bank_full. And then we're just going to use a special indexer to select the y_binary variable which is located at index position 17.
We'll write that here and then .values. Then when we run this we've got our data sets. Now we need to build and train the model, so let's instantiate our logistic regression object. We'll say that its called LogReg and we'll set it equal to LogisticRegression function. And then we want to fit the model to the data so we say LogReg.fit and we pass in our variables x and y.
Next we want to use the model to predict for values of y. Or y pred. So we'll say y_pred is equal to LogReg.predict and we'll pass in our x dataset. And then when we run this we've got all of the variables we need to evaluate the performance of the model. To evaluate the performance we're going to use classification report function.
And we imported that from scikit-learn's metric module. The classification report function build a text report that shows the main performance metrics for classification algorithm. So, precision, and recall. To generate the report we call the classification report function on our y value and then we call the print function on this entire thing to print it out. So we'll say classification_report.
We pass in our y and our y_pred and then we want to print this whole thing so we'll, we'll call the print function on it and then run it. And here we go. We have our results and now I'm going to take you back over to the other screen to discuss these results. So, what I've done is I've copied the results over to this screen so I can explain them better.
Now, as I mentioned earlier, precision is a measure of a model's relevancy and recall is a measure of the model's completeness. So we see we have a precision of 87 here, and what that means is that of all the offers that were made, 87% of them were made to users that liked them. This metric is an indicator of how precise the predictions were. When we look over at recall we see that we get an 89. And what that is really saying is of all the products that users liked, 89% of those products were offered to them.
So in other words, recall is expressing how complete the predictions were that the model made or model completeness. And that's it for finding precision and recall. Pretty simple, right?
- Working with recommendation systems
- Evaluating similarity based on correlation
- Building a popularity-based recommender
- Classification-based recommendations
- Making a collaborative filtering system
- Content-based recommender systems
- Evaluating recommenders