Join Keith McCormick for an in-depth discussion in this video Lift and gains chart, part of Machine Learning & AI Foundations: Decision Trees.
- [Instructor] A great way to compare models visually is the evaluation node. I'm going to go ahead a hook that up, again using the wheel mouse, and it's worth noting that the evaluation node automatically recognizes the naming conventions that have been passed to it from the modeling nodes. So, I don't have to give this node a lot of instruction. Let's edit it to take a look. The one thing I am going to do is change the chart type. There are several, and I'm going to choose the lift chart type. Now we already know that our model has told us which passengers are most likely to survive and which passengers are least likely to survive.
The lift chart's going to give us some sense of how much more likely, twice as likely, three times as likely, so on. Let's run it, take a look. Now, on the left we have the results on the training data set, and on the right we have the results from the testing data set. The drop from the upper left hand corner, a little bit lower, indicates at first that maybe we're not doing as well on the testing data set, but we're actually doing okay, because if you take a closer look, we're going from about 2.5% lift, to 2.2% lift, so it's close enough that we're okay.
So, let's define the axes. Along the bottom, we basically are arranging from the most likely to survive, near the zero percentile, all the way over to the least likely to survive at the 100 mark. So, you can picture it in the following way, imagine that you have your tree, and you've got your leaf segments, one of those leaf segments is the most likely to survive. They're over here on the left hand part of the chart.
The other end of the chart is those leaf segments that are the least likely to survive. What about the y axis? Well, the y axis is the lift, so if we take a look at those passengers that are in the top 20% to survive, they're likely to survive. If we follow up to here, we find out that the top 20th percentile is more than twice as likely to survive as the general population on the ship, and if we move over to the 40th percentile, they're almost twice as likely to survive as well.
Their lift is 1.83, notice that it starts to drop, well of course, that's not a coincidence, because only about a third of passengers survived. So, once we get to the 40th percentile, we've identified most of the passengers that are going to survive. So, what does the lift chart tell us? It tells us a couple of things in this instance, the first thing that we notice, is that other than this odd little twist in the blue line, the maroon line and the blue line are similar.
Now, remember, the chaid was first in the stream, so R is the chaid, and maroon is R1 is the cart, for the most part they match each other, so the model was agreeing most of the time, that's one of the things that we learn. The other thing we learn from the lift chart is we get some indication, again, of how much greater the survival rate is for the top 20%. So, imagine now that we're going to use thing like lift charts and analysis nodes in lots of different application areas, we're not always going to be talking about the Titanic, we could be talking about credit card fraud, we could be talking about the likelihood that a cellphone customer is going to churn, so when we go to tell management that the top 20th percent of our churns are two and a half times as likely, or four times as likely to churn, that's quantifying that in an important way.
So, along with the analysis node, the lift chart is a powerful way of comparing our models.
- Using the SPSS Modeler
- Building a CHAID model
- Adding a second model with C&RT
- Analysis notes
- Using a lift and gains chart
- Exploring algorithms
- Building a tree interactively
- The Bonferonni adjustment
- Handling nominal, ordinal, and continuous variables
- Examining the CHAID tree
- The Gini coefficient
- Weighing purity and balance
- Understanding pruning
- Examining the C&RT tree
- Applying stopping rules
- Using the Auto Classifier tuning trick