Join Keith McCormick for an in-depth discussion in this video The Auto Classifier tuning trick, part of Machine Learning & AI Foundations: Decision Trees.
- [Narrator] I'm about to show you a software trick in SPSS Modeler. But it's not the trick that we should be focused on, but rather the thought behind it. There's a stream that's been already prepared for us. Let's called Autoclassifier Tuning Trick. The star of the show is the autoclassifier and modeler. Software packages will have some option like it. Let's take a look inside. The idea is to be able to run all the different classification techniques.
Now there numerous. This isn't just trees. This is also things like neural nets and support vector machines and so on. I've already set it up to do two different kinds of CHAID and two different kinds of CART. And therein lies the trick. You can also choose different perameters. So for CHAID two levels of confidence. And for CART, I've chose two thresholds for the minimum change in impurity. So let's take a look at how they did.
I've already prepared the results, and we can see that they've been ranked by overall accuracy. Although, they could also be ranked by lift. We can see that the winner is the winner because it has the highest overall accuracy. However, it also has the best lift, and it also has the best area under the. You can rank on any of these three criterion, and they don't always agree. So what was it about the winner in it's settings that caused it to be the best? There's a detailed print out here, but what we find is that the winner was a CHAID, as we saw in the previous screen.
But, also, this is the one that's been set at 99% confidence. Now in a real world setting, you wouldn't just do two CHAID and two CART. We would try CHAID versus Exhaustive CHAID. We might try a whole variety of different things. Also keep in mind that on real world project, you would be trying the other classifiers. So over the years, I've had many projects where I have attempted 80 or 100 or more than 100 models and then compared them.
So again, even if you have a trick like this available in software that you might be using. The real key is the thought behind it, which is, be thorough and do due diligence on all these different settings and really put some effort in trying to get the best model that you can.
- Using the SPSS Modeler
- Building a CHAID model
- Adding a second model with C&RT
- Analysis notes
- Using a lift and gains chart
- Exploring algorithms
- Building a tree interactively
- The Bonferonni adjustment
- Handling nominal, ordinal, and continuous variables
- Examining the CHAID tree
- The Gini coefficient
- Weighing purity and balance
- Understanding pruning
- Examining the C&RT tree
- Applying stopping rules
- Using the Auto Classifier tuning trick