Join Keith McCormick for an in-depth discussion in this video Stopping rules in QUEST, part of Machine Learning & AI: Advanced Decision Trees.
- [Instructor] Let's talk about the Stopping Rules in QUEST, there are three. The first Stopping Rule is tree depth. This is just literally how deep the tree can grow as it adds branches. The default setting in Modeler happens to be five. I often favor increasing that as high as eight. That allows the other Stopping Rules, like statistical significance, to play a role in how much the tree grows. The second Stopping Rule, is the parent child limits. To put that in the context of our dataset of 715 records, what that means is, that a node that has fewer than 14 records, won't even be considered for a split, and any resulting split has to have at least seven records in the child nodes.
Now, you have to know your tool of choice. Modeler happens to have this option of 1%, 2%, which I think is a good rule of thumb, however, so if your tool of choice, doesn't allow you to put in a percentage, you can calculate one or two percent, and put in the appropriate value, because again, one or two percent is a good rule of thumb. The final Stopping Rule is statistical significance. So the default setting is 0.05, meaning that were operating at 95% confidence.
Now remember, in QUEST, whether it's a category, an ordinal, or a scale, some kind of statistical test is being performed. So we're at 95% confidence but you could increase that to 99% confidence by making this 0.01, or you could relax the requirement by making it 90% confidence by making it 0.1, and that would either grow or shrink the tree in turn.
- Understanding QUEST functions and applications
- C5.0 concepts and practical applications
- Understanding information gain
- Random forests
- Boosting and bagging
- Costs and priors