Join Keith McCormick for an in-depth discussion in this video Rule sets, part of Machine Learning & AI: Advanced Decision Trees.
- [Narrator] One of the nice features that Quinlin has built into C5.0, is the ability to display your results in the form of rule set instead of a tree. The motivation is, is that trees can get quite complex. Let's zoom out bit and take a look. We can see that there's a lot going on in this tree. It's going to be somewhat easier to look at the entire result in the form of a rule set. Let's go back into the C5.0 settings and request that.
We can see that for output type in Modeler we can choose rule set, but this should be available to you in any C5 implementation. And we'll click on Run. Let's take a look at the resulting model. Maximize this, I'm going to click on All, so that we can see all the rules. By rules for one, it means rules four survived, because the survivors are coded one and we see several of them, there are six.
The first rule for survivors, is if male and under 13, in sibling/spouse, now remember, that's the number of siblings and spouses traveling together, but they're less than or equal to 13, so that's obviously a sibling. Then one, meaning that group is expected to survive. We actually can add additional detail by clicking on run set symbol and we can see the count as well as the confidence of that rule. Rule 2 for instance, if passenger class is first class, age is less than or equal to 36, and parent/child is greater than one, then they're also expected to survive.
We scroll down, we can see the five rules for not surviving. The first of which is, if passenger class equals three, and age is greater than 30, and sibling/spouse is zero, probably a bachelor, traveling alone, in this case, they're not expected to survive. Sample size of 66 with a confidence of about 88%. There's quite a few rules here, but all of them can be seen on a page or a little bit more of material as opposed to the large, complicated tree.
The tree has it's place, it's a powerful diagram, but the rule sets are a powerful way to get all that information in a more compact format.
- Understanding QUEST functions and applications
- C5.0 concepts and practical applications
- Understanding information gain
- Random forests
- Boosting and bagging
- Costs and priors