Join Keith McCormick for an in-depth discussion in this video ID3 and C4.5, part of Machine Learning: Advanced Decision Trees.
- [Narrator] As you explore and do reading about C5, you may come across two other algorithms, C4.5 and ID3. Let's take a moment to connect the dots between the three. John Ross Quinlan is the common denominator. He's developed all three of them. Let's start with the first one. Back in the 80s, he developed something called the Iterative Dichotomiser 3, usually just called ID3. What's interesting is you'll still encounter this.
You may find it in R and some other places, but there have been substantial improvements made over the years. Then, he developed an algorithm called C4.5. What's interesting about this one, is if you read in this area, you'll probably encounter authors talking about C4.5 more often than they talk about C5. Why is that true? Well, back in 1994, he wrote a whole book explaining how C4.5 worked, but then, for many years he licensed the improvements he made in C5, so the details weren't generally available and that's only changed in the last couple of years.
We're going to be talking primarily about C5, but now you know a little bit about these other algorithms and if you can't find an implementation of C5 and you can only find an implementation of C4.5 that's okay because the similarities outweigh the differences. Briefly, a couple of the improvements made in C5 inlcude the fact that it's faster, it offers boosting and it offers something called winnowing. All topics, we'll have a chance to revisit.
- Understanding QUEST functions and applications
- C5.0 concepts and practical applications
- Understanding information gain
- Random forests
- Boosting and bagging
- Costs and priors