Join Keith McCormick for an in-depth discussion in this video Defining terms, part of Machine Learning and AI Foundations: Classification Modeling.
- [Narrator] There's a couple of terms that come up so frequently in the course I want to take a moment to define them. The first thing I want to talk about is level of measurement. You won't hear me use that exact phrase very often, but what it refers to is types of variables and the way it's typically described is that there's three. The first are nominal variables. These are separate and distinct categories that are not meaningfully ranked. Ordinal variables are separate and distinct categories that are meaningfully ranked, like freshman, sophomore, junior, senior, or that kind of thing.
Low, medium, high risk. For the most part, we're gonna be talking about nominal and ordinal collectively as categorical variables. Then there's scale variables. There's a lot of similar terms for scale. Sometimes folks call them continuous, sometimes folks call them interval. There's a slightly different term, ratio, that sometimes folks use. As far as we're concerned for this course, all of those are interchangeable.
Scale, interval, continuous, ratio for us are basically the same. Finally, although the gentlemen Steven's who originally came up with this taxonomy didn't use this term, there's one that's gonna be really important for us. The notion of a binary. You might think of binaries by another name, a bullion. But it's really just a categorical variable with exactly two categories. True, false, yes, no, on, off.
These are all binaries. Okay, in addition to level of measurement, let's also talk about binary classification. So there it is. There's that term again. Binary classification techniques, which is really what this entire course is about, is when you're trying to build a predictive model and what you're trying to predict has only two categories. So, classic examples, churn, not churn. Respond to an email campaign, not respond to an email campaign.
A machine in predictive maintenance, let's say, fails or doesn't fail. So binary classification is really at the heart of this course. Finally, when binary classification models are built, they make predictions, but they make predictions in a particular form. For instance, we're gonna be seeing the Titanic data set and folks on the ship, of course, either survived or some unfortunately died. So a propensity score would essentially be a percentage chance, or propensity, for surviving.
So if you structure the problem in that way, scores near 1.0 would be a high propensity to survive and scores near 0 would be a low propensity to survive. That's the way virtually all binary classification models are set up, using propensity scores.
Note: These tutorials are focused on the theory and practical application of binary classification algorithms. No software is required to follow along with the course.
- Why do you need classification?
- Statistical algorithms versus machine learning algorithms
- Combining models using ensembles
- Classification modeling challenges