From the course: Executive Guide to Predictive Modeling Strategy at Scale

Data and supervised machine learning

From the course: Executive Guide to Predictive Modeling Strategy at Scale

Start my 1-month free trial

Data and supervised machine learning

- [Instructor] Now we're going to do a high level review of supervised machine learning, which essentially is taking historical data to build a model which we then score on new data. So let's take a closer look. In our historical data we have to have an established outcome. So if we're talking about loans, they paid or perhaps they defaulted on their mortgage. We need that end result to be known. Then we need predictors. There can be numerous predictors. Hopefully we have dozens or hundreds of them. Examples could include things like this. Whether or not the loan in question is their primary mortgage or maybe it's a second mortgage. Or what percentage of their income is being paid to housing. And this can help us predict that end result. So in order to do it. In order to map those predictors to the end result, we need a modeling algorithm. And modelers are experts in this. And there's going to be many of them that they might draw upon. They have names like Decision Tree, Support Vector Machine, Logistic Regression, or Neural Network. There are dozens more. We're going to use that historical data to establish the model. And inside the model it's just some kind of a formula, very frequently in the form of a rule set. Like if variable x is true and the value of y is less than 100 then we think that they might default. Now, the past data is the basis of that rule set or model. But we've got to score it on new data, on a regular basis. Every month or every night or so on. And this is going to produce a score. It's not just going to tell us default or not. It's going to give us a probability, like this. So we have just four loan IDs and we have these scores. 6% chance, 80% chance, 1% chance that they're going to default. And we use that score to predict the end result.

Contents