From the course: Advanced Predictive Modeling: Mastering Ensembles and Metamodeling

What is an ensemble? - SPSS Tutorial

From the course: Advanced Predictive Modeling: Mastering Ensembles and Metamodeling

Start my 1-month free trial

What is an ensemble?

- [Instructor] Rather than simply defining ensembles, let's go ahead and build one. So, I have three models that I built. They're all attempting to predict the miles per gallon value for a bunch of cars. The first model that I built is a linear regression model. Now, I didn't just use the variable weight, but we can see there's some potential for a linear model here, but a little taste of curvilinearity, too. But we're going to give it a shot. We're going to build a regression model. And here are the regression coefficients. So, you can see that, in addition to weight, which is the most important variable, I also used the number of cylinders, the displacement, the horsepower, the weight of the vehicle, the acceleration, and the country of origin of the car. Second model that I built is a neural network; same variables, completely different approach. Frankly, with neural nets, diagrams like this don't really help you understand the neural net terribly well. The point is we've used a completely different approach from the linear regression. The third model that I built is a regression tree; again, the same variables but yet another completely different approach from the other two. So, you can see here we've got the average miles per gallon is 23.8, and the regression tree apparently liked displacement because that was the most popular variable and that it shows up at the top of the tree. But if I scroll down, you can see there's a lot more goin' on. Okay, so, here I am, I'm in an Excel spreadsheet that has all of my variables, cylinders, displacement, horsepower, weight, acceleration, model year, and country of origin of the car. They even got the names, got the good old Chevy Malibu here. And I've got the predictions from all of my models. Okay, let's take a closer look. We've got our Chevy Malibu has an actual miles per gallon of 18. So, let's see what the different models said. Neural net is predicting 16.8; the C&RT, 15.4; and the linear regression, 18.6. So, the whole idea of the ensemble is we're going to do the simple average of these three, and we should get a better result with the combination of the three than we would with any of them alone. So, in this instance, we get a 16.9 for the average, and it looks like our regression, you know, it's been around for more than a century, regression was the best predictor there. But the rows where regression was the top performer seem a bit thin on this first page. We really need our models to be pulling their weight because, if the regression if thrown into the average, but it is rarely the best predictor, it's not clear that the regression belongs in the ensemble. So, let's take a look at this from a different point of view. We can look at all the data. We actually find that our old-school linear regression was the best prediction almost 30% of the time; so actually, remarkably, it's divided almost equally between the three. What you want to be looking for, folks, is a situation where one dominant model is always producing the best prediction, and the other two models are almost like backup singers. This can happen in ensembles all the time, and a lot of folks don't take the time to look to see about the individual contributing models. It's important that you do so. Let's go a level deeper on that exact topic. Let's imagine, for a moment, that we've got a single model, and the small circle represents the portion of the data where that model is making an error. So, let's further imagine we got three models now, and they all have their own errors. But when we start to look at the models together, they're making the same errors together very often. This is a frequent phenomenon in ensembles, and it's not good. For the ensemble to be effective, we need to be making different errors. So, let's assume that our three models were making very different errors, always different ones. And if this was a classification problem, we could imagine that, as we started to combine the models into an ensemble, we had errors that were not overlapping. The implications of this are actually quite remarkable because, if we were voting, just a simple vote, two out of three wins, every time we have an error, the other two models will catch the error as long as the errors don't overlap. So, in this instance, one model makes a mistake, the other two models catch it. And here, a different model now is making the mistake but, once more, the other two models catch it. This is going to be a theme that we're going to come back to from time to time because all of the modeling algorithms will try to shuffle up the models enough that the errors are different, and that will make the ensemble sing.

Contents