From the course: Machine Learning and AI Foundations: Classification Modeling

Unlock the full course today

Join today to access over 22,700 courses taught by industry experts or purchase this course individually.

Missing data

Missing data

- [Instructor] One of the themes that we've investigated is certainly missing data. Decision trees really are the exception to the rule. Virtually all of the algorithms, at least on default settings, operate by listwise deletion. So, you really wanna be careful about the following phenomenon. In working with clients and building my own models over many years, I've noticed that when a client starts to say that they're always having the best luck with trees, I double check to see how many records were actually processed by the other algorithms. And I can't tell you the number of times I've checked and discovered that zero records were run by the neural net, zero records were run by the support vector machine, and so on. And a lot of times they didn't realize this. So if trees are consistently being the best performers, you wanna make sure if the other algorithms were even running properly. Now if, perchance, you do have a situation where you've lost a lot of records and maybe even lost…

Contents