From the course: Data Science Foundations: Data Assessment for Predictive Modeling

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Assessing imputation as a potential solution

Assessing imputation as a potential solution

From the course: Data Science Foundations: Data Assessment for Predictive Modeling

Start my 1-month free trial

Assessing imputation as a potential solution

- [Instructor] We're going to start by looking at the data in Excel. Our mission is going to be to figure out if it's appropriate to impute age. Age is the kind of variable that people normally think about imputing. It's a scale variable. You figure, just replace with a mean, just replace with a median. It's exactly the kind of thing that's often automated in software. Now remember that the actual act of imputation is usually done in the data preparation phase, but we're doing the detective work. Does it make sense to do it? So let's start by exploring age. I've turned the filter on and I'm going to grab all of the blanks. And we just investigate a bit. If there's anything obvious about what we see, well here in column H, we've got all zeros. Now those date of birth values probably shouldn't be zero, but the point is it is missing, at the same time that age is missing. That makes sense. In column R here, we've…

Contents