From the course: Python: Working with Predictive Analytics

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Convert categorical data into numbers

Convert categorical data into numbers - Python Tutorial

From the course: Python: Working with Predictive Analytics

Start my 1-month free trial

Convert categorical data into numbers

- [Instructor] We are still at the Data Preparation step in our Predictive Analytics Roadmap. We need to convert categorical data into numbers, because prediction models only accept numerical data. We have two ways to handle this. One is label encoding, the other way is one hot encoding. Label encoding works well if we have two distinct values. One hot encoding works well if we have three or more distinct values. In our insurance data set we used before, we have two distinct values for smoker, yes or no. Which means we can handle with label encoder by replacing yes with one, and no with zero. Now, let's look at what happens if we have more than two distinct values. For example, colors. Blue, red and green. When we applied label encoding here, we end up having numbers as zero, one, and two. And in this case, two is larger than zero, which means green is larger than blue. We cannot make such correlation between categorical values, thus we need another method. One hot encoding is the…

Contents