From the course: Python: Working with Predictive Analytics
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Convert categorical data into numbers - Python Tutorial
From the course: Python: Working with Predictive Analytics
Convert categorical data into numbers
- [Instructor] We are still at the Data Preparation step in our Predictive Analytics Roadmap. We need to convert categorical data into numbers, because prediction models only accept numerical data. We have two ways to handle this. One is label encoding, the other way is one hot encoding. Label encoding works well if we have two distinct values. One hot encoding works well if we have three or more distinct values. In our insurance data set we used before, we have two distinct values for smoker, yes or no. Which means we can handle with label encoder by replacing yes with one, and no with zero. Now, let's look at what happens if we have more than two distinct values. For example, colors. Blue, red and green. When we applied label encoding here, we end up having numbers as zero, one, and two. And in this case, two is larger than zero, which means green is larger than blue. We cannot make such correlation between categorical values, thus we need another method. One hot encoding is the…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.