In this video, learn how to clean up continuous features by filling missing values, creating new features, etc.
- [Instructor] Previously we talked about … how exploratory data analysis will inform our data cleaning. … In this lesson we'll take … what we learned in the last two lessons, … and actually implement some of the necessary cleaning. … So lets start by importing our data, … and then we discussed in the last lesson … how passenger ID doesn't really factor into … whether somebody survived or not in any way. … So we're going to go ahead and drop this feature in place … just like we did in the previous section. … So we do that by calling … the drop feature on the titanic data set, … we tell it to drop passenger ID. … We pass in axis=1, … and that tells it to drop the column … instead of trying to drop rows. … And lastly we'll pass in this inplace=True, … which tells pandas to alter … the titanic data set as it stands now, … instead of trying to create a new data frame. … So we'll run that, and you can see that this passenger ID … is no longer in our data frame. … Now in the last lesson, … we learned that age has some missing values. …
Author
Released
5/10/2019- What is machine learning (ML)?
- ML vs. deep learning vs. AI
- Handling common challenges in ML
- Plotting continuous features
- Continuous and categorical data cleaning
- Measuring success
- Overfitting and underfitting
- Tuning hyperparameters
- Evaluating a model
Skill Level Beginner
Duration
Views
Related Courses
-
Deploying Scalable Machine Learning for Data Science
with Dan Sullivan1h 43m Intermediate
-
Introduction
-
Leveraging machine learning1m 57s
-
What you should know1m 6s
-
Using the exercise files1m 24s
-
-
1. Machine Learning Basics
-
Why Python?5m 49s
-
Common challenges6m 4s
-
2. Exploratory Data Analysis and Data Cleaning
-
Plotting continuous features7m 35s
-
Continuous data cleaning5m 44s
-
Categorical data cleaning4m 33s
-
3. Measuring Success
-
Why do we split up our data?5m 54s
-
-
4. Optimizing a Model
-
What is underfitting?2m 26s
-
What is overfitting?2m 47s
-
Finding the optimal tradeoff3m 16s
-
Hyperparameter tuning6m 22s
-
Regularization2m 31s
-
5. End-to-End Pipeline
-
Overview of the process1m 48s
-
Clean categorical features4m 18s
-
Tune hyperparameters6m 34s
-
-
Conclusion
-
Next steps1m 23s
-
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.
CancelTake notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.
Share this video
Embed this video
Video: Continuous data cleaning