A mistake that data scientists should avoid is not addressing outliers in data. In this video, learn how to address outliers in your data using the approach of identifying outliers and then removing them.
- [Instructor] A mistake that data scientists should avoid … is not addressing outliers in data. … For example, … let's say that I have a variable named Airbnb … containing pandas DataFrame that consists of information … about Airbnb listings in New York City from 2019, … and I want to eventually build a model … that predicts the prices … of future Airbnb listings in New York City … based on this data. … I've displayed the first few rows of the dataset here. … And I've created a visualization … that displays the distribution of listing prices here. … From the state of visualization, … it appears that the majority of listing prices … are under 1,000, and listing prices over 1,000 … seem to be outliers. … If I do not address the outliers in the data, … the data may not be an accurate representation … of listing prices, and when I go on … to build a predictive model based on this data, … my model may not make good predictions. … So I need to address these outliers. … In general, there are many ways of addressing outliers. …
Skill Level Intermediate
1. Avoid Mistakes in Coding Practices
2. Avoid Mistakes in Structuring Code
3. Avoid Mistakes in Handling Data
4. Avoid Mistakes in Machine Learning
Using redundant features1m 45s
Get started with Python1m 7s
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.