Understand options for estimating missing values.
- Now we are going to look at inputting missing values.…This is the process of filling in missing values…with a reasonable value.…So now that we've identified missing values in our dataset,…we have to decide what to do about that missing data.…There are several options.…First, we could just continue…without replacing the missing values.…We could define a default value…and use that for all of the missing values in a column.…We could compute a value based on neighboring rows.…
Or, we could use more advanced techniques, like regression.…We could continue our analysis…and not treat records with missing values…any different than any other record.…Now there are some advantages to this.…For example, it makes no extra work for us.…But, there are disadvantages.…For example, rows with missing values…will not contribute to anything to the total…so that could throw averages off.…Also, it may not be possible to perform some calculations,…such as, using division.…
So for example, if we wanted to calculate…the number of items sold per employee on a shift,…
Released
6/7/2018- Exploratory data analysis vs. hypothesis-driven statistical analysis
- Performing data quality checks
- Calculating quartiles
- Using box plot to understand the distribution of values
- Using histograms to understand the frequency of values
- Using chi square to understand the correlation between values
Share this video
Embed this video
Video: Imputing missing values