Learn how to clean data.
- [Voiceover] It's really important…to remove duplicates from your data set,…in order to preserve the data set's accuracy,…and avoid producing incorrect and misleading statistics.…For example, imagine you're analyzing retail sales data,…and shopaholic Sally came in three times,…and used three different credit cards to make purchases,…but provided the cashier the same zip code,…three two eight oh three, for each sale.…Just based on the card number,…Sally looks like three different customers,…all from the three two eight oh three zip code.…
If you fail to examine other attributes of the customer,…so that you can identify and remove duplicates,…shopaholic Sally's results would skew the results…of any customer demographic analysis, because Sally…would be counted as three people, rather than one.…To market to three two eight oh three customers effectively,…you need to understand their characteristics.…Don't let duplicate records skew your analysis.…It's time for me to show you how to…actually remove duplicates from your data set.…
AuthorLillian Pierson, P.E.
- Getting started with Jupyter Notebooks
- Visualizing data: basic charts, time series, and statistical plots
- Preparing for analysis: treating missing values and data transformation
- Data analysis basics: arithmetic, summary statistics, and correlation analysis
- Outlier analysis: univariate, multivariate, and linear projection methods
- Introduction to machine learning
- Basic machine learning methods: linear and logistic regression, Naïve Bayes
- Reducing dataset dimensionality with PCA
- Clustering and classification: k-means, hierarchical, and k-NN
- Simulating a social network with NetworkX
- Creating Plot.ly charts
- Scraping the web with Beautiful Soup
Skill Level Beginner
1. Data Munging Basics
2. Data Visualization Basics
3. Basic Math and Statistics
4. Dimensionality Reduction
Explanatory factor analysis6m 39s
5. Outlier Analysis
6. Cluster Analysis
7. Network Analysis with NetworkX
8. Basic Algorithmic Learning
9. Web-based Data Visualizations with Plotly
10. Web Scraping with Beautiful Soup
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.