Principal component analysis, PCA, is a critical tool for dimensionality reduction and visualization. In this video, learn how to perform PCA for data visualization using the Python library scikit-learn.
- [Narrator] Are all the features in our dataset needed? … Say you have some flowers … and you measure their petal length. … If you have a column of that measurement in centimeters, … and another column with measurement in inches … do you need both columns? … In that circumstance, … you could probably drop either column … without losing information. … In other cases dropping a column could lead to issues. … Principal Component Analysis … better known as PCA. … Is a technique that you can use … to smartly reduce the dimensionality of your dataset … while losing the least amount of information possible. … One use of PCA, is for data visualization. … In this video, … I'll share with you how you can use PCA … to help visualize your data. … The first step is to import libraries. … From there, you can load your dataset. … The dataset used in this notebook is the hours dataset. … The next step is to standardize your data … PCA like a lot of different algorithms is affected by scale. … You can transform your data onto unit scale …
This course was created by Madecraft. We are pleased to host this content in our library.
- Why use scikit-learn?
- Supervised vs. unsupervised learning
- Linear and logistic regression
- Decision trees and random forests
- K-means clustering
- Principal component analysis (PCA)