Join Lillian Pierson, P.E. for an in-depth discussion in this video What you should know, part of Python for Data Science Essential Training.
- [Instructor] To get the most out of this course, you'll want to already be familiar with Python. How it works, its syntax, and the basics of object-oriented programming. You should also have a basic working knowledge of math up to the algebra 2 level. If you have some experience with statistics, that will be helpful, but it's definitely not a requirement. While you certainly don't have to be a pro, we'll be using Python's NumPy and Pandas libraries throughout this course and I'll be showing you how to use these libraries to meet your data science objectives. If you want to brush up on those topics before diving into this course, I recommend taking Introduction to Data Analysis with Python course from our library.
I'll also be working with a local installed Python that I got from Anaconda, as well as the Jupiter Notebook from that installation, but you don't need either of these to watch this course. You'll need them if you plan to follow along with me exactly. I'm going to show you Anaconda in just a few seconds. Also, this course is taught in Python 2.7, so make sure you're not running the 3.5 release, or else you'll be headed for trouble. The only other thing you need before starting this course is an excitement to begin generating valuable insights from data.
Whether it be at work or even for a hobby project, knowing how to use data science methods to generate insights will take your performance to the next level. Brace yourself for success and welcome to the course. In case you're not familiar with Anaconda, it's put out by Continuum Analytics and you can find it at the link shown below. This is the front page of the website and you can download Anaconda. For me, I'm running Windows, so I would choose the Windows icon. Then you just want to make sure to select the correct version, for this course it's 2.7.
Let me show you what comes with the Anaconda install. You get all of these different libraries that will be available to you. Some of them are pre-installed and some of them you'll need to do a pip install in order to use in your Jupiter Notebook. But don't worry too much about that, because, in this course, I'm going to show you how to do a pip install when it's needed. One other thing is Anaconda comes with Jupiter Notebook and we're going to be using Jupiter Notebook for our programming console. I'm going to show you how to use that next.
- Getting started with Jupyter Notebooks
- Visualizing data: basic charts, time series, and statistical plots
- Preparing for analysis: treating missing values and data transformation
- Data analysis basics: arithmetic, summary statistics, and correlation analysis
- Outlier analysis: univariate, multivariate, and linear projection methods
- Introduction to machine learning
- Basic machine learning methods: linear and logistic regression, Naïve Bayes
- Reducing dataset dimensionality with PCA
- Clustering and classification: k-means, hierarchical, and k-NN
- Simulating a social network with NetworkX
- Creating Plot.ly charts
- Scraping the web with Beautiful Soup