Michele demonstrates how to set up your analysis environment and provides a refresher on the basics of working with data structures in Python. Then, he jumps into the big stuff: the power of arrays, indexing, and tables in NumPy and pandas—two popular third-party packages designed specifically for data analysis. He also walks through two sample big-data projects: using NumPy to identify and visualize weather patterns and using pandas to analyze the popularity of baby names over the last century. Challenges issued along the way help you practice what you've learned.
Note: This version of the course was updated to reflect recent changes in Python 3, NumPy, and pandas.
- Describe how to install and start Python, and load necessary libraries.
- Explain examples of uses for lists and ranges in Python.
- Explain string processing methods in Python.
- Describe the characteristics and specifications of NumPy arrays.
- Explain examples of uses for NumPy methods in generating and analyzing data.
- Use matplotlib to create xy plots.
- Describe the characteristics and specifications of DataFrames in pandas.
Skill Level Intermediate
- Data science, it powers so much of modern life, the internet, social media, artificial intelligence. But also on a personal level, the statistics from your Fitbit or the next song recommended by Pandora. And, truly, data science is driving a personal and social evolution. We're constantly learning and getting better and accomplishing monumental goals. However, do you feel like you're missing the boat? Maybe you're watching all these advances, but you don't really know how to get in the game. And you wonder, "What goes on under the hood? "How does someone one do data science?" You don't know where to start. Do not worry, this is where I can help. My name is Michele Vallisneri, and I'm a research scientist at NASA. I use data science concepts and tools every day to analyze astronomy datasets, and my tool of choice is Python. It's an expressive and pragmatic computer language that has its own spirit and style. And it's supported by a diverse and helpful user community. My goal with this course is to get you started with data science, and more specifically, data analysis with Python, in a friendly and approachable way. It's not all encompassing. I don't recommend applying for a PhD program right after this course, but it will get you started, and I really hope inspired. That's what matters, and that's what you need, a jumping off point. I will take you through the foundations of doing data analysis with Python. We will look at the most important programming constructs, data structures, and third party packages. With this, you will be able to complete simple data analysis tasks, and you will be ready to move on to more advanced topics. I like to teach by example rather than in the abstract, so throughout this course, we will write and execute practical code and analyze real-world data. So let's enter the friendly but exciting world of Python data analysis.