Join Michele Vallisneri for an in-depth discussion in this video Series in Pandas, part of Introduction to Data Analysis with Python.
- In Pandas, series objects can be used as one dimensional NumPy arrays but they provide additional features. I will show you how to make series objects from Python lists and dicts. I will show you how to extract the index and the values from a series. And I'll show you how to do indexing on lists both implicitly and explicitly. That means using numerical indices similar to NumPy and explicit indices that you assign to the series yourself. Let's open the IPython notebook.
I will select the 0602 series begin notebook. Of course, I will import that Pandas package. And give it a nickname of PD. We can make a Pandas series from a simple Python list. We can also give it a name. The series contains a NumPy array. We can extract it by asking for the values. The series also contains an index which, in this case, has been given implicitly.
So we can use numeric indices to extract elements from the series. We can also use standard slicing. Now, I will make a series with an explicit index. I will use data about computer language popularity in 2014 that I found in the (mumbles) magazine. I'll call this series, "Pop 2014." Here, I specify the index. As you can see, this series is nicely displayed with the index to the left and the value to the right.
In 2014, Java was the most popular language according to (mumbles). We can then extract the index, or Pop 2014, which is now a collection of Python objects. We can still index by number, again, Pop 2014. Or slice by numbers. But now, we can also select an element with an explicit value from the index. We can even slice using such values. You'll notice that here Pandas deviates from the standard slicing convention since the ending element of the slice, C Sharp, is included.
If you just use brackets for indexing, Pandas will do its best to decide whether you're trying to use numbers or explicit values for indices. However, you can also be explicit. In that case, you can use the ILook object to specify that you're using numbers and the lock object to make it clear that you're using explicit values. Of course, you can also use advanced indexing. For instance, a boolean mask.
Another way to create this series is from a Python dictionary. Indeed, sometimes it's useful to think of Pandas series as akin to dictionaries rather than NumPy arrays. So let me write down what the (mumbles) magazine thought of computer language popularity in 2015. As you can see, I'm writing this as a Python decked with the keys and the values of the index. And, of course, I need round and not square parenthesis. Here's the popularity of languages in 2015.
- Writing and running Python in iPython
- Using Python lists and dictionaries
- Creating NumPy arrays
- Indexing and slicing in NumPy
- Downloading and parsing data files into NumPy and Pandas
- Using multilevel series in Pandas
- Aggregating data in Pandas
Skill Level Intermediate
Q: The course shows how to download files from FTP and web servers using Python 3.X. How do I do the same thing with Python 2.7?
A: First import urllib, then use urllib.urlretrieve(URL,filename). For instance, to download the stations.txt files used in the chapter 5 video “Downloading and parsing data files,” you’d do urllib.urlretrieve(‘ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/ghcnd-stations.txt','stations.txt').