- We are now going to work through a quick Python refresher.…In particular, all those features that are most useful…to deal with data, Python containers,…such as lists and dictionaries.…The interface to Python containers is a blueprint…for the interfaces to more advanced data analysis libraries,…such as NumPy and Pandas, which we will discuss…later in this course.…So we'll look at Python lists, where the indexing…and slicing syntax is very similar to NumPy arrays.…We will look at Python dictionaries, which are in a sense…similar to Pandas DataFrames, and we will look…at Python comprehensions, a very useful feature…of the language that lets us create and modify lists…and dicts with great ease.…
Released
11/12/2015- Writing and running Python in iPython
- Using Python lists and dictionaries
- Creating NumPy arrays
- Indexing and slicing in NumPy
- Downloading and parsing data files into NumPy and Pandas
- Using multilevel series in Pandas
- Aggregating data in Pandas
Skill Level Intermediate
Duration
Views
Q: The course shows how to download files from FTP and web servers using Python 3.X. How do I do the same thing with Python 2.7?
A: First import urllib, then use urllib.urlretrieve(URL,filename). For instance, to download the stations.txt files used in the chapter 5 video “Downloading and parsing data files,” you’d do urllib.urlretrieve(‘ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/ghcnd-stations.txt','stations.txt').
Q. What are the issues with DataFrame.sort()?
Â
A: Since Pandas version 0.18, the DataFrame method sort() was removed in favor of sort_values(). Unlike sort(), the new method does not sort records in place unless it is given the option "inplace=True". The following lines of code in the video need changing:Â
- In Chapter 6: Introduction to Pandas/DataFrames in iPandas
- twoyears = twoyears.sort('2015',
ascending=False) -> twoyears = twoyears.sort_values('2015', ascending=False)
- In Chapter 7: Baby names with Pandas/A yearly top ten
- allyears_indexed.loc['M',:,
2008].sort_values('number', ascending=False).head() - pop2008 = allyears_indexed.loc['M',:,
2008].sort_values('number', ascending=False).head() - def topten(sex,year):
- simple = allyears_indexed.loc[sex,:,
year].sort_values('number', ascending=False).reset_index()
- In Chapter 7: Baby names with Pandas/Name Fads
- [in addition to lines above, which are used to initialize the "name fads" computation]
- spiky_common = spiky_common.sort_values(
ascending=False) - spiky_common = spiky_common.sort_values(
ascending=False); spiky_common.head(10)
- In Chapter 7: Baby names with Pandas/Solution
- [in addition to lines above, which are used to initialize the "name fads" computation]
- totals_both = totals_both.sort_values(
ascending=False)
Q. What are the issues with Pandas categorical data?
Â
A. Since version 0.6, seaborn.load_dataset converts certain columns to Pandas categorical data (see http://pandas.pydata.org/
Q. What are the issues with matplotlib.pyplot.stackplot? Â
A. In recent versions of matplotlib, the function matplotlib.pyplot.stackplot now throws an error if given the keyword argument "label". This problem occurs in the "Baby names with Pandas/Name popularity" exercise file, and it can be ignored. In the video, matplotlib does not complain, but nevertheless shows no legend for the plot. The tutorial moves on to show how to make a legend using matplotlib.pyplot.text.
Share this video
Embed this video
Video: Python containers overview