Join Michele Vallisneri for an in-depth discussion in this video NumPy overview, part of Python: Data Analysis.
- In this chapter, we're going to look at NumPy,…a third party package for Python that extends…the language with multi-dimensional arrays.…NumPy is a very important part of the Python ecosystem,…and it has become the fundamental package…for scientific computing with Python.…Here, we should take "scientific" to mean…"dealing with numbers and maths."…So, whenever you have long sets of numbers…or are doing math with them, NumPy is a good choice.…Let's talk a bit about how NumPy arrays are different…from Python containers.…
You may have heard Python variables described as labels.…They are not little cubbies in computer memory,…ready to receive a value.…Rather, the values are independent objects…with their own space in memory, and a Python variable…just points there; it's a name for that object.…You can have more than one variable referring…to the same object.…This mechanism is very flexible, and it also allows…for lists and dictionaries with heterogeneous elements.…However, doing that is not very efficient…when you are dealing with lots of values of the same type.…
Released
11/12/2015- Writing and running Python in iPython
- Using Python lists and dictionaries
- Creating NumPy arrays
- Indexing and slicing in NumPy
- Downloading and parsing data files into NumPy and Pandas
- Using multilevel series in Pandas
- Aggregating data in Pandas
Skill Level Intermediate
Duration
Views
Q: The course shows how to download files from FTP and web servers using Python 3.X. How do I do the same thing with Python 2.7?
A: First import urllib, then use urllib.urlretrieve(URL,filename). For instance, to download the stations.txt files used in the chapter 5 video “Downloading and parsing data files,” you’d do urllib.urlretrieve(‘ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/ghcnd-stations.txt','stations.txt').
Q. What are the issues with DataFrame.sort()?
Â
A: Since Pandas version 0.18, the DataFrame method sort() was removed in favor of sort_values(). Unlike sort(), the new method does not sort records in place unless it is given the option "inplace=True". The following lines of code in the video need changing:Â
- In Chapter 6: Introduction to Pandas/DataFrames in iPandas
- twoyears = twoyears.sort('2015',
ascending=False) -> twoyears = twoyears.sort_values('2015', ascending=False)
- In Chapter 7: Baby names with Pandas/A yearly top ten
- allyears_indexed.loc['M',:,
2008].sort_values('number', ascending=False).head() - pop2008 = allyears_indexed.loc['M',:,
2008].sort_values('number', ascending=False).head() - def topten(sex,year):
- simple = allyears_indexed.loc[sex,:,
year].sort_values('number', ascending=False).reset_index()
- In Chapter 7: Baby names with Pandas/Name Fads
- [in addition to lines above, which are used to initialize the "name fads" computation]
- spiky_common = spiky_common.sort_values(
ascending=False) - spiky_common = spiky_common.sort_values(
ascending=False); spiky_common.head(10)
- In Chapter 7: Baby names with Pandas/Solution
- [in addition to lines above, which are used to initialize the "name fads" computation]
- totals_both = totals_both.sort_values(
ascending=False)
Q. What are the issues with Pandas categorical data?
Â
A. Since version 0.6, seaborn.load_dataset converts certain columns to Pandas categorical data (see http://pandas.pydata.org/
Q. What are the issues with matplotlib.pyplot.stackplot? Â
A. In recent versions of matplotlib, the function matplotlib.pyplot.stackplot now throws an error if given the keyword argument "label". This problem occurs in the "Baby names with Pandas/Name popularity" exercise file, and it can be ignored. In the video, matplotlib does not complain, but nevertheless shows no legend for the plot. The tutorial moves on to show how to make a legend using matplotlib.pyplot.text.
Share this video
Embed this video
Video: NumPy overview