Join Michele Vallisneri for an in-depth discussion in this video Doing math with arrays, part of Introduction to Data Analysis with Python.
- In this video, we're going to look at doing mathematics with number arrays. We will learn how to apply simple mathematical operations to an array and between two arrays. I will also show you how to plot one-dimensional arrays. And in case you need it, I will show you how to do some simple linear-algebra operations with them. Let's go back to the IPython notebook and open the exercise file for video 0403. Again, we need to import numpy, and we'll call it np, and we will also import matplotlib, pyplot, call it pp.
We will also instruct IPython notebook to keep pyplots in line in the notebook itself. Let's start by generating a number array of the numbers between zero and 10. We'll use linspace this let's us specify how many we want. Let's say 40. We can then try to apply a simple trigonometric function to this array. For instance, a sign. For this, we cannot use the standout math.sin function of Python. We need to use the numpy version, which can take a full array as an argument.
These numpy functions are known as universal functions for this reason. So we'll assign the result of applying numpy.sin to x to deviable sin(x). Let's have a look. The best way to see the result of the operation is actually to plot it. So we'll call the mathplotlib function plot and give it the arrays x and sinx as arguments. Here we go. As you can see, the IPython notebook kept this plot in line.
If you save this notebook, the figure will be saved with it. We can also plot a couple of functions together. So let's get the cosine, as well as the sin. I'm copying the plot instruction from the cell above and repeating it with a cosine. Here we go. As a math lab, we can modify the style of the function. For instance, we could give different symbols to the two lines. There are many many options in mathplotlib, and you can look at the documentation if you want to learn more.
Just like we can apply a unary function, like sin to a number array, we can also do arithmetics between them. For instance, let's take the product of the sin and cosine arrays, and let's take a slightly more complicated function. The difference of their squares. And let's again plot them. Both times, we will be using the array x, as the horizontal x's. We can also add a legend to this plot, so that we can tell the two curves apart.
This is done with a mathplotlib function legend. Normally, mathematical operations are applied to arrays element by element. However, if you want to do linear-algebra, that's not the case. For instance, you may want to take the inner product of two vectors. That is the sum of the element-by-element products. You can do this in numpy with the function dot. This will treat the two one-dimensional arrays as vectors. We could also take the outer product, which builds every possible combination of the elements from the two vectors.
The result is a matrix. Numpy always tries to be helpful in any way to guess what you want to do. So it has some broadcasting rules, with which it will try to make sense of operations between arrays of different shapes. For instance, if you have a one-dimensional vector, and you add a number to it, just a single number, the number will be added to every element. Let me fix this simple typo, it's linspace, not space.
Broadcasting also applies if we try to add a one-dimensional array to a two-dimensional array. So let's see what happens if I add a one-dimensional array of size n to a two-dimensional array of size n-by-n. The result is a two-dimensional array where the one-dimensional array has been added to every row. If instead we wanted to add it to every column, we'd first have to turn it into proper n-by-one column vector, by adding a dimension with numpy newaxis.
- Writing and running Python in iPython
- Using Python lists and dictionaries
- Creating NumPy arrays
- Indexing and slicing in NumPy
- Downloading and parsing data files into NumPy and Pandas
- Using multilevel series in Pandas
- Aggregating data in Pandas
Skill Level Intermediate
Q: The course shows how to download files from FTP and web servers using Python 3.X. How do I do the same thing with Python 2.7?
A: First import urllib, then use urllib.urlretrieve(URL,filename). For instance, to download the stations.txt files used in the chapter 5 video “Downloading and parsing data files,” you’d do urllib.urlretrieve(‘ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/ghcnd-stations.txt','stations.txt').