Learn how to perform arithmetic operations on data.
- [Instructor] The benefit of NumPy is it makes it really easy to do math on data that's stored in arrays and matrices. I know we've talked a lot about arrays and matrices in this course already, but just to give you a formal definition. An array is a one-dimensional container for elements that are all of the same data type. In contrast, matrix is a two-dimensional container for elements that are stored in an array. Let me give you an example of where NumPy can come in handy. Have you ever tried to use a spreadsheet application to perform mathematical operations on a data set that has more than 300,000 rows? What happened? If the application didn't crash then it took a lot of time and effort to get the program to make the computation.
With NumPy on the other hand you can quickly and easily do mathematical and statistical operations on data sets with even millions of records. Simply put, NumPy makes it easy to do math on large data sets. Here are the arithmetic operators that you use in Python. You use the same standard symbols that you would with any scientific calculator. So there's not a lot to cover here, but you might want to familiarize yourself with them. And let me show how multiplication and division is done between a matrix and an array.
Up here we have a matrix and then we're actually going to do multiplication. You would just multiply the values. So a1, here, is going to be multiplied times d1. And so in the first cell it'll be a1 times d1. And similarly, c3 here will be multiplied times d3 here. The two values will be multiplied and the resultant number will be placed in the last row and column. And it works the same with division. It's really pretty easy to understand how arithmetic is done with matrices and arrays.
If you just look here, if you look at the a1 value and we do a multiplication and we were multiplying times d1, then the product of a1 and d1 would be returned in the resultant matrix. NumPy will do that for every single row and column in the exact same work flow. You can also calculate dot products with NumPy. A dot product is a mathematical operation that takes two factors, or sequential units of numbers that have an equal number of elements each, and multiplies them out to generate a single number.
Otherwise known as a scaler value. And you can also use NumPy for matrix multiplication. Matrix multiplication is a mathematical operation that takes two matrices, or two-dimensional arrays that have an equal number of elements each, and multiplies them out to generate a single matrix. Let me show you how this works in practice. The first thing I want to show is how to use NumPy to perform arithmetic operations on data. So in this demonstration we're only going to use NumPy and its random number generator.
So we'll say import numpy as np, and from numpy dot random import randn. Okay, and execute this line. We've got our libraries imported and it never looks good to see more than two digits after a decimal point. Let's limit the number of decimal places returned in this demonstration by calling the set print options function and passing in the argument precision equal to two.
We'll say np set underscore print options and then just pass in precision equal to two. Did I spell that right? Yeah it looks good. Okay, before I can show you how to do math using arrays we need to create some arrays to work with. So, we'll create an array a here. Call the array constructor and then pass in a list of six values, and then print that out and see what it looks like.
Okay, there we go. And another array we can create will be array b and we'll make that a matrix. I ran the code and we got 10, 20 and 30 in the first row and 40, 50 and 60 in the second row. I want to show you another way to create arrays and that's via assignment. So we'll generate a series of random numbers and we'll assign them to a variable name. So before we create the series of numbers let's just set our seed and then we'll call this array c and we'll say 36 times np random dot randn, and then we'll pass in the value six.
So we're going to get six random numbers. And we'll print it out. One thing I would like to point out here is we're using the randn function of the random number generator in NumPy. And so what that does it is actually generates both positive and negative random numbers. And moving on to our last array. This is just going to be a series of numbers between one and 34. So in order to create that we'll say np dot arrange, use the arrange function, and then pass in one and 35.
Print that out and there we have it. I want to point out one thing here, 'cause I'm calling all of these objects arrays but some of them are actually matrices. What you want to keep in mind is that a matrix is actually just a two-dimensional array. That's why you're hearing me say array when I'm actually creating a matrix. But now let's move on to doing math with arrays and matrices. Like I said earlier, it's really easy to do math on arrays and matrices using NumPy. So let me just show you.
If you want to take our a object, one through six, and multiply those times 10, see what we get. Well we get an array that ranges between 10 and 60. That's because we multiplied each value by 10. And if we wanted to add our arrays we could do, say, c plus a, let's see what we have for c. And so what this is actually going to do is it's going to add up the values in the c array and add them to the values in the a array.
So we would expect the first value to come out at nine point two two. Let's see, and there we have it, nine point two two. We could say c minus a and this is going to subtract the two arrays, so it'll subtract a from c. So we would expect we'll get a seven point two two here for our first value. And there we have it. Let's multiply them together, so c times a. Our c array, the first value is eight point two two. In a our first value is one.
So when we multiply them it's just going to be eight point two two. Let's just check to make sure. Yes that's how it works with multiplication. And then division would be c divided by a. And so that's going to be eight point two two divided by one for the first value. So we'll also get eight point two two. So the same way that we calculate the first value is how each of the values in the resultant array is calculated. I also want to show you about multiplying matrices and matrix multiplication.
So let's just create some objects we can use. We'll say np dot array to generate an array and then let's make three rows and three columns. The first row is going to have a value of two, four and six. And then in the second row you're going to have a one, a three and a five. And then in the third row you'll have a 10, a 20 and a 30.
We'll print that out. Yay, we have a three by three matrix. Now let's create a second matrix, we'll call the second matrix bb. I'll just bring in the code from the last cell and customize it. So this one'll be zero, one and two in the first row. And then three, four, five, six, seven and eight in the remaining elements. And we print this out and there we go, we have that. Now let's see how we can use these together.
Arithmetic multiplication of two matrices works just like regular multiplication. So, in Python you can multiply these two matrices simply by writing aa times bb. And as you'll see here what's really happened is that the value in the first row and column of each of the matrices has been multiplied together and then four has been multiplied by one, so it returns a four. And then the third element in the first row those are multiplied together, so six times two is twelve.
And so on. This is different than dot product and formal matrix multiplication. Which I'm going to show you in just a second. We already talked about dot products and matrix multiplication, but I just want to point out that with NumPy you use the same function to perform both of these operations. That function is np dot dot function. When called on arrays this function takes the dot product of arrays. When it's called on matrix objects the function performs matrix multiplication to reduce them down to a single matrix object.
So let me show you how to do a matrix multiplication. We'll just call the dot function and then we'll pass in our two matrix objects. Just to really quickly go over the procedure for matrix multiplication, you can see here that the first row is multiplied by the first column of the second matrix. When multiplied out and added together these values all add up to 48. Then for the second step, you would multiply the first row in the first matrix by the second column in the second matrix.
And when you multiply all of these values and add them up you would get the value of 60. So this is the general procedure for matrix multiplication. Hold on to these ideas because they're going to be really important when we get talking about singular value decomposition and principal component analysis. But next let's talk about summary statistics.
- Getting started with Jupyter Notebooks
- Visualizing data: basic charts, time series, and statistical plots
- Preparing for analysis: treating missing values and data transformation
- Data analysis basics: arithmetic, summary statistics, and correlation analysis
- Outlier analysis: univariate, multivariate, and linear projection methods
- Introduction to machine learning
- Basic machine learning methods: linear and logistic regression, Naïve Bayes
- Reducing dataset dimensionality with PCA
- Clustering and classification: k-means, hierarchical, and k-NN
- Simulating a social network with NetworkX
- Creating Plot.ly charts
- Scraping the web with Beautiful Soup