From the course: Faster pandas

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Measuring performance

Measuring performance

From the course: Faster pandas

Start my 1-month free trial

Measuring performance

- [Instructor] Once you know your performance goals, you need to see if your code meets them. This process is known as benchmarking. Before you start benchmarking, gather some real data to run the benchmarks against. I can't stress enough how important this step is. I've seen code that passes benchmarks with flying colors on made up data, then fails miserably in production. Assume you're writing an outlier detection function. So we have find outliers of data, We found out all the places where the data is further away than two standard deviations from the mean, and return the indices of this data. And now we'd like to benchmark this data. We did the survey on new data and found out that the median data size is around 10,000 items. So ipython, and then we're going to use the run magic to load the code to ipython, and now we're going to generate some data. So we're going to import numpy as np and import pandas as pd. And now we're going to say that data is pandas.series of np.random, and…

Contents