You have a collection of data. You would like to quickly identify the range of the data.
- [Narrator] We begin with the Data Range Descriptive Statistic. This will be the easiest Descriptive Statistics that we cover in section one. All it is is grabbing the maximum and minimum of a set of values. So in this video we're going to be taking a look at: using the maximum and minimum functions in order to find the range of a data set, and we're going to be combining those functions into a single function, which we returns a tuple values. And finally, we're going to compute the range of our away team runs using the function that we prototype.
Let me find my Haskell notebooks and the Jupyter system. So when we last left our video, we pulled the listing of all the away team scores for each game in the 2015 season of Major League Baseball. If you're rejoining this video series after a break, then you may have to find the kernel and restart run-all feature inside the notebook system. And I'm just going to do that, to make sure that we get things going. We do get a warning message saying that this will clear all of our variables and that's okay because all of the variables are going to be rebuilt by the notebook.
Of this menu option up here, we're going to close that down. So this was the last thing that we did. In our notebook in the last video. And what I would like to do is to store this in a variable called awayRuns. Now in order to find the range of this data set, we're going to utilize two functions. And those two functions are: maximum awayRuns we see that the maximum number of runs scored by any away team in the 2015 season was 21, and we see that the minimum was zero.
21 sounds like a lot of runs. Let's take a moment and examine the type signatures of the maximum and minimum functions. They both take a list of values and return a single value. And the values are bound by the Ord type. With that knowledge, we're going to create a function called Range that takes a value and returns a tuple of values bound by the Ord type. Let's go. So our quick function should probably look like this.
So we're going to call this a range, and we're going to be bounding our values by the Ord type. We're going to accept a range of values and return our tuple of values. And so we can just say range xs, and that will equal minimum xs, maximum xs. Good. Now let's test this function. Alright, so. Range awayRuns.
And we see that we get a range of zero to 21. Now what if we pass an empty list? Or what if we just pass a list of one value? These are some things that we didn't consider in this function that I just wrote. So let's explore that for a little bit. So range and then an empty list. We see that we get an error message, Prelude,minimum: empty list, and that's because our data was passed to the minimum function it saw that we had an empty list and it threw up error. What we really ought to do is package our return in a maybe.
So that we could potentially return nothing, and adjust in case that we have empty lists. So what I would like to do is to find my pre-written range code. I'm just going to copy this. Now I'm going to paste this into the window. So this is our improved range function. And we use a little bit of pattern matching in order to adjust to some of the conditions that we should be looking for in a proper range function.
So I still have a list of values, that are bound by the Ord type, but now I am packaging my return inside of a maybe. That way I can adjust to circumstances in which an empty list is passed. Such as, returning nothing. If I have a single value, I can just return that value twice and not even have to worry with the minimum and maximum. But if I get anything else, I can utilize our minimum and maximum functions. That way I can type in range, and an empty list.
Range 1 and range awayRuns. Great. So this improved function is going to be our prototyped goal for the remaining Descriptive Statistics in our video series. We're going to be adjusting accordingly, based on the inputs given. And returning none in cases where no results should be given In our next video, we're going to be discussing: how to compute the mean of a data set.
Note: This course was created by Packt Publishing. We are pleased to host this training in our library.
- Data ranges, means, and medians
- Standard deviation
- SQLite3 command line
- Slices of data
- Regular expressions
- Atoms and modifiers
- Character classes
- Line plots of a single variable
- Plotting a moving average
- Feature scaling
- Scatter plots
- Normal distribution
- Kernel density estimation (KDE)