Join Kevin Skoglund for an in-depth discussion in this video Introducing Lake Pend Oreille, part of Code Clinic: Ruby.
Hello, and welcome to Code Clinic. My name is Kevin Copeland. Code clinic is a monthly course, where a unique problem is introduced, to a collection of lynda.com authors. In response, each author will create a solution using their programming language of choice. You can learn several things from code clinic. Different approaches to solving a problem. The pros and cons of different languages. And some tips and tricks to incorporate, into your own coding practices.
This month, we'll work on a problem in statistical analysis and to some extent, handling big data. It's common to use a computer to manipulate and summarize large amounts of information, providing important insights on how to improve, or handle a situation. In this problem, we'll use weather data collected by the U.S. Navy from Lake Pend Oreille in Northern Idaho. Lake Pend Oreille, is the fifth deepest fresh water lake in the United States. So deep, in fact, that the U.S. Navy uses it to test submarines.
As part of that testing, the U.S. Navy compiles an exhaustive list of weather statistics. Wind speed, air temperature, barometric pressure and more. You can browse this data by pointing your web browser at http://lpo.dt.navy.mil. You'll find several weather summaries, a web cam and the raw data that they collect every five minutes. Archived as standard text files. For anyone living or working on Lake Pend Oreille, weather statistics are an important part of every day life.
Average wind speed, can be very different than median wind speed. Especially if you're on a small boat in the middle of the lake. In this challenge, each of our authors will use their favorite language to calculate the mean and the median, of the wind speed, air temperature and barometric pressure recorded at the Deep Moor station, for a given range of dates. First, let's briefly review, mean and median. These are both statistics. To explain statistics, we need to start with a set of numbers. So how about, 14 readings for, wind gust at Deep Moor weather station on January 1st 2014.
You can see the data at this website. The fist column, is the day the wind gust was recorded. The second column, is the time the data was recorded. And the third column, is the wind gust in miles per hour. The mean, is also known as the average. To calculate the mean of a range of numbers, simply add the values in the set, and then divide, by the number of values. In this example we add, 14 plus 14, 11 plus 11 plus 11 plus 11 plus 11, plus 3 plus 7 plus 7 plus 7 plus 7 plus 4 plus 8.
Then, we divide the sum by a count of 14, the count of numbers that were in the set. In this case, the mean, is equal to 9. The median, is the number halfway between all the values, in a sorted range of values. Think of the median, as in the median strip of the road. It always marks, the center of the road. To calculate the median, first, sort the numbers from lowest to highest. If there's an odd number of values, then just take the middle number. But if there's an even number of values, then calculate the average, of the two central numbers.
In our case, if I make this HTTP call. I would expect to receive the mean and median, for wind speed, air temperature and barometric pressure, for the range of dates, starting with March 19th, 2011 and running until midnight on March 20th, 2011. The JSON data, would look something like this. So, there's our first challenge. Pull statistics from a data set available online. Perhaps you want to pause and create a solution of your own.
How would you solve this problem? In the next videos, I'll show you how I solved the challenge.
Kevin introduces challenges and provides an overview of his solutions in Ruby. Challenges include topics such as statistical analysis, searching directories for images, and accessing peripheral devices.
Visit other courses in the series to see how to solve the exact same challenges in languages like C#, C++, Java, PHP, and Python.