(buzzing and whooshing) - Hello and welcome to Code Clinic. My name is Ray Villalobos. Code Clinic is a monthly course where a unique problem is introduced to a collection of Lynda.com authors. In response, each author will create a solution using their programming language of choice. You can learn several things from Code Clinic. Different approaches to solving a problem, the pros and cons of different languages, and some tips and tricks to incorporate into your own coding practices.
This time we'll work on a problem in statistical analysis and to some extent try to handle big data. It's common to use a computer to manipulate and summarize large amounts of information. Providing important insight on how to improve and handle a situation. In this problem we'll use weather data collected the US Navy from Lake Pend Oreille in Northern Idaho. Lake Pend Orielle is the fifth deepest fresh water lake in the United States. So deep, in fact, that the US Navy uses it to test submarines.
As part of that testing the US Navy compiles an exhaustive list of weather statistics such as wind speed, air temperature, and barometric pressure. You can browse this data by pointing your web browser at http://lpo.dt.navy.mil. You'll find several weather summaries, a web cam, and the raw data they collect archived as standard text files. For anyone living or working on Lake Pend Orielle weather statistics are an important part of everyday life.
Average wind speeds can be very different than median wind speeds. Especially if you are in a small boat in the middle of the lake. In this challenge, each of our authors will use their favorite language to calculate the mean and median of the wind speed, air temperature, and barometric pressure, recorded at the Deep Moor station for a given range of dates. First, let's briefly review mean and median. These are both statistics. Now, to explain statistics we need a set of numbers. How about 14 readings for wind gust at Deep Moor weather station on January 1st, 2014.
You can see the data at this website. The first column is the day the wind gust was recorded. The second column is the time it was recoreded. And the third column is the wind gust and miles per hour. The mean is also known as the average. To calculate the mean of a range of numbers you simply add the values in the set then divide by the number of values. In this example, we add 14 plus 14 plus 11 plus 11 plus 11 plus 11 plus 11 plus 3 plus 7 plus 7 plus 7 plus 7 plus 4 plus 8 then divide the sum by 14.
The count of numbers in the set. In this case, the mean is equal to 9. The median is the number halfway between all the values in a sorted range of values. Think of the median as the median strip of the road. It always marks the center of the road. To calculate the median, first sort the numbers from lowest to highest. If there is an odd number of values then just take the middle number. If there's an even number of values then calculate the average of the central two numbers.
This is called Same Domain Origin Policy and it means that files that you are loading into a web application should be on the same website that's requesting them. To solve the first problem I'm letting PHP handle processing the original data into manageable chunks. Those can be requested through a RESTful service. In our case, if we make this HTTP call we'll get the data for two of the dates in the Lake Pend Orielle statistics. Now, I'm really thankful to David Powers for providing this service for me.
Each one of those will be a series of arrays with the raw data for each of the types of data for Lake Pend Orielle. For example, here's the speed on this particular day. So, there's our first challenge. Pull statistics from a data set available online. Perhaps you'll want to pause and create a solution of your own. How would you solve the problem? In the next videos, I'll show you how I solved the challenge.
Visit other courses in the series to see how to solve the exact same challenges in languages like C++, C#, Java, PHP, Python, R, Ruby, and Swift.