Join Dan Gookin for an in-depth discussion in this video Exploring Lake Pend Oreille, part of Code Clinic: C.
(electronic noises and beeps) - Hello, and welcome to Code Clinic. My name is Dan Gookin. Code Clinic is a course where a unique problem is introduced to a collection of lynda.com authors. In response, each author creates a solution by using their programming language of choice. You can learn several things from Code Clinic: different approaches to solving a problem, the pros and cons of different languages, and some tips and tricks to incorporate into your own coding practices.
This chapter's problem is on statistical analysis, and to some extent, handling big data. It's common to use a computer to manipulate and summarize large amounts of information, providing important insights on how to improve or handle a situation. In this problem, I'll use weather data collected by the US Navy from Lake Pend Oreille in Northern Idaho. Lake Pend Oreille is the fifth deepest freshwater lake in the United States, so deep, in fact, that the US Navy uses it to test submarines.
As part of that testing, the US Navy compiles an exhaustive list of weather statistics: wind speed, air temperature, barometric pressure. You can browse this data by pointing your web browser at http://lpo.dt.navy.mil. You'll find several weather summaries, a web cam, and the raw data they've collected every five minutes, archived as standard text files. For anyone living or working in Lake Pend Oreille, weather statistics are an important part of everyday life.
Average wind speed can be very different than median wind speed, especially if you're on a small boat in the middle of the lake. In this challenge, each Code Clinic author uses his favorite language to calculate the mean and median of the wind speed, air temperature, and barometric pressure recorded at the Deep Moor Station for a given range of dates. First I'll briefly review the concepts of mean and median. These are both statistics. To help explain statistics, I'll use some sample values. These numbers represent 14 readings for wind gust at Deep Moor Weather Station on January 1st, 2014.
You can see the data at this website. The first column is the day the wind gust was recorded. The second column is the time it was recorded, and the third column is the wind gust in miles per hour. The mean is also known as the average. To calculate the mean of a range of numbers, simply add the values in the set, then divide by the number of values. In this example, we add 14 twice, 11 five times, a three, four sevens, a four, and an eight.
Divide that sum by 14, the total of numbers in the set, and in this case, the mean equals nine. The median is the number halfway between all the values in a sorted range of values. Think of the median as a median strip of a road. It always marks the center of the road. To calculate the median, first sort the numbers from lowest to highest. For an odd number of values, just take the middle number. For an even number of values, calculate the average of the two central numbers.
For this example, the number of values is even, so you sort the numbers, then take the average of the middle two values, eight and 11. The median is nine and a half. That's the first challenge. Pull data from the website for a specific date regarding wind speed, air temperature, and barometric pressure. Then calculate the mean and median for each set. Perhaps you want to pause and create a solution of your own. How would you solve this problem? In the next few movies, I'll show you how I solved the challenge.
Dan introduces challenges and then provides an overview of his solutions in C. Challenges include topics such as statistical analysis, searching directories for images, and accessing peripheral devices.
Skill Level Beginner
Q: I am unable to access the Lake Pend Oreille data from outside the U.S.
A: A static copy of this data is provided here for lynda.com members outside of the U.S.