We organize and visualize distributions so that we can see trends in our data. To organize and graph a distribution, Joe Schmuller will provides a dataset of maximum temperatures for the month of January in the years 1972-2015 in Madison, Wisconsin. Joe will also show you how to organize the distribution into intervals of five degrees, regardless of year, and count the frequency in each interval to demonstrate frequency distribution.
- [Narrator] We organize and visualize distributions so that we can see trends in our data. Let's learn how to do this. Here, we work with a distribution. That's a set of numbers resulting from measurements. Our measurements are the maximum temperatures in degrees Fahrenheit for the month of January in the years 1972 through 2015 in Madison, Wisconsin. We'll refer to this as Jan Max. And here they are. I put them into a table with 11 rows and four columns so all the values would fit on the screen.
Looking at 44 numbers isn't always the easiest way to spot trends. One thing we can do is put the years into intervals of, say, four years and find the average in each interval. Here's a table of years and means with the average formulas already entered, so you wouldn't have to watch me enter them 11 times. Click on a cell, shows you the details in a formula bar. We can easily visualize this table with a line graph. Select the averages in cells E2 through E12.
Click the Insert tab, then the Charts area. Select Line Chart with lines and markers. After the chart opens, right click inside it and choose Select Data. And then, click the Edit button on the right side. Now, we can put the years on the X axis. Click OK, then OK again, and there they are. Still another way to organize a distribution is to put the temperatures into intervals, of say, five degrees each, regardless of year, and count the frequency in each interval.
So here's the resulting frequency distribution. And here's a graph of the frequency distribution which is called a histogram. On the spreadsheet, click on the Frequency tab. We will use the statistical array function frequency to create a frequency distribution from our data. First, select the array where the frequencies will appear. That's cell I2 through I8. From the statistical menu, select Frequency.
In the data array box, enter the cells that hold the data. Cells B2 through B45. In the bins array box, enter the cells H2 through H8 that hold the upper ends of the intervals. Now, because this is an array function, don't click OK, press these three keys together - Control, Shift, and Enter.
Then the frequencies appear in cells I2 through I8. To graph this distribution, select I2 through I8, and insert a column chart, and let's put the intervals on the X axis by right clicking and selecting data, click Edit, access label range gets the cells G2 through G8. Click OK, and OK again.
Then you can use a plus sign to add elements to the graph like access labels. In a cumulative frequency distribution, each frequency in an interval is added to all the frequencies in the interval below it. Here's a graph of the cumulative frequency distribution. It's a cumulative frequency histogram. Here's how to create the cumulative frequencies and the histogram in Excel. In cell J2, type equals, and then click cell I2, and Enter. Now in cell J3, type equals, click cell I3, plus J2, and Enter.
And now, auto fill the values. To create the histogram, select the cumulative frequencies in cells J2 through J8, and insert a column chart. And once again, we'll put the intervals on the X axis by right clicking, Select Data, and Edit, and the axis label range with cells G2 through G8. Click on OK, OK again, and there's our cumulative frequency histogram.
So, several different ways to organize and visualize a distribution.
He explains how to organize and present data and how to draw conclusions from data using Excel's functions, calculations, and charts, as well as the free and powerful Excel Analysis ToolPak. The objective is for the learner to fully understand and apply statistical concepts—not to just blindly use a specific statistical test for a particular type of data set. Joseph uses Excel as a teaching tool to illustrate the concepts and increase understanding, but all you need is a basic understanding of algebra to follow along.
- Identify functions and charts available for use in Excel.
- Recognize the definition of the Bayesian probability.
- List the three measures of central tendency.
- Compare the usage of inferential statistics to the usage of standard normal distribution.
- Calculate the confidence level when provided the alpha level of uncertainty.
- Define a Type I and Type II error of hypothesis testing.
- Explain when to use a z-test or a t-test.
- Recall the types of variance that occur in multi-factored studies.
- Summarize the function of variables in multiple regression.
- Name three types of correlation values.