Join Curt Frye for an in-depth discussion in this video Inferring results from matched samples selected from two populations, part of Excel 2007: Business Statistics.
If you've ever been to an eye doctor, you're probably familiar with the comparison sequence, "which is better, A or B?" It is much the same for businesses. In many cases, you'll need to determine which of two competing processes produces more desirable results. In this movie, I'll show you how to analyze two data sets that use matched samples. A matched sample means that for every sample in list one there is a matching B sample in list two. In this scenario, Two Trees Olive Oil Company is testing two different processes to purify olive oil.
The values represent the purity level after Process 1 and Process 2 were used on oil from the same batch. You can perform this type of analysis, which is a paired t-test analysis, using the Paired t-Test tool that's part of the Analysis ToolPak, to determine whether the differences between two sets of matched pair measurements are significant. To use the Analysis ToolPak, you click on the Data tab of the Ribbon, and then at the far right edge of the Ribbon, you click Data Analysis. If you don't see the Data Analysis item on the Ribbon on the Data tab, then you need to install the Analysis ToolPak, and I showed you how to do that in a movie early on in the course.
But for now, we'll assume it's installed. So we can click Data Analysis, and then in the Data Analysis dialog box we can scroll down and click t-Test: Paired Two Sample for Means. Click OK and the dialog box of the same name appears. So the first thing we need to do is select the ranges that contain our variables. So I'll click the collapse dialog button. And then rather than selecting this entire table column by moving the mouse pointer until it changes into a downward-pointing black arrow, I'm going to select the cells individually, so the mouse pointer should be what's called a Greek cross, or a white four-way cross, and I will drag down to select cells B1 through B15.
And the reason I'm doing that is because I want to select the column label, so that it will appear in our results and make them easier to understand. So I'll click Expand Dialog button, and we see that Variable 1 Range is correct. Now I will click in Variable 2 Range, click the Collapse Dialog button next to that field, and select again. The mouse pointer should be the white four-way cross. Drag down to cell C15, click the expand dialog button, and we see that we have our correct cell ranges selected.
Now the dialog box asks us to enter a hypothesized mean difference. It is an optional argument, but it helps with the calculations if you can put one in. And in this case, I am going to hypothesize that the difference between the two values is .01. And if you look at the values in the table, and you look at the pairs, you'll see that that's not an unreasonable assumption. I am going to select the Labels check box, because the labels are in the cells B1 and C1. And the Alpha, which is the significance level, I am going to leave at .05.
If you subtract your significance level from 1, then you find your confidence level. So 1-.05 is 95%, which is the usual confidence level that's used in statistics. I do want to put my results on a new worksheet, so I will leave that option button selected. Click OK and I get my results. I'm going to reconfigure the worksheet so that we can read everything. I'll double-click the column margin at the right of the A column, so that we see every value there and then Method 1 and Method 2. I'll double-click for column edges to expand those columns as well. And I'll zoom in a little bit, so that the values are easier to read.
Now we can examine our statistics. So, we find the t statistic is 1.821, and we can compare that to our one-tailed t statistic of 1.77, and the corresponding probability of .045 or 4.5%. The t statistic is larger than the critical one-tail statistic, so that means that the value is significant. However, it is very close to the .05 value, or 5%, so you will probably want to take a larger sample to analyze this data more thoroughly.
Because this example used only 14 observations, you should probably run another batch of tests, preferably with larger sample sizes, to see if there is a significant difference between the two processes.
- Understanding statistical terms
- Creating a basic Excel table
- Auditing formulas
- Creating frequency distributions for qualitative data
- Calculating a running total
- Creating a histogram
- Using PivotTables
- Calculating mean, median, mode, and other numerical data
- Using probability distributions
- Population sampling
- Testing hypotheses
- Developing liner and multiple regression models