In the previous video, you learned about a way to draw a graph that visualizes the way probability is combined to perform a Bayesian analysis. In this video, learn how to create a classification matrix which can take you one step closer to implementing th
- [Instructor] In the previous movie, I showed you a way to draw a graph that visualizes the way probability is combined to perform Bayesian analysis. In this movie, I will show you how to create a calculation matrix which will take us one step closer to implementing the analysis in Excel. So let's review what we know about our base rates and accuracy. Based on what we stated before, 85% of cabs are green, and 15% are blue. The accuracy of witness testimony is 80%, so 80% of the time, they'll be right, 20% wrong.
The cab can be either green or blue. Those are the only options. And the witness can be accurate or not. And using those facts, we can create a table by evaluating the probability of each case. So how do we create a classification matrix? Well, again, we have green cabs 85% of the time, blue cabs 15, witness is correct 80% of the time, and incorrect 20%. Here is what the matrix looks like. We have our four cases where the cab was reported as green and was actually green, and that will occur 68% of the time, .85 times .8.
When the cab was reported green and it was actually blue, that's the 50% blue cab base rate times the 20% incorrect guess, and that is 3% or 0.03. And we have the same values for the cab being reported blue. There are a couple of things that I would like to point out. The first is accuracy. You can see here that the cab is green and reported as green 68% of the time, and it's actually blue and reported blue 12% of the time. So there are those two cells in the table.
.68 times .12 is .8, and that adds up to the amount that the witness is correct, the witness is correct 80% or .8 of the time. The color of the cabs is reported incorrectly, that is, it's actually green when reported blue .17 of the time, and actually blue when reported green .03 of the time, so there you have 17 plus 3% equals 20%, and that's the total time that the witness is incorrect. If you look at the times that the cab is actually green, you have .68 plus .17, that's in the top row.
That's .85, our base rate for green cabs, and at the bottom, we have .03 plus .12, or 15%, .15, and that equals the actual base rate for the blue cabs. So if you're uncertain, if you have done your calculations correctly, you can look to make sure that they line up with your assumptions and then you'll know that your calculations are almost certainly correct. So how do we calculate these probabilities? The calculations for a cab being blue and reported as blue, as mentioned before, is about 41%.
And the number of times that the cab is blue but reported as green will be about 59%, and you can check your calculations because 41% and 59% add up to 100. Now, let's say that the cab is actually green. It will be reported as green, correctly, 96% of the time, and it will be reported as blue only about 4% of the time. So as you can see, the higher base rate for green cabs means that even with only an 80% accuracy, reports that a cab is green are much more likely to be correct than blue cabs being reported as blue.
- Distinguish between the mean, median, and mode.
- Describe the relationship between variance and standard deviation.
- Identify a nondirectional hypothesis.
- Point out the difference between COVARIANCE.P and COVARIANCE.S.
- Explain correlation.
- Analyze Bayes’ rule.