If you haven't worked with Bayesian analysis before, dealing with prior probabilities and base rates, then the concept can be a little bit unclear. In this video, work through a classic example, the Tversky and Kahneman taxicab problem, to demonstrate how
- [Instructor] If you haven't worked with Bayesian analysis before, then the concept can be a little unclear. I'd like to work through a classic example, the Tversky and Kahneman taxicab problem, to demonstrate how the process works and give you a better intuition. We'll start with the following facts. A cab was involved in a hit-and-run accident at night, and only two cab companies, green and blue, operate in the city. You, the analyst, have been given the following data. The first is that a witness identified the cab as blue.
The base rate is that 85% of the cabs in the city are green and only 15% are blue. Also, and any attorneys in the audience will find this next claim to be a bit questionable, but we're illustrating a point. The court tested the reliability of the witness under the circumstances that existed on the night of the accident and concluded that the witness correctly identified each one of the two colors 80% of the time and failed 20% of the time. So in this scenario, the witness can tell green correctly 80% of the time when the cab was green and blue 80% of the time with 20% failure.
So your question is what is the probability that the cab involved in the accident was blue rather than green? And remember we have a witness saying the cab was blue, but that witness is only 80% accurate. The answer, which I will give in a second, is explained by Neel Ocean. He provides an excellent, intuitive explanation on how to arrive at the correct answer, and that answer is, think about what it would be. 41% or thereabouts.
And you can see the URL for Neel's website where he gives his own explanation. So let's visualize the answer. And again, I'm working off of Neel Ocean's work here. We have our base rate. 85% of the cabs in the city are green and 15% are blue. And remember that our witness said the cab was blue. There's an 80% accuracy rate, so 85% of the time the cab will be green and the witness will say yes it is green when it's correct, or the witness will say that it's actually blue and be incorrect.
And we have the same scenario but with a 15% base rate on the other side for blue. We'll see that it's actually blue 80% of the time but it's actually green 20% of the time. Now we need to calculate the compound or conditional probabilities for each of these scenarios, green actually green, green actually blue, blue actually blue, and blue actually green. With an 85% base rate and 80% accuracy, the cab will actually be green and reported as green 68% of the time.
It'll be reported as blue 17%, and 68 plus 17 adds up to 85. On the blue side, it will be reported as blue 80% of the time, and .15 times .8 equals .12, and it will actually be green 3% of the time, .15 times .2. And again, .12 plus .03 adds up to the total of 15%. Now how do we calculate our conditional probability? We need to focus on the times that the cab was reported as being actually blue.
So we have 17% of the time a green cab was reported as blue and 80% of the time a blue cab is reported as blue, so .15 times .8 again is .12. So what we will do is divide the times that the cab was correctly identified, that's .12, that's blue reported as blue, and divide that by the total of the number of times that the cab was reported as blue, correctly and incorrectly, and that is .12 plus .17, or .29.
So the probability that a cab reported as blue was actually blue, assuming that the witness is 80% accurate and we have our base rate as stated, is 41%.
- Distinguish between the mean, median, and mode.
- Describe the relationship between variance and standard deviation.
- Identify a nondirectional hypothesis.
- Point out the difference between COVARIANCE.P and COVARIANCE.S.
- Explain correlation.
- Analyze Bayes’ rule.