Wednesday, April 18, 2012

An Intuitive Explanation of Bayes' Theorem

Yudkowsky presents Bayes' Theorem in an "excruciatingly gentle introduction." While many articles which explain Bayes' Theorem's application to one field or the other, it may be difficult to understand Bayes' Theorem at face value. As Yudkowsky writes, "It's this equation. That's all. Just one equation ... It looks like this random statistics thing." Compounding the problem is the fact that this mathematical concept is very counter-intuitive. It just seems to be one of those things that is inherently difficult for humans to grasp. This paper attempts to simplify Bayes' Theorem.

After introducing Bayes' Theorem with the standard mammography example, Yudkowsky introduces an even simpler example:

Suppose that a barrel contains many small plastic eggs.  Some eggs are painted red and some are painted blue.    40% of the eggs in the bin contain pearls, and 60% contain nothing.   30% of eggs containing pearls are painted blue, and 10% of eggs containing nothing are painted blue.  What is the probability that a blue egg contains a pearl?  

He then sets up the problem using standard Bayesian notation:

  • p(pearl) = 40%
  • p(blue|pearl) = 30%
  • p(blue|~pearl) = 10%
  • p(pearl|blue) = ?
When set up in this manner, it is easier to plug the information into the formula to come to the correct answer, which is 66.7%. However, the most effective way to introduce Bayes' Theorem is using 

Natural frequencies - saying that 40 out of 100 eggs contain pearls, 12 out of 40 eggs containing pearls are painted blue, and 6 out of 60 eggs containing nothing are painted blue.  A natural frequencies presentation is one in which the information about the prior probability is included in presenting the conditional probabilities.

Yudkowsky mentions that presenting the problem in this way seems like cheating. However, in the real world, it is important to cheat, in the sense that the correct answer should be as obvious as possible. 

The significance of Bayes' Theorem comes because it helps people make better sense of how new probabilities relate to one another. Because test cases are a measure of a sample, not a measure of the total population, it makes sense that forecasting accuracy increases with each test. So Bayes is a method to increase forecasting accuracy given new information.  Given statistically independent evidence, calculations of probability will not change. But if the evidence is statistically linked, Bayes will allow the researcher to update the probability, regardless of the significance of new evidence. 

  • Tests are not the event. We have a cancer test, separate from the event of actually having cancer. We have a test for spam, separate from the event of actually having a spam message.
  • Tests are flawed. Tests detect things that don’t exist (false positive), and miss things that do exist (false negative).
  • Tests give us test probabilities, not the real probabilities. People often consider the test results directly, without considering the errors in the tests.
  • False positives skew results. Suppose you are searching for something really rare (1 in a million). Even with a good test, it’s likely that a positive result is really a false positive on somebody in the 999,999.
  • People prefer natural numbers. Saying “100 in 10,000″ rather than “1%” helps people work through the numbers with fewer errors, especially with multiple percentages (“Of those 100, 80 will test positive” rather than “80% of the 1% will test positive”).
  • Even science is a test. At a philosophical level, scientific experiments can be considered “potentially flawed tests” and need to be treated accordingly. There is atest for a chemical, or a phenomenon, and there is the event of the phenomenon itself. Our tests and measuring equipment have some inherent rate of error.

In any case, the formula is still difficult to understand mathematically and cognitively. 

Source: Yudkowski, Eliezer S. (n.d) An Intuitive Explanation of Bayes' Theorem.


  1. I searched for more specific applications of Bayes, but as I did not understand any of them, I thought that perhaps an introductory article could be just as useful.

  2. I found this very useful as an introduction to Bayes. How did you find this as an analytic technique, and could you apply it in any way to game theory?

  3. This was the most succinct introduction to Bayes. It was actually better than the youtube introductory videos. Any chance you found an online calculator/applet to go with this? This would really really make things better for those who are not mathematically inclined.

  4. The biggest problem I have with understanding any advanced mathematics is the way notation (symbols, greek letters) is thrown around without any explanation. In this example, what is the meaning of "Vertical Bar"? Apparently it is not the transitive "Bitwise OR" that I'm used to from Computer Science - here, it seems to roughly mean "Also is", like "A blue egg _also is_ an egg containing a perl". Is that a reasonable explanation, or is there more to | than meets the eye?