## Thursday, April 19, 2012

### Summary of Findings (White Team): Bayes Theorem (4 out of 5 stars)

Note: This post represents the synthesis of the thoughts, procedures and experiences of others as represented in the articles read in advance (see previous posts) and the discussion among the students and instructor during the Advanced Analytic Techniques class at Mercyhurst University on 19 April 2012 regarding Bayesian Analysis specifically. This technique was evaluated based on its overall validity, simplicity, flexibility, its ability to effectively use unstructured data, and its ease of communication to a decision maker.

Description:
Bayesian analysis is a statistical methodology which employs a mathematical formula to produce a probability, using the formula known as “Bayes’ Rule”. This objective probability estimate can be dynamically updated as the analyst learns additional evidence relevant to the situation. Where prior information is known about a target population, Bayesian inference allows analysts to evaluate consequential, related probabilities that were not previously obvious. While the formula is fairly complicated and difficult to apply by hand, automation through computer software makes the use of Bayes easier to apply to complex intelligence problems.

Strengths:

• Useful for situations where frequentist statistics are insufficient or unable to deal with variables and probabilities.
• Immediately generates an estimate and applies a quantitative value to variables that were previously purely qualitative.
• Adjusts for bias or uncalibrated estimates.
• Various software lessens the difficulty associated with the formula.
• Natural frequencies can allow users to instinctively pick up on some applications of Bayesian statistics.

Weaknesses:

• Formal Bayesian analysis is not very intuitive - it takes training and insight to apply most effectively.
• Difficult to do by hand in many situations, requiring access to Bayesian software/technology to use most effectively.
• Genuinely difficult to learn and effectively explain to others.

How-To:

1. Define your two conditional variables (A and B)
2. Define the probabilities of these variables and their inverse.
3. Plug these values into the formula
P(A|B)=[P(B|A)P(A)] / [P(B|A)P(A)+P(B|~A)P(~A)] or go here.
Where in the formula above...
P(A|B) = Probability the analyst is seeking

P(B|A) = Probability of B given that A occurs.

P(A) = Probability of event A occurring.

P(~A) = Probability of A not occurring.

P(B|~A) = Probability of B given that A did not occur.

Personal Application of Technique:
For the activity, the class was given a brief introduction to the basic concepts of Bayesian analysis and formulas with a hypothetical situation involving diseases and tests with a base accuracy rate. Then, the educators introduced the class to software (Netica) designed to enable and enhance the creation of Bayesian analytic networks. The class was walked through the creation of a basic network based on the prior examples, step-by-step. After this instruction, the class was asked to construct a Bayesian network from the following prompt:

“A crime was committed, and the perpetrator was seen getting into a taxi and driving away. In this city, 85% of the taxis are green, and 15% are blue. You have an eyewitness who claims he saw the man getting into a blue taxi. Your experts estimate a 95% chance that what he saw was the correct color, given that it was nighttime and foggy. What is the actual probability that he saw the right color (blue)?”

This theory is highly counter intuitive. Bayes runs against the natural instinct for identifying naturally occurring frequencies.