Monday, October 6, 2014

Summary of Findings: Bayesian Forecasting (4 out of 5 stars)

Note: This post represents the synthesis of the thoughts, procedures and experiences of others as represented in the 5 articles read in advance (see previous posts) and the discussion among the students and instructor during the Advanced Analytic Techniques class at Mercyhurst University in October 2014  regarding Bayesian Forecasting specifically. This technique was evaluated based on its overall validity, simplicity, flexibility and its ability to effectively use unstructured data.

Description:
Bayesian analysis is a statistical procedure used to combine prior information with new evidence to produce an estimate.  Bayes theorem, also known as conditional probability, is a means for revising predictions as new relevant evidence is collected.

Strengths:
1. Bayes presents a mathematical approach to how analysts do intelligence  
2. Outputs a single percentage as opposed to a range, which can be updated as new evidence is found
3. Bayes is the accepted gold standard of probabilistic mathematics
4. Clearly outlines the process, allowing for modifiers, such as nominal group technique, to be incorporated
5. Simplistic Bayes can be learned quickly by analysts lacking strong mathematical skills

Weaknesses:
1. Quality of estimate contingent on appropriate selection and weighing of evidence
2. Confirmation of appropriate selection and weighing of evidence not possible until after the fact due to present unknowns and future uncertainties
3. Evidence factored into Bayesian model can fluctuate significantly in importance
4. Output of Bayesian model not always in format that satisfies intelligence requirement
5. The absence of standard deviation and other “traditional” statistical measures creates barriers to communicating estimate to decision makers

Step by Step:  
Note: There are many different to complete a Bayesian model. This step by step process was identified as a common one across different this Bayesian exercises.
  1. Identify a conceptual problem
  2. Create a baseline
  3. Take pieces of evidence and apply them to potential outcomes of the problem
  4. Follow Nate Silver’s example to solve for the probability:   

  1. This methodology can be repeated as new information and/or evidence is introduced to problem 
     
Silver, N. (2012). The signal and the noise: Why so many predictions fail--but some don't. New York: Penguin Press.

Exercise:
Participants were provided a Bayesian calculation Excel sheet and a form containing 8 pieces of evidence pertaining to Iran nuclear development (available here).  Participants were required to assess the prior probability, the probability of the evidence being directly related to Iran nuclear development, and the probability of the evidence occurring/appearing regardless of Iran’s nuclear weapon intentions.  The prior probability for the next evidence valuation was the revised probability result from the previous evaluation.  For example, the revised probability after evaluation Evidence A served as the prior probability before evaluating for Evidence B.

Once the evaluations were completed, all of the participants’ results were automatically recorded and graphed.  A aggregated average was also included (available here).

What did we learn from the Bayesian Forecasting Exercise
Although Bayesian is considered the ‘gold standard’ for probability revisions, the complex nature of intelligence work makes it troublesome to use simple Bayesian.  First, it is difficult to accurately weigh evidence without hindsight.  Second, selecting which evidence is worth considering without hindsight is equally cumbersome.

Friday, October 3, 2014

Bayesian Reasoning Method For Intelligence Using Natural Frequencies

Summary
Jennifer Lee’s 2007 Mercyhurst thesis contains three experiments that suggest an efficient way for teaching a Bayesian reasoning technique to intelligence analysts and indicate that using natural frequencies when modeling a problem increases forecasting accuracy compared to modeling a problem using conditional probabilities when solving intelligence-related Bayesian problems. However, Lee found no significant difference in forecasting accuracy between a group using an experimental paper and pencil natural frequency tree method and a group using their own form of calculation to solve an intelligence-related Bayesian problem worded using natural frequencies as applied by a previous experiment conducted by Ulrich Hoffrage and Gerd Gigerenzer. Both groups using natural frequencies to solve Bayesian problems outperformed the group using traditional statistical language. 


Performance of traditional statistics group forecasts compared to Hoffrage/Gigerenzer group forecasts. In the traditional group, 6 out of 34 analysts were within ten points of the correct answer. In the Hoffrage/Gigerenzer group, 26 of the 33 analysts were within ten points of the correct answer.
Lee tested two hypotheses: 
  1. Intelligence-related Bayesian problems worded using natural frequencies will elicit higher Bayesian reasoning amongst intelligence analysts compared to intelligence-related Bayesian problems that are worded using traditional statistical language. 
    1. Findings proved this hypothesis true. 
  2. A paper and pencil frequency tree method that utilizes natural frequencies can easily be taught to intelligence analysts within 90 minutes and elicit a higher level of Bayesian reasoning than Bayesian problems that are only worded using natural frequencies.    
    1. Findings proved the first part (ease of teaching) true, but the second part (superiority of frequency tree method) false.
Lee outlines five specific benefits to using Bayesian techniques in intelligence analysis:
  1. Bayes forces analysts to assess all pieces of evidence in a systematic way, thus eliminating any biases resulting from recentness or visibility. 
  2. Bayesian procedures are transparent and therefore can be reproduced by other analysts who disagree with the final estimate. 
  3. The way in which analysts model problems using Bayes forces the analysts to consider alternative explanations of the facts perceived 
  4. Bayes forces numerical assignments when weighing pieces of evidence 
  5. Bayes is less conservative than informal opinions and tend to drive the probabilities away from 50/50 faster and farther than subjective intuitive judgments do. 
Natural frequencies put statistical information into a form that analysts without a strong quantitative background can intuitively understand. Peter Sedlmeier and Gigerenzer developed a computer program that teaches users how to use natural frequency methods to solve problems that require Bayesian reasoning. While the accuracy of estimates from the group using the experimental paper and pencil natural frequency tree methodology did not differ significantly from the group using Peter Sedlmeier and Gigerenzer's method, the group using the experimental paper and pencil method did in fact learn more about Bayesian methods at a statistically significant level.

Critique
The article indicates that it is possible to teach analysts a Bayesian method in a small amount of time. Existing literature indicated that Bayesian reasoning compliments other analytic approaches when applied to strategic warning intelligence because Bayes helps ensure that analysts don't assign more meaning to the evidence available than is warranted. Strategic warning analysis focuses on the odds favoring an imminent attack over no imminent attack. Bayes enables analysts to quantify judgments, enables evaluation of evidence against hypotheses, and enables judgments about individual pieces of evidence instead of the sum of the evidence overall.

One of the hypotheses Lee describes to account for the failure of the natural frequency tree method to statistically outperform the natural frequency methods developed by Sedlmeier and Gigerenzer is the composition of the natural frequency tree group, which was composed primarily of freshmen and sophomores. Seniors made up the sample population of the natural frequency group without the tree method. The qualitative differences in maturity between the two groups could have been enough to affect the results of the natural frequency tree method forecasts but this could not be proven. An implication for future research is selection of participants that are more similar than different in class level dimensions.

Source:
Bayesian Reasoning Method For Intelligence Using Natural Frequencies

Bayes theorem of intelligence analysis

Summary
In his research on bayes probabilities, Zlotnich (1970) identifies bayes as a potentially useful tool for intelligence analysts. This reasoning lays in the nature of intelligence work.  As intelligence analysts, we work with uncertainties everyday. Bayes analysis understands this and uses probabilities to help identify the likelihood of events occurring based on past occurrences, such as an intelligence analyst would do.

In a very simplistic way, Zlotnich's bayes formula is R=PL.  R is the rivised estimate of the probability of an event occurring after new evidence has been taken into consideration. P is the previous estimate likelihood of the event occurring before the new piece of evidence. L is the estimate of the probobalitliy of the event occurring based only on this new piece of evidence.

Zlotnich identifies two main differences between intelligence analysis and bayes,
1) Bayes requires that analysts define a qualitative value to an estimate. Analysts are familiar with providing estimator words of probability, such as likely, unlikely, and almost certain. Using bayes, they must turn those words, into numbers.

2) Bayes allows the analyst to look and examine pieces of evidence individually, instead of creating an estimate based on the cumulative summary of the evidence. The math will provide the overall estimate of all the evidence combined. According to Zlotnich, his experiences and research suggest that analysts are often better at making estimates off one piece of evidence as opposed to having to summarize a conglomerate of evidence.

There are some problems with bayes, one being the presence of suspect evidence. Not all evidence comes from completely reliable and accurate sources. The analyst often has to judge the reliability of the information and account for that in his estimate. Unfortunetly, analysts will often weigh source reliability with their own opinion of the hypothesis. If information comes from a questionable source that is counter to what the analyst believes, the analyst may see the source as low credibility.

In addition, another issue faced by analysts using bayes is the passage of time and the erosion of evidence. As time passes, pieces of evidence often lose weight. A piece of evidence that was found a month ago is likely to not have the same weight as it did a month later.  Analysts must find a way to account for this in their analysis. One simple way is for the analyst to periodical go back and re-analyze pieces of evidence to see of they are still relevant or are out of date.

Critique

Zlotnich's article on the use of bayes, while dated, was informative and well written. Some of the issues he identified have been partially resolved today. For example, research has been done to help assign words of estimator probability with percentages (Kessleman). Also, to address source reliability, there has been research into the characteristics of a good source and actually being able to give a source a credibility ranking (Norman).  Zotnich provided an excellent outline of the benifits of bayes as well as many of the issues we as analysts face. Luckily, some of these issues had been identified and their are ways now around them.

Source

Zlotnich, Jack (1970). Bayes Theorem for intelligence analysis. CIA.

Representing Variable Source Credibility in Intelligence Analysis with Bayesian Networks

Representing Variable Source Credibility in Intelligence Analysis with Bayesian Networks
By: Ken McNaught and Peter Sutovsky

Summary:
McNaught and Sutovsky used a Bayesian Network (BN) to show a computational platform to support information fusion. The authors make the point that they are “not advocating the routine use of this approach in intelligence analysis, partly due to the difficulty of quantifying such models.”  Although, they believe using BNs can help to understand the aspects of uncertainty. This paper explores the possibility of using BNs with the combination of evidence from credible and variable sources.

One strength of utilizing a BN is that it can help overcome cognitive biases. Other occupations can also utilize BNs due to its flexible and powerful probabilistic modeling framework. According to the authors, “risk modeling and forensic analysis are two fields which share some commonalities with intelligence analysis and in which applications of BNs are increasing.” Using a BN would permit the exploration of “what if” statements, which could help create hypotheses and questions. 

Below is an example of a BN of information that has three main categories, a summary of the “analyst’s current understand of the situation, an analysis of evidence gaps and key uncertainties, and finally a list of actions required.” Using a BN encourages additional thinking to find missing information, other resources, and new investigated leads. This method also supports a collaborative environment so colleagues can utilize each other’s work.    


Although the authors believe BNs support logical reasoning, they concluded that networks do not advocate routine quantification to help calculate probabilities of various hypotheses.  In order to be able to calculate the probabilistic inference, each node would need to be quantified with probability distributions. This would require many probabilities including a number of uncertainties.
The authors conclude that BNs could help intelligence analysts overcome some cognitive bases and help “provide important insights regarding the combination of evidence and the sensitivity of inferences to source credibilities.”

*For more BN examples in this research, please see the source below.

Critique:
Although this research was somewhat complex to learn without former knowledge of the subject, the authors did a good job breaking down the material and giving a background of the theory. However, further explanations of real world situations would have made the material easier to understand. I found the authors interesting when they related the information to other academic fields, making it a useful tools for other analysts.

Source

Monitoring Murder Crime in Namibia Using Bayesian Space-Time Models

This paper focuses on the murder rate within Namibia, and the development of a Bayesian space-time model to monitor the issue. Namibia, a developing sub-Saharan African nation, possesses one of the highest murder rates in the world. In 2006, the murder rate in Namibia was 0.168 per 1000 people which was among the top 6 countries in the world with the highest murder rates. This is decrease from a murder rate of 0.480 per 1000 people in 1997.

The purpose of this paper was to apply a Bayesian approach to monitoring changes in the risk of being murdered in the 13 regions of Namibia over time.  Utilizing murder data between 2002 and 2006, Neema and Bohning developed a Bayesian model to assess the chances of being murdered over time in the regions. Data was gathered from uniform crime reports in the Department of Crime Information Unit (CI) of the Namibian Police.

Neema and Bohning incorporated various factors into their Bayesian model to assess which potential variable has the greatest impact on the likelihood of being murdered. The Bayesian model produced the following posterior mean estimates:

The rate of the overall risk of being murdered is 0.889 per 1000 people with a confidence interval of 0.72 to 1.083 when no random parameters are included in the model. Additionally, the study concluded that regional clustering resulted in the greatest amount of variation in the relative risk of murder (σ­u=0.683).

The results of the study revealed that the greatest amount of variation in the relative risk for murder was due to regional clustering, and that population density was insignificant.

Critique

This study properly used a Bayesian model to identify the bounds of the relative risk for murder in Namibia; however, the Bayesian model was not used in a predictive context. Rather than predicting the relative risk for murder in the future, the study was used to identify the key factors that may be contributing to murders. Also, this study did not provide a great deal of evidence supporting its findings. The Bayesian model produced mean estimates correlated to factors, but Neema and Bohning did not provide any further support for why regional clustering attributes to a higher likelihood of being murdered.

Source:

Neema, I. & Bohning, D. (2012). Monitoring murder crime in Namibia using Bayesian space-time models. Journal of Probability and Statistics, 1-13.

Fusion of intelligence information: A Bayesian approach



Summary
The September 11th attacks should be viewed as one defense failure in a series of foiled attempts.  In risk management, only defensive failures are broadcasted while successes often go unnoticed to the public.  The attacks were successful due to a lack of information sharing.  In regards to the little amount of information that was shared, intelligence professionals were not able to measure the value of particular pieces of information.  In other words, they did not effectively find signals in noise.

According to Paté-Cornell, Bayesian models provide the efficient means to find signals in voluminous amounts of noisy information.  Bayesian models calculate the probability of an event given a new signal and a base-line probability or the priority prior to the new signal.  A basic Bayesian model should contain a prior probability estimate, a probability estimate given a new piece of evidence, and a probability assessment of the relevance of the new piece of evidence.

Paté-Cornell takes a Bayesian model for fusion intelligence one step further.  She introduces various steps in the model in order to account for the possibilities of a false positive and a false negative given a new piece of evidence (or missed evidence).  This approach is normally used in engineering risk analysis, but it is logically applicable to intelligence work.  The generalized formula is shown below.


Critique
Unfortunately for Paté-Cornell, not all intelligence analysts are math fans.  The extra requirements of estimating false positives and false negatives, while logically sound, are difficult to do.  Intelligence professions operate in a world of unknown unknowns arguably more than any other profession.  Quadrant Crunching™ provides a qualitative and visual alternative for those who opt not to stress over numbers.  With that being said, it is highly commendable to incorporate uncertainty into the equation.

A simpler, more user-friendly Bayesian model would be one given by Nate Silver in his book, Signal and the Noise: Why So Many Predictions Fail – but Some Don’t.  Silver’s Bayes contains the three, known necessary variables, but only includes one unknown one dealing with the new signal.  The way it is structured, the prior probability or baseline is resilient in the face of new information.


“x” is the prior probability, “y” is the probability of the signal resulting from a particular event,  and “z” is the probability of the signal being unrelated to the same event considered in “y”.  This formula is much simpler and, according to Silver, can lead to “vast predictive insights.”

Source
Pate-Cornell, E. (2002). Fusion of intelligence information: A Bayesian approach. Risk Analysis, 22(3), 445–454.