Tuesday, April 23, 2013

Applied Visual Analytics for Economic Decision-Making

Summary:

Savikhin et al. (2008) utilize the application of visual analytics to improve an individuals' economic decision-making skills.   The authors investigated the application of visual analytics to common problems noticed in economics, the winner’s and loser’s curse.   The winner’s curse is the individual who tends to overpay for certain items or services, either the individual is worse off for buying the product or service, or the value of the asset is less than the bidder perceived.  The loser’s curse is when an individual pursues an asset that is below their profit-maximizing bid, or a competing entity attains the bid.   The main problems apparent is that economists are unable to see the potential for creating a business strategy that is able to maximize profit, with most economists are unable to consider all the information that could guide these decisions.   Thus, the authors apply visual analytics to improve the decision-making abilities in both winner’s and loser’s curse situations.   The hypothesis by Sayikhin et al. (2008) for their study was the subjects who participated in their interactive visual analysis study would bid closer to the profit maximization decision as opposed to those who participated in simple visual or tabular studies. 

Sayikhin et al. (2008) conducted six different treatment groups, 3 for winner’s curse scenarios and 3 for loser’s curse scenarios.  The three different visual aids the participants looked at to help with their decision-making were an interactive visual analytic model, a simple visual, and a tabular table.  Each subject in the experiment acted independently from other subjects in the study.   All were given the scenario of being a decision-maker who had to decide how much to bid for a company.   Participants were given a possible data range that they could bid for each company.  Decisions on how much to bid were conducted on a computer generated program that would randomly decide the value of the company and display the three different types of graphics.  Over the course of the experiment the participants switched between the three types of visual aids and would base their bid value off their interpretation of what the visuals portrayed.   In each of the three different visual representations individuals were given 30 different opportunities to bid on various companies. 

Overall, subjects that were given the interactive visual analytics treatment learned what the best bid/optimal solution would be more often as compared to those individuals who were given simple visual or tabular representation of the bidding information.  Moreover, for both the winner’s and loser’s curse scenario groups, the periods of using interactive visual analytics outperformed subjects given the other visual treatments.   Overall, results were statistically significant in this regard. Each increased usage of the interactive visual analytics model allowed the participants to learn from past decisions on bids and allowed these individuals to make more optimal bids as opposed to other participants who received the other two visual treatments.  It is also important to note that even a simple visual aide provided more effective decision making capabilities as opposed to viewing information with tabular formed displays.

Critique: 

I found that this study was useful as it provided a way in which to help individuals within the business realm to make more efficient decisions by analyzing their situation with interactive visual aids.  It is important to avow that this study seems to suggest the usefulness of showing information visually to overcome cognitive thinking judgments from decision-making and improve learning capabilities.   Moreover, it would be interesting for a future study to demonstrate how interactive visuals seem to engage our thinking more than just a simple visual does.  One limit of this study was that the sample size was small, so it would be interesting to conduct this study over a much larger sample size to replicate the results.  Another limitation of this study other than sample size was that the authors only looked at bidding patterns in winner’s curse or loser’s curse scenarios, not any other economic conditions.  Even though this is a topic that would come up often within the business environment, it would be interesting to see what other areas in business decision-making scenarios would interactive visual analytics improve the process of decision-making.   I would hypothesize that interactive visual analytics would be able to be applied to multiple areas in the business realm, especially for those individuals who learn more effectively visually.  



Source: Savikhin, A.,Maciejewski, R., & Ebert, D.S. (2008). Applied visual analytics for economic decision-making. IEEE Symposium on Visual Analytics Science and Technology, 107-114.  Retrieved from https://www.bioinformatics.purdue.edu/discoverypark/vaccine/assets/pdfs/publications/pdf/Applied%20Visual%20Analytics%20for%20Economic%20Decision-Making.pdf.

Monday, April 22, 2013

Empirical Studies of Information Visualization: A Meta-Analysis


Summary:

The study Empirical Studies of Information Visualization: A Meta-Analysis by Chaomei Chen and Yue Yu provides a meta-analysis on a variety of empirical studies of information visualization.  The intent of the research is to conduct a meta-analysis on this topic in order to capture the theories and practices in empirical examinations of information visualization. The analysis focuses on three areas of information visualization including users, tasks, and tools.  As a meta-analysis, this article provides a simplified description and displays the underlying relations in the large amount of convoluted, contradictory, and confusing information often found in the literature.  

The article first provides an overview of the meta-analytical method and selection for studies used, then a subjective review of the studies is presented follow by identifying the most commonly used hypothesis, independent variables, and dependent variables, this is followed by the results of the study.  The research includes experimental studies with independent variables that comprise a relationship to one of the three contextual variables (users, tasks, and tools).  The two types of dependent variables used are accuracy and efficiency measures. 

The study’s results come in two parts looking at both users and tools.  Each section compares the empirical findings of individual studies and are synthesized in terms of effect sizes and significance levels.  The study found that users with strong cognitive abilities will benefit significantly more from visual-spatial interfaces than those with weaker cognitive abilities.  The study found that users with stronger cognitive abilities will perform more efficiently than users with weaker cognitive abilities while using visualization. Additionally, the study displayed that visual-spatial information-retrieval interfaces will enable users to perform better than traditional retrieval interfaces.  Finally, the study stated that users using visualization interfaces in information retrieval will perform more efficiently than those using a none visualization interface.  The following list the major, all encompassing conclusions of the study.
                1. Empirical studies of information visualization are diverse and applying meta-analysis methods is difficult.
                2. Future studies would benefit from systematically investigating individual differences, including a  variety of cognitive abilities systematically.
                3. When users displayed the same level of cognitive abilities they tended to perform better with simpler visual-spatial interfaces.
                4. The combined effect size of visualization in not statistically significant. A larger homogeneous sample of studies is necessary for conclusive results.

Critique:

This meta-analysis is especially helpful due to the increasing amount of literate on the topic of visualization.  Although its finding were not significant it provided a very effective start in providing an overview of the current literature on visualization.  As technology progresses this study will be one to build upon.
Considering the study came out when visualization was in its infancy, it is understandable that a minimal amount of articles were available.  Although, if the authors would have slightly expanded their criteria the study would have provided a better overview of visualization overall.  Furthermore, only five studies tested the effects of visualization on accuracy.  To strengthen the argument the authors should have considered increasing the amount of articles used under this category.  Additionally only three studies tested the efficiency of visualization. The same critique applies here in that more sources would have greatly benefited the analysis as well as the argument overall.  Again, by expanding the criteria for an article to be used in the study as well as the categories tested in the study more articles may have been available.    

Overall, this study provided a very useful synthesis of visualization and its benefits.  Expanding upon the approach by conducting a similar study in terms of visualization today would provide an interesting comparison and an overview of its progression.   

Source:

Chen, C. & Yue, Y. Empirical studies of information visualization: a meta-analysis. Int. J. Human-Computer Studies. 53(5), 851-866. Retrieved from: http://www.sciencedirect.com/science/article/pii/S1071581900904221.

Geographical Information Systems–Based Marketing Decisions:Effects of Alternative Visualizations on Decision Quality

Ana-Marija Ozimec, Martin Natter, & Thomas Reutterer conducted a study to examine the effectiveness of different quantitative symbolization methods on maps. The researchers wanted to know which type of visualization method was the most effective with decision makers.

The symbolization styles they examined  were size-based circles and bars, shadings (value), pure (circles, bars, and shadings), combined (shadings and distortion), circles, and bars. To measure the effectiveness of these symbolization styles, the researchers examined decision accuracy, decision confidence, decision efficiency, and perceived ease of task. Below is a chart of their findings.  

The researchers found that using circles was the most effective visualization methods for decision makers. Using all four measures of effectiveness, circle symbolization was the best performing one. Other results showed the combined shadings and distortion ranked second in decision accuracy, but ranked lowest in decision confidence. Value shadings ranked lowest in decision efficiency while shadings and bars ranked lowest in perceived ease of task.

Critique 

This study has direct implications to intelligence analysts. This study found that circle symbolization of quantitative information is the most effective symbolization tool for decision makers. As intelligence analysts, it is our job to continue to improve communication between ourselves and our decision makers. The results of this study can help us do that. Since GIS data can be used by all types of intelligence (national security, law enforcement, and competitive), this study can be applied across all fields as well.  

The only criticism I have is that circles as a form of quantitative symbolization may not be applicable to all types of scenarios or quantitative data. Though at this moment I cannot think of any, there might be a situation where circles are not the best form of visualization. 

Source: Ozimec, A., Natter, M., & Reutterer, T. (2010). Geographical Information Systems–Based Marketing Decisions:Effects of Alternative Visualizations on Decision Quality. Journal of Marketing, 74 (6), 94-110. Retrieved from http://ehis.ebscohost.com/eds/pdfviewer/pdfviewer?sid=8fe30abf-7ab8-4068-be2c-26bac62fd9d7%40sessionmgr4&vid=2&hid=4# 

Sunday, April 21, 2013

Investigative Visual Analysis of Global Terrorism

Summary:
The application of visual analytics to global terrorism.  The authors, Miler, Smarick, Ribarsky and Chang, used an existing database with information on terrorist organizations to apply visual methods.  The Global Terrorism Database (GTD) contains information on both domestic and international terrorist organizations.  The authors applied their visual analytic system to this existing database to look at the five W's (who, what, where, when and why) of terrorist organizations in a manner that is easier for decision makers to understand.

Prior to this tool there were typically two groups of visual analytics: social network analysis and geo-temporal visualizations. The system implemented by the authors attempts to combine both social network analysis and geo-temporal visualization  This tool has a number of layers that can be activated to look at the various elements of terrorist organizations.  There are levels that show the location of attacks, which can be detailed on to see the specifics of what took place at that location. Through the different layers various elements can be visually depicted, which makes it easier to understand the data that is present.

The authors indicated that there are three types of individuals who typically go to the GTD website: the general public, investigative analysis, and terrorism experts.  Through the visual analytics tool, individuals with varying levels of exposure to the subject matter can gain a significant level of understanding of the material.  When the system was used by individuals in various organizations, they were all interested in applying the method to the various fields they were in.

Critique:
One element the authors identified in their conclusion was that there were certain elements that could be enhanced overtime.  For example, there were instances of over-plotting data and geographic lines, which made it difficult to look at.  Overall, it appears that this method is extremely useful to apply to existing data. Not only does it analyze various elements, but it also increases the ease of communicating with decision makers, as well as decrease the ambiguity that may be present in large data sets.  This method is certainly something that should be incorporated when possible and certainly enhances the distribution element of information.

Wang, X., Miller, E., Smarick, K., Ribarsky, W., & Chang, R. (2008). Investigate visual analysis of global terrorism. Computer Graphics Forum, 27(3), 919-26. Retrieved from: http://ehis.ebscohost.com/ehost/pdfviewer/pdfviewer?sid=74559524-5a41-4427-a79e-c5e026f76e72%40sessionmgr110&vid=3&hid=17

Thursday, April 18, 2013

Summary of Findings (Green Team): Bayesian Analysis (3.75 out of 5 Stars)


Game Theory
Bayesian Analysis
Rating (3.75 out of 5 Stars)

Note: This post represents the synthesis of the thoughts, procedures and experiences of others as represented in the 8 articles read in advance (see previous posts) and the discussion among the students and instructor during the Advanced Analytic Techniques class at Mercyhurst University in April 2013 regarding Bayesian Analysis specifically. This technique was evaluated based on its overall validity, simplicity, flexibility and its ability to effectively use unstructured data.

Description:
Bayesian analysis, developed by Thomas Bayes, is a statistical approach that deals with using prior knowledge as well as updated information, separating the Bayesian approach from frequentist statistics. Bayesian analysis makes the analysis process iterative, allowing for more information to be added as it is learned or deemed relevant. According to Hubbard (2010), Bayesian theorem is a relationship of probabilities and conditional probabilities, or the chance of something given a certain condition (p.178-79). Bayesian analysis allows analysts to calculate probabilities or make estimates in terms of certain base assumptions as well as new developments. This is especially important in the intelligence field as analysts should be able to include new signals or indicators into their analysis to update estimates, further reducing their levels of uncertainty.   

Strengths:
  • Most helpful in situations in which your initial estimate is at either end of the spectrum
  • Helps analysts to combat emotional/irrational estimates
  • Particularly effective when analyzed in relation to the probability that was present prior to the addition of the newest piece of evidence.
  • Has the potential to be applied across multiple disciplines, as evidenced more frequently in recent academic journals
Weaknesses:
  • Complex, difficult to learn without a statistics background
  • The volume of evidence makes it difficult to continuously evaluate the information
  • No set way to decide what constitutes evidence
  • Unable to know what is important at the moment, particularly with real-world situations
    • Assumes that each piece of evidence is worth the same weight
  • Is less useful in intelligence applications when probabilities are initially determined to be 50/50

How-To:
  1. Start with a rudimentary estimation for the likelihood of an event occurring
  2. Take evidence and apply/use probabilities for potential outcomes
  3. Follow the formula whereby P is the probability, E is the event of interest, H is the factor likelihood, and E|H is the probability of the event of interest given the factor occurred in conjunction to it.
  4. Do the math to find the probability percentage.
  5. This process can be repeated as additional information is given/discovered in order to improve the probability estimate.

Personal Application of Technique:
The class was tasked with finding out the likelihood that an individual, Bob, drove to work based on the fact that he was late. Bob can choose between three different ways to get to work: car, bus, or commuter train. If he takes his car, there is a 50% chance that he will be late. The bus has a 20% chance of making him late while the train has a 1% chance of making him late.

The first question that the class was given was to find out the likelihood that Bob drove to work based on the probability that Bob chose evenly between the three choices. Using Bayesian analysis, the class came up the a 70.4% probability that Bob drove to work that day.

The second question added new information about Bob’s normal transportation habits. Bob’s co-worker knew that Bob almost always takes the commuter train, never takes the the bus, and takes the car 10% of the time. With this new information, the class was able to update the probability that Bob drove to work to 84.75%.

This exercise demonstrated that the application of Bayesian Analysis is not always the most direct method to get to the probability of the situation in that with this particular case, the boss could specifically ask Bob how he got to work that morning. As individuals not seeking a degree in mathematics, there was a significant hesitation to apply the formula and present the findings.

Rating:  3.75 out of 5 stars
Note: The analysts feel this methodology has very strong benefits and is widely applicable, however, it is relatively weak in terms of its application to the intelligence community, particularly in the reliance on numerous factors which affect the utility.  
For Further Information:
Hubbard, D. W. (2010). How to measure anything: Second edition. Hoboken, NJ: John Wiley & Sons, Inc.

Summary of Findings (White Team): Bayesian Theory ( 4.5 out of 5 Stars)

Note: This post represents the synthesis of the thoughts, procedures and experiences of others as represented in the 8 articles read in advance (see previous posts) and the discussion among the students and instructor during the Advanced Analytic Techniques class at Mercyhurst University in April 2013 regarding Bayesian Probability Theory specifically. This technique was evaluated based on its overall validity, simplicity, flexibility and its ability to effectively use unstructured data.


Description:
According to Hubbard, Bayesian Theory is a relationship of probabilities and “conditional” probability (Hubbard, 2010). Bayesian Analysis allows for the addition of new information as it is learned to decrease uncertainty in the hypothesis. This method allows a quantitative value to be applied to intelligence questions (ISBA, 2009). Additionally, it is very effective in reducing uncertainty.

Strengths:
  1. Applicable to the intelligence field because it gives a probability of likelihood rather than statistical significance, such as in frequentist statistics.
  2. Allows for updates to probabilities with addition of new information.
  3. Allows you to move out of the confirmation bias mindset by asking questions such as what would I observe if X were true or what would I expect to see if X were true?
  4. Effective when odds are very high or very low.
  5. Is more intuitive than traditional frequentist statistics.

Weaknesses:
  1. Less helpful when initial probability is close to 50/50.
  2. A piece of evidence that may may not be accounted for may turn out to be the most important.
  3. It is difficult to decide how my weight to put into each new piece of evidence.
  4. Sometimes difficult to see the real-world application and relevance.
  5. Possibly time consuming to conduct bayesian theory to situations.
  6. Can sometimes be fairly complex and difficult to understand.

Step by Step Action:
  1. Find a topic for which you want to estimate a probability.
  2. Assign probabilities to each hypothesis.
  3. Update the equation by adding new hypothesis when new evidence is available.
  4. Calculate the likelihood a certain scenario will occur based on the probabilities of each hypothesis.
  5. For the numerator, multiply the hypothesis being tested by the probability that it will occur.
  6. For the denominator, multiply each individual hypothesis by the probability of that event occurring with that hypothesis and find the sum of each probability.
  7. Divide the numerator by the denominator to find the final probability.

Exercise:
In class we conducted a simple Bayesian Theory problem to determine the likelihood that an employee would show up to work late by car. The formula that was used in this Bayesian approach was the general form of bayes theorem equation as shown below. There were two parts to the exercise, part A included the bosses probabilities that Bob would come late to work by car, bus, or train, overall determining the probability that Bob came to work late by car. Part B analyzed the same situation, but the probabilities of how a co-worker thought Bob would come late to work by car. By calculating both Part A and Part B it was noticed how the probabilities of Bob coming to work late by car changed with different evidence placed into the general form bayesian equation. In terms of real world applicability, the exercise demonstrated how applying bayesian theory to real life scenarios may be difficult to initiate or not seem as the best approach to figuring out a situation. It would seem just asking Bob what mode of transportation he came to work late in would be a much easier approach. However, the exercise does demonstrate the benefits of how bayesian theory can reduce the uncertainty of a question when more evidence is added, which is essential in the work conducted by intelligence analysts. Overall, Bayesian Theory can provide a much more reliable estimate for the analyst.





Tuesday, April 16, 2013

Is It Safe To Go Out Yet? Statistical Inference in a Zombie Outbreak Model

Summary: 
The authors Calderhead, Girolami, and Higham (2010) wrote a paper dealing with the potential outcomes of a zombie outbreak using Bayesian theorem to support their conclusions.  Since there has never been a zombie outbreak it is logical to use Bayesian theory to account for unknown data that can be estimated.  Estimations can be made regarding a zombie outbreak, and in turn these estimations can be used in particular ways to end with likely outcomes.

The idea for using Bayesian theory applied to zombie outbreaks starts with a logical probability.  In the case of this paper, the authors state that in one day the far extremes of probability are that no human will turn into a zombie and all humans are converted into zombies.  This probability (labeled prior) is then updated as new data is used (such as different quantity of days) and thus this posterior distribution then becomes the prior and the process is repeated.

The authors then state that many questions can be answered by successfully finding a likely distribution for human to zombie conversion rates.  Such questions include how many soldiers should be mobilized, the scale of quarantine needed, and whether or not it is alright to leave a hiding spot given the number of zombie sightings during a particular time span.  The authors also emphasize that since the rate of change from human to zombie is likely to not be constant, the beta (conversion coefficient) should be a range and not a singular number.

One of the model comparisons the authors use is the comparison of two models: one that assumes zombies can attack alone while the other shows that after a rumor is circulated, zombies are believed to only travel in pairs.  The authors then seek to disprove the second model through Bayes factors (posterior odds = Bayes factor * prior odds), in which statistical evidence for the first model is weighed heavier than the second.  The authors find that the first model with the least amount of noise (introduced as Gaussian distributed noise) is more likely.  This means that the experimental data deviated the least from the expected curve.  Adding additional noise negatively impacted the Bayes factor (shows as how strong the evidence was against the second model.


Another example the authors used Bayesian theory on was for answering the question of whether or not it is safer to leave a hiding spot based on the number of zombies spotted in the past few days.  The authors use two types of analysis for this: one that does not have any observations of previous day's zombie sightings and one that has five day totals of zombie sightings.  The stated the zombie sightings for the second case were 123, 127, 104, 92, and 74.  The left column of the figure below shows Bayesian factors applied to the first model mentioned above, and the potential outcomes.  The right column shows this process with a second layer of data (the five days of observations) which greatly reduces the uncertainty regarding potential zombie totals for the next 45 days.  Thus, due to these observations and using Bayesian theory uncertainty can be greatly reduced and the chance of surviving longer during a zombie outbreak are much higher.



Critique:
I found this article to be fairly complex to read with no experience with Bayesian theory.  However, I really appreciate the application of Bayesian theory to a zombie outbreak.  Although this article topic is (most likely) fantastical, it is well constructed and thoughtful.  Honestly, the topic caught my eye and I doubt I would have tried as hard as I did to understand Bayesian theory had it been on a drier topic.

One issue I had with this article is that it was clearly meant for someone that had previous experience with Bayesian theory.  At times the authors referenced various aspects of Bayesian theory without defining them.  For example, the authors did not just list Bayes factors but instead referenced an article.  For the average reader this does not make reading and understanding this topic any easier.  Additionally, simple Bayesian theory is not that long or difficult to write out and would have saved me time having to look it up to double check I was thinking about the correct thing.

This article is not related to intelligence, save for if there were to ever be a zombie attack it would prove useful for intelligence analysts to extend their lifespans.  However, this model could be applied to medicine for spread of infectious diseases if the transmission rate is unknown.  Instead of zombies and humans there would be infected and healthy individuals.

Source:
Calderhead, B., Girolami, M., & Higham, D. (2010). Is It Safe To Go Out Yet? Statistical Inferencein a Zombie Outbreak Model.University of Strathclyde, United Kingdom.  Retrieved from http://www.strath.ac.uk/media/departments/mathematics/researchreports/2010/6zombierep.pdf

Bayesian Inference Analysis of the Uncertainty Linked to the Evaluation of Potential Flood Damage in Urban Areas

Summary:
Fontanazza, Freni and Notaro explain that flood impact on highly urbanized areas can be high and has the potential to increase with the effects of climate change. Thus, decision-makers prefer reduced uncertainty when planning flooding mitigation and prevention. This analysis is beneficial because there exists uncertainty in the physical processes that must be simulated in hydraulic models and in the limit of data for model calibration. Additionally, there are sometimes measurement errors in terms of depth-damage curves which can affect data.

In this article, the authors applied Bayesian probability analysis to a case study of Palermo, Italy to determine whether uncertainty decreases with the addition of data. Bayesian analysis has two benefits: "parameter estimation and uncertainty analysis" in both hydraulic model parameters and the depth-damage curve coefficients. They create a mathematical probability model using Bayesian analysis including values in the equation for "the uncertainty of a generic model parameter", "observed values" and a "likelihood function."

The authors split the historical data into three sections, that from January 1994 to April 1999, from May 1999 to January 2003 and from February 2003 to December 2008, to determine whether uncertainty would decrease with each subsequent addition of a data group. The land use in the Palermo case study was identified as mostly for residential dwellings with 88 percent of the area being impervious. The following three images show the reduction in uncertainty once more data became available, demonstrating that Bayesian probability analysis did in fact reduce uncertainty. By the addition of only the second set of data (in the second image), the reduction in uncertainty was about 40%, without a reduction in reliability.





Critique:
There were some limitations in Bayesian analysis, such as that it relies on an initial hypothesis which can often be subjective as well as that the approach may not be objective if the parameter distribution is not made on physical observations. Nevertheless, I noticed many advantages of the methodology. The authors were successful in demonstrating its effectiveness with a case study, thus showing with real but historical data, that a significant reduction in uncertainty was possible. They also accounted for the aforementioned limitations with additional probabilistic analyses on the parameter choices to ensure that they did not skew the results.

The interest in reducing uncertainty for a decision-maker seems to be the same for any profession. I would be curious to see how this could be applicable to a study of crime mapping in which it is determined whether a decrease in uncertainty actually does occur with an increase in data. This could perhaps be applied to the "Newton-Swoope Buffers" in ATAC Workshop that are intended to determine the location of an offender's home or business. These buffers change with each additional piece of information, seemingly because they are becoming more accurate with more data. A Bayesian probability analysis could be applied to this tool to determine its effectiveness and additionally, application to law enforcement intelligence.

Source:
Fontanazza, C.M., Freni, G., & Notaro, V. (2012). Bayesian inference analysis of the uncertainty linked to the evaluation of potential flood damage in urban areas. Water Science and Technology, 1669-1677. doi: 10.2166/wst.2012.359

Fusion of Intelligence Information: A Bayesian Approach

Elizabeth Paté-Cornell presents a classical probabilistic Bayesian model that she believes can be utilized by the intelligence community to aid in the fusion of intelligence information. The awareness of the need for such fusion is apparent in the wake of September 11, 2001, which the author suggests the probability of impending attacks can be found through a Bayesian analysis. The author's two major arguments for the use of the Bayesian model in the IC, particularly related to terrorist attacks, is that it allows for the computation of the posterior probability of an event given the probability of the event prior to observing signals, and the quality of the signals based on the probabilities of false positives and false negatives.

Summary:
The author begins by discussing the problems associated with a fusion of information within the US intelligence community, namely difficulties in ensuring internal communications and the difficulties in merging the content of multiple signals, some more sharp than others, some dependent or independent of others. This research claims that Bayesian analysis can be applied to help solve the difficulties associated with the latter and is explained in terms of identifying the probability of an impending terrorist attack. It should be noted the author does not claim the model will better detect impending terrorist attacks, rather that it can increase the probability that an attack plan is foiled through guiding "clear thinking at a time when the amount of information is large and confusing and intuitions can be seriously misleading" (Paté-Cornell, 2002, p. 454).

The elements of Paté-Cornell's Bayesian model can be explained through the following notations:
 
Namely, the event of interest throughout the article is an impending terrorist attack. Through the model, the author presents formula that addresses both the prior probability of the event occurring before reading signals, such as intercepted telephone conversations, as well as the quality of the signals. The formula, as appears in the following figure, considers what alternatives to the event of interest could occur in conjunction with the signal, a very important thing to consider in the intelligence field.


Additionally, the formulas the author presents address the chances the signals observed are false positives, or that some signals has been missed (false negatives), and how these affect the probability of a future terrorist attack. The probability of false positives can be calculated by considering the prior probability of the impending attack without considering the signals, in conjunction with the rate at which the signal occurs during normal sensor operation when the event does not occur. She explains that her definition of false positives and its application in Bayesian analysis is most useful to the intelligence community because of its consideration of the prior probability of the event, especially considering how drastically the prior probability has increased post-September 11.

Estimating the prior probability of an impending attack can be considered as a combination of the intention of the enemy to attack, the effective planning of that attack (ie. the ability of the perpetrators to coordinate a plan and avoid detection), and the successful implementation of the plan on a given day (ie. the ability of the perpetrators to carry out the plan and avoid target's safeguards). The author argues that the identification of these probabilities alone is of use to the intelligence community given the chance to reduce the probability of an attack attempt through various measures hitting these areas (ie. cutting flow of funds or increasing security).

Critique:
The research applies a Bayesian model by using hypothetical numerical illustrations for the interpretation and fusion of intelligence information and could be strengthened through the use of real-life numerical examples, though sensitive in nature. Additionally, the author switches between examples for multiple formulas, sometimes relating it back to the overarching theme of terrorist attacks, while other times relying on the unrelated example of testing chemicals for poison. This back-and-forth detracts from the overall readability of the research and does not add to the application of the model to the intelligence community. The author uses good examples of potential signals used intelligence but does not carry them throughout the research.

The author further admits some limitations of the research. First, the assumption that both the event and the signals are black and white, either they occur or they do not occur, which is not always the case, particularly in the intelligence community. Further, the research assumes that the likelihood of false signals, whether positive or negative, remains the same throughout time, also unlikely in the intelligence field. Finally, many of the sources of data for such a model are difficult to accurately quantify, including the frequency of past observations, reliability data for sensors or links, or expert opinions. For instance, how can we accurately, and quantitatively, determine the reliability of human intelligence?

Overall, the research is very interesting and provides insight into the intelligence community and process. Admittedly, the approach only helps solve the second half of the information fusion, not aiding in the means of internal communication among the intelligence community, however, any reduction in uncertainty, particularly through objective means, helps the success rate of thwarting plans of terrorists attacks, or other such problems addressed by the intelligence community.

Source: 
Paté-Cornell, E. (2002). Fusion of Intelligence Information: A Bayesian Approach. Risk Analysis: An International Journal, 22(3), 445-454.

The Deterrent Effect of Arrest in Incidents of Domestic Violence: A Bayesian Analysis of Four Field Experiments

Summary:
The authors of this study, Berk, Campbell, Klap and Western (1992), looked at a number of different studies conducted following a study of the Minneapolis Police Department.  The initial study looked at police responses to misdemeanors for domestic assault.  There were three response options available to the police officers, and this measures were supposed to be given out randomly.  These options were (1) to arrest the suspect, (2) remove the suspect from the premises for 24 hours, and (3) to attempt to restore order at that moment.  Through a series of initial and follow-up interviews, it was determined that arrest of the suspect was the most effective way to reduce further violence.  Based on these result, police departments were encouraged to arrest suspects as soon as possible in domestic assault cases.  In addition, that National Institute of Justice funded six replications of the Minneapolis experiment to take place across the United States.

The authors took results from the initial Minneapolis study as well as the following six studies, applied Bayesian Analysis, and attempted to determine if there was an applicable theory; labeling theory or social control theory.  The authors took a combination of a Bayesian Analysis and meta-analysis to attempt to replicate the original study as well as the results that came with that. The subsequent studies were used as different levels of the Bayesian Analysis.

The findings of the this analysis determined that there was no generalizable approach to effectively reducing further violence in domestic assault incidents.  Berk, et. al. determined that there were "good" and "bad" risks, and the different positions and relations individuals held in society determined the effectiveness of arrests.  Individuals who did not feel as constrained by their social standing, or not constrained by social controls are seen to be "bad' risks -- they are likely to repeatedly offend, since they are not as deterred.

This study concluded that social control elements, such as familial ties, relationships, and public perception, are only indicators, not actual measures of attachment.  Therefore, there is no generalizable finding that is applicable to offenders across the United States, or even to offenders in the same region, just over time.  Therefore, there is no statement overall that is applicable to offenders or one that applies specifically to site's past, present, and future offenders.

Critique:
The application of the Bayesian Analysis was interesting since it not only looked at a statisical element, but it also included a meta-analysis to attempt to understand a method that is most effective at curbing domestic violence.

The study did note that the detailed steps for the Bayesian Analysis were located in another document, which made it slightly difficult to understand the larger picture, including the specific elements that went into the analysis.  Overall findings from the analysis are presented, and analyzed in a manner that is coherent to individuals outside of the field.  That being said, it would have been beneficial to include a more detailed element in this study depicting the numerical application of Bayesian analysis rather than the written element.

Berk, R., Campbell, A., Klap, R., & Western, B. (1992). The deterrent effect of arrests in incidents of domestic violence: A Bayesian Analysis of four field experiments. American Sociological Review, 57(5), 698-708. Retrieved from http://www.jstor.org/stable/10.2307/2095923

Bayesian Analysis of Intelligence or Improved Advice to Decision-Makers

Summary:
Decisions associated with protecting critical infrastructures are facilitated through collecting intelligence leading to the formulate courses of action to protect them from adversaries. Sometimes there are time restraints that prevent analysts from evaluating the full scope of a situation and evaluating every piece of data available. During a crisis, analysts must make assessments based on new pieces of information. Bayesian analysis allows decision makers to assess the credibility of potential threats. Researchers have often used Bayesian analysis along side other techniques such as probabilistic risk analysis and game theory to model threat scenarios in hopes of formulating effective responses by identifying vulnerabilities and risks.


According to the authors,’ Bayesian analysis is not very popular in the intelligence community to identify indicators and warnings. The following are the reasons why Bayesian analysis has not been be adopted by the intelligence community.


1.      Bayesian inferences in intelligence have not been defined because analysts are uncomfortable applying probabilistic distributions to events.

2.      It is assumed that analysts often pre-process raw intelligence to produce intelligence reports.

3.      A large number of Bayesian tools evaluate only one hypothesis and cannot be applied to situations where adversaries have more than one strategic interest.
4.      Current Bayesian models cannot handle the short time horizon during a crisis.
5.      There is a lack of need for a clear confidence threshold for decision makers.
6.      The way of updating the prior beliefs about a specific scenario utilizing new pieces of evidence is considered insignificant.   

With consideration of these reasons, the authors proposed a ways of improving the effectiveness of decision made during a crisis. First, they plan on incorporating the moving time horizon to the new model. Secondly, they plan on creating a model that is not hindered by above ideologies common to the intelligence community. The new model would include the following characteristics:

1.      Generalize the Bayesian approach of analyzing intelligence.

2.      Include signals intercepted knowingly and ones gathered through clandestine means.

3.      Recognize denial and deception.

4.      Evaluate the scenario using temporal elements.
5.      Play games that would allow misinterpretation of the data leading to signals directed at a third party.
6.      Identity and apply a threshold for decision making.
7.      Define prior beliefs based on available military and intelligence resources.
8.      Develop conditional probabilities of the existence or the absence of a threat based on new evidence.   

Critique:

Elisabeth Pat-Cornell and David Blum’s article provides a good introduction to the Bayesian analysis with regards to intelligence analysis. Although the authors proposed a new model to make Bayesian analysis more relevant to intelligence analysis, they did not sufficiently test their model by conduct an experiment. Therefore, no evidence is present to assess the effectiveness of their model. In addition, they failed to provide a good definition of the theory as well as the advantages and disadvantages of adopting Bayesian analysis for analyzing intelligence. Bayesian analysis allows for the use of prior knowledge to alongside or to update current knowledge of a scenario. However, Bayesian analysis is sometimes restricted to small sample sizes. Also, there is not valid method of choosing the priors. Each member of a team working to resolve a specific problem may come to the different conclusions depending on the prior they chose.      
 
Source:
Cornell, M., & Blum, D. (n.d.). Bayesian analysis of intelligence or improved advice to decision-makers. Retrieved from http://create.usc.edu/2010/06/bayesian_analysis_of_intellige.html

Monday, April 15, 2013

Bayesian Analysis for Intelligence: Some Focus on the Middle East

In a report written for the CIA and declassified in 1994, Nicholas Schweitzer discusses the use of Bayesian analysis and how it can benefit traditional analysis.

Schweitzer begins his report by discussing how much information is generated on a yearly basis and how it is the analysts job to be the funnel for this information. Along with the ever increasing amount of information, the analyst also must deal with very complex situations that they must analyze, usually with models that fail to account for a large amount of complexities. The author believes that by using Bayesian analysis, some of the complexities of situation can be reduced.

Schweitzer discusses the use of Bayesian inference in the analysis of IMINT. Schweitzer gives the example of an analyst attempting to determine whether a a military unit is a motorized rifle battalion or an infantry regiment. In his example, the analyst finds that there are 10 tanks stationed with this unknown military unit. Using either expert opinion or historical observation, the analysts assigns a probability that the there is a 90% chance that the unit is a motorized rifle battalion and a 10% chance that it is an infantry regiment.

In discussing more complex applications, Schweitzer warns the reader that there is rarely objective probabilities of events and that historical observation is not very useful. However, if the analyst can overcome the difficulties of assigning subjective probabilities to events, than Bayesian analysis will allow him to "squeeze a little more information from the data we do receive". With this in mind though, he warns that analysts tend to attribute more precision to a number than they should.

The complex example that Schweitzer creates is four hypothetical scenarios, each involving the potential for war between Israel and Egypt and Syria. The scenarios are: No hostilities are planned by either country for 30 days, Syria, either alone or with other Arab nations, plans to attack Israel within 30 days, Israel is planning an attack with an Arab nation within 30 days, and the last one is that Egypt will disavow the disengagement treaty in the next 30 days. Among the analysts used, the initial probability that they assigned to continued peace was between 70% and 95%. After this, the analysts started to change their estimates based on new evidence that was presented.

The author concludes by discussing the applicability of Bayesian analysis and the types of questions it can be applied to. The questions must have mutually exclusive categories (war, no war), the question has to be expressed as specific hypothetical outcomes, there needs to be a rich flow of data that is related to the situation, and the question must resolve around an activity that produces signs and is not largely a chance or random event.

Critique

The biggest criticism that I see in this report is the lack of an assessment in the Middle East example. The author sets up the four scenarios and gives a preliminary probability of continued peace. He then begins introducing evidence into the question and assigns probabilities to the pieces of evidence and how they apply to a certain scenario. However, he stops there. He does not take the next step to show what the new probability is is each scenario playing out while taking into account the various pieces of evidence. he has an appendix of the formulas used, but he does not show the answer.

The author does a very good job of showing how difficult it can be to set up a question where Bayesian analysis can be applied. This includes is warning about subjectively assigned numbers and how analysts tend to put more belief in them than they should.

Overall, the paper was a good set up to Bayesian analysis and intelligence, it just felt like an important part was missing.

Source:

Schweitzer, N. Bayesian Analysis for Intelligence: Some Focus on the Middle East. Retrieved from: https://www.cia.gov/library/center-for-the-study-of-intelligence/kent-csi/vol20no2/html/v20i2a03p_0001.htm



Sunday, April 14, 2013

Bayesian Analysis of Longitudinal Data Using Growth Curve Models

Summary:

Bayesian Analysis of Longitudinal Data Using Growth Curve Models bases Bayesian analysis on the principle that the application of probability depends on the degree to which a person believes a hypothesis or a proposition.  The article provides a summary of the overall basics of Bayesian terms and methods beginning by reviewing terms such as priors, posteriors, and the Markov chain Monte Carlo (MCMC) method.  Following the introduction, the authors explain the basic concepts of the latent basis growth model.  Lastly, they incorporate an empirical example fitting a latent basis growth curve model to achievement data from the National Longitudinal Survey of Youth.  The example demonstrates how to analyze data using noninformative and informative priors.  The findings show that Bayesian methods are an alternative to the maximum likelihood estimation (MLE) method.  Further findings suggest Bayesian methods have other strengths including systematic incorporation of prior information from previous studies.  The authors found that the Bayesian method was a more plausible way to analyze small sample data as compared to the MLE method. 

Bayesian methods are applicable to intemresopnse models, factor analytic models, structural equation modes, genetic models, and multilevel models.  Both applied and theoretical measurement can benefit from the opportunities that Bayesian methods can bring forth.  The authors point out that oftentimes the strenuous programming and computational demands of Bayesian methods as well as the complexities of the models that usually need the application of Bayesian methods make the methods seem fairly remote and frustrating for empirical researchers.  Thorough this study and the examples provided the authors attempt to provide an easy way to implement Bayesian analysis.   
  
Critique:

This article provides a summary of the basis for Bayesian methods.  While it is not related to intelligence analysis it does provide a breakdown of the technique, especially useful to those with no experience using Bayesian analysis.  Additionally, the systematic example provided using data from the National Longitudinal Survey of Youth provides a fairly easy to follow, step-by-step example of the method.

An issue that not only arose with this article but all articles on Bayesian analysis is that although they talk about its importance in terms of analysis there are very few that apply the method directly to intelligence analysis.  Statements describing Bayesian analysis such as, Bayes’ theorem is useful because it provides a way to calculate the probability of a hypothesis based on the evidence or data, is very relevant to intelligence analysis.  Additionally, by discussing the probability of the data also calling it the likelihood, would allow analysists to use the correct WEP when making statements of estimated probability.  Lastly, the ability of Bayesian to estimate complex models in data analysis is extremely effective.  All of these were findings or statements found within the article, and all are very applicable and beneficial to the intelligence community.  The application of these findings and statements to a problem faced by an intelligence analyst would demonstrate the usefulness of Bayesian analysis in these scenarios. 

Finally, the article mentions that an alternative to meta-analysis are Bayesian methods that use informative priors.  The authors provides a short explanation of Bayesian’s ability to do this but, considering the reliability and weight placed on meta-analysis studies, more information should be given to back up this claim.  A claim like this is deserving of an entire study rather than a short paragraph and leaves the reader wondering exactly how the Bayesian method can act as an alternative to a meta-analysis.  A more comprehensive explanation is necessary.

Source: 

Grimm, K.J., Hamagami, F., Nesselroade, J.R., Wang, L., & Zhang, Z. (2007). Bayesian analysis of longitudinal data using growth curve models.  International Journal of Behavioral Development. 31 (4), 374-383.

A bayesian analysis of human decision-making on bandit problems

A bayesian analysis of human decision-making on bandit problems


Summary: 

Steyvers, Lee, and Wagenmakers (2009) conducted their study on the differences in individuals balancing between exploration and exploitation in solving bandit problems by using Bayesian analysis.  In a bandit problem situation, the individual will have to choose between a set of alternatives that have inherently different reward levels.  Moreover, the individual will have to try and maximize the total reward that they receive over a set number of trials ( Steyvers et al., 2009).  Bandit problems require that the individual analyzes their environment in two distinctive manners, both explorative and exploitative.  It is crucial that the individual exploits situations in their environment that they are familiar with and explore areas of their environment that they are less familiarized with (Steyvers et al, 2009).  Thus, conducting a happy medium between both exploitation and exploration in bandit problems is critical to effective decision-making thought processes.

Steyvers et al. (2009) utilized a Bayesian extension of optimal decision-making processes to display differences in human decision-making when the reward rates are different within the individuals situated environment.  The ultimate goal is to determine which situations an individual would be more willing to make optimistic assumptions about reward rates, as opposed to pessimistic assumptions about the potential reward rate.   The sample size for the study included 451 participants who completed a series of bandit problems, as well as a series of psychological tests.  The psychological tests measured some aspects of psychometric assessments of cognitive, intelligence, and personality traits of the 451 participants (Steyvers et al., 2009).

Over the course of the study it was determined that by completing a larger amount of problems the participants were able to learn more effective decision-making processes.  Hence, becoming more familiar with a certain environment improved decision-making abilities.  Completing more tests allowed the individuals to have more efficient decision-making processes about what their assumptions should be to maximize rewards and minimize losses in different situations/environments.  Thus, environments with high reward rates displayed participants as being more likely to conduct exploration as opposed to limited reward environments in which participants chose to be more exploitative in decision-making endeavors (Steyvers et al, 2009).  Moreover, Steyvers et al. (2009) found that standard psychometric measurements of intelligence had a direct correlation with choosing to be explorative or exploitative in ones decision-making thought processes.

Critique: 

Overall, with not being strongly familiar with the concept of Bayesian analysis the article was a little hard to follow at times, however, the authors did describe the various explanations of the decision-making variables within the study that were part of the various Bayesian analysis equations.  A more thorough explanation would certainly benefit the reader to follow the way in which the experiment and calculations were conducted more easily.  Most significantly, I found the study interesting in the way it utilized Bayesian analysis to examine bandit problems.  Bayesian analysis' ability to update the probability of an event to occur when more evidence is added allows for a good analysis of bandit problems.  Bandit problems challenge the decision-maker to explore unfamiliar ground or exploit situations they have more direct experience with.

I agree with the authors that they would need to expand their study in order to determine if more individuals conduct decision-making process in bandit situations in the same manner that was displayed in this study.  Thus, it would be necessary to choose more factors that would affect the cognitive thinking capabilities of the respondents.  One such factor that would be needed to consider as a variable in this study would be the factor of learning.  The authors found that continued testing allowed the participants to choose the correct decision-making possibility, either exploitative or exploratory.  I think that it would be interesting to find out at what point over a certain amount of tests would the respondents have a sense that they were making the right decision, or is this type of decision-making inherently present in our cognitive abilities without testing.

Source:

Steyvers, M., Lee, M.D., & Wagenmakers, E.J. (2009). A bayesian analysis of human decision-making on bandit problems. Journal of Mathematical Psychology, 53 (3), 168-179. Retrieved from http://www.sciencedirect.com/science/article/pii/S0022249608001090.