Showing posts with label Bayesian Analysis. Show all posts
Showing posts with label Bayesian Analysis. Show all posts

Thursday, April 18, 2013

Summary of Findings (Green Team): Bayesian Analysis (3.75 out of 5 Stars)


Game Theory
Bayesian Analysis
Rating (3.75 out of 5 Stars)

Note: This post represents the synthesis of the thoughts, procedures and experiences of others as represented in the 8 articles read in advance (see previous posts) and the discussion among the students and instructor during the Advanced Analytic Techniques class at Mercyhurst University in April 2013 regarding Bayesian Analysis specifically. This technique was evaluated based on its overall validity, simplicity, flexibility and its ability to effectively use unstructured data.

Description:
Bayesian analysis, developed by Thomas Bayes, is a statistical approach that deals with using prior knowledge as well as updated information, separating the Bayesian approach from frequentist statistics. Bayesian analysis makes the analysis process iterative, allowing for more information to be added as it is learned or deemed relevant. According to Hubbard (2010), Bayesian theorem is a relationship of probabilities and conditional probabilities, or the chance of something given a certain condition (p.178-79). Bayesian analysis allows analysts to calculate probabilities or make estimates in terms of certain base assumptions as well as new developments. This is especially important in the intelligence field as analysts should be able to include new signals or indicators into their analysis to update estimates, further reducing their levels of uncertainty.   

Strengths:
  • Most helpful in situations in which your initial estimate is at either end of the spectrum
  • Helps analysts to combat emotional/irrational estimates
  • Particularly effective when analyzed in relation to the probability that was present prior to the addition of the newest piece of evidence.
  • Has the potential to be applied across multiple disciplines, as evidenced more frequently in recent academic journals
Weaknesses:
  • Complex, difficult to learn without a statistics background
  • The volume of evidence makes it difficult to continuously evaluate the information
  • No set way to decide what constitutes evidence
  • Unable to know what is important at the moment, particularly with real-world situations
    • Assumes that each piece of evidence is worth the same weight
  • Is less useful in intelligence applications when probabilities are initially determined to be 50/50

How-To:
  1. Start with a rudimentary estimation for the likelihood of an event occurring
  2. Take evidence and apply/use probabilities for potential outcomes
  3. Follow the formula whereby P is the probability, E is the event of interest, H is the factor likelihood, and E|H is the probability of the event of interest given the factor occurred in conjunction to it.
  4. Do the math to find the probability percentage.
  5. This process can be repeated as additional information is given/discovered in order to improve the probability estimate.

Personal Application of Technique:
The class was tasked with finding out the likelihood that an individual, Bob, drove to work based on the fact that he was late. Bob can choose between three different ways to get to work: car, bus, or commuter train. If he takes his car, there is a 50% chance that he will be late. The bus has a 20% chance of making him late while the train has a 1% chance of making him late.

The first question that the class was given was to find out the likelihood that Bob drove to work based on the probability that Bob chose evenly between the three choices. Using Bayesian analysis, the class came up the a 70.4% probability that Bob drove to work that day.

The second question added new information about Bob’s normal transportation habits. Bob’s co-worker knew that Bob almost always takes the commuter train, never takes the the bus, and takes the car 10% of the time. With this new information, the class was able to update the probability that Bob drove to work to 84.75%.

This exercise demonstrated that the application of Bayesian Analysis is not always the most direct method to get to the probability of the situation in that with this particular case, the boss could specifically ask Bob how he got to work that morning. As individuals not seeking a degree in mathematics, there was a significant hesitation to apply the formula and present the findings.

Rating:  3.75 out of 5 stars
Note: The analysts feel this methodology has very strong benefits and is widely applicable, however, it is relatively weak in terms of its application to the intelligence community, particularly in the reliance on numerous factors which affect the utility.  
For Further Information:
Hubbard, D. W. (2010). How to measure anything: Second edition. Hoboken, NJ: John Wiley & Sons, Inc.

Tuesday, April 16, 2013

Is It Safe To Go Out Yet? Statistical Inference in a Zombie Outbreak Model

Summary: 
The authors Calderhead, Girolami, and Higham (2010) wrote a paper dealing with the potential outcomes of a zombie outbreak using Bayesian theorem to support their conclusions.  Since there has never been a zombie outbreak it is logical to use Bayesian theory to account for unknown data that can be estimated.  Estimations can be made regarding a zombie outbreak, and in turn these estimations can be used in particular ways to end with likely outcomes.

The idea for using Bayesian theory applied to zombie outbreaks starts with a logical probability.  In the case of this paper, the authors state that in one day the far extremes of probability are that no human will turn into a zombie and all humans are converted into zombies.  This probability (labeled prior) is then updated as new data is used (such as different quantity of days) and thus this posterior distribution then becomes the prior and the process is repeated.

The authors then state that many questions can be answered by successfully finding a likely distribution for human to zombie conversion rates.  Such questions include how many soldiers should be mobilized, the scale of quarantine needed, and whether or not it is alright to leave a hiding spot given the number of zombie sightings during a particular time span.  The authors also emphasize that since the rate of change from human to zombie is likely to not be constant, the beta (conversion coefficient) should be a range and not a singular number.

One of the model comparisons the authors use is the comparison of two models: one that assumes zombies can attack alone while the other shows that after a rumor is circulated, zombies are believed to only travel in pairs.  The authors then seek to disprove the second model through Bayes factors (posterior odds = Bayes factor * prior odds), in which statistical evidence for the first model is weighed heavier than the second.  The authors find that the first model with the least amount of noise (introduced as Gaussian distributed noise) is more likely.  This means that the experimental data deviated the least from the expected curve.  Adding additional noise negatively impacted the Bayes factor (shows as how strong the evidence was against the second model.


Another example the authors used Bayesian theory on was for answering the question of whether or not it is safer to leave a hiding spot based on the number of zombies spotted in the past few days.  The authors use two types of analysis for this: one that does not have any observations of previous day's zombie sightings and one that has five day totals of zombie sightings.  The stated the zombie sightings for the second case were 123, 127, 104, 92, and 74.  The left column of the figure below shows Bayesian factors applied to the first model mentioned above, and the potential outcomes.  The right column shows this process with a second layer of data (the five days of observations) which greatly reduces the uncertainty regarding potential zombie totals for the next 45 days.  Thus, due to these observations and using Bayesian theory uncertainty can be greatly reduced and the chance of surviving longer during a zombie outbreak are much higher.



Critique:
I found this article to be fairly complex to read with no experience with Bayesian theory.  However, I really appreciate the application of Bayesian theory to a zombie outbreak.  Although this article topic is (most likely) fantastical, it is well constructed and thoughtful.  Honestly, the topic caught my eye and I doubt I would have tried as hard as I did to understand Bayesian theory had it been on a drier topic.

One issue I had with this article is that it was clearly meant for someone that had previous experience with Bayesian theory.  At times the authors referenced various aspects of Bayesian theory without defining them.  For example, the authors did not just list Bayes factors but instead referenced an article.  For the average reader this does not make reading and understanding this topic any easier.  Additionally, simple Bayesian theory is not that long or difficult to write out and would have saved me time having to look it up to double check I was thinking about the correct thing.

This article is not related to intelligence, save for if there were to ever be a zombie attack it would prove useful for intelligence analysts to extend their lifespans.  However, this model could be applied to medicine for spread of infectious diseases if the transmission rate is unknown.  Instead of zombies and humans there would be infected and healthy individuals.

Source:
Calderhead, B., Girolami, M., & Higham, D. (2010). Is It Safe To Go Out Yet? Statistical Inferencein a Zombie Outbreak Model.University of Strathclyde, United Kingdom.  Retrieved from http://www.strath.ac.uk/media/departments/mathematics/researchreports/2010/6zombierep.pdf

Bayesian Inference Analysis of the Uncertainty Linked to the Evaluation of Potential Flood Damage in Urban Areas

Summary:
Fontanazza, Freni and Notaro explain that flood impact on highly urbanized areas can be high and has the potential to increase with the effects of climate change. Thus, decision-makers prefer reduced uncertainty when planning flooding mitigation and prevention. This analysis is beneficial because there exists uncertainty in the physical processes that must be simulated in hydraulic models and in the limit of data for model calibration. Additionally, there are sometimes measurement errors in terms of depth-damage curves which can affect data.

In this article, the authors applied Bayesian probability analysis to a case study of Palermo, Italy to determine whether uncertainty decreases with the addition of data. Bayesian analysis has two benefits: "parameter estimation and uncertainty analysis" in both hydraulic model parameters and the depth-damage curve coefficients. They create a mathematical probability model using Bayesian analysis including values in the equation for "the uncertainty of a generic model parameter", "observed values" and a "likelihood function."

The authors split the historical data into three sections, that from January 1994 to April 1999, from May 1999 to January 2003 and from February 2003 to December 2008, to determine whether uncertainty would decrease with each subsequent addition of a data group. The land use in the Palermo case study was identified as mostly for residential dwellings with 88 percent of the area being impervious. The following three images show the reduction in uncertainty once more data became available, demonstrating that Bayesian probability analysis did in fact reduce uncertainty. By the addition of only the second set of data (in the second image), the reduction in uncertainty was about 40%, without a reduction in reliability.





Critique:
There were some limitations in Bayesian analysis, such as that it relies on an initial hypothesis which can often be subjective as well as that the approach may not be objective if the parameter distribution is not made on physical observations. Nevertheless, I noticed many advantages of the methodology. The authors were successful in demonstrating its effectiveness with a case study, thus showing with real but historical data, that a significant reduction in uncertainty was possible. They also accounted for the aforementioned limitations with additional probabilistic analyses on the parameter choices to ensure that they did not skew the results.

The interest in reducing uncertainty for a decision-maker seems to be the same for any profession. I would be curious to see how this could be applicable to a study of crime mapping in which it is determined whether a decrease in uncertainty actually does occur with an increase in data. This could perhaps be applied to the "Newton-Swoope Buffers" in ATAC Workshop that are intended to determine the location of an offender's home or business. These buffers change with each additional piece of information, seemingly because they are becoming more accurate with more data. A Bayesian probability analysis could be applied to this tool to determine its effectiveness and additionally, application to law enforcement intelligence.

Source:
Fontanazza, C.M., Freni, G., & Notaro, V. (2012). Bayesian inference analysis of the uncertainty linked to the evaluation of potential flood damage in urban areas. Water Science and Technology, 1669-1677. doi: 10.2166/wst.2012.359

Fusion of Intelligence Information: A Bayesian Approach

Elizabeth Paté-Cornell presents a classical probabilistic Bayesian model that she believes can be utilized by the intelligence community to aid in the fusion of intelligence information. The awareness of the need for such fusion is apparent in the wake of September 11, 2001, which the author suggests the probability of impending attacks can be found through a Bayesian analysis. The author's two major arguments for the use of the Bayesian model in the IC, particularly related to terrorist attacks, is that it allows for the computation of the posterior probability of an event given the probability of the event prior to observing signals, and the quality of the signals based on the probabilities of false positives and false negatives.

Summary:
The author begins by discussing the problems associated with a fusion of information within the US intelligence community, namely difficulties in ensuring internal communications and the difficulties in merging the content of multiple signals, some more sharp than others, some dependent or independent of others. This research claims that Bayesian analysis can be applied to help solve the difficulties associated with the latter and is explained in terms of identifying the probability of an impending terrorist attack. It should be noted the author does not claim the model will better detect impending terrorist attacks, rather that it can increase the probability that an attack plan is foiled through guiding "clear thinking at a time when the amount of information is large and confusing and intuitions can be seriously misleading" (Paté-Cornell, 2002, p. 454).

The elements of Paté-Cornell's Bayesian model can be explained through the following notations:
 
Namely, the event of interest throughout the article is an impending terrorist attack. Through the model, the author presents formula that addresses both the prior probability of the event occurring before reading signals, such as intercepted telephone conversations, as well as the quality of the signals. The formula, as appears in the following figure, considers what alternatives to the event of interest could occur in conjunction with the signal, a very important thing to consider in the intelligence field.


Additionally, the formulas the author presents address the chances the signals observed are false positives, or that some signals has been missed (false negatives), and how these affect the probability of a future terrorist attack. The probability of false positives can be calculated by considering the prior probability of the impending attack without considering the signals, in conjunction with the rate at which the signal occurs during normal sensor operation when the event does not occur. She explains that her definition of false positives and its application in Bayesian analysis is most useful to the intelligence community because of its consideration of the prior probability of the event, especially considering how drastically the prior probability has increased post-September 11.

Estimating the prior probability of an impending attack can be considered as a combination of the intention of the enemy to attack, the effective planning of that attack (ie. the ability of the perpetrators to coordinate a plan and avoid detection), and the successful implementation of the plan on a given day (ie. the ability of the perpetrators to carry out the plan and avoid target's safeguards). The author argues that the identification of these probabilities alone is of use to the intelligence community given the chance to reduce the probability of an attack attempt through various measures hitting these areas (ie. cutting flow of funds or increasing security).

Critique:
The research applies a Bayesian model by using hypothetical numerical illustrations for the interpretation and fusion of intelligence information and could be strengthened through the use of real-life numerical examples, though sensitive in nature. Additionally, the author switches between examples for multiple formulas, sometimes relating it back to the overarching theme of terrorist attacks, while other times relying on the unrelated example of testing chemicals for poison. This back-and-forth detracts from the overall readability of the research and does not add to the application of the model to the intelligence community. The author uses good examples of potential signals used intelligence but does not carry them throughout the research.

The author further admits some limitations of the research. First, the assumption that both the event and the signals are black and white, either they occur or they do not occur, which is not always the case, particularly in the intelligence community. Further, the research assumes that the likelihood of false signals, whether positive or negative, remains the same throughout time, also unlikely in the intelligence field. Finally, many of the sources of data for such a model are difficult to accurately quantify, including the frequency of past observations, reliability data for sensors or links, or expert opinions. For instance, how can we accurately, and quantitatively, determine the reliability of human intelligence?

Overall, the research is very interesting and provides insight into the intelligence community and process. Admittedly, the approach only helps solve the second half of the information fusion, not aiding in the means of internal communication among the intelligence community, however, any reduction in uncertainty, particularly through objective means, helps the success rate of thwarting plans of terrorists attacks, or other such problems addressed by the intelligence community.

Source: 
Paté-Cornell, E. (2002). Fusion of Intelligence Information: A Bayesian Approach. Risk Analysis: An International Journal, 22(3), 445-454.

The Deterrent Effect of Arrest in Incidents of Domestic Violence: A Bayesian Analysis of Four Field Experiments

Summary:
The authors of this study, Berk, Campbell, Klap and Western (1992), looked at a number of different studies conducted following a study of the Minneapolis Police Department.  The initial study looked at police responses to misdemeanors for domestic assault.  There were three response options available to the police officers, and this measures were supposed to be given out randomly.  These options were (1) to arrest the suspect, (2) remove the suspect from the premises for 24 hours, and (3) to attempt to restore order at that moment.  Through a series of initial and follow-up interviews, it was determined that arrest of the suspect was the most effective way to reduce further violence.  Based on these result, police departments were encouraged to arrest suspects as soon as possible in domestic assault cases.  In addition, that National Institute of Justice funded six replications of the Minneapolis experiment to take place across the United States.

The authors took results from the initial Minneapolis study as well as the following six studies, applied Bayesian Analysis, and attempted to determine if there was an applicable theory; labeling theory or social control theory.  The authors took a combination of a Bayesian Analysis and meta-analysis to attempt to replicate the original study as well as the results that came with that. The subsequent studies were used as different levels of the Bayesian Analysis.

The findings of the this analysis determined that there was no generalizable approach to effectively reducing further violence in domestic assault incidents.  Berk, et. al. determined that there were "good" and "bad" risks, and the different positions and relations individuals held in society determined the effectiveness of arrests.  Individuals who did not feel as constrained by their social standing, or not constrained by social controls are seen to be "bad' risks -- they are likely to repeatedly offend, since they are not as deterred.

This study concluded that social control elements, such as familial ties, relationships, and public perception, are only indicators, not actual measures of attachment.  Therefore, there is no generalizable finding that is applicable to offenders across the United States, or even to offenders in the same region, just over time.  Therefore, there is no statement overall that is applicable to offenders or one that applies specifically to site's past, present, and future offenders.

Critique:
The application of the Bayesian Analysis was interesting since it not only looked at a statisical element, but it also included a meta-analysis to attempt to understand a method that is most effective at curbing domestic violence.

The study did note that the detailed steps for the Bayesian Analysis were located in another document, which made it slightly difficult to understand the larger picture, including the specific elements that went into the analysis.  Overall findings from the analysis are presented, and analyzed in a manner that is coherent to individuals outside of the field.  That being said, it would have been beneficial to include a more detailed element in this study depicting the numerical application of Bayesian analysis rather than the written element.

Berk, R., Campbell, A., Klap, R., & Western, B. (1992). The deterrent effect of arrests in incidents of domestic violence: A Bayesian Analysis of four field experiments. American Sociological Review, 57(5), 698-708. Retrieved from http://www.jstor.org/stable/10.2307/2095923

Sunday, April 14, 2013

A bayesian analysis of human decision-making on bandit problems

A bayesian analysis of human decision-making on bandit problems


Summary: 

Steyvers, Lee, and Wagenmakers (2009) conducted their study on the differences in individuals balancing between exploration and exploitation in solving bandit problems by using Bayesian analysis.  In a bandit problem situation, the individual will have to choose between a set of alternatives that have inherently different reward levels.  Moreover, the individual will have to try and maximize the total reward that they receive over a set number of trials ( Steyvers et al., 2009).  Bandit problems require that the individual analyzes their environment in two distinctive manners, both explorative and exploitative.  It is crucial that the individual exploits situations in their environment that they are familiar with and explore areas of their environment that they are less familiarized with (Steyvers et al, 2009).  Thus, conducting a happy medium between both exploitation and exploration in bandit problems is critical to effective decision-making thought processes.

Steyvers et al. (2009) utilized a Bayesian extension of optimal decision-making processes to display differences in human decision-making when the reward rates are different within the individuals situated environment.  The ultimate goal is to determine which situations an individual would be more willing to make optimistic assumptions about reward rates, as opposed to pessimistic assumptions about the potential reward rate.   The sample size for the study included 451 participants who completed a series of bandit problems, as well as a series of psychological tests.  The psychological tests measured some aspects of psychometric assessments of cognitive, intelligence, and personality traits of the 451 participants (Steyvers et al., 2009).

Over the course of the study it was determined that by completing a larger amount of problems the participants were able to learn more effective decision-making processes.  Hence, becoming more familiar with a certain environment improved decision-making abilities.  Completing more tests allowed the individuals to have more efficient decision-making processes about what their assumptions should be to maximize rewards and minimize losses in different situations/environments.  Thus, environments with high reward rates displayed participants as being more likely to conduct exploration as opposed to limited reward environments in which participants chose to be more exploitative in decision-making endeavors (Steyvers et al, 2009).  Moreover, Steyvers et al. (2009) found that standard psychometric measurements of intelligence had a direct correlation with choosing to be explorative or exploitative in ones decision-making thought processes.

Critique: 

Overall, with not being strongly familiar with the concept of Bayesian analysis the article was a little hard to follow at times, however, the authors did describe the various explanations of the decision-making variables within the study that were part of the various Bayesian analysis equations.  A more thorough explanation would certainly benefit the reader to follow the way in which the experiment and calculations were conducted more easily.  Most significantly, I found the study interesting in the way it utilized Bayesian analysis to examine bandit problems.  Bayesian analysis' ability to update the probability of an event to occur when more evidence is added allows for a good analysis of bandit problems.  Bandit problems challenge the decision-maker to explore unfamiliar ground or exploit situations they have more direct experience with.

I agree with the authors that they would need to expand their study in order to determine if more individuals conduct decision-making process in bandit situations in the same manner that was displayed in this study.  Thus, it would be necessary to choose more factors that would affect the cognitive thinking capabilities of the respondents.  One such factor that would be needed to consider as a variable in this study would be the factor of learning.  The authors found that continued testing allowed the participants to choose the correct decision-making possibility, either exploitative or exploratory.  I think that it would be interesting to find out at what point over a certain amount of tests would the respondents have a sense that they were making the right decision, or is this type of decision-making inherently present in our cognitive abilities without testing.

Source:

Steyvers, M., Lee, M.D., & Wagenmakers, E.J. (2009). A bayesian analysis of human decision-making on bandit problems. Journal of Mathematical Psychology, 53 (3), 168-179. Retrieved from http://www.sciencedirect.com/science/article/pii/S0022249608001090.

Tuesday, April 17, 2012

Bayesian Analysis of Intelligence or Improved Advice to Decision-Makers

Introduction:

Although not the standard article, M. Elisabeth Pat-Cornell and David M. Blum’s ongoing research into the use of Bayesian analysis in intelligence problems is extremely relevant to the current subject matter. Their work builds on previous and ongoing research conducted by the National Center for Risk and Economic Analysis of Terrorism Events (CREATE).

(http://create.usc.edu/)

Summary:

According to the article, one of the main problems facing US national and homeland security is the response to very-near future threats. While longer term threats allow the time to build reports and plan courses of action, near term threats do not. As a result, analysts need to be able to judge the reliability of the new threat information in the context of all available intelligence in order to both minimize risk as well as responses to false threats. Researchers at CREATE have previously determined that Bayesian analysis is useful in such situations, as a way to gauge the credibility of potential threat scenarios. Furthermore, Bayesian analysis has been used in conjunction with various other analytical approaches, including probabilistic risk analysis, game theory, and Markov models.

Although the use of Bayesian analysis to measure threats is not new, it has not yet been adopted by the intelligence community, for several reasons:

1) the idea of the prior in intelligence has not been well defined;

2) academic research tends to assume a substantial amount of pre-processing by analysts to produce intelligence reports from raw intelligence feeds;

3) many Bayesian tools evaluate only a single hypothesis, ignoring multiple strategic interests;

4) crises imply a short but moving time horizon, which current models lack;

5) the process through which new intelligence data relating to a threat updates the prior belief about the threat has been considered trivial.

This new research seeks to remove these obstacles by incorporating a moving time-horizon into dynamic signaling games to better simulate crises, and also by creating a new model which will eliminate the intelligence community’s resistance to Bayesian techniques. The researchers then go through an in-depth research proposal, a case study, and the deliverables, of which the final results will be released in August 2012.

Further Readings:

Another research project on a similar topic is Bayesian Approach to Intelligence Analysis: (http://create.usc.edu/2011/03/bayesian_approach_to_intellige.html)

History of Bayesian analysis in risk assessment: http://www.usc.edu/dept/create/assets/001/50765.pdf

Probabilistic Modeling of Terrorist Threats: A Systems Analysis Approach to Setting Priorities Among Countermeasures: http://www.ingentaconnect.com/content/mors/mor/2002/00000007/00000004/art00004 (Purchase required)

Source:

http://create.usc.edu/2010/06/bayesian_analysis_of_intellige.html

Monday, April 16, 2012

Measuring Sustained Competitive Advantage Using Bayesian Reasoning

Introduction:
In this article, Tang and Liou develop a theoretical framework to understand the causal relationships among (1) sustainable competitive advantage, (2) configuration, (3) dynamic capability, and (4) sustainable superior performance. They propose that a firm’s competitive advantage, resource bundle configuration, and dynamic learning capability cannot be comprehended by outsiders. Its operational performance, however, can be captured by financial indicators.

They promote an inductive Bayesian interpretation of the sustainable competitive advantage proposition. From this viewpoint, the presence or absence of competitive advantage may be reflected in the causal relationship between resource configuration, dynamic capability, and observable financial performance. They then apply this theoretical framework to an example drawn from the global semiconductor industry, an area in which resource configuration and dynamic capability are essential to performance.

Summary:
The paper expounds on the theory supporting the use of Bayesian reasoning to measure competitive advantage over other measures like Porter’s competitive strategy or the resource based view of valuable, rare, inimitable and non-substitutable.

Powell’s premise – sustainable competitive advantage is more probable in firms that have already achieved sustained superior performance – is further developed through periodically updating its propositions or hypotheses in the face of empirical evidence. Tang and Liou use a Bayesian discriminate model to reveal the functional dependence of superior performance on unique business processes. The primary sources of competitive advantage are considered embedded in and inseparable from the organization itself, along with its business units and functional departments. It is assumed that the process of managing these resources, termed strategic fit, cannot be comprehended or imitated by outsiders. The model is then demonstrated using the semi-conductor industry.
Explanation of sustainable competitive advantage
Model:
The final data set for the study contained 147 companies and 786 firm-year observations. Of those, 188 companies are located in developed countries (The U.S., within Europe, and Japan). The other 29 are in the Asia/Pacific region. Using the firm’s financial data, certain observable traits can be inferred.

The study began by using PCA, principle component analysis, on the financial indicators to identify the traits or factors. Three principal factors accounted for 60 percent of the total variance.

Factor 1: Relationship management. This factor includes customer relationship management (accounts receivable turnover), three variables related to supplier relationship management (accounts payable turnover, inventory turnover, and CGS/sales) and one variable associated with the government (tax to sales ratio). The factor illustrates the sustainable competitive advantage of firms that manage upstream (suppliers), downstream (customers), and governmental relationships. The variance indicated that good relationship management can pay off with respect to a lower CGS.

Factor 2: Management ability. This factor consists of indicators related to fixed asset management capabalities including department/sales ratio and fixed asset turnover. The correlation between fixed assets turnover and Factor 2 indicates that firms with greater competence in assets management generate revenue at a lower unit cost and low asset depreciation.

Factor 3: Knowledge management. This factor includes R&D/sales and SG&A/sales ratios to measure a firm’s effectiveness in resource deployment. The high correlation indicates that lower unit costs are associated with efficient management.

The findings, therefore, support the idea that resource configurations or factors of a firm can be inferred from their observable financial indicators.

Conclusion:
Tang and Liou advance Powell’s idea of using Bayesain probabilistic reasoning as a means of distinguishing sustained competitive advantage from sustained superior performance in this paper.
They propose that particular resource configurations can be shown to link the two – sustained competitive advantage and sustained superior performance.

Through a discussion of Bayes’ theory and subsequent semi-conductor example, the paper describes how empirical data on past financial performance in a population of firms can be used to generate the posterior probability of sustainable competitive advantage, given the prior probabilities of both competitive advantage and competitive disadvantage.

Source:Tan, Y-C; Liou, F-M. (2010). Does Firm Performance Reveal Its Own Causes? The Role of Bayesian Inference. Strategic Management Journal. 31: 39-57.
Retrieved from http://web.it.nctu.edu.tw/~etang/SMJ2010_TangLiou.pdf

Likelihood of Global Warming Given X.

Introduction:

I’m going to profile a rather wacky article here. A Norwegian physics student applied Bayes’ Theorem to a number of things in a peripheral way in order to provide a basis for applying the theory to global warming. In “Testing Hypotheses about Climate Change: the Bayesian Approach”, Kristoff Rypdal applies Bayes to Russian roulette, the learning process of individuals, hurricanes and global warming and the melting of arctic ice caps. Essentially the author applied probability theory and hypothesis testing, where the concept of probability is defined subjectively as “a degree of knowledge" about a hypothesis. He defines knowledge as something generated by four processes: 1) the inspired formulation of new hypotheses 2) prediction (here deductio

n enters a central element) 3) collection of new data through experiment or observation 4) verification/falsification by comparing predictions and observations.

Summary:

I’ll go ahead and skip Kristoff’s lengthy explanation of what Bayes’ theory does via a parable about Mafiosos and their desire to watch him play Russian roulette. After by passing his .38, Kristoff takes us to Section VI. Hurricanes and Global Warming. He essential states that for the sake of the formula, the existence of human caused global warming is bivalent, 50/50 (H,HN). The scientific community (again for the sake of the formula) thinks that the odds of a massive hurricane occurring more than once per century in the absence of global warming is 10%, i.e. p(B|HN)=0.1. He also states that the scientific community believed that in the presence of global warming, massive hurricanes will occur more than once per century is 50%, i.e. p(B|H)=0.5.


Kristoff then basically applies the exact same theory to the arctic ice caps, in Section VIII: Bayesian Learning and Arctic Ice Cap Melting. He states that there has been a well-documented scientific effort regarding the monitoring of the arctic ice caps (which is true) and that these scientists have observed a large reduction in summer sea ice (which is also true). He then gives the “scientific estimates of probability” for the sake of applying the situation to Bayes’ Theorem. He states that the probability of the arctic ice caps melting w/o human caused global warming is 10% or, p(C|HN)=.1. He also states that the odds of this kind of melting occurring in conjunction with the presence of human caused global warming is 50% or, p(C|H)=0.5. These two measurements are identical as the one above and result in p(H|C)=.83 or 83%. When combined however, the author states that p(HN)=1-p(H)=.17. When combining this to p(H|B) we get .96, thus turning the results of the previous equation from ‘highly likely’ to ‘virtually certain’.

Conclusion:

Overall I thought Kristoff’s application was interesting though only theoretical in nature. It would be interesting to conduct polling studies on the topic of global warming in both the scientific community and the general public and apply Bayes theorem to the results. This could actually be applied to any polling of public perception really, so long as there was actual information to prove something correct or incorrect. As a study of ‘the causes of global warming’ Kristoff’s article is speculative and not particularly edifying, but as a study of human perception and likely, it is rather interesting.

Source:

Rypdal, K. (2008). Testing hypotheses about climate change: the Bayesian approach. Department of Physics and Technology, University of Troms, 9037 Troms, Norway

http://web.me.com/kristofferrypdal/Themes_Site/Courses_files/Bayesian%20approach%20to%20Climate%20Change.pdf

Saturday, April 14, 2012

A Test of Empirical Bayes Journey-to-Crime Estimation in The Hague


Introduction:
In the article, “Finding a Serial Burglar’s Home Using Distance Decay and Conditional Origin–Destination Patterns: A Test of Empirical Bayes Journey-to-Crime Estimation in The Hague”, the authors test a new method, empirical Bayes journey-to-crime estimation, to estimate where an offender lives from where he or she commits crimes. In the new method, the profiler not only asks ‘what distances did previous offenders travel between their home and the crime scenes’ but also ‘where did previous offenders live who offended at the locations included in the crime series I investigate right now?’.

Summary:
The empirical Bayes method uses not only the distance of the journey to crime, but also exploits our knowledge of origins (where did previous offenders live) and destinations (where did they offend), and the links between them to predict the home of a serial offender. It uses more specific information about past offenders. In contradistinction from previous methods, distance does not completely dictate the outcome of the prediction. Thus, given distance, if some destinations have been associated with a particular origin relatively frequently in the past, the new method will identify that particular origin as a likely home area of the offender.

The Bayes journey-to-crime estimation is an extension of its earlier distance-based. Based on connections between offenders and the incidents they committed, three risk surfaces are calculated:
  1. The first risk surface is the risk surface generated by the regular journey-to-crime/distance decay      method in CrimeStat (labeled distance decay risk surface & P(JTC)).
  2. The second is a ‘usual suspects’ risk surface by prioritizes zones where previous offenders lived, independent of where they committed their crimes and independent of the location of the crimes in the series of the offender that is being searched (labeled the general risk surface & P(O))
  3. The third risk surface is based on the origin-destination zone matrix (labeled conditional probability risk surface & P(O|JTC)).
The empirical Bayes journey-to-crime method generates two other risk surfaces by combining the above three risk surfaces. One of these two combination surfaces is the product risk surface, which explicitly recognizes both distance decay and the home-to-incident histories of prior offenders. The product surface is mathematically the numerator of the other combination surface, the Bayesian risk probability. The Bayesian risk surface is calculated by the application of the Bayes’ formula:
Thus, in addition to the three basic risk surfaces distance decay, general, and conditional, in this paper, two combination risk surfaces, product and Bayesian risk, are analyzed.


Conclusion:
Based on the study of 62 burglars, the homes of serial burglars were more successfully estimated with the new conditional risk surface than with the other risk surfaces. The method demonstrated may seem complex, but the authors are confident that in practice it will not be. The authors do state that there are disadvantages with using Bayes method which includes that the new method requires more data, more upfront work and may only be applied to relatively common crimes such as burglary or robbery.

Source:
Block, R., & Bernasco, W. (2009). Finding a serial burglar's home using distance decay and conditional origin–destination patterns: a test of empirical Bayes journey-to-crime estimation in the Hague. Journal Of Investigative Psychology & Offender Profiling, 6(3), 187-211. doi:10.1002/jip.108. Received from http://web.ebscohost.com/ehost/pdfviewer/pdfviewer?sid=5720f7fe-4ccf-473d-b299-e8755df7f04b%40sessionmgr113&vid=6&hid=110

A Bayes factor analysis of Extrasensory Perception (ESP) claims

Introduction:
Daryl Bem a social psychologist at Cornell University has claimed that people can feel or sense important events in the future that could not otherwise be anticipated.  In Bem’s experiments he claimed that people can feel future reward and punishment events and were able to anticipate a random choice at a rate above chance.  Bem, in his experiment, used a conventional approach where the same basic phenomena was targeted from slightly different angles. Bem used null-hypothesis significance testing and p-values as evidence to reach his judgements.  The authors of this article, Rouder and Morey, employed a Bayes factor analysis to assess the evidence in the Bem experiments.   

Summary:
The authors’ of the article attempted to employ a Bayes factor measurement to evaluate Bem’s evidence.  A Bayes factor measurement is a method of model selection based on Bayes posterior odds and is used as an alternative to frequency hypothesis testing (which is what Bem used).  The posterior odds in a Bayes analysis is the probability ratio of data given hypothesis.  The formula for the posterior odds is usually given as:

Where:

  • Pr(M|D) is the posterior odds, which describe the analyst’s degree of belief in the hypothesis after observing the data.  Usually described as the Probability of M given D.
  • Pr(D|M) is a likelihood and represents the probability that some data is produced under the assumption of the model.

The basic assumption in Bayes factor analysis is that prior (Pr(D|M)) and posterior information (Pr(M|D)) are combined into a ratio that provides evidence in favor of one model versus another.   

The general form given for Bayes factor is:


The interpretation of K is given as:


Rouder and Morey’s analysis found that people can “feel” the future with neutral and erotic stimuli to be slight, with Bayes factor scores (K value) of 3.23 and 1.57 respectively.  There was, however, some evidence for Bem’s hypothesis that people can “feel” the future with emotionally valenced nonerotic stimuli. A Bayes factor of 40 was recorded for this type of stimuli.  The authors think the K value of 40 is noteworthy, but, they believe it is still a magnitude lower than what is required to overcome appropriate skepticism of ESP.

The summary of Rouder and Morey’s findings are below:

Conclusion:
Based on the number of articles read to create this summary, Bayes factors appear to be flexible and allow for the comparison of multiple hypotheses simultaneously.  Statisticians claim the Bayes factor is intuitive, however, the factors are difficult to calculate. Also, Bayes factor analysis is unlikely to be undertaken by someone with only a cursory understanding of statistics.   

Source:
Rouder, J., & Morey, R. (2011). A bayes factor meta-analysis of bem's esp claim. Psychonomic Bulletin & Review, 18(4), 682-689. Retrieved from http://drsmorey.org/bibtex/upload/Rouder:Morey:2011a.pdf