Friday, October 17, 2014

Effects of Tyrosine, Phentermine, Caffeine d-amphetamine, and Placebo on Cognitive and Motor Performance Deficits During Sleep Deprivation

Summary
A 2003 Department of Defense sponsored experiment studied the effects of various drugs on analytic performance during the course of 40.5 hours of sleep deprivation found that all medication tested significantly improves performance in running memory, logical reasoning, mathematical processing, tracking and visual vigilance tasks during periods of continuous sleep deprivation. All the drugs tested improved some aspects of cognitive and motor performance during sleep deprivation compared to a placebo. The subjects realized the most significant effects during states of sleep deprivation for tasks requiring speed and accuracy.


The literature on the effects of sleep deprivation attributes performance failures to cognitive slowing, memory encoding problems, retrieval problems, reductions in vigilance, deterioration of response time, increased frequency of non-responses, and increased frequency of false responses.  

This experiment investigated and compared the effects of a placebo versus other treatment groups consisting of 20 mg of D-amphetamine, 300 mg/per 70 kg body weight of caffeine, 37.5 mg of phentermine, or 150 mg/per 1 kg of body weight of tyrosine after 32.5 hours without sleep on a series of eleven performance tests administered to 76 individuals in a laboratory set up for the experiment. 

The experiment took place over 5 days demarcated over four phases: Baseline (Days 1-3), Sleep Deprivation (Days 3-4), Medication (Day 4) and Recovery (Day 5). The researchers collected performance test data eight times during the Baseline period and again every 4 hours during the night of no sleep. After 32.5 hours of no sleep, the researchers administered the drug doses to the applicable treatment groups after which two further performance test batteries took place after 34 hours of no sleep and after 38 hours of no sleep. Four additional test sessions took place during the Recovery period. 

Outline of protocol from Figure 1.

The performance tests in the test battery were (in order administered): 
  1. Visual Scanning Task: Subject required to scan a matrix of letters to locate the letter K. Subject performed 20 trials per session and performance measure was amount of time to locate the correct letter. 
  2. Running Memory Task: Subject viewed series of 80 individually presented letters. When each letter was presented, subject indicated presented letter was same or different from the letter shown previously. Performance measures were percent correct and response delay time (RT). 
  3. Logical Reasoning Task: Subject viewed statement in the form A is followed by B then a letter pair was presented and subject indicated whether or not the letter pair relationship was congruent with preceding statement. Subject performed 80 trials per session. Performance measures were response latency per response (RT) and number of errors per session. 
  4. Mathematical Reasoning: Subjects solved addition or subtraction problem and indicated whether the answer is greater than or less than five. Subject performed 80 trials per session. Performance measures were percent correct and average response latency (RT) per correct response. 
  5. Stroop Task: Subject viewed words RED, GREEN, AND BLUE one at a time. On each presentation, the letters could be red, blue, or green in color. Subject responded according to the color of the letters. Subject performed 80 trials per session. Performance measures were percent correct for congruent and incongruent word-color items and average response latency (RT). 
  6. Four-choice Serial Reaction Time Task: Subject saw blinking + sign in one of four quadrants and responded as quickly as possible by pressing a corresponding keyboard key. The + remained visible until subject pressed a key and randomly appeared in one of four quadrants for next trial. Subject performed 75 trials per session. Performance measure was reaction time (RT) for correct responses. 
  7. Time Wall Task: Subjects observed an object descend from the top of the monitor at a constant rate toward a target at the bottom of the monitor and the target disappeared after the object descended two-thirds of the way down the screen. Subject pressed a key at the estimated time that the object would contact the target, 20 trials per session. Performance measure was amount of timing error. 
  8. Pursuit Tracking Task: Subject saw two cursors in the center of a monitor. Subject moved the mouse to make the top cursor follow the bottom target cursor as closely as possible as it moved at a constant rate across the screen for 3 minutes. Performance measure was amount of error per unit of time for subject cursor deviations from target cursor. 
  9. Visual Vigilance Task: Subject observed darkened computer monitor for 40 minutes. At random intervals, a small dim light appeared somewhere on the monitor. The subject pressed an appropriate key when subject detected the dim light on monitor. Performance measures were number of correct responses (out of 40 possible) and the average response latency for correct responses.
  10. Trails (B) Task: Subject given sheet of paper with a series of randomly arranged letters and numbers. Using a pencil and starting at number 1, subject traced a path between each succeeding number and letter (i.e. 1-A-2-B etc). Performance measure was time to completion. 
  11. Long-term Memory Task: Subjects verbally given a sentence at the beginning of session. Included in sentence was 12 pieces of factual information. After 90 minutes, subject wrote down as much of the sentence as could be remembered. Performance measure was number of correct pieces of information recalled.
Interestingly enough, not every task showed performance deficits due to sleep deprivation. 


Results for tasks that showed performance deficits related to sleep deprivation from Table II.

Tasks involving performance measures related to time stress such as speed and accuracy showed more consistent performance deficits during sleep deprivation than performance measures related to accuracy alone. All four drugs had a primary performance benefit of an improved response delay time, with auxiliary benefits in reduction of error for visual vigilance and pursuit tracking tasks. 

Prior to this research, the performance improvements from D-amphetamine and caffeine were well known by the body of literature. In contrast, phentermine and tyrosine were not. The article indicates that phentermine has very low abuse potential and mimicked the performance affects of D-amphetamine. The effects of Tyrosine supplementation were delayed compared to the effects of other substances (effects on performance not realized until 5.5 hours after dose administration compared to 1.5 hours for other substances) but a perceived benefit is that Tyrosine is a naturally occurring, non-essential amino acid with no known potential for physiological addiction abuse.

Critique
An issue with performing the same type of tests repeatedly within the same treatment groups is improvement related to learning how to "game" a test rather than improvement related to the consumption of performing enhancing drugs. The researchers administered a freebie "pretest" on the first day of the experiment yet the researchers indicate that eight sessions during the Baseline period were needed to overcome learning effects based on pilot research findings. The statistical analysis assessed performance data for 10 of the 18 testing sessions. Only four sessions on Day 3 were considered as baseline performance. Day 5 sessions representing the recovery period after sleep deprivation were also excluded.

An organization-wide adoption of drugs to mitigate performance deficits due to sleep deprivation during critical time crunches must additionally factor tolerance and withdrawal effects. Even a psychoactive drug perceived as relatively benign like caffeine can induce symptoms of psychosis such as hallucinations and hearing voices at large enough doses in otherwise healthy individuals according to 2009 research at Durham University and consistent use of 300 mg per day of caffeine is enough to cause tolerance side effects. 


Sources
Richard A. Magill; William F. Waters; George A. Bray; Julia Volaufova; Steven R. Smith; Harris R. Lieberman; Nancy McNevin; Donna H. Ryan. "Effects of Tyrosine, Phentermine, Caffeine d-amphetamine, and Placebo on Cognitive and Motor Performance Deficits During Sleep Deprivation." Nutritional Neuroscience. 2003  

Simon R. Jones; Charles Fernyhough. "Caffeine, stress, and proneness to psychosis-like experiences: A preliminary investigation." Personality and Individual Differences. 2009

Thursday, October 16, 2014

The Effects of Caffeine on Cognitive Fatigue

Summary
A study performed by Sunni Newton (2009) sought to examine the effect of caffeine on cognitive fatigue. This study examined the effects of caffeine on performance and self report mood measures during the execution of complex cognitive tasks. The end results of this study showed that complex task performance improved while using caffeine.

Cognitive fatigue is a factor that decreases task engagement and increases the resistance to further exert energy towards a task. Cognitive fatigue observed through changes in a person’s work performance, their physiological processes, and subjective feelings.  

To perform this study, Newton split 116 participants into two groups, a control group that was given a placebo and an experimental group that was administered a dose of caffeine. Participants received the caffeine dose (170mg) or the placebo by chewing two pieces of gum after the first hour of the test. The dose was given after an hour to ensure that a baseline was gathered for the experimental group.

Groups of 2 to 12 persons participated in a 4.5 hour testing session which alternated between self report questionnaires and exam questions. The questionnaire solicited responses from the participants regarding alertness and their perceived level of fatigue, while the exams were problems sets composed of 8 questions. Newton compiled the questions from college textbooks on the following subjects: science, history, English, and human interest.

The results of the study, which Newton analyzed using an ANOVA, showed that the placebo group reported a greater amount of fatigue throughout the test while the caffeine group reported less fatigue and performed better on the complex tasks.
Figure 1. Mean hourly test performance for caffeine and placebo conditions


Critique
This study was well constructed and executed. The choice to administer the caffeine dose an hour into the test was well thought out and further highlighted the effects of caffeine on task performance. An examination of the effects of caffeine on performance over a longer test time would be interesting to see.

Additionally, participants only had 6 minutes to complete each of the test questions. The constant changing of topic and requirement may have served as a stimulus to the placebo group. It would be beneficial to examine the performance on tasks that take a greater amount of time to complete

Source:

Newton, Sunni. (2009). The effects of caffeine on cognitive fatigue. Georgia Tech University.

Effects of acute bouts of exercise on cognition

Summary:

In his 2003 journal article, Phillip Tomporowski addresses how short periods of exercise affect our cognitive thinking abilities.  He identifies that there is plenty of literature showing that brief bouts of exercise improve both the mood of the participant, as well has their ability to think clearly.  Exercise has also been linked to reducing stress levels as well as anxiety and depression.  At the time of this research though, there was very little evidence to support the hypothesis that short periods of exercise improve cognitive brain functions.

Tomporowski conducted a literature review of the relevant studies relating to exercise and cognitive abilities.  The research was broken into three categories: studies that focused on  1) intense exercise 2) the relation of exercise induced arousal 3) long aerobic physical activities.

1) Relationship of intense exercise on cognitive thinking abilities

At the time, research on intense exercise and its link to cognitive thinking was examined by pushing the subject to near exhaustion (using fast running or biking exercises) and then conduction either a visual or oral test.  The effects of intense exercise on visual cognitive tasks, such as viewing maps and determining distance and routes, did not tend to be affected by intense cardiac workouts.

Studies did find a transient effect on the participants cognitive thinking abilities when presented with problems that required greater response preparation.  Wrisberg and Herbert (1976) only found this deterioration in response time and accuracy to have only a small deduction in thinking abilities.

2) Relationship of exercise induced arousal on cognitive thinking abilities.

Exercise induced arousal has presented many different findings on the participants cognitive thinking abilities.    There is some evidence that suggest it works in a reverse U shape.  For example, Salmela and Ndoye (1986) found that cognitive abilities were highest at 115 beats per minute than at rest or at 145 beats per minute.  Levitt and Gutin (1971) had similar findings were cognitive abilities peaked at 115 beats per minute, reached resting level at 145 beats per minute, and decreased at 175 beats per minute.

Many of the journals Tomporowski reviewed did not show reverse U shape relation to the participants cognitive abilities.  Much of the literature relating to visual recognition showed an increase in visual identification responses.  Allard et al. (1989) found visual recognition to be at its highest when participants were cycling at the highest level of intensity, contradictory to the reverse U shape model suggested by other researchers.

3) Long aerobic physical activities on cognitive thinking abilities

As with the other two categories, there was no agreed upon conclusions.  Travlos and Marisi (1995) found no increase in cognitive cognitive abilities during 50-min cycling exercises where the reaction time of participants were tested.  The same was found by Tomporowski et al. (1987) when testing college students memory during extended periods of running.

Other studies found that prolonged excise helped facilitate decision making processes.  Tests were administered during the exercise as well as during the cool down phases.  In both cases, participants speed in making decisions greatly increased.

Critique:

Tomporowski's literature review of the effects of exercise on cognitive abilities present a great deal of unanswered questions.  A majority of the findings are inconclusive and contradictory to other similar findings.  There are two key takeaways from this that I find important to intelligence analysts.

1) In all three of the categories, there was evidence to suggests that exercise helps improve cognitive tasks involving visual recognition and interpretation.  These results were best while the participant had an elevated heart rate.  Geospatial analysis is one area that I feel could best benefit from exercise and increasing the heart rate of the analysts.  Simple exercises, such as push-ups, jumping jacks, sit-ups, and running in place may help offer boosts in visual recognition to analysts.

2) A majority of the evidence suggests that excise does have some kind of influence on cognitive thinking.  There are many ways to help analysts have an elevated heart rate at work.  Examples include standing desks and treadmills with desks attached.  Exercise facilities located in the building are other options.  While I agree that research must continue into the influence exercise has on cognitive thinking, the intelligence community should start considering exercising as a tool to help improve the short term cognitive abilities of their analysts.

Source:

Tomporowski, Phillip D (2003).  Effects of acute bouts of exercise on cognition.  Acta Psychologica.  P. 297 - 324.

Monday, October 13, 2014

Summary of Findings: Prediction Markets (3.5 out of 5 stars)

Summary of Findings: Prediction Markets (3.5 out of 5 stars)


Note: This post represents the synthesis of the thoughts, procedures and experiences of others as represented in the 5 articles read in advance (see previous posts) and the discussion among the students and instructor during the Advanced Analytic Techniques class at Mercyhurst University in October 2014  regarding Prediction Markets specifically. This technique was evaluated based on its overall validity, simplicity, flexibility and its ability to effectively use unstructured data.


Description:
A prediction market is an analytic method in which participants buy and sell estimates based on the  probability they assess an event to have.  For example, a participant may pay 45 cents for a prediction stock of an event that they believe has a 45 percent chance of occurring.  There are also prediction markets that do not have the ‘stock market’ element.  The estimates, regardless of the type of prediction market used, are aggregated to create a more accurate estimate of a specific event.


Strengths:
1. Prediction markets have been used successfully across multiple fields (economics, finance, intelligence)
2. Can incorporate insight from experts across many different fields
3. Output can come out as a single point or a range of values
4. Similar structure to that of Nominal Group Technique (NGT)
Weaknesses:
1. Prediction markets require a large number of analysts to create the number of estimates needed
2. Constant traffic is required within a prediction market to create the volume of estimates needed
3. The integrity of the prediction market is susceptible to manipulators
4. The purpose of the prediction market must be to create to accurate estimates
5. Long term estimates are at risk of forecaster apathy
6. Some modest level of expertise is required to be a forecaster


Step by Step:  
  1. Define a question resolvable by prediction markets
  2. Design a prediction market to reduce uncertainty about future outcomes by aggregating individual estimates within a predetermined time frame
  3. Define rules of the prediction market constraining participant behavior and applicable payouts
  4. Open prediction market to participants
  5. Readjust prediction market as new information comes to light
  6. Close prediction market at predetermined date and declare winners
  7. Use output of prediction market to support decision-making or feed into another technique


Exercise:
Participants were provided a jar filled with a predetermined amount of tootsie rolls only known by the administrator to make an estimate on the amount of candy in the jar. The participants were not allowed to talk to other participants during the exercise and were not allowed to open the jar. Participants were allowed to use the computer for ten minutes in any way to help them form their estimates. After the ten minutes, the participants wrote down their one estimate on a post-it note and turned it faced down. The proctor came around and collected all of the estimates and displayed them on the board and took the average. The average of the participants was closer to the real estimate then individually.

What did we learn from the Prediction Market Exercise
After researching this topic, students were able to learn the methodology of prediction markets. Participants of this exercise were able to use various approaches to develop estimates. The average of the participants responses were closer to the actual number of tootsie rolls in the jars.  

Friday, October 10, 2014

Affecting Policy by Manipulating Prediction Markets: Experimental Evidence

Summary
Can prediction markets that are successful at forecasting in the absence of manipulators be corrupted when manipulators actively undermine the prediction markets? An experimental design developed by Deck, Lin, and Porter (2011) provides evidence suggesting that well-funded manipulators who only care about the forecast and exclusively concern themselves with misleading market observers can disrupt a prediction market's ability to aggregate information and mislead those who make forecasts based upon market predictions, effectively eliminating the prediction market's ability to improve a forecaster's performance. However, evidence from the experiment also suggests that the detection of manipulators might be possible due to the trade volume increases and price variance decreases observed. Organizations aspiring to base policy on prediction markets need to condition what the threats of manipulation are for their prediction markets. The experiment models the confrontation between agents attempting to deviously manipulate prediction markets and decision-makers attempting to use prediction markets to guide a course of action. The research objective was to understand how manipulators influence forecasters, not how well the market aggregates information.

Prior research indicated that prediction markets are robust to manipulative attacks and resulting market outcomes improve forecasting accuracy regardless. The profit motive usually proves sufficient to seeing that attempts at manipulating prediction markets are unsuccessful. The authors use the analogy for the now defunct Policy Analysis Markets, if a wealthy terrorist's only motivation is to cause significant loss, his own financial profit or loss does not enter the decision process. This is a separate issue from whether or not prices in prediction markets or asset markets reflect all available information. The limitation of prior research on the manipulability of prediction markets is that the manipulators suffered the financial losses associated with manipulation. The authors assert this is true in any market, but in some cases the relative value of manipulating the market dominates the financial losses associated with attempting to do so. 

This experiment is different from previous market manipulation experiments. Where previous research concerned itself with so called "trade-based" manipulators in financial markets that move price with current trading in order to profit from later trades; this experiment has manipulators who control the event being forecasted but do not want decision-makers to uncover the outcome. For instance, a terrorist makes plans for an attack, but does not want security forces to discover the target. Decision-makers make investments to counter possible attacks. If decision-makers use prediction markets to assist them in detecting terrorist plans and making investments, manipulators would like to mislead decision-makers to make incorrect investments by manipulating market prices. In this experiment, manipulators were not paid in any way for their market earnings. Instead, they were paid solely based upon the average amount that Forecasters invested in the incorrect event. This gave manipulators strong incentive to mislead Forecasters. A critique of previous research on manipulation was that the incentives to mislead were too weak. This schema more accurately models the confrontation between manipulators and decision-makers attempting to use prediction market outcomes to set policy.

Another way the experiment differed from prior research was enabling the Forecasters to have a range of investment opportunities to measure the intensity of their confidence instead of prompting Forecasters to make binary predictions. The implication is that prior research cannot distinguish between a Forecaster who thinks the chance a particular event will occur is 51% and a forecaster that believes the likelihood the event will occur is 90%. 

Experiments conducted at the Economic Science Institute at Chapman University over the course of three days.
 When manipulators are absent, the research found that market prices correlate with the true state and forecasters successfully use price information to make predictions. However, when manipulators are present, the research found that the prediction markets failed to aggregate good information and forecasters consistently failed to predict events. Additionally, manipulator trading increased trade volume compared to markets without manipulators. An unintended finding was that manipulators earned positive profits in almost 70% of the periods in which they were active, which mitigates concerns over financial losses on the part of manipulators. 

The results suggest that manipulators can reduce the predictive power of prediction markets and create situations where Forecasters are unable to make good decisions by actively trading in the markets, which provides a means of identifying the likelihood of manipulator presence. At a statistically significant level, markets with active manipulators had greater trade volume and less variation in prices.

What market information should forecasters use to make a prediction? When manipulators are present only excess bids has predictive power.
 The research identified a case where manipulators cause forecasters to make predictions that are no better than random guessing and concludes that decision-makers should not indiscriminately rely upon prediction markets. An unintended finding was that even though the manipulators were solely motivated by misleading market prices, their strategies resulted in trading profits rather than trading losses, which deviates from prior research from the literature review. 
 
Critique
The prerequisite for successful prediction market manipulation identified in the research is sufficient liquidity to have a measurable impact on trade volume and excess bids. In this particular experiment, the manipulation treatment group had manipulators start with 4 times the amount of experimental currency units of regular traders, this represented one-third of the money in the market overall. How much money aspiring manipulators need relative to the market to be successful in getting forecasters to predict incorrect outcomes is a present gap in the academic literature. 

Source
Affecting Policy by Manipulating Prediction Markets: Experimental Evidence

Who’s Good at Forecasts?

Who’s Good at Forecasts?
By: The Economist

Summary:
The Economist took a look at predictive markets in their special edition, The World in 2014. They looked at Philip Tetlock’s 1980s forecasting tournament involving 284 economists, political scientists, intelligence analysts and journalists. This research collected around 28,000 predictions which concluded that “the average expert did only slightly better than random guessing.” The forecasts were expressed numerically so an expert could not provide vague words such as “may” or “possible”.  The results also concluded that “experts with the most inflated views of their batting averages tended to attract the most media attention.”

The Intelligence Advanced Research Projects Activity (IARPA) used Tetlock’s forecasting tournament as a pilot to sponsor a more ambitious tournament called The Good Judgment Project. The project has collected over one million forecasts from 5,000 forecasters on 250 questions. These questions range from the euro-zone to the Syrian civil war. From this research, IRAPA has been able to discover which methods of training promote accuracy.

This research also explores the super-forecaster hypothesis. Within the first year of the tournament, two percent of forecasters showed that luck was involved. However, after a while, forecasters became better and the super-forecasters were assigned to teams. These forecasters beat the “unweighted average (wisdom-of-overall-crowd) by 65%; beat the best algorithm for four competitor institutions by 35-60%; and beat two prediction markets by 20-35%.”
To be a part of The Good Judgment Project you can register here

Critique:
Although the Economist did a very good job explaining both forecasting tournaments, I found the analysis of the research lacking. I would have liked to see a more in-depth look at how the tournaments came to their conclusions in addition to how super forecasters are grouped into teams.  

Source:

Who’s good at forecasts? (2013, November 18). The Economist http://www.economist.com/news/21589145-how-sort-best-rest-whos-good-forecasts 

Wanna Bet There Will Be War? A Time-Series Analysis of Prediction Markets During the Libya Conflict 2011

Summary:

In his attempt to see the accuracy of prediction markets when forecasting international conflicts, Sebastian Worle analyzed the results of a prediction market (PM) predicated around the ousting of Muammar Gaddafi.  The prediction market ran from 19 February 2011, until 29 August 2011, where participants were allowed to trade futures on whether Gaddafi would still be in power on 31 December 2011.  The closing price of the market would establish the probability that Gaddafi would not be in power.

Worle's hypotheses for this experiment were,

  1. A PM's price correctly forecasts the outcome of international conflicts
  2. A PM's price approximates the event's true probability as the future approaches the end date
  3. New good news (evidence suggesting Gaddafi will be removed from power, such as a country joining in airstrikes against Gaddafi) makes market price rise, bad news makes it fall (evidence suggesting Gaddafi will remain in power such as failed UN operation), and irrelevant news has no influence on the market price
  4. Good news leads to relatively lower increase in volatility as opposed to bad news which will increase volatility.
  5. PM's anticipate publicly foreseeable events and do not show significant reactions once the event takes place
  6. PM's do not anticipate events that are not foreseeable and react once the event takes place
To answer these questions, Worle broke the analysis into two parts.  Worle used a GARCH regression model in order to examine patterns in trading.  The GARCH model allowed Worle to analyze how the market reacted to different types of events (news stories) over the course of the experiment.  Worle then used an event study design to examine when and how the market reacted to certain events.

Worle was not able to definitively confirm hypothesis one or two due to the research design.  The PM determined that chances of Gaddafi being removed from power by 31 December 2011 to be 68.5%. Worle recognizes that he did not have a way of comparing the results to a benchmark.  While he was able to identify a probability of Gaddafi being removed from power, the accuracy of that percentage could not be confirmed.

Worle found that the market moved as expected when good and bad news was received.  News articles suggesting Gaddaffi's ouster made the stock rise while those the suggested against it made the price fall (Hypothesis 3).  He found that the change occurred rather quickly, as it usually took fewer than 24 hours for the price to change after a new news article was published.  Worle rejected hypothesis 4, as he found that good news increased trading volume more than bad news.  Worle believes this may be due to investors risk aversion levels.

Worle "cautiously accepts" hypothesis 5 and 6, adding that they are only "semi-strong efficient".  The PM was able to correctly identify publicly foreseeable events, such as the Security Council resolution.  Throughout the course of the PM, very few "unforeseeable" events were forecasted by the market (no significant increase or decrease in trading price before the event).
  
Critique:

Worle's research into prediction markets was rather intriguing and well put together.  I warn against using this research to prove the effectiveness of PM's though.  First off, there was no way to confirm the accuracy of the forecast.  He did not compare it to polls or other research on the subject.  Worle could have asked analysts, independent of the PM, what they believe the chances of the Gadaffi being removed.  This could have given him a benchmark to examine the accuracy a little better.

Secondly,he found that this PM did a decent job of identifying publicly foreseeable events, but was not efficient at identifying those that were not as easy to predict.  As analysts, we are concerned with the harder to predict events, the ones that most have a hard time predicting.  If there is limited application towards identifying these events, PM's are limited in their forecasting abilities to big picture events.

Source:
Worle, M.S. (2013). Wanna bet there will be war? A time-series analysis of prediction markets during the Libya conflict 2011.  The Journal of Prediction Markets.

Are Sports Betting Markets Prediction Markets? Evidence from a New Test

Summary
In their paper, Kain and Logan (2011) argue that the sports betting markets are not accurate prediction markets. Kain and Logan examined two or the possible bets made with regards to sporting contests, margin of victory (the line) and over/under (sum of scores), to determine the accuracy of predictions in the outcomes of sporting contests. This study argues that in order for the sports betting market to be an accurate prediction market, it must be able to accurately predict both the sum and difference of scores in sporting contests.

In this study, Kain and Logan examined the predictions and outcomes for NFL, NBA, NCAAF, and NCAABB contests between 2004 and 2010 to determine the accuracy of house predictions. The test was performed by weighing the predicted margin of victory and total score against the actual outcome. The results of the study showed that the margin of victory is an accurate predictor for bettors; however, over/under is not.

Kain and Logan contribute the results of this study to the problematic position those creating the lines are in, for the house is profit maximizer. Casinos and others producing lines for sporting contests are in the business of making money. When they produce a line, they want to have a greater amount of losers than winners, not a 50/50 split. Kain and Logan continue by stating that does not desire to accurately predict the outcome of games, but instead wants to create bettor belief. Finally, Kain and Logan make the observation that there is less money in betting over/unders, therefore, casinos are less likely to invest in developing accurate predictions.

Critique
This study brings to light flaws in sports betting that many tend to overlook. The lines made for sporting contests are made with the interest of making money, not necessarily for making accurate predictions. Because of this, analysts should be wary of taking a great amount of stock in sports forecasting. Studies into the validity of sports betting as an accurate prediction market should be done carefully as accuracy is not valued.


Source
Kain, K. & Logan, T. (2012). Are sports betting markets prediction markets? Evidence from a new test. Journal of Sports Economics.

Thursday, October 9, 2014

Do prediction markets produce well-calibrated probability forecasts?



Summary:

Page and Clemen (2012) examined the accuracy of prediction markets in making probability estimates.  Prediction markets are used to gather estimates based on various sources of information that market players have different access to.  Theoretically, as the various sources of information are taken into account, the buying and selling of predictions in the market generate a more accurate estimate. Prediction markets usually involve short-term predictions, but it was still widely assumed that long-term predictions in the market are just as accurate as the short-term ones.  Page and Clemen’s findings prove otherwise.



Prices in prediction markets are often calibrated in order to get closer to a true estimate of a particular event.  There are several inter-related reasons why prices deviate from an event’s true probability, many of which are just manifestations of stock market tactics in the predication market.  Players with limited budgets will often take longshots on predictions in hopes of selling them later at an inflated price.  Another reason for deviation comes from price manipulation.  Players may be encouraged to buy or sell predictions for the sole purpose of moving the price up or down.  An example of an encouraged player would be one who buys predictions of an event which he or she has influence of that event’s occurrence.



Page and Clemen found that as predictions in the market cover an extended timespan, players in the market are less likely to trade them.  Players undervalue the probability estimates of long-term predictions.  Long-term predictions, consequently, have little use in prediction markets.  The lack of trade volume undermines the importance that the prediction may actually have.  A long-term prediction in this case is a prediction or estimate that involves an event that will take over a year or longer to occur.  However, Page and Clemen did find that short-term predictions (those within 100 days) are consistent enough to be calibrated for value predictive insight.  The shorter the timespan, the more accurate a calibrated estimate is.



Critique:

These findings speak to one of the biggest weaknesses of prediction markets: they cannot suit questions with too great of a difficulty.  Any use of prediction markets for strategic decision support has remote chances of being helpful.  However, their usefulness for tactical to operational decisions would be worth exploring.  The authors did not specify if the prediction markets they examined specific to any particular subjects (i.e. sports, stock market, etc.), so further exploration for intelligence-related prediction markets is required.



Source:

Page, L., & Clemen, R. T. (2012). Do prediction markets produce well-calibrated probability forecasts? The Economic Journal, 123, 491–531.