Thursday, March 28, 2013

Summary of Findings (White Team): News Analysis (1 out of 5 Stars)

Note: This post represents the synthesis of the thoughts, procedures and experiences of others as represented in the 8 articles read in advance (see previous posts) and the discussion among the students and instructor during the Advanced Analytic Techniques class at Mercyhurst University in March 2013 regarding News Analysis specifically. This technique was evaluated based on its overall validity, simplicity, flexibility and its ability to effectively use unstructured data.


Description:
A vaguely defined technique that includes the analysis of news in some way but generally relies on an established analytic technique such as sentiment or content analysis.

Strengths:
1. Fast way to find information on a topic of interest.

Weaknesses:
1. Technique is not well defined.
2. New sources can be skewed by whether one is left or right leaning politically.
3. The analysis of the news sources can be biased by the analysts inherent biases on the proposed topic/personal experiences.
4. Articles affected by high emotion.
5. Extremely affected by circular reporting.
6. Just examining news sources is not an effective approach to creating accurate. intelligence estimates.
7. Difficult to approach or carry out due to the lack of structure.
8. Produces estimates of low analytical confidence.

Step by Step Action:
1. Analyze a specific topic utilizing news articles from various publishers.
2. Determine whether information related to the topic can be corroborated between the sources.
3. Produce an estimate based on information found.
4. Provide the analytic confidence for the findings.

Exercise:
Provided the class with a topic to research using only online news sources. The question was, “ Has the U.S. government been training Syrian rebels?” The class was given ten minutes to research the topic. Then as a class we discussed our findings to the proposed topic. We discussed whether news analysis is an effective tool to provide reliable intelligence estimates and what does the term “news analysis” really mean.

Summary of Findings (Green Team): News Analysis (1 out of 5 Stars)


News Analysis
Green Team
Rating (1 out of 5 Stars)

Note: This post represents the synthesis of the thoughts, procedures and experiences of others as represented in the 8 articles read in advance (see previous posts) and the discussion among the students and instructor during the Advanced Analytic Techniques class at Mercyhurst University in March 2013 regarding News Analysis specifically. This technique was evaluated based on its overall validity, simplicity, flexibility and its ability to effectively use unstructured data.


Description:
News analysis is a poorly defined analytic technique that is often confused for a separate technique, such as sentiment analysis, content analysis, and computational linguistics all applied to news sources. News analysis is meant to analyze the qualitative and quantitative attributes of news sources, with particular focus on sentiment, context, and novelty. News analysis, or other techniques referred to as news analysis, is frequently utilized for the financial industry to predict stock movements and consumer confidence.

Strengths:
  • Can be applied in a relatively short amount of time
  • Can be done individually or within a group
  • Can examine multiple facets of news, including sentiment and novelty

Weaknesses:
  • Not easily defined -- ambiguous in regards to what is measured, how to conduct it, and the information it provides
  • Unable to separate human bias from news analysis
  • Frequently used as a guise for a separate technique -- sentiment analysis, content analysis, and computational linguistics
  • Often limited in use to textual news sources

How-To:
  1. Determine a topic that is likely to be covered in the news.
  2. Search for relevant news articles from news sources (online, print, or other)
  3. Take into account the different biases that certain sources may contain along with personal biases from previous knowledge or personal interpretation.
  4. Note which information is important, relevant trends, and anything else noteworthy.
  5. Decide likelihood of topic and confidence interval.

Personal Application of Technique:
The class was tasked with using only online news sources to create an estimate to the question: Has the US government been training Syrian rebels?  The class had ten minutes to look at different news sources and create an answer to the question.  In addition to answering the question with an estimate, the class had to assign an analytic confidence in the assessment.  This exercise reiterated the difficulty in using this method, since it is not easily defined nor is it possible to eliminate biases or the framing through information already known.  Additionally, the issue of Circular Reporting was raised through this application and was an additional entity that should be taken into account when conducting news analysis.

Rating: 1 out of 5 stars

Tuesday, March 26, 2013

Does Public Financial News Resolve Asymmetric Information?

Summary:
The author, Paul C. Tetlock, looks at four different predictions in relation to financial news and the role that plays in the stock price. He was interested in analyzing the role of news analysis in stock pricing; to test this Tetlock used news that is found in the news archive of the Dow Jones using 29 years of information.  For this study, public information is measured by the return on stocks in the Dow Jones on news days.

The author determined that there is a correlation between the news reports released and the stock price the few days following the release.  This demonstrates a definite relation between the media and the prices of publicly traded stocks.  The number of informed investors increases due to the release of news, which in turn affects the stocks for that time period.  The author's presentation of the paper and research added to the previous body of knowledge surrounding the influence and impact of news releases.

Additionally, the author looked at the change in the relation of knowledge the investors had, specifically in regards to the release of information by the media.  The information that was included and determined from this study is not only associated with trading and finance, but has the potential to influence the manner in which firms view their stocks and the potential analysis that may result from that.

Critique:
This article provided a good application of news analysis, and while it has a strong financial application, it does not specifically discuss the intelligence field.  While the use of news analysis should be support by other techniques or sources, it does provide a level of insight into public perception pertaining to certain issues that are taking place. Public sentiment and the information portrayed by the media has the potential to influence the intelligence field and should certainly be considered as one of the methods employed in analysis.

Another element, which the author did note, is the fact that this study did not account for the potential behavioral biases which are involved in the processing and interpretation of news.  While this does not discount the method, it is something which needs to be taken into consideration when utilizing this tool as an analytic approach.

Tetlock, P. C. (2010). Does public financial news resolve asymmetric information? AFA 2010 Atlanta Meetings Paper. Available from http://ssrn.com/abstract=1303612

Sentiment Analysis in the News

Summary:
For this article, the authors use opinion mining on 1592 quotes from English language newspapers whereby both the target and the source were known.  The authors acknowledge that the author, reader, and the text all could have potentially different interpretations of content.  Perhaps one of the most important points of this article is that the authors state that by putting these news feed results into categories (ex: 'disaster', 'flood', and 'accident' can all be put into one category) they may miss some things through misinterpretation.  However, especially with news analysis, having lists that are easily translated to apply to many different languages is a useful way to save time and to sometimes preserve the meaning.  They state that their technique of analysis could be used for tests whereby quotes are not used.  A key assumptions is that the text in quotes are more subject than the entirety of the text.

The authors conducted this experiment by taking the 1592 quotes and limiting them to 1292- the amount the authors agreed upon for sentiment.  Of their results, the sentiment analysis system identified the target sentiment in 1114 of these quotes.  Additionally, these quotes were broken into four categories by which opinion was shown: positive, negative, high positive, and high negative.  Some of the issues the authors found were that the software did not detect sarcasm and had a lot of error with regard to false neutral results where no sentiment words were present.  One of the solutions suggested was to increase the amount of text examined, such that sarcasm could be negated and sentiment words would be present.  The authors did not include foreign news media, but suggested it for future research.

Critique:

One of my major issues with this article is that terminology was either poorly defined or not defined at all.  The authors did not define what EMM was, and with a general search, I found results ranging anywhere from Enterprise Mobility Management to Eastern Mennonite Missions, neither of which  I believe the authors wished to analyze.  My best guess is that this article was on the Europe Media Monitor.  Additionally, the techniques/resources were only mentioned briefly and not defined.  WordNet Affect and SentiWordNet were two of the resources mentioned but never explained why or what they are used for.  I feel that these two particular issues would have made understanding this article much easier. 

Another major issue I had with this article is that the authors disregard background knowledge and interpretation of what quotes were said in order to simplify the process.  I argue that this would be nearly impossible to do.  I cannot read a quote and disregard any knowledge I may have on the issue, nor can I stop myself from interpreting a quote.  These mental processes operate automatically and are difficult, if not impossible to stop.   

Lastly, I feel that this article does not apply specifically to the intelligence field, however it does explain a process that could be used for intelligence analysis.  Examining articles for sentiment is a useful procedure when examining areas of interest, such as how Iranian leaders perceive certain issues through quotes from their newspapers.  I agree that using multiple resources to check for sentiment helps to double check the resources to make sure the sentiment is consistently detected correctly.  


Source:
Balahur, A., Steinberger, R., Kabadjob, M., Zavarella, V., van der Goot, E., Halkia, M., Pouliquen, B., & Belyaeva, J. (2009). Opinion Mining on Newspaper Quotations. Proceedings of the workshop
'Intelligent Analysis and Processing of Web News Content' (IAPWNC), held at the 2009
IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent
Technology. Milano, Italy.  Retrieved from: http://lexitron.nectec.or.th/public/LREC-2010_Malta/pdf/909_Paper.pdf


Monday, March 25, 2013

Reading the Markets: Forecasting Public Opinion of Political Candidates by News Analysis

Lerman, Gilder, Dredze, & Pereira (2008) used computational linguistics to predict the impact of news on public perceptions of political candidates in the 2004 US Presidential election. The system predicts shifts in public opinion by analyzing daily newspaper articles. Their research assumes that mass media affects world events, such as elections, by swaying the opinions of both the general public and decision makers.

Summary:
The research of Lerman et al. applies the predictive capability of news analysis typically associated with financial performance to the political field of election results. Unlike opinion polls which are conducted and published sporadically and are often incomparable, the authors claim daily news analysis can predict how public perception of political candidates will change on a day-to-day basis. The work differs from other opinion analysis in that the system uses objective news, not the extracted opinions, to analyze news to predict future opinions, a cause and effect relationship.

Their computational system incorporates both external linguistic information (provided by the news coverage) and internal market indicators to forecast public opinion measured by prediction markets. The political prediction markets act like a stock market for elections with investors buying shares in an outcome they believe most likely to occur in exchange for a payout if correct. Internal market indicators include overall market mood, momentum, and history, citing the example that a positive news story regarding a candidate otherwise disliked will have less of an impact on public opinion (Lerman et al, 2008, p. 474). The system employed takes morning news articles, looks for particular features, and computes based on market history the price movement the news will cause, and compares the prediction with the actual day's end movement.

The system looks for certain features that will affect public opinion. Bag-of-words features are words that occur more than 20 times in an article, excluding common stop words. News focus features refer to particular topic that is reported multiple times and to what degree the amount of reporting on this single topic changes. Entity features look to connect a subject entity to the topic, such as a political candidate to a scandal. Dependency features takes entity features a step further and identifies both the subject and the object of particular topics, such as which candidate defeated the other in a particular debate. Dependency features proved to be the most influential in forecasting public opinions using news analysis for the 2004 US Presidential election.

Critique:
The research presented by Lerman et al. succeeds in identifying certain important aspects of news analysis. First, the authors note their system best tracks negative news impact (Lerman et al., 2008, p. 479). This is not surprising given the media's propensity to publish negative news stories which attract readership. Additionally, the work disproves the notion that the quantity of mentions a candidate has is the sole factor in forecasting election results. The authors note that while Bush had more mentions than Kerry and did win, Kerry had the least amount of mentions compared to fellow DNC contenders, yet he won the nomination (Lerman et al., 2008, p. 478-79).

Despite these positive contributions, the research fails in a few areas. For instance, the authors do not identify or at all address the types of news sources they compiled their data from, beyond stating daily, early-morning publications in various markets. It would be interesting to see if these are local, regional, or national papers, or papers with known biases, and how the authors addressed this, if at all. Additionally, the research only looked at morning articles, specifically print articles, which leaves out the vast amount of news likely to affect public opinion. For instance, while the reasoning to focus on morning articles was to make a prediction for that day, the researchers fail to address how news from the previous day, after or during market hours, affected the next day, particularly for those, like myself, who read the news in the evening, not the morning. Additionally, a vast amount of news does not come from print sources, and now more than ever social media is being analyzed to make similar predictions, though the authors could not have predicted this in 2004, at the time of publication (2008), social media was an ever-important election resource, particularly for President Obama.

Source:
Lerman, K., Gilder, A., Dredze, M., & Pereira, F. (2008). Reading the Markets: Forecasting Public Opinion of Political Candidates by News Analysis. Proceedings of the 22nd International Conference on Computational Linguistics. Manchester, UK: Coling. Retrieved from http://delivery.acm.org/10.1145/1600000/1599141/p473-lerman.pdf?ip=76.189.160.87&acc=OPEN&CFID=302768080&CFTOKEN=33442324&__acm__=1364267932_9637607646da1dbf0b73a76e46a9b775

International sentiment analysis for news and blogs

Summary:
Bautin, Vijayarenu and Skiena performed a sentiment analysis in International Sentiment Analysis for News and Blogs on the English translation of foreign language texts in the news from May 1 to May 10 2007. They used the Lydia sentiment analysis system to conduct the experiment using both English-language papers and sources in eight other languages: Arabic, Chinese, French, German, Italian, Japanese, Korean and Spanish, totaling to about 21,000 articles per language. The entities used in polarity calculation were 14 countries and four cities around the world. Previous research has shown that the Lydia system has been effective in capturing sentiment analysis from English-language sources and this article attempted to discover whether the same is true for machine-translated texts.

The authors conducted several preliminary experiments to test the validity of the method. To "isolate the effects" of variance in the news, they first conducted the analysis on a body of text of European Law that should show no explicit sentiment. They presented techniques to counter the bias that arose from the different language sources. They also concluded that their results were mostly translator independent.

The Lydia sentiment analysis system receives all of the news articles and conducts certain tasks such as part-of-speech tagging, extraction of entity descriptions and compared against an antonym and synonym check to produce the final result of a sentiment score calculation. Two different scores are given, the polarity score and the subjectivity score. The former shows whether the entity is associated with a positive or negative polarity and the latter shows how much sentiment the entity receives of the particular polarity. When analyzing results on news entity polarity correlations, they noticed a common underlying factor when the time periods were highly correlated for most pairs of languages. This was shown to be during major world events such as the sentiment drop among all four different languages for the entity "London" on May 10, the day that four people were arrested in the United Kingdom in connection with the 2005 London bombings. This is shown in the figure below.




Critique: 
The authors of the article conducted the analysis in a way that accounted for every bias I would have found myself, which highly validated the results of the experiment. Their use of two different Spanish translators and comparison of results to observe to what degree their results were translator dependent was also a very effective strategy.

A limitation to the experiment conducted was that it covered only a ten day period which was too short to analyze "long-time country sentiment". Additionally, translation programs often make errors which hinders their analysis. Nevertheless, the system proved valuable and its application to the intelligence community (IC) was evident particularly in the section covering news entity polarity correlations mentioned in the summary. Cross-cultural observations also showed an interesting result potentially valuable to the IC from a cultural perspective: Italian is the most biased language toward negative sentiment while Korean is the most biased language toward positive sentiment.

The methodology did rely heavily on statistical methods that were sometimes not explained fully and thus, could be misunderstood by the layperson. Although the methodology in general was understandable, it would have been more effective for the authors to have provided more explanation of the equations or definitions of terms. I would have also liked to see more analysis of the actual results but the purpose of the article seemed to be geared more towards the application of the methodology and its effectiveness rather than the analysis.



Bautin, M., Vijayarenu, L., & Skiena, S. (2008). International sentiment analysis for news and blogs. Association for the Advancement of Artificial Intelligence, Retrieved from http://www.aaai.org/Papers/ICWSM/2008/ICWSM08-010.pdf

Large-scale Comparative Sentiment analysis of News Articles


Summary:
Although a large volume of new items are available to the public due to online media, there is a lack of efficient ways analyze them.  This article presents sentiment analysis as a visual tool for analyzing news items.  This tool in combination with text analysis is designed to analyze news feeds from Europe Media Monitor (EMM) to determine if they have positive or negative connotations.  Since this technique is semi-automatic, it costs less and is capable of monitoring a specific topic real-time.  The news articles gathered focused on two categories; terrorist attacks and natural disasters.  A sentiment score was given to each of the 6000 news articles related to terrorist attacks, and 1000 new articles related to natural disasters. 

Sentimental analysis not only provides effective visual representation of the data but also highlights trends and patterns. For example, figure one displays new articles of one week. The vertical axis shows the week divided in to days and the horizontal axis shows the time the article was published. Each triangle represents a news feed which is color coded as red (negative news item), blue (positive news item), or white (neutral news item). While the upper line contains news associated with terrorist attacks and the lower line contains news associated with natural disasters. News articles associated with both categories are placed in between both lines.  News articles with more saturated colors have higher sentiment scores.  Utilizing the zoom function, one may look at each triangle without data overlap.      

From this experiment, the authors were able to retrieve and asses a large volume of news articles from EMM. The data was organized utilizing the time and date they were published. A sentiment score was given to each article based on the tone associated with each article.

Critique:
This article explains a scenario that uses sentiment analysis to evaluate news articles.  However, it does not provide a good introduction to the experiment.  For example, the authors do not explain why the study was conducted or how a sentiment score was assigned to each news article. The authors mostly emphasized sentiment analysis as a visual tool, but did not explain the software used to analyze the articles. In addition, it does not assess advantages and disadvantage of this type of study. Sentimental analysis can analyze current activities on social networking sites. However, it can be subjective in interpretation because information found on articles is prone to biases.              

Utilizing sentiment analysis for assessing media is fairly new concept that has gone viral in the past few years.  It has similarities to social media analysis.  Sentiment analysis is a great technique for businesses attempting to improve quality of customer care.  They may utilize sentiment analysis to evaluate what is being discussed on media sites. If the feedback is negative, then the company can take proper action to improve the quality of customer service.  Also, based on positive and negative reviews, decision makers can formulate strategies to improve sales and marketing efforts.   
 
Source:
Rohrdantz, F., Mansmann, F., Stoffel, A., Kristajic, M., & Keim, D. (n.d.). Large-scale comparative   sentiment analysis of news articles. Retrieved from http://www.inf.uni-konstanz.de/gk/pubsys/publishedFiles/WaRoMa09b.pdf http://www.inf.uni-konstanz.de/gk/pubsys/publishedFiles/WaRoMa09b.pdf

Further Decline in Credibility Ratings for Most News Organizations

In August of 2012, the Pew Research Center published a report on a poll conducted in late July on the perceived believability of newspapers, cable news, and network news. The poll also checked the perceived believability of local newspapers and broadcasts.

Summary

Using a 4-point scale, respondents were asked to assign a number on 13 news organizations relating to how accurate they believe the organizations to be. Aside from examining the previously mentioned areas, the study also examined the perception of news organizations between Republicans and Democrats. Overall believability in news organizations has sharply declined since 2002. In 2002, 71% or individuals who could answer the question assigned a positive rating of 3 or 4, 30% assigned a negative rating of 1 or 2. In 2012, the positive rating had dropped to 56% while the negative rating had increased to 44%.

The newspapers that were used were the Wall Street Journal, New York Times, USA Today, and the local newspaper. Believability for all four dropped since 2002. The New York Times and USA Today dropped to from the mid to low 60s to 49%, the Wall Street Journal dropped from 77% to 58%, while local newspapers saw the least significant drop (64% to 57%).

Cable and local news still has general positive ratings, but these ratings have declined since 2002. CNN (76% to 58%) and MSNBC (73% to 50%) both maintained at least 50% overall believability. Fox News dropped just below the 50% mark (67% to 49%). Local TV news barely changed, as it stayed in the mid 60s (68% to 65%).

Network news maintained a relatively high credibility rating compared to newspapers and cable news. In 2002, ABC, CBS, and NBC News all had a believability rating of 72%. Ten years later, the numbers fell to 59%, 57%, and 55%. 60 Minutes went from 77% to 64% while NPR went from 62% to 52%.

When political affiliation is brought into the mix, the results change dramatically. Of the 13 organizations, Republicans were found to trust only five of them. These were 60 Minutes (51%), Wall Street Journal (57%), USA Today (50%), Local TV News (68%), and Fox News (67%). Democrats were the opposite. with the majority trusting all the organizations with the exception of Fox News (37%). Independents were in the middle. trusting eight organizations. As with Democrats, Fox News was the lowest scoring organization, with 43%.

Critique

While the poll and results do not discuss news analysis or are an example of it, the results show the problems that using only news analysis can cause. The issue that was examined was not the accuracy of the news that these organizations report on but the perceived accuracy. As the poll shows, there has been a decline in the perception of accuracy in news media. This in turn harms a method such as news analysis, a method that relies on these organizations for sources.

This poll also shows the biases that political leanings can cause and how they can be exploited. An example of possible exploitation is knowing the political leanings of a management team or CEO and releasing false information on certain networks or newspapers. These biases could lead to higher deception is exploited properly.

As for the study itself, there are some elements that could be improved on. One element that was done correctly was using a four point scale. By keeping the range simple, this avoided confusion that using a scale of 1 to 10 can cause. The biggest item that the study missed out on was polling on the believability of online news sources. Polling on the perceived believability of online news sources would have made this report more beneficial to the intelligence community, as the majority of sources that we use in our program are online.

Source: (2012). Further Decline in Credibility Ratings  for Most News Organizations. The Pew Research Center for the People and Press. Retrieved from http://www.people-press.org/files/2012/08/8-16-2012-Media-Believability1.pdf 

Large-Scale Sentiment Analysis for News and Blogs


Summary:

The article Large-Scale Sentiment Analysis for News and Blogs is a study by Namrata Godbole, Manjunath Srinivasaiah, and Steven Skiena assessing news analysis with a particular focus on sentiment.  Their method utilizes a system assigning scores that indicate a positive or negative opinion to each distinct entity in the text corpus.  The systems include a sentiment identification phase and a sentiment aggregation and scoring phase.  The sentiment identification phase associated the expressed opinions with each relevant entity while the sentiment aggregation and scoring phase scores each entity relative to those in the same class.  The study ends with an evaluation of the scoring techniques over a large corpus of news and blogs. 
By building off the Lydia text analysis system the authors determine the public sentiment on thousands of entities further determining how the sentiment varies with time.

Various aspects of the sentiment analysis system include Algorithmic Construction of Sentiment Dictionaries, Sentiment Index Formulation, and Evaluation of Significance.   The Algorithmic Construction of Sentiment Dictionaries portion of the study includes tracking the reference frequencies of adjectives with positive and negative connotations.  The authors incorporate a method that expands small candidate seed lists of positive and negative words into full sentiment lexicons that use path-based analysis in synonym and antonym sets in WordNet.  Furthermore, the authors use sentiment-alteration hop counts to determine the polarity strength of the candidate terms and eliminate any ambiguous terms.  Sentiment Index Formulation includes constructing a statistical index to reflect the significance of sentiment term juxtaposition.  The use of juxtaposition of sentiment terms and entities and a frequency-weighted interpolation with word happiness levels scores the overall entity sentiment.   Finally, the Evaluation of Significance element provides statistical evidence of the validity of the sentiment evaluation.  It does this by correlating the index with real-world events.
               
After presenting the overall structure of the study, a section describing a method to determine the semantic orientation of words is included.  An overview of sentiment analysis systems is also incorporated.  The next section focuses on sentiment lexicon generation.  The authors define separate lexicons for the seven sentiment dimensions used in the study including general, health, crime, sports, business, politics, and media.  The sentiment word generation algorithm used in the study expands a set of seed words by using synonym and antonym queries in multiple ways.  First, a polarity is associated to each word and query.  Second, the significance of a path decreases as a function of its length or depth from a seed word.  The final score of each word is the summation of the scores received over all the paths.  Two iterations are run on each word. The first iteration calculates a preliminary score estimate while the second re-enumerates the paths while calculating the number of apparent sentiment alternations.  Next, WordNet orders the synonyms and antonyms by sense.  Overall, the algorithm generates over 18,000 words.  

The sentiment lexicon generation was evaluated in two different was.  The evaluation was done using a n“un-test”as well as by comparing sentiment lexicons against the lexicons obtained by Wiebe.   To interpret and score the data the authors utilized sentiment lexicons to mark up all the sentiment words and associated entities in the corpus.  Finally, a section preceding the conclusion talks about news versus blogs and the significant differences they generated. 

Critique:
The article compares sentiment analysis in both news sources and blogs but lacks an intelligence perspective.  Although the authors mention these two sources are not comparable displaying different data for both, stating the issues and the people discussed in blogs varied considerably from newspapers; both analyses are important and deserve their own study.  After conducting separate studies, a comparative case study may prove to be effective.  The authors state they are interested in how sentiment can vary by demographic group and geographic location.  These findings can vary drastically between news sources and blogs reiterating the need for seperate studies. 
The overall study would have been more clear and effective if it were broken into two different studies on testing blogs and another news sources.  It would have likely allowed for a more detailed analysis of each topic. 

It also seems as though certain dimensions such as politics may receive more negative sentiment that sports.  Additionally, the period may have a very significant impact on sentiment, for instance, election years will generate more emotion over non-election years, other examples include the super bowl or World Series when people are paying more attention to sports.  In business, dips in the stock market or times when the market is doing very well will generate more emotion.  Other events such as the release of a new movie followed by reviews can skew results in terms of media.  Location is also a factor not included in the study that can affect sentiment.  For example, conservative newspapers or blogs generated in areas that are more conservative will likely produce different sentiment over more liberal newspapers and blog posts coming from more liberal areas.  Overall, although this research generates useful conclusions, there are many potential factors that have the ability to skew the data not accounted for in the study.  


Source:
Godbole, N., Skiena, S., Srinivasaiah, M. (2007). Large-Scale Sentiment Analysis for News and Blogs.  Proceedings of the International Conference on Weblogs and Social Media (ICWSM). 

Saturday, March 23, 2013

Internet news media and issue development: a case study on the roles of independent online news services as agenda-builders for anti-US protests in South Korea



Internet News Media And Issue Development: A Case Study On The Roles Of Independent Online News Services As Agenda-Builders For Anti-US Protests In South Korea

Summary:

Song (2007) conducted a study that compared online news sources to mainstream newspapers with their reactions to the death of two schoolgirls by a manned U.S. military vehicle in 2002.  Several groups and news agencies in South Korea used this event to as a way to share their grievances against the Status of United States Armed Forces Agreement (SOFA).   SOFA is a treaty between the U.S. and South Korea that outlines the legal procedure for crimes against U.S. military individuals.  Many South Koreans believe SOFA favors U.S. military individuals to commit crimes that go un-punished.  Despite Korean authorities asking for the jurisdiction to make the military officers be subjected to South Korean courts, the U.S. intervened under SOFA and delivered not-guilty verdicts for the death of the two South Korean schoolgirls, resulting in organized protests throughout South Korea (Song, 2007).  Song (2007) conducted a news analysis to determine if there were differences in the number and timing of news stories pertaining to the death of the two schoolgirls and related issues between traditional news media and online news sources. 

Song (2007) analyzed the publication material of five South Korean news organizations, which included three national newspapers and two online news sources over a 30-week period after the death of two South Korean schoolgirls in 2002.  The newspapers Chosun and JoongAng are the largest South Korea news organizations and are deemed to be conservative in nature.  The other newspaper source the Hankyoreh was analyzed and is deemed to be more progressive in its content.  The two online new sources Song (2007) added to his study were the PRESSian and OhmyNews.  PRESSian at the time of the study was the leading independent online new source in South Korea, while OhmyNews is a heavily opinionated new source that relies on citizen participation for its content (Song, 2007).

The results of the study demonstrated that the progressive news sources, especially the online news services had the largest influence on news publications.  The non-guilty verdict of the U.S. military court increased the amount of publications by progressive activists utilizing online news publications to convey disproval.  This increase influenced traditional news sources to increase publications as well, but not as frequently as the below chart demonstrates (Song, 2007).  This aspect seemed to suggest that online news sources under study were able to control how much traditional news sources reported on the murder of two South Korean schoolgirls.  The online media sources that were part of the sample seemed to be the catalyst for the escalation of reported news publications and escalating resentment towards SOFA, leading to large organized protests against the U.S. in South Korea.  Furthermore, the most intriguing aspect of the study was that during the studied 30-week period the trend in news publications decreased until the U.S. military court issued the not-guilty verdict during week 23, in which Internet news media agencies significantly increased news publications.  Song (2007) states that the influence of significant triggering news events is stronger than inter-media influence on impacting what the media should report on.  



















Critique:

Even though Song (2007) found interesting trends in news reporting both in traditional print sources and internet reporting of news for a particular event, it is important to clarify that his conclusions should be taken with some discretion.  The scope of the study was only limited to South Korea and over a short duration of time, 30 weeks.  To validate the results of the study it would be necessary to either choose another country to study or track another event in South Korea to see if similar findings to this study occur.

In terms of relating to intelligence practices, the use of the Internet to publish politicized material is important to be familiarized with.  Song’s (2007) study found that the reviewed Internet news sources published far more publications than traditional news sources and influenced traditional news sources to increase their coverage of events.   The author found that Internet media sources were able to escalate the reactions of South Korean citizens towards the perceived injustice with U.S. court provisions associated with SOFA.  Most significantly, South Korea’s robust Internet infrastructure has the capability to further escalate the power associated with online news sources and their influence on citizen perceptions and behavior.  The influence of online news media created organized protests for disproval against SOFA and increased anti-U.S. sentiment.  Not only was the Internet able to create organized protests, it was able to be a factor in the presidential race that year when the progressive candidate beat the conservative opponent in South Korea (Song, 2007). 

Ultimately, a news analysis such as conducted by Song (2007) demonstrates the importance to be familiarized with the many different sources of news publications, both traditional and online and how they influence societal behavior in a certain region.  The author’s study found that online media sources had a greater effect on influencing the reader’s behavior and attitudes.  Understanding both forms of open source data is key for an intelligence analyst in terms of probabilistic thinking in predicting future actions and likelihood of potential outcomes. Although Song’s (2007) results need to be tested again to prove reliability, the trends in the escalation of media coverage suggest the increasing role of the Internet to provide news coverage over traditional sources of news.  Online news media coupled with social media needs to be analyzed more frequently by the intelligence analyst of today because it offers more insights into predicting behavior in foreign regions.       



Source: Song, Yonghoi. (2007). Internet News Media And Issue Development: A Case Study On The Roles Of Independent Online News Services As Agenda-Builders For Anti-US Protests In South Korea. New Media Society, 9(1), 71-92. Retrieved from http://nms.sagepub.com/content/9/1/71.full.pdf+html