Vistology Fusion 2011
Total Page:16
File Type:pdf, Size:1020Kb
Toward Formal Reasoning with Epistemic Policies about Information Quality in the Twittersphere Brian Ulicny Mieczyslaw M. Kokar VIStology, Inc Department of Electrical and Computer Framingham, MA, U.S.A. Engineering [email protected] Northeastern University Boston, MA, USA [email protected] Abstract – Some recent systems have had success in ‘@’. Users can annotate a message by topic with a producing an accurate awareness of situations by hashtag: a folksonomy term prepended with ‘#’. Users mining traffic in Twitter. Where these systems have can subscribe to the messages of other users by following been successful, there has been no issue of evaluating them. Users can send private messages to someone who Twitter streams for source reliability and information follows them by by prefixing their message with ‘DM’ credibility because the situations have not been (direct message) and the username. Users typically adversarial. Recent uses of Twitter in political dissent in shorten URLs in their tweets via various services (e.g. the Mideast makes the need for computationally bit.ly) to maximize the 140-character message length. tractable approaches to evaluating source reliability and These shortened URLs are unique to the originating information credibility more acute in order to achieve message. Users can also retweet a message, indicating accurate situation awareness on the basis of Twitter whom it came from by simply prepending the message streams in the face of deliberate mis- or disinformation with ‘RT’ and the originators username. Users can efforts. automatically associate a geolocation with their message if their phone or other device supports this, and they turn Keywords: Twitter; soft data fusion; situation awareness; this option on. (Less than 1% of Twitter status updates information evaluation; reliability; credibility; source are geolocated currently). Twitter messages are archived independence after six months. Users can provide a short profile message, a profile picture, and a URL to provide more background. 1 Introduction Twitter, and other such platforms, are particularly interesting because they are public. Anyone can follow Twitter has become the best-known example of a what is going on in the Twittersphere simply by broadcast system for short “status update” messages. ‘following’ users or topics (called hashtags) or keywords. Such platforms have become associated with organizing Twitter verifies some famous users’ identities, and and mobilizing political dissent and disruption [1]. In the indicates this status on their profile. In general, however, recent 2011 uprisings in the Middle East, in Tunisia and users are not verified, and anyone can tweet under Egypt [2], Twitter and Facebook are widely believed to whatever name they like. Twitter suggests that by have played a major part in organizing and mobilizing providing a link to one’s Twitter feed on their website, elements of society to overthrow the governments in those this can provide user authentication as well. countries, although some observers have stated that the Although, we focus on Twitter here, Facebook and role of social media platforms like Twitter in sparking Google Buzz provide similar functionality. Also, the similar uprisings in Iran and Moldova has been overstated Ushahidi platform (ushahidi.org) combines a map overlay [3]. As unrest continues in the Mideast, regardless of with the ability to post reports by location, via cell phone whether Twitter and similar social media are an essential texts or from Twitter or anonymously from the web, technology for initiating or organizing such dissent or not, primarily in humanitarian relief situations. It has been it is clear that the use of technologies like Twitter cannot used to monitor election fraud in Afghanistan and be ignored as a source of situation awareness. responses to the 2010 Haitian earthquake. Twitter, on which we will focus here, is a platform By monitoring Twitter, in principle we can discover by which users can sign up for a free password- what users are talking about and interested in from authenticated account anywhere in the world. Users can moment to moment. Although individual tweets may not post short messages with a 140-character limit associated provide much insight, aggregated Tweets may convey a with their username via their computer, smartphone or strong signal about the situation they reflect. For SMS; currently, approximately 55 million tweets are sent example, Figure 1, from the Twitter blog, shows tweets each day 0. Messages are time stamped. Users can address per second over time for the hashtag #superbowl during another user with an @tag: a username prepended with the 2011 NFL Superbowl game. The spikes in the graph Reliability of the source is designated by a letter A of tweets per second clearly correlate strongly with to F signifying various degrees of confidence as follows: scoring in the game. Other spikes correlate with moments A: Completely reliable. It refers to a tried and trusted in the game’s half-time show, particularly the surprise source which can be depended upon with confidence. appearance of one performer. Armed only with thèse B: Usually reliable. It refers to a source which has been tweets, it is likely that one could recreate an account of successfully used in the past but for which there is still what happened in the game and when, by looking for some element of doubt in particular cases. commonalities in the messages at the times corresponding C: Fairly reliable. It refers to a source which has to spikes. occasionally been used in the past and upon which some Similarly, Culotta has shown [13] that influenza degree of confidence can be based. outbreaks can be tracked in near-real time quite D: Not usually reliable. It refers to a source which has effectively just by looking for simple keywords in tweets. been used in the past but has proved more often than not Culotta validated his models by comparing Twitter results unreliable. with weekly epidemiological reports from the Center for E: Unreliable. It refers to a source which has been used in Disease Control. the past and has proved unworthy of any confidence. F: Reliability cannot be judged. It refers to a source Figure 1 NFL Superbowl 2011 #Superbowl Tweets per which has not been used in the past second (from Twitter blog) Credibility: The credibility of a piece of information is rated numerically from 1 to 6 as follows: 1: If it can be stated with certainty that the reported information originates from another source than the already existing information on the same subject, then it is classified as "confirmed by other sources''.2 2: If the independence of the source of any item of information cannot be guaranteed, but if, from the quantity and quality of previous reports, its likelihood is nevertheless regarded as sufficiently established, then the information should be classified as ``probably true''. What the Super Bowl and flu situations have in 3: If, despite there being insufficient confirmation to common, is that there is little reason for a Twitter user to establish any higher degree of likelihood, a freshly publish mis- or disinformation. Therefore, the tweets can reported item of information does not conflict with the be taken at face value. In this paper, our focus will be not previously reported behaviour pattern of the target, the primarily on aggregating situation awareness from a item may be classified as ``possibly true''. multitude of presumably sincere tweets, but formally 4: An item of information which tends to conflict with the evaluating tweets for their information quality along previously reported or established behaviour pattern of an several dimensions that are relevant to adversarial, or intelligence target should be classified as ``doubtful'' and partially adversarial, situations. That is, while current given a rating of 4. approaches to situation awareness via Twitter treat every 5: An item of information that positively contradicts tweet equally, because of the adversarial nature of the previously reported information or conflicts with the struggles in which Twitter plays a big part, it may be established behaviour pattern of an intelligence target in prudent to treat tweets differentially in terms of their a marked degree should be classified as ``improbable'' reliability, credibility, and other epistemic properties and given a rating of 5. before constructing a depiction of the situation from them. 6: An item of information the truth of which cannot be judged. 2 Information Evaluation As such, the credibility metric involves notions of NATO STANAG (Standard Agreement) 2022 source independence, (in)consistency with past reports, “Intelligence Reports” states that [5] where possible, “an and the quality and quantity of previous reports. evaluation of each separate item of information included in an intelligence report, and not merely the report as a whole” should be made. It presents an alpha-numeric Intelligence Collector Operations” (2006) without citing rating of “confidence” in a piece of information which STANAG 2022. JC3IEDM [6] includes a reporting-data- combines a measurement of the reliability of the source of reliability-code rubric that is nearly identical, with some quantitative guidance (“not usually reliable” means less than the information and a numeric measurement of the 70% accurate over time.) 2 credibility of a piece of information “when examined in JC3IEDM’s reporting-data-accuracy codes are nearly identical 1 the light of existing knowledge”. to these except that the top three categories refer to confirmation by 3, 2 or 1 independent sources, respectively. JC3IEDM also contains an additional, unrelated reporting-data-credibility-code 1 The same matrix is presented in Appendix B “Source and (reported as fact, reported as plausible, reported as uncertain, Information Reliability Matrix” of FM-2-22.3 “Human indeterminate); it is not clear how it relates to the others. 2.1 Current Approaches to Reliability documents (those that are themselves pointed to by high quality documents) more highly.