toward formal reasoning with epistemic policies about information quality in the twittersphere
DESCRIPTION
Presentation by VIStology Inc. at Fusion 2011 Conference, Chicago, IL. July 2011. Automatically evaluating the reliability and credibility of messages on Twitter.TRANSCRIPT
VIStology, Inc - Fusion 2011 1
TOWARD FORMAL REASONING WITH EPISTEMIC POLICIES ABOUT INFORMATION QUALITY IN THE TWITTERSPHERE
Brian Ulicny
VIStology, Inc.
Mieczyslaw Kokar Northeastern University and VIStology, [email protected]
VIStology, Inc - Fusion 2011 2
Arab Spring Uprisings 2011
VIStology, Inc - Fusion 2011 3
Situation Awareness (?):Al Jazeera’s Twitter Monitor
VIStology, Inc - Fusion 2011 4
Situation Awareness:Attention Spikes from Twitter
VIStology, Inc - Fusion 2011 5
Situation Awareness: Flu Trends from Social Media
Detecting influenza outbreaks by
analyzing Twitt
er messages
Aron Culotta
arXiv:1007.4748v1 [cs.IR] 27 Jul 2010
VIStology, Inc - Fusion 2011 6
Twitter as Open Source Intel
VIStology, Inc - Fusion 2011 7
Reliability (Source) Credibility (Reported Information)
A: Completely reliable. It refers to a tried and trusted source which can be depended upon with confidence.
1 : Confirmed by Other Sources. It can be stated with certainty that the reported information originates from another source than the already existing information on the same subject. (JC3IEDM: 3 Independent Sources)
B: Usually reliable. It refers to a source which has been successfully used in the past but for which there is still some element of doubt in particular cases.
2: Probably True. The independence of the source of any item of information cannot be guaranteed, but from the quantity and quality of previous reports, its likelihood is nevertheless regarded as sufficiently established. (JC3IEDM: 2 Independent Sources)
C: Fairly reliable. It refers to a source which has occasionally been used in the past and upon which some degree of confidence can be based.
3: Possibly True. Despite there being insufficient confirmation to establish any higher degree of likelihood, a freshly reported item of information that does not conflict with previously reported behaviour pattern of target. (1 …)
D: Not usually reliable. It refers to a source which has been used in the past but has proved more often than not unreliable. (JC3IEDM: The probability of producing erroneous information is high (>30%).)
4: Doubtful. An item of information which tends to conflict with the previously reported or established behaviour pattern of an intelligence target.
E: Unreliable. It refers to a source which has been used in the past and has proved unworthy of any confidence.
5: Improbable. An item of information that positively contradicts previously reported information or conflicts with the established behaviour pattern of an intelligence target in a marked degree.
F: Reliability cannot be judged. It refers to a source which has not been used in the past
6: Truth of information cannot be judged.
Confidence = <Reliability, Credibility>
VIStology, Inc - Fusion 2011 8
Problem Statement
• How can we assess not only the volume of tweets per time period
• And the frequency of terms they contain
• But the reliability, credibility & confidence of the information they convey
• In a potentially adversarial situation?
VIStology, Inc - Fusion 2011 9
Naïve STANAG 2022 for Twitter• Reliability = F: Cannot Be
Judged– All “sources not used in the
past”
• Credibility = 1: Confirmed by Other Sources– More than two string identical
tweets?
• Or Credibility = 3, Possibly True – Because Sources not
Independent– Because Path between all
sources in Twitter graph
VIStology, Inc - Fusion 2011 10
Need
• Tractable Way to Calculate:– Twitter Source Reliability– Twitter Content Credibility– Twitter Source Independence
• Where – Entire Twitter graph contains 105 Million Users• As of April, 2010
– 55 Million Tweets per Day– 3 Billion Requests per day to Twitter API
VIStology, Inc - Fusion 2011 11
The Argument from Google
• There are too many Twitter sources to evaluate their reliability directly.
• However, Google has shown that there is great value in using eigenvector centrality (PageRank) as a proxy for reliability.
• Therefore, we assume that a PageRank-like metric correlates with Reliability because
• (1) We assume that people do not pass along information they believe to be unreliable
• (2) Eigenvector centrality/retweet influence, unlike simple indegree centrality, is difficult to fake.
VIStology, Inc - Fusion 2011 12
Not Every Twitter User is Real
CENTCOMOperation Earnest Voice
VIStology, Inc - Fusion 2011 13
TunkRank as Reliability
• Influence(X) = Expected number of people who will read a tweet that X tweets, including all retweets of that tweet. For simplicity, we assume that, if a person reads the same message twice (because of retweets), both readings count.
• If X is a member of Followers(Y), then there is a 1/||Following(X)|| probability that X will read a tweet posted by Y, where Following(X) is the set of people that X follows.
• If X reads a tweet from Y, there’s a constant probability p that X will retweet it.
•
D. Tunkelang. 2009. A Twitter Analog to PageRank. http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/
VIStology, Inc - Fusion 2011 14
TunkRank as Reliability TunkRank vs Indegree Centrality (log scale)
Mapping TunkRank to STANAG 2022 Reliability
TunkRank Stanag 2022 Reliability
> 90th percentile A: Completely Reliable
> 80th percentile B: Usually Reliable
>50th percentile C: Fairly Reliable
< 50th percentile D: Not Usually Reliable
< 10th percentile E: Unreliable
Undefined F: Cannot Be Determined
VIStology, Inc - Fusion 2011 15
Unreliability Indicators
• If X retweets a message, e.g:
RT @Whitehouse Zombie uprising in Scranton
• And there is no corresponding original tweet
• Then X is E: Unreliable.
• If X tweets a message with the same URL (shortened or dereferenced)
• But different content• More than twice• Then X is D: Not Usually
Reliable.• (On the other hand:
Verification: Reliability )
VIStology, Inc - Fusion 2011 16
Source Independence
• There is a path connecting (nearly) every user in the Twitter graph.
• This does not mean that there is no source independence in Twitter.
• We count any sources as independent if they originate the message, and
• The shortest path between them is ≥ 4.
• In T.H. dataset, 4/20 tweets cite same NY Times URL via 3 shortened URLs.
• So, not independent.• Other news sources: 2 cite
Guardian, 1 BBC, 1 Der Spiegel, 1 WaPo, 1 Times of London
• No explicit Retweets• No Implicit Retweets• => 16 originating sources• Compute distance between
remaining sources
VIStology, Inc - Fusion 2011 17
Sameness of Content• String identical tweets are not independent. Implicit retweets
– @BWJones: Tim Hetherington, photographer and 'Restrepo' co-director, killed in Misrata, Libya http://nyti.ms/dIm29T 4/20/2011 6:16:25 PM
– @Frieze_magazine: Tim Hetherington, photographer and 'Restrepo' co-director, killed in Misrata, Libya http://nyti.ms/dIm29T 4/20/2011 7:01:30 PM
• Custom Regexes to handle dead/alive– Tweet =~ (<subject> .* (dead|died|killed|not alive|RIP) ) &&– Tweet !~ (<subject> .* (not (dead|died|killed)) => Dead
• Tim Hetherington, Restrepo director has been killed in Misurata
– OR: Tweet =~ (<subject>.*(alive|(not (killed|dead|died)) &&– Tweet !~ (<subject> .* (not alive|RIP) => Alive
• E.g. C H still alive. (true positive) Wish T H were still alive (false positive)• Misses: C H in serious condition ( |= alive)
• >2x P vs not-P: Confirmed P; not-P: Improbable; > 1.5x P vs not-P: Probably True P, Doubtful not-P; ~same P, not-P: Possibly-true P, Possibly-true not-P
• 435 Tweets report C H dead; vs 7 C H alive: Confirmed: C H Dead; Improbable: C H not Dead.
VIStology, Inc - Fusion 2011 18
Recap: Algorithm• Identify set of Tweets by Search API on name• Classify into Dead/Alive content• Calculate TunkRank on Users
– Discount false retweeters• Calculate Source Independence
– Group same media URLs; retweets, implicit retweets– Calculate distance between sources for joint network two hops out for each source.
• @NYTImesPhoto: An attack in Misurata, Libya today killed the photographer Tim Hetherington. 4/20/2011 7:11:15 PM– TunkRank: 99th percentile; > 5 independent sources assert T H died; 0 alive– <A:Completely Reliable, 1:Confirmed by Other Sources>
• @Cmovila: Sad news Tim Hetherington died in Misrata now when covering the front line. 4/20/2011 4:39:57 PM– TunkRank: 0th Percentile; > 5 Independent sources assert T H died; 0 alive– <E: Unreliable; 1:Confirmed by Other Sources>
• T H Alive: 5: Improbable>
VIStology, Inc - Fusion 2011 19
Notional Architecture
Twitter Search API
Tweet to RDF
Conversion
Message Classifier
TwitterAPI
DistanceCalculator
BaseVISorInference
Engine
Tweets Augmented with STANAG 2022Assessments
TunkRankAPI
VIStology, Inc - Fusion 2011 20
Conclusions
• Treating all Tweets as equally legitimate OK in non-adversarial, high volume situations.
• As OSINT, Tweets need to be evaluated according to the STANAG 2022 rubric
• We have outlined tractable ways to calculate reliability (TunkRank), credibility (sameness of content) and source (in)dependence.
• By converting Tweets to RDF, we can reason about them formally with a formal reasoner (BaseVISor)
• Future work: Do large scale demonstration showing efficacy in distinguishing low-confidence death rumors from high-confidence death notices on Twitter
VIStology, Inc - Fusion 2011 21
Questions?