insight dataengineering henok_fourthdemo
TRANSCRIPT
Where is my tweet?Henok Mengistu
Insight Data Engineering Fellow
Silicon ValleySummer 2016
Motivation
Motivation
But, this number doesn't show how the tweet spreads-out?
But, a re-tweet graph could show
The data● An original tweet
{ ….
"text": "McAllen, Texas- 8 miles from U.S. - Mexico border ", "id": 743485792585146368, "user": { "id": 25073877,"screen_name": "realDonaldTrump" },"created_at": "Thu Jun 16 16:50:00 +0000 2016"
…. }
● A re-tweet {…
"text": “McAllen, Texas- 8 miles from U.S. - Mexico border” "screen_name": "trinnitythomps1","created_at": "Thu Jun 16 16:50:13 +0000 2016", "user": {"id": 2894078186}
"retweeted_status":{"id": 743485792585146368}
...}
Data Pipeline
Watches out for a tweet by a specific user
Filter and sort re-tweetsand
put them in Redis
Builds graph
Creates a topic with tweets as messages
● I am Henok– Originally, from Ethiopia
– Currently, a PhD student at the University of Wyoming
● Working on Evolutionary Computation
– I like playing and watching Soccer
– But skiing, not so much so
Thank you!
Queries
● On the re-tweet graph
– who are my audiences? ● Geographically, social groups
– Betweenness centrality ● Who is relevant to spread out my tweet?● Identify influential followers