language of politics on twitter - 02 twitter

53
Language of Politics on Twitter Summer School in AI American University Beirut June 16, 2015 Yelena Mejova @yelenamm Social Computing Group Qatar Computing Research Institute, HBKU

Upload: yelena-mejova

Post on 06-Aug-2015

41 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Language of Politics on Twitter - 02 Twitter

Language of Politics on TwitterSummer School in AI

American University BeirutJune 16, 2015

Yelena Mejova@yelenammSocial Computing GroupQatar Computing Research Institute, HBKU

Page 2: Language of Politics on Twitter - 02 Twitter

political

twitter

analysis

Page 3: Language of Politics on Twitter - 02 Twitter

Usersindividualsnewsorganizationsbots…

Page 4: Language of Politics on Twitter - 02 Twitter

#hashtagsword or phrase preceded by a hash mark (#), used within a message to identify a keyword or topic of interest and facilitate a search for it

Page 5: Language of Politics on Twitter - 02 Twitter

linksall links are shortened by Twitter to form t.co/…

shortercontrol for spam, malware, phishingcollect clickthrough information

Page 6: Language of Politics on Twitter - 02 Twitter

MEMEan idea, behavior, or style that spreads from person to person within a culture

Richard Dawkins

MEME

Page 7: Language of Politics on Twitter - 02 Twitter

Monthly active users302 million (4/28/2015)

Total number of Twitter registered users“about a billion” (9/16/13)

Unique monthly visitors to Twitter.com (desktop)36 million (10/3/13)

Daily active twitter users100 million (10/3/13)

Number of Twitter accounts that have ever sent a tweet

550 million (4/14/14)

Page 8: Language of Politics on Twitter - 02 Twitter
Page 9: Language of Politics on Twitter - 02 Twitter

TWITTER

TWITTER RESEARCH

Google Trends

Page 10: Language of Politics on Twitter - 02 Twitter

userstweets

relationships

Page 11: Language of Politics on Twitter - 02 Twitter

Twitter API

Page 12: Language of Politics on Twitter - 02 Twitter
Page 13: Language of Politics on Twitter - 02 Twitter

https://dev.twitter.com/overview/documentation

Page 14: Language of Politics on Twitter - 02 Twitter

users

Page 15: Language of Politics on Twitter - 02 Twitter
Page 16: Language of Politics on Twitter - 02 Twitter
Page 17: Language of Politics on Twitter - 02 Twitter

try it yourself

• go to https://apigee.com/console/twitter • select OAuth1 from Authentication and log in

using your Twitter account

Page 18: Language of Politics on Twitter - 02 Twitter

Select api.twitter.com/1.1 from Service

Click on theon the left to see a list of API methods

Page 19: Language of Politics on Twitter - 02 Twitter

• select• enter your Twitter handle into screen_name

and click

Page 20: Language of Politics on Twitter - 02 Twitter
Page 21: Language of Politics on Twitter - 02 Twitter

http://jsonviewer.stack.hu/

Page 22: Language of Politics on Twitter - 02 Twitter

http://www.faceplusplus.com/demo-detect/

More info from picture

Page 23: Language of Politics on Twitter - 02 Twitter

questions

where are you from?are you male or female?what job do you have?

when did you join?how active are you?

what do you look like?are you a bot?

Page 24: Language of Politics on Twitter - 02 Twitter

tweets

Page 25: Language of Politics on Twitter - 02 Twitter
Page 26: Language of Politics on Twitter - 02 Twitter
Page 27: Language of Politics on Twitter - 02 Twitter
Page 28: Language of Politics on Twitter - 02 Twitter

#!/usr/bin/env python# -*- coding: utf-8 -*-

from tweepy.streaming import StreamListenerfrom tweepy import OAuthHandlerfrom tweepy import Streamimport sysimport urllib

# Go to http://dev.twitter.com and create an app.# The consumer key and secret will be generated for you afterconsumer_key = '4x8XS232ncHXewIOPa50eZZWz'consumer_secret = '0rjF9c34QgjK6nlL9zSpptAmVntDDsXRKV5JS3sQ0bi15flq5Y'

# After the step above, you will be redirected to your app's page.# Create an access token under the the "Your access token" sectionaccess_token = '2958638362-6VIJ2S7zSX7ellLHvrFLbsJKBKimIDuk62O8ZNP'access_token_secret='EwqIjYNJKDGhJskYHdMS8nX7dBqpxB94qmmarJL058B9I'

class StdOutListener(StreamListener): """ A listener handles tweets are the received from the stream. This is a basic listener that just prints received tweets to stdout. """

def on_data(self, data): print data[:-1] return True

def on_error(self, status): print status

Querying public stream using python(1)https://tinyurl.com/aiss15-gettweets

Page 29: Language of Politics on Twitter - 02 Twitter

def auto_restart_stream(auth,listner,l_keywords): while True: try: sapi = Stream(auth, l) sapi.filter(track=l_keywords) except: #print 'Restarting ;)' continue

if __name__ == '__main__': keywords = [u'Cátar',u'Catar',u'Katar',u'Katara',u'Kataras',u'Katari',u'Kataro',u'Qadar',u'Qatar',u'u'कतर',u'ਕਤਰ',u'卡塔尔,'قطر ',u'卡塔爾 ',u'카타르 ',u'קטאר',u'कत�र',u'કતા�ર''కతర్',u'ກາຕາ',u'カタール ',u'Κατάρ',u'Катар',u'Қатар',u' ',u'ատար কা�তা�র',u'ಕತಾ�ರ್ �',u'ഖത്തർ',u'කටා�ර්',u'กาตาร์�',u'קַאטַאר',u'கத்தா�ர்',u'ប្�ទេ�សកាតា',u'ကတနို��င်�င်�'] l = StdOutListener() auth = OAuthHandler(consumer_key, consumer_secret) auth.set_access_token(access_token, access_token_secret) auto_restart_stream(auth,l,keywords)

Querying public stream using python(2)https://tinyurl.com/aiss15-gettweets

Page 30: Language of Politics on Twitter - 02 Twitter

{"created_at":"Wed May 13 11:44:24 +0000 2015","id":598453736839598080,"id_str":"598453736839598080","text":"Don't get star struck often but I like this guy @Mo_Farah you the man boss! Much respect to you! #Doha #qatar http:\/\/t.co\/wf8nc0C527","source":"\u003ca href=\"http:\/\/twitter.com\/download\/iphone\" rel=\"nofollow\"\u003eTwitter for iPhone\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":788413,"id_str":"788413","name":"Mohsin Ali","screen_name":"mohsin","location":"Doha, Qatar","url":"http:\/\/mohsinali.com","description":"Digital story telling, infogrpahics, interactives, R&D, Emerging Technologies, Future Trends, Innovation @ajlabs, Global Nomad, Likes Maps. LBA, DHA, BHA, DOH","protected":false,"verified":false,"followers_count":2422,"friends_count":645,"listed_count":69,"favourites_count":889,"statuses_count":10756,"created_at":"Thu Feb 22 11:11:01 +0000 2007","utc_offset":10800,"time_zone":"Riyadh","geo_enabled":true,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http:\/\/pbs.twimg.com\/profile_background_images\/462946198211407873\/xWaKYtpF.jpeg","profile_background_image_url_https":"https:\/\/pbs.twimg.com\/profile_background_images\/462946198211407873\/xWaKYtpF.jpeg","profile_background_tile":true,"profile_link_color":"0084B4","profile_sidebar_border_color":"FFFFFF","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/1249217364\/n504379828_3076_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/1249217364\/n504379828_3076_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/788413\/1399210132","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":{"type":"Point","coordinates":[25.316197,51.498302]},"coordinates":{"type":"Point","coordinates":[51.498302,25.316197]},"place":{"id":"0181f32937df0de8","url":"https:\/\/api.twitter.com\/1.1\/geo\/id\/0181f32937df0de8.json","place_type":"admin","name":"Doha","full_name":"Doha, Qatar","country_code":"QA","country":"\u062f\u0648\u0644\u0629 \u0642\u0637\u0631","bounding_box":{"type":"Polygon","coordinates":[[[51.4477039,25.2216],[51.4477039,25.4263938],[51.630581,25.4263938],[51.630581,25.2216]]]},"attributes":{}},"contributors":null,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[{"text":"Doha","indices":[97,102]},{"text":"qatar","indices":[103,109]}],"trends":[],"urls":[],"user_mentions":[{"screen_name":"Mo_Farah","name":"Mo Farah","id":83855918,"id_str":"83855918","indices":[48,57]}],"symbols":[],"media":[{"id":598453717596119040,"id_str":"598453717596119040","indices":[110,132],"media_url":"http:\/\/pbs.twimg.com\/media\/CE4ifEPUIAAhCsG.jpg","media_url_https":"https:\/\/pbs.twimg.com\/media\/CE4ifEPUIAAhCsG.jpg","url":"http:\/\/t.co\/wf8nc0C527","display_url":"pic.twitter.com\/wf8nc0C527","expanded_url":"http:\/\/twitter.com\/mohsin\/status\/598453736839598080\/photo\/1","type":"photo","sizes":{"small":{"w":340,"h":453,"resize":"fit"},"medium":{"w":600,"h":800,"resize":"fit"},"thumb":{"w":150,"h":150,"resize":"crop"},"large":{"w":768,"h":1024,"resize":"fit"}}}]},"extended_entities":{"media":[{"id":598453717596119040,"id_str":"598453717596119040","indices":[110,132],"media_url":"http:\/\/pbs.twimg.com\/media\/CE4ifEPUIAAhCsG.jpg","media_url_https":"https:\/\/pbs.twimg.com\/media\/CE4ifEPUIAAhCsG.jpg","url":"http:\/\/t.co\/wf8nc0C527","display_url":"pic.twitter.com\/wf8nc0C527","expanded_url":"http:\/\/twitter.com\/mohsin\/status\/598453736839598080\/photo\/1","type":"photo","sizes":{"small":{"w":340,"h":453,"resize":"fit"},"medium":{"w":600,"h":800,"resize":"fit"},"thumb":{"w":150,"h":150,"resize":"crop"},"large":{"w":768,"h":1024,"resize":"fit"}}}]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"en","timestamp_ms":"1431517464252"}

https://tinyurl.com/aiss15-tweetjson

Page 31: Language of Politics on Twitter - 02 Twitter

http://jsonviewer.stack.hu/

JSON Tweet Object

Page 32: Language of Politics on Twitter - 02 Twitter

JSON Tweet Objecthttp://jsonviewer.stack.hu/

Page 33: Language of Politics on Twitter - 02 Twitter

JSON Tweet Objecthttp://jsonviewer.stack.hu/

Page 34: Language of Politics on Twitter - 02 Twitter

JSON Tweet Objecthttp://jsonviewer.stack.hu/

Page 35: Language of Politics on Twitter - 02 Twitter

import jsonimport codecsfrom geopy import *

fin = open("rawTweets.txt",'r')fout = open("parsedTweets.txt",'w')

line = fin.readline().rstrip()while (line): jdict = json.loads(line)

if jdict['coordinates'] != None or jdict['place'] != None: # Coordinates if jdict['coordinates'] != None: longitude = jdict['coordinates']['coordinates'][0] latitude = jdict['coordinates']['coordinates'][1] fout.write(str(longitude)+'\t’) fout.write(str(latitude)+'\t')

# Tweet id fout.write(str(jdict['id'])+'\t’) # User screen name fout.write(jdict['user']['screen_name'].encode("UTF-8")+'\t’) # Timestamp fout.write(str(jdict['timestamp_ms'])+'\t’) # User's language fout.write(jdict['user']['lang']+'\t’) # Text fout.write(jdict['text'].encode("UTF-8").replace('\n'," ").replace('\r\n',"")) fout.write('\n')

line = fin.readline().rstrip()

fin.close()fout.close()

Extracting individual fields from JSONhttps://tinyurl.com/aiss15-cleanjson

Page 36: Language of Politics on Twitter - 02 Twitter

Tab Separated Value (TSV) format

Page 37: Language of Politics on Twitter - 02 Twitter

Language Model

http://tweetcloud.icodeforlove.com/

workshop 25twitter 20religion 17interaction 12online 12dyad 9research 9accepted 7…

Page 38: Language of Politics on Twitter - 02 Twitter

Activity

http://www.tweetails.com/

Page 39: Language of Politics on Twitter - 02 Twitter

Mentions

http://www.tweetails.com/

Page 40: Language of Politics on Twitter - 02 Twitter

questions

what are you interested in?how do you eat/sleep/work/hang out?

how happy are you?what political opinions do you have?what outside sources do you link to?

what new emerging topics are you mentioning?how do you behave?

are you a bot?

Page 41: Language of Politics on Twitter - 02 Twitter

network

Page 42: Language of Politics on Twitter - 02 Twitter

networknodesedges

Page 43: Language of Politics on Twitter - 02 Twitter

User Network

Page 44: Language of Politics on Twitter - 02 Twitter

User Network

Page 45: Language of Politics on Twitter - 02 Twitter

Follower Network

Page 46: Language of Politics on Twitter - 02 Twitter

Mention Network

Page 47: Language of Politics on Twitter - 02 Twitter

Mention Networkfor hashtags

Page 48: Language of Politics on Twitter - 02 Twitter

questions

how influential are you?how influential are your connections?

who influences you?what are people around you like?

do you bring together different communities?how fast will you know about a piece of news?

are you an opinion leader?are you a bot?

Page 49: Language of Politics on Twitter - 02 Twitter

resources

Page 50: Language of Politics on Twitter - 02 Twitter

https://dev.twitter.com/overview/documentation

Page 51: Language of Politics on Twitter - 02 Twitter

https://apigee.com

Page 52: Language of Politics on Twitter - 02 Twitter

try it in your favorite language

https://dev.twitter.com/overview/api/twitter-libraries

Page 53: Language of Politics on Twitter - 02 Twitter

next

using Twitter data for real-world political speech mining