using topic models for twitter hashtag recommendation

12
ELIS – Multimedia Lab Fréderic Godin , Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de Walle Using Topic Models for Twitter Hashtag Recommendation Multimedia Lab, Ghent University – iMinds, Belgium Reservoir Lab, Ghent University, Belgium Image and Video Systems Lab, KAIST, South Korea

Upload: fgodin

Post on 15-Jan-2015

1.248 views

Category:

Technology


1 download

DESCRIPTION

Presentation given at the Making Sense of Micropost Worksop at the World Wide Web conference of 2013

TRANSCRIPT

Page 1: Using Topic Models for Twitter hashtag recommendation

ELIS – Multimedia Lab

Fréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de Walle

Using Topic Models for Twitter Hashtag Recommendation

Multimedia Lab, Ghent University – iMinds, Belgium

Reservoir Lab, Ghent University, Belgium

Image and Video Systems Lab, KAIST, South Korea

Page 2: Using Topic Models for Twitter hashtag recommendation

2

ELIS – Multimedia Lab

Using Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de Walle

Making Sense of Microposts Workshop @ World Wide Web Conference 2013

Introduction (1)

Indexing

Search

Linking

General Topic

Memes Grouping

Information retrieval

Page 3: Using Topic Models for Twitter hashtag recommendation

3

ELIS – Multimedia Lab

Using Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de Walle

Making Sense of Microposts Workshop @ World Wide Web Conference 2013

Introduction (2)

±10% of tweets contain a hashtag

3% of the hashtags are used more than 5 times

Indexing

Search

Linking

General Topic

MemesGrouping

Page 4: Using Topic Models for Twitter hashtag recommendation

4

ELIS – Multimedia Lab

Using Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de Walle

Making Sense of Microposts Workshop @ World Wide Web Conference 2013

Goal

Suggest keywords that resemble the general topic of a tweet and that could be used as a hashtag

Promote hashtags for effective indexing

Allow for effective search of tweets through hashtags

Reduce the use of sparse hashtags

Page 5: Using Topic Models for Twitter hashtag recommendation

5

ELIS – Multimedia Lab

Using Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de Walle

Making Sense of Microposts Workshop @ World Wide Web Conference 2013

Architectural overview

Basic filterTweetLanguage identificati

on

Topic distribution

Hashtag suggestion

Hashtagged tweet

Page 6: Using Topic Models for Twitter hashtag recommendation

6

ELIS – Multimedia Lab

Using Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de Walle

Making Sense of Microposts Workshop @ World Wide Web Conference 2013

Basic filter

Clean up the tweet: URLs, special HTML entities, digits, punctuations, the hash character, …

During training:Remove tweets with just one wordRemove retweets

Page 7: Using Topic Models for Twitter hashtag recommendation

7

ELIS – Multimedia Lab

Using Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de Walle

Making Sense of Microposts Workshop @ World Wide Web Conference 2013

Language identification

Why We need to build a language-dependent topic model.

Goal Build unsupervised classifier that discriminates between English and non-English tweets.

How Using Naive Bayes and the Expectation-Maximization algorithm + character n-gram features

Result Evaluation on a test set of 1000 randomly selected tweets

Lui & Baldwin (LangID.py)

Our algorithm

Precision

97.9% 97.0%

Recall 91.8% 97.8%

F1 94.8% 97.4%

Page 8: Using Topic Models for Twitter hashtag recommendation

8

ELIS – Multimedia Lab

Using Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de Walle

Making Sense of Microposts Workshop @ World Wide Web Conference 2013

Calculating the topic distribution

Idea Find the general topic(s) of a tweet

How Using Latent Dirichlet Allocation to find the topic distribution in an unsupervised manner

Training 1.8 million tweets pre-filtered on 4000 keywords200 topics, α=0.1, β=0.1

Example “Please RT!! sign Bernie Sanders petition for the fiscal cliff! http://..”

0 1 2 3 57 199[0.1; 0.0 ; 0.0 ; 0.0 ; … ; 0.8 ; … ; 0.05]

Topic 57:1. Fiscal2. Political3. President…

Page 9: Using Topic Models for Twitter hashtag recommendation

9

ELIS – Multimedia Lab

Using Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de Walle

Making Sense of Microposts Workshop @ World Wide Web Conference 2013

Hashtag suggestion (1)

Idea Suggest a number of hashtags based on the topic distribution of the tweet

How Sample the topic distribution and suggest the top ranked keywords

Yay, we got sixth period today school business light time period

Please RT!! Sign Bernie Sanders petition for the fiscall! Http://.. fiscal political traffic president policy

comfort, elegance, prettiness little good love relationship god

Example

Page 10: Using Topic Models for Twitter hashtag recommendation

10

ELIS – Multimedia Lab

Using Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de Walle

Making Sense of Microposts Workshop @ World Wide Web Conference 2013

Hashtag suggestion (2)

0 1 2 3 4 5 6 7 8 9 100

5

10

15

20

25

30

35

5 hashtags

10 hashtags

Number of correctly suggested hashtags

Perc

en

tag

e of

tweets

(%

)Evaluation of 100 tweets

Page 11: Using Topic Models for Twitter hashtag recommendation

11

ELIS – Multimedia Lab

Using Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de Walle

Making Sense of Microposts Workshop @ World Wide Web Conference 2013

Conclusions and Future Work

We built a hashtag recommendation system:Suggests general keywordsUnsupervised

In the future:Use more context information: semantic web, social graph,…Adopt a hybrid approach between general and specifichashtags

Page 12: Using Topic Models for Twitter hashtag recommendation

12

ELIS – Multimedia Lab

Using Topic Models for Twitter Hashtag RecommendationFréderic Godin, Viktor Slavkovikj, Wesley De Neve, Benjamin Schrauwen and Rik Van de Walle

Making Sense of Microposts Workshop @ World Wide Web Conference 2013

#Questions @frederic_godin