semantics2016 - exploring dynamics and semantics of user interests for user modeling on twitter for...

23
Guangyuan Piao, John G. Breslin Unit for Social Semantics 12 th International Conference on Semantic Systems Leipzig, Germany, 12-15, September, 2016 Exploring Dynamics and Semantics of User Interests for User Modeling on Twitter for Link Recommendations

Upload: guangyuan-piao

Post on 12-Apr-2017

660 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Guangyuan Piao, John G. Breslin

Unit for Social Semantics

12th International Conference on Semantic Systems

Leipzig, Germany, 12-15, September, 2016

Exploring Dynamics and Semantics of

User Interests for User Modeling on Twitter for

Link Recommendations

2

1/3 users seek medical information

and over 50% users consume news

on Social Networks

Facebook and Twitter together generate

more than 5 billion microblogs / day

[SOURCE] Semantic Filtering for Social Data, Amit et al., Internet Computing’16

Background – User Modeling

content enrichment

analysis &

user modeling

interest profile

?

personalized content

recommendations

(How) can we infer

user interest profiles

that support the

content recommender?

3[SOURCE] Analyzing User Modeling on Twitter for Personalized News Recommendations, UMAP’11

Background – User Modeling

Representation of User Interest

Bag of

Words

Topic

Modeling

Bag of

Concepts

users' interests are

represented as a

set of words

topics: co-occurring words

document: mixture of topics

users' interests are

represented as a

set of concepts

• can exploit background

knowledge about concepts

for interest propagation

• focus on words

• assumption: a single doc contains rich information

• cannot provide semantic relationships among words

Bag-of-Concepts

dbpedia:The_Black_Keys

dbpedia:Eagles_of_Death_Metal

Background – User Modeling

dbpedia:The_Wombats

Weighting Scheme: importance of a concept for user

dbpedia:The_Black_Keys (3)

dbpedia:Eagles_of_Death_Metal (5)

Background – User Modeling

dbpedia:The_Wombats (2)

Concept Frequency (CF)

Semantic Interest Propagation

• different structures of DBpedia beyond category information are

not fully explored

Related Work – Semantics

dbpedia:The_Black_Keys

dbpedia:The_Wombats

dbc:Rock_music_duos

dbpedia:Indie_rock

subject

genredbpedia:The_Black_Keys

Temporal Dynamics of User Interests

• assumption: user interests might change over time

• no comparative evaluation over different methods

Related Work – Dynamics

long-term user profile

short-term user profile

interest decay function

historical user-generated content (UGC)

(e.g., the last two weeks UGC)

Concepts

• entities, categories and classes from DBpedia which can be used

for representing user interests

Definition

dbpedia:The_Black_Keys

dbc:Rock_music_duossubject

yago:BluesRockMusicianstype

entity

category

class

CF-IDF: Concept Frequency – Inverse Document Frequency

• Weighting Scheme: CF-IDF vs. CF

• Semantics: explore different structures of DBpedia

• Dynamics: comparative study on different methods

Aim of Work

We propose and evaluate user modeling strategies

using best-performing strategies in the three dimensions

11

User Modeling Framework

semantic interest

propagation

temporal dynamics

• category

• …

• category & property

• Ahmed

• …

• OrlandiUser Profiles

P(u)

Google Category:

Smartphones

… iPhone

0.09 0.12 … 0.08

a concept-based profile P(u)

weighting

scheme

entity-based

user profiles

12

Core propagation strategies

• category-based

SP: sub-pages of the category

SC: sub-categories of the category

• class-based

SP’: sub-pages of the class

SC’: sub-classes of the class

Semantic Interest Propagation

13

Semantic Interest Propagation

Core propagation strategies

• property-based

P: property count in DBpedia graph

Combine different semantics

14

Dynamics of User Interests

Interest decay functions

• Long-term(Orlandi) [SEMANTiCS]

• Long-term(Ahmed) [SIGKDD]

Long-term(Ahmedα): μ2week, μ2month, μall

• Long-term(Abel) [WebSci]

μweek = μ= e -1

μmonth = μ 2

μall = μ 3

Dataset

• 322 users: shared at least one link in the last two weeks

• 247,676 tweets in total

Experiment

• task: recommending 10 links (URLs)

• recommendation algorithm: cosine similarity(P(u), P(i))

P(i): item (link) profile using the same modeling strategy for P(u)

• ground truth links: links shared in the last two weeks

• candidate links: 15,440 links

15

Experiment Setup

used for user modeling

ground truth

links (URLs)

recommendation time

Results

16

Study of Weighting Scheme

using CF-IDF improves the performance significantly (<.05)

# of concepts after propagation

17

Semantic Interest Propagation

• on average, 224 concepts before extension

• 1,865, 1,317 and 1,152 respectively after extension

Recommendation Results

18

Semantic Interest Propagation

combining different structures of information improves the

performance in the context of link recommendations

Results

19

A Comparative Study of Dynamics

Ahmed’s and Orlandi’s methods

provide competitive performance

in line with previous studies,

using interest decay functions

improves the performance

20

Our User Modeling Strategies

extension strategy

using DBpedia

temporal dynamics

• category & property

• Ahmedα

Google Category:Smartphones

… iPhone

0.09 0.12 … 0.08

a concept-based profile P(u)

weighting

scheme

(CF-IDF)

entity-based

user profiles

um(weighting scheme, temporal dynamics, propagation strategy)

User Profiles

P(u)

Results

21

Compare to State-of-Art

outperform baselines

best: um(CF-IDF, none,

category & property)

Conclusions & Future Work

22

• CF-IDF & combining different structures of DBpedia are beneficial

- um(CF-IDF, none, category & property) performs best

• Ahmed’s and Orlandi’s methods provide competitive performance

for capturing dynamics of user interests

• investigation of combining different dimensions of user modeling

• richer interest representation beyond concepts for users

23

Thank you for your attention!

Guangyuan Piao

homepage: http://parklize.github.io

e-mail: [email protected]

twitter: https://twitter.com/parklize

slideshare: http://www.slideshare.net/parklize