semantics2016 - exploring dynamics and semantics of user interests for user modeling on twitter for...
TRANSCRIPT
Guangyuan Piao, John G. Breslin
Unit for Social Semantics
12th International Conference on Semantic Systems
Leipzig, Germany, 12-15, September, 2016
Exploring Dynamics and Semantics of
User Interests for User Modeling on Twitter for
Link Recommendations
2
1/3 users seek medical information
and over 50% users consume news
on Social Networks
Facebook and Twitter together generate
more than 5 billion microblogs / day
[SOURCE] Semantic Filtering for Social Data, Amit et al., Internet Computing’16
Background – User Modeling
content enrichment
analysis &
user modeling
interest profile
?
personalized content
recommendations
(How) can we infer
user interest profiles
that support the
content recommender?
3[SOURCE] Analyzing User Modeling on Twitter for Personalized News Recommendations, UMAP’11
Background – User Modeling
Representation of User Interest
Bag of
Words
Topic
Modeling
Bag of
Concepts
users' interests are
represented as a
set of words
topics: co-occurring words
document: mixture of topics
users' interests are
represented as a
set of concepts
• can exploit background
knowledge about concepts
for interest propagation
• focus on words
• assumption: a single doc contains rich information
• cannot provide semantic relationships among words
Bag-of-Concepts
dbpedia:The_Black_Keys
dbpedia:Eagles_of_Death_Metal
Background – User Modeling
dbpedia:The_Wombats
Weighting Scheme: importance of a concept for user
dbpedia:The_Black_Keys (3)
dbpedia:Eagles_of_Death_Metal (5)
Background – User Modeling
dbpedia:The_Wombats (2)
Concept Frequency (CF)
Semantic Interest Propagation
• different structures of DBpedia beyond category information are
not fully explored
Related Work – Semantics
dbpedia:The_Black_Keys
dbpedia:The_Wombats
dbc:Rock_music_duos
dbpedia:Indie_rock
subject
genredbpedia:The_Black_Keys
Temporal Dynamics of User Interests
• assumption: user interests might change over time
• no comparative evaluation over different methods
Related Work – Dynamics
long-term user profile
short-term user profile
interest decay function
historical user-generated content (UGC)
(e.g., the last two weeks UGC)
Concepts
• entities, categories and classes from DBpedia which can be used
for representing user interests
Definition
dbpedia:The_Black_Keys
dbc:Rock_music_duossubject
yago:BluesRockMusicianstype
entity
category
class
CF-IDF: Concept Frequency – Inverse Document Frequency
• Weighting Scheme: CF-IDF vs. CF
• Semantics: explore different structures of DBpedia
• Dynamics: comparative study on different methods
Aim of Work
We propose and evaluate user modeling strategies
using best-performing strategies in the three dimensions
11
User Modeling Framework
semantic interest
propagation
temporal dynamics
• category
• …
• category & property
• Ahmed
• …
• OrlandiUser Profiles
P(u)
Google Category:
Smartphones
… iPhone
0.09 0.12 … 0.08
a concept-based profile P(u)
weighting
scheme
entity-based
user profiles
12
Core propagation strategies
• category-based
SP: sub-pages of the category
SC: sub-categories of the category
• class-based
SP’: sub-pages of the class
SC’: sub-classes of the class
Semantic Interest Propagation
13
Semantic Interest Propagation
Core propagation strategies
• property-based
P: property count in DBpedia graph
Combine different semantics
14
Dynamics of User Interests
Interest decay functions
• Long-term(Orlandi) [SEMANTiCS]
• Long-term(Ahmed) [SIGKDD]
Long-term(Ahmedα): μ2week, μ2month, μall
• Long-term(Abel) [WebSci]
μweek = μ= e -1
μmonth = μ 2
μall = μ 3
Dataset
• 322 users: shared at least one link in the last two weeks
• 247,676 tweets in total
Experiment
• task: recommending 10 links (URLs)
• recommendation algorithm: cosine similarity(P(u), P(i))
P(i): item (link) profile using the same modeling strategy for P(u)
• ground truth links: links shared in the last two weeks
• candidate links: 15,440 links
15
Experiment Setup
used for user modeling
ground truth
links (URLs)
recommendation time
# of concepts after propagation
17
Semantic Interest Propagation
• on average, 224 concepts before extension
• 1,865, 1,317 and 1,152 respectively after extension
Recommendation Results
18
Semantic Interest Propagation
combining different structures of information improves the
performance in the context of link recommendations
Results
19
A Comparative Study of Dynamics
Ahmed’s and Orlandi’s methods
provide competitive performance
in line with previous studies,
using interest decay functions
improves the performance
20
Our User Modeling Strategies
extension strategy
using DBpedia
temporal dynamics
• category & property
• Ahmedα
Google Category:Smartphones
… iPhone
0.09 0.12 … 0.08
a concept-based profile P(u)
weighting
scheme
(CF-IDF)
entity-based
user profiles
um(weighting scheme, temporal dynamics, propagation strategy)
User Profiles
P(u)
Conclusions & Future Work
22
• CF-IDF & combining different structures of DBpedia are beneficial
- um(CF-IDF, none, category & property) performs best
• Ahmed’s and Orlandi’s methods provide competitive performance
for capturing dynamics of user interests
• investigation of combining different dimensions of user modeling
• richer interest representation beyond concepts for users
23
Thank you for your attention!
Guangyuan Piao
homepage: http://parklize.github.io
e-mail: [email protected]
twitter: https://twitter.com/parklize
slideshare: http://www.slideshare.net/parklize