leveraging user modeling on the social web with linked data
DESCRIPTION
Talk give by Ke Tao at #ICWE2012 titled Leveraging User Modeling on the Social Web with Linked DataTRANSCRIPT
Delft University of Technology
Leveraging User Modeling on the Social Web with Linked Data
#ICWE2012 Berlin, Germany, July 27th 2012
Fabian Abel, Claudia Hauff, Geert-Jan Houben, Ke Tao
Web Information Systems, TU Delft, the Netherlands
2 Leveraging User Modeling on the Social Web with Linked Data
Personalized Recommendations
Personalized Search Adaptive Systems
What we do: Science and Engineering for the Personal Web
Social Web
Analysis and User Modeling
user/usage data
Semantic Enrichment, Linkage and Alignment
domains: news social media cultural heritage public data e-learning
recommending points of interest
3 Leveraging User Modeling on the Social Web with Linked Data
Kyle
Hometown: South Park, Colorado
4 Leveraging User Modeling on the Social Web with Linked Data
Kyle recently uploaded photos to Flickr
During his trip to the Netherlands, he uploaded pictures to Flickr.
tags: delft, vermeer geo: The Hague
tags: girl with pearl earring geo: The Hague
5 Leveraging User Modeling on the Social Web with Linked Data
Kyle tweets about his upcoming trip
Looking forward to visit Paris next week!
6 Leveraging User Modeling on the Social Web with Linked Data
Interests of Kyle?
tags: delft, vermeer geo: The Hague
tags: girl with pearl earring geo: The Hague
Looking forward to visit Paris next week!
• Given Kyle’s Flickr and Twitter activities, can we infer Kyle’s interests?
• Knowing that Kyle will visit Paris, France, can we recommend him places that might be interesting for him?
7 Leveraging User Modeling on the Social Web with Linked Data
Challenges • How to create a meaningful profile that supports the given
application? à how to bridge between the Social Web chatter of a user and the candidate items of a recommender system?
User Profile concept weight
? Applica0on
that demands user interest profile regarding
-‐concepts
c1 c5
c6
c9
Social Web d5
c1 c3
c2 d1 d2 d3 d4
d4 d3
Candidate Items Recommendations
8 Leveraging User Modeling on the Social Web with Linked Data
Challenge of Recommending Points of Interests (POIs) to Kyle
Johannes Vermeer
dbpedia:Louvre Looking forward to visit Paris next week!
dbpedia:Paris
The lacemaker
The astronomer
9 Leveraging User Modeling on the Social Web with Linked Data
c1
c4
c5
c6
LOD-based User Modeling
weigh0ng strategies
Applica0on that demands user
interest profile regarding -‐concepts
c2
c3 cx
cy
c9
User Profile concept weight
0.4
0.1
0.2
c1
c2
c3 … …
concepts that can be extracted from the user data
user data
Social Web
background knowledge (graph structures)
Linked Data
10 Leveraging User Modeling on the Social Web with Linked Data
User Modeling Building Blocks
11 Leveraging User Modeling on the Social Web with Linked Data
tags: girl with pearl earring geo: The Hague
dbpedia:Girl_with_pearl_earring
A
Artifact
B
The lacemaker
C
The astronomer
…
rdf:type
Johannes Vermeer foaf:maker
foaf:maker
Strategies for exploiting the RDF-based background knowledge graph
Direct Mention Indirect Mention @RDF_statement Indirect Mention
@RDF_graph
dbpedia:The_Hague
dbpedia:Louvre dbpprop:location
12 Leveraging User Modeling on the Social Web with Linked Data
Weighting Scheme
Weighting scheme: count the number of occurrences of a given graph pattern.
User Profile concept weight
?
?
?
c1
c2
c3 … …
User Profile concept weight
374
152
73
c1
c2
c3 … …
13 Leveraging User Modeling on the Social Web with Linked Data
Source of User Data
• Twitter • Flickr
14 Leveraging User Modeling on the Social Web with Linked Data
Mining the Geographic Origins of User Data
• Tweets or Flickr images posted by a GPS-enabled device;
• Images geo-tagged manually on Flickr world map
• Otherwise : exploit the title and the tags the users assign to their images
15 Leveraging User Modeling on the Social Web with Linked Data
Evaluation
16 Leveraging User Modeling on the Social Web with Linked Data
Research Questions
1. How does the source of user data influence the quality in deducing user preferences for POIs?
2. How does the consideration of background knowledge from the Linked Open Data Cloud impact the quality of the user modeling?
3. What (combination of) user modeling strategies allows for the best quality?
17 Leveraging User Modeling on the Social Web with Linked Data
Dataset
users 394
tweets 2,489,088 location 11% pictures 833,441 location 70.6%(within 10km)
duration 11 months
18 Leveraging User Modeling on the Social Web with Linked Data
Dataset Characteristics: Profile Sizes
0% 20% 40% 60% 80% 100%user percentiles
0
0.001
0.01
0.1
1
10
100ra
tio b
etw
een
Flic
kr-b
ased
pro
file
size
a
nd T
witt
er-b
ased
pro
file
size size of Flickr-based profile is greater than
size of Twitter-based profile
size of Twitter-based profile is greater than size of Flickr-based profile Flickr-based
profile is empty
81.4% of the users, Twitter-based profiles are
bigger than Flickr-based profiles
19 Leveraging User Modeling on the Social Web with Linked Data
Experimental Setup • Task:
= Recommending POIs = Predicting POIs which a user will visit
• Ground truth: • split data into training data (= first nine months) and test data (= last
two months) • POIs that the user visited in the last two months are considered as
relevant
• Metrics: • Precision@k, Recall@k and F-Measure@k: precision, recall and f-
measure within the top k of the ranking of recommended items
20 Leveraging User Modeling on the Social Web with Linked Data
Results: Impact of User Data Source Selection
!"!#$"!#%"!#&"!#'"!#("!#)"!#*"!#+"
,-./01" 23.451" ,-./01"6"23.451"
!"#$%&%'()*+#$,--*,(.
*/01
#,&2"#*
78$!"
98$!"
,8$!"
Combination of Twitter and Flickr user data
allows for best performance
Twitter seems to be more valuable for the
given application
21 Leveraging User Modeling on the Social Web with Linked Data
Results: Impact of Strategies for Exploiting RDF-based Background Knowledge
!"!#$"!#%"!#&"!#'"!#("!#)"!#*"!#+"!#,"
-./012"304564"
.4-./012"304564"7"
.4-./012"304564"77"
!"#$%&%'()*+#$,--*,(.
*/01
#,&2"#*
89$!"
:9$!"
;9$!"
The more background information the better
the user modeling performance
22 Leveraging User Modeling on the Social Web with Linked Data
Results: Combining different Strategies Leveraging User Modeling on the Social Web with Linked Data 13
Table 2. Overview on the di�erent strategies for integrating background knowledge.Twitter and Flickr are exploited as user data source.
strategy P@10 R@10 F@10 P@20 R@20 F@20
core strategies:
direct mentions 0.715 0.260 0.412 0.580 0.179 0.298
indirect mentions I 0.699 0.268 0.426 0.566 0.185 0.308
indirect mentions II 0.820 0.312 0.475 0.727 0.436 0.569
combined strategies:
direct & indirect mentions I 0.733 0.216 0.360 0.608 0.287 0.416
direct & indirect mentions II 0.836 0.333 0.4975 0.747 0.466 0.596
indirect mentions I + II 0.830 0.325 0.489 0.739 0.456 0.587
direct & indirect mentions I + II 0.839 0.337 0.502 0.751 0.473 0.603
which does not exploit RDF statements from the Linked Open Data Cloud,the F@20 performance is more than doubled. The consideration of backgroundknowledge obtained from the Linked Open Data Cloud therefore clearly improvesthe performance of user modeling on the Social Web which answers our mainresearch question.
Furthermore, we can answer the specific research questions raised at thebeginning of this section as follows. For the task of recommending points ofinterests, it turns out that (1) the aggregation of Twitter and Flickr user dataallows for the best user modeling performance and that (2) the user modelingquality increases the more background information we consider from the LinkedOpen Data Cloud. Finally, (3) the best performance is achieved by combiningthe di�erent graph patterns for acquiring background information and inferringuser preferences.
6 ConclusionsIn this paper, we proposed a framework for leveraging user modeling on the So-cial Web with information from the Linked Open Data Cloud. Our frameworkmonitors user activities on Social Web platforms such as Twitter and Flickr, in-fers the semantic meaning of user activities and provides strategies for gatheringbackground information from the Web of Data to generate semantically mean-ingful user profiles that support a given application. We showcase and evaluateour framework in the context of a geospatial recommender system where the corechallenge relies in deducing user preferences for points of interests. Therefore, wealso present a method that allows for the semantic enrichment of Flickr picturesby (i) estimating the geographical location where a picture was taken and (ii)exploiting GeoNames in order to identify related DBpedia concepts.
Our evaluation proves the e�ectiveness of our user modeling framework.Based on a large Twitter and Flickr dataset of more than 2.4 million tweets
Leveraging User Modeling on the Social Web with Linked Data 13
Table 2. Overview on the di�erent strategies for integrating background knowledge.Twitter and Flickr are exploited as user data source.
strategy P@10 R@10 F@10 P@20 R@20 F@20
core strategies:
direct mentions 0.715 0.260 0.412 0.580 0.179 0.298
indirect mentions I 0.699 0.268 0.426 0.566 0.185 0.308
indirect mentions II 0.820 0.312 0.475 0.727 0.436 0.569
combined strategies:
direct & indirect mentions I 0.733 0.216 0.360 0.608 0.287 0.416
direct & indirect mentions II 0.836 0.333 0.4975 0.747 0.466 0.596
indirect mentions I + II 0.830 0.325 0.489 0.739 0.456 0.587
direct & indirect mentions I + II 0.839 0.337 0.502 0.751 0.473 0.603
which does not exploit RDF statements from the Linked Open Data Cloud,the F@20 performance is more than doubled. The consideration of backgroundknowledge obtained from the Linked Open Data Cloud therefore clearly improvesthe performance of user modeling on the Social Web which answers our mainresearch question.
Furthermore, we can answer the specific research questions raised at thebeginning of this section as follows. For the task of recommending points ofinterests, it turns out that (1) the aggregation of Twitter and Flickr user dataallows for the best user modeling performance and that (2) the user modelingquality increases the more background information we consider from the LinkedOpen Data Cloud. Finally, (3) the best performance is achieved by combiningthe di�erent graph patterns for acquiring background information and inferringuser preferences.
6 ConclusionsIn this paper, we proposed a framework for leveraging user modeling on the So-cial Web with information from the Linked Open Data Cloud. Our frameworkmonitors user activities on Social Web platforms such as Twitter and Flickr, in-fers the semantic meaning of user activities and provides strategies for gatheringbackground information from the Web of Data to generate semantically mean-ingful user profiles that support a given application. We showcase and evaluateour framework in the context of a geospatial recommender system where the corechallenge relies in deducing user preferences for points of interests. Therefore, wealso present a method that allows for the semantic enrichment of Flickr picturesby (i) estimating the geographical location where a picture was taken and (ii)exploiting GeoNames in order to identify related DBpedia concepts.
Our evaluation proves the e�ectiveness of our user modeling framework.Based on a large Twitter and Flickr dataset of more than 2.4 million tweets
Leveraging User Modeling on the Social Web with Linked Data 13
Table 2. Overview on the di�erent strategies for integrating background knowledge.Twitter and Flickr are exploited as user data source.
strategy P@10 R@10 F@10 P@20 R@20 F@20
core strategies:
direct mentions 0.715 0.260 0.412 0.580 0.179 0.298
indirect mentions I 0.699 0.268 0.426 0.566 0.185 0.308
indirect mentions II 0.820 0.312 0.475 0.727 0.436 0.569
combined strategies:
direct & indirect mentions I 0.733 0.216 0.360 0.608 0.287 0.416
direct & indirect mentions II 0.836 0.333 0.4975 0.747 0.466 0.596
indirect mentions I + II 0.830 0.325 0.489 0.739 0.456 0.587
direct & indirect mentions I + II 0.839 0.337 0.502 0.751 0.473 0.603
which does not exploit RDF statements from the Linked Open Data Cloud,the F@20 performance is more than doubled. The consideration of backgroundknowledge obtained from the Linked Open Data Cloud therefore clearly improvesthe performance of user modeling on the Social Web which answers our mainresearch question.
Furthermore, we can answer the specific research questions raised at thebeginning of this section as follows. For the task of recommending points ofinterests, it turns out that (1) the aggregation of Twitter and Flickr user dataallows for the best user modeling performance and that (2) the user modelingquality increases the more background information we consider from the LinkedOpen Data Cloud. Finally, (3) the best performance is achieved by combiningthe di�erent graph patterns for acquiring background information and inferringuser preferences.
6 ConclusionsIn this paper, we proposed a framework for leveraging user modeling on the So-cial Web with information from the Linked Open Data Cloud. Our frameworkmonitors user activities on Social Web platforms such as Twitter and Flickr, in-fers the semantic meaning of user activities and provides strategies for gatheringbackground information from the Web of Data to generate semantically mean-ingful user profiles that support a given application. We showcase and evaluateour framework in the context of a geospatial recommender system where the corechallenge relies in deducing user preferences for points of interests. Therefore, wealso present a method that allows for the semantic enrichment of Flickr picturesby (i) estimating the geographical location where a picture was taken and (ii)exploiting GeoNames in order to identify related DBpedia concepts.
Our evaluation proves the e�ectiveness of our user modeling framework.Based on a large Twitter and Flickr dataset of more than 2.4 million tweets
Combining all background exploitation strategies improves the user modeling performance clearly
23 Leveraging User Modeling on the Social Web with Linked Data
Conclusions
What we did: • LOD-based User Modeling on the Social Web • Different strategies for exploiting RDF-based background
knowledge Findings: 1. Combination of different user data sources (Flickr & Twitter) is
beneficial for the user modeling performance 2. User modeling quality increases the more background
knowledge one considers 3. Combination of strategies achieves the best performance
Future work: • Investigate weighting schemes that weight the different RDF
graph patterns for acquiring background knowledge differently
24 Leveraging User Modeling on the Social Web with Linked Data
Thank you!
Email: [email protected] Twitter: @wisdelft @taubau http://persweb.org