personalised access to linked data

21
Personalised Access to Linked Data Milan Dojchinovski and Tomas Vitvar Web Intelligence Research Group Czech Technical University in Prague The 19th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2014) November 24-28, 2014, Linköping, Sweden Milan Dojchinovski [email protected] - @m1ci - http://dojchinovski.mk Except where otherwise noted, the content of this presentation is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported Czech Technical University in Prague Web Intelligence Research Group Web Intelligence Research Group

Upload: milan-dojchinovski

Post on 14-Jul-2015

227 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Personalised Access to Linked Data

Personalised Access to Linked Data

Milan Dojchinovski and Tomas Vitvar

Web Intelligence Research Group Czech Technical University in Prague

The 19th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2014) November 24-28, 2014, Linköping, Sweden

Milan Dojchinovski [email protected] - @m1ci - http://dojchinovski.mk

Except where otherwise noted, the content of this presentation is licensed underCreative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported

Czech Technical University in Prague

Web Intelligence Research GroupWeb Intelligence Research Group

Page 2: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Outline

2

• Introduction • Personalised Resource Recommendations • Experiments and Results • Conclusion and Future Work

Page 3: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Introduction

3

• Find relevant information in LOD is not easy - SPARQL, manual dereferencing URIs, …

• … or ask other people for recommendations and get personalised recommendations of resources

• Linked Data based recommenders can help [1] M. Schmachtenberg et al, Adoption of linked data best practices in different topical domains, ISWC 2014.

LOD cloud stats [1]: • 294 in Sep 2011 • 1,091 datasets in Apr 2014

• 271% growth

Page 4: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Related Work

4

• dbRec (Passant, 2010): semantic distance measure - function of direct and indirect links

• Content-based LD recommender (Di Noia et. al, 2012) - movies domain, max resource distance: 2

• Lookup Explore Discovery (Mirizzi et al., 2010) - user input required - recommendations related to the entities occurring in the query

• Discovery Hub (Marie et al., 2013) - based on the spreading activation - utilizes small portion of information DBpedia

• Aemoo (Musetti et al., 2012) - Encyclopedic Knowledge Patterns over DBpedia

Page 5: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Introduction

5

• Method for personalised Linked Data recommendations - apply collaborative filtering technique to Linked Data - recommendations from users with similar resource interests

• Two novel metrics: - resource similarity and resource relevance

• Considered aspects: - Resource Commonalities

- how much information two resources share

- Resource Informativeness - how informative the resources are

- Resource Connectivity - how well are resources connected

Page 6: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Outline

6

• Introduction • Personalised Resource Recommendations

- Resource Similarity - Resource Relevance

• Experiments and Results • Conclusion and Future Work

Page 7: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Resource Recommendation In a Nutshell

7

• Input: RDF graph (including user profiles) • Step 1: evaluate user similarities

- e.g. similarity between resources representing users - instances of foaf:Person class

• Step 2: recommend resource from similar users - compute relevance for each resource candidate - incorporate the resource (user) similarities

dc:creator

dc:creator

dc:creator

dc:crea

tor

ls:used

API

ls:categoryls:usedAPI

ls:tag

ls:used

API

ls:usedAPI

ls:usedAP

I

ls:tag

ls:tag

ls:tag

ls:used

API

ls:tag

ls:tag

ls:tag

#Alfredo

#FriendLynx

#Hashtagram

#Instagram

#Twitter-API

#Facebok-API

#search #Microsoft-Bing-API

#music

#social

#microblogginig

#411Sync-API

#MTV-Billboard-charts

#Mobile-Weather-Search

#mlachwani

Page 8: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Outline

8

• Introduction • Personalised Resource Recommendations

- Resource Similarity - Resource Relevance

• Experiments and Results • Conclusion and Future Work

Page 9: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Resource Similarity Computation

9

• Assumption 1: the more information two resource share, the more similar they are

• 6 resources in the shared context graph

dc:creator

dc:creator

dc:creator

dc:crea

tor

ls:used

API

ls:categoryls:usedAPI

ls:tag

ls:used

API

ls:usedAPI

ls:usedAP

I

ls:tag

ls:tag

ls:tag

ls:used

API

ls:tag

ls:tag

ls:tag

#Alfredo

#FriendLynx

#Hashtagram

#Instagram

#Twitter-API

#Facebok-API

#search #Microsoft-Bing-API

#music

#social

#microblogginig

#411Sync-API

#MTV-Billboard-charts

#Mobile-Weather-Search

#mlachwani

Page 10: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Resource Similarity Computation (cont.)

10

• Assumption 2: less probable shared resources carry more similarity information than the more common

• Evaluated by computing the node degree value - Microsoft-Bing-API (deg. 40) more than Twitter-API (deg. 799)

Information Content (IC)

Resource IC

dc:creator

ls:tag

dc:creator

dc:creator

dc:crea

tor

ls:used

API

ls:categoryls:usedAPI

ls:tag

ls:usedA

PI

ls:usedAPI

ls:usedAP

I

ls:tag

ls:tag

ls:tag

ls:used

API

ls:tag

ls:tag

#Alfredo

#FriendLynx

#Hashtagram

#Instagram

#Twitter-API

#Facebok-API

#search#Microsoft-Bing-

API

#music

#social

#microblogginig

#411Sync-API

#MTV-Billboard-charts

#Mobile-Weather-Search

#mlachwani

Page 11: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Resource Similarity Computation (cont.)

11

• Assumption 3: better connected shared resources carry more similarity information

• The number of simple paths between the resources - 2 simple paths between #Alfredo and #Twitter-API

dc:creator

dc:creator

dc:creator

dc:crea

tor

ls:used

API

ls:categoryls:usedAPI

ls:tag

ls:used

API

ls:usedAPI

ls:usedAP

I

ls:tag

ls:tag

ls:tag

ls:used

API

ls:tag

ls:tag

ls:tag

#Alfredo

#FriendLynx

#Hashtagram

#Instagram

#Twitter-API

#Facebok-API

#search #Microsoft-Bing-API

#music

#social

#microblogginig

#411Sync-API

#MTV-Billboard-charts

#Mobile-Weather-Search

#mlachwani

Page 12: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Outline

12

• Introduction • Personalised Resource Recommendations

- Resource Similarity - Resource Relevance

• Experiments and Results • Conclusion and Future Work

Page 13: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Resource Relevance Computation

13

• Recommending resources of type Web APIs for an user

• Recommendations from similar users - connectivity between the similar user and the resource candidate

- number of simple paths - informativeness of each resource in these paths

dc:creator

dc:creator

dc:creator

dc:crea

tor

ls:used

API

ls:categoryls:usedAPI

ls:tag

ls:used

API

ls:usedAPI

ls:usedAP

I

ls:tag

ls:tag

ls:tag

ls:used

API

ls:tag

ls:tag

ls:tag

#Alfredo

#FriendLynx

#Hashtagram

#Instagram

#Twitter-API

#Facebok-API

#search #Microsoft-Bing-API

#music

#social

#microblogginig

#411Sync-API

#MTV-Billboard-charts

#Mobile-Weather-Search

#mlachwanisimilar users

Page 14: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Outline

14

• Introduction • Personalised Resource Recommendations

- Resource Similarity - Resource Relevance

• Experiments and Results • Conclusion and Future Work

Page 15: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Experiments Setup

15

• Linked Web APIs dataset - RDF representation of ProgrammableWeb.com - largest service and mashup repository

• Evaluated accuracy and usefulness of recommendations • Accuracy:

- precision/recall, AUC, NDCG, MAP, MRR

• Usefulness: - serendipity: how surprising the recommendations are - diversity: how diverse the recommendations are

• Evaluated methods: - User-KNN, Item-KNN, Most popular, Random - LD with RIC, LD without RIC

Page 16: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

• Taking into account resource informativeness makes sense • Item-KNN and User-KNN do not work well

- … at least in the Web services domain

Accuracy Evaluation

16

0.0 0.2 0.4 0.6 0.8 1.0

0.00

0.05

0.10

0.15

0.20

Recall

Precision

Linked Data based with RIC Linked Data based without RIC User-KNNItem-KNNMost popularRandom

Page 17: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

• Serendipity score = user resource avg. distance • Diversity score = avg. dissimilarity between all resource

pairs

Serendipity and Diversity Evaluation

17

@top-N Random Most Popular User-KNN Item-KNN LD without

RICLD with

RIC@top-5 2.97752 2.66810 2.59197 2.68006 3.18881 3.03271

@top-10 2.98455 2.67465 2.65514 2.70402 3.54821 3.26700@top-15 2.98364 2.65816 2.68101 2.71267 3.73117 3.36509@top-20 2.98455 2.65184 2.69780 2.70968 3.84142 3.42444@top-5 0.65339 0.58347 0.62092 0.63349 0.83417 0.81949

@top-10 0.65317 0.61354 0.62411 0.64392 0.86044 0.82912@top-15 0.65370 0.60374 0.63159 0.64558 0.87511 0.82884@top-20 0.65347 0.60719 0.63276 0.64287 0.88435 0.83114

sere

ndip

itydi

vers

ity

Page 18: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Trade-off: Serendipity, Diversity and Accuracy

18

• higher serendipity leads to lower precision and higher recall

• optimal results @top 5-10

0.000.020.040.060.080.100.120.140.160.180.200.220.240.260.280.30

Precision

0.818

0.820

0.822

0.824

0.826

0.828

0.830

0.832

0.834

Diversity

0.73 0.74 0.75 0.76 0.77 0.78 0.79 0.80 0.81 0.82Recall

@5 @10 @15 @20

Precision/RecallDiversity

0.000.020.040.060.080.100.120.140.160.180.200.220.240.260.280.30

Precision

3.003.053.103.153.203.253.303.353.403.453.50

Serendipity

0.73 0.74 0.75 0.76 0.77 0.78 0.79 0.80 0.81 0.82Recall

@5 @10 @15 @20

Precision/RecallSerendipity

• higher diversity leads to lower precision and higher recall

• optimal results @top 5-10

Page 19: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Outline

19

• Introduction and Motivation • Personalised Resource Recommendations

- Resource Similarity - Resource Relevance

• Experiments and Results • Conclusion and Future Work

Page 20: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Conclusion

20

• Method for personalised access to Linked Data - recommendations based on the collaborative filtering

technique

• Considered aspects: - resources’ commonalities - resources’ informativeness - resources’ connectiviteness

• Validated on a dataset from the Web services domain - Linked Web APIs dataset

• Future work: - consider other multi-domain datasets - automatic determination of optimal resource contexts distances - publish the Linked Web APIs dataset to the LOD cloud

Page 21: Personalised Access to Linked Data

Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk

Feedback

21

Thank you!Questions, comments, ideas?

Milan Dojchinovski [email protected]

@m1ci http://dojchinovski.mk

dc:creator

dc:creator

dc:creator

dc:crea

tor

ls:used

API

ls:categoryls:usedAPI

ls:tag

ls:used

API

ls:usedAPI

ls:usedAP

I

ls:tag

ls:tag

ls:tag

ls:used

API

ls:tag

ls:tag

ls:tag

#Alfredo

#FriendLynx

#Hashtagram

#Instagram

#Twitter-API

#Facebok-API

#search #Microsoft-Bing-API

#music

#social

#microblogginig

#411Sync-API

#MTV-Billboard-charts

#Mobile-Weather-Search

#mlachwani