automatic selection of linked open data features in graph-based recommender systems

100
Automatic Selection of Linked Open Data features in Graph-based Recommender Systems Cataldo Musto, Pierpaolo Basile, Marco de Gemmis Pasquale Lops, Giovanni Semeraro, Simone Rutigliano (Università degli Studi di Bari ‘Aldo Moro’, Italy - SWAP Research Group) CBRecSys 2015 Workshop on New Trends in Content-based Recommender Systems Vienna (Austria) September 20, 2015

Upload: cataldo-musto

Post on 14-Apr-2017

1.529 views

Category:

Technology


0 download

TRANSCRIPT

Automatic Selection of Linked Open Data features in

Graph-based Recommender SystemsCataldo Musto, Pierpaolo Basile, Marco de Gemmis

Pasquale Lops, Giovanni Semeraro, Simone Rutigliano (Università degli Studi di Bari ‘Aldo Moro’, Italy - SWAP Research Group)

CBRecSys 2015 Workshop on New Trends in

Content-based Recommender Systems Vienna (Austria)

September 20, 2015

Outline• Basics

• Linked Open Data • Graph-based Recommendations • PageRank with Priors

• Methodology • Introducing LOD-based features • Selecting LOD-based features

• Experiments • Impact of LOD-based features • Comparison to baselines • Trade-off F1/Diversity

• Conclusions

2Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

3

Linked Open Data

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

4

Linked Open Data

What are you talking about?Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

(*) basic italian gesture

(*)

5

Linked Open Data

Methodology to publish, share and link structured data on the Web

Definition

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

6

Linked Open Data (cloud)What is it?

A (large) set of interconnected semantic datasetsCataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

7

Linked Open Data (cloud)What kind of datasets?

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Each bubble is a dataset!

8

Linked Open Data (cloud)How many data?

1048 datasets and 58 billions triplessource: http://stats.lod2.eu

(slide from CBRecSys 2014 presentation)

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

9

Linked Open Data (cloud)How many data?

3426 datasets and 86 billions triplessource: http://stats.lod2.eu

today!

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

10

Linked Open Data (cloud)

DBpedia is the structured mapping of Wikipedia

http://dbpedia.org

It is the core of the LOD cloud.

DBpedia

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

11

Linked Open Data (cloud)Example: unstructured content from Wikipedia

example (Wikipedia page)

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

12

Linked Open Data (cloud)How are these data represented?

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

The Matrix

Don Davis

http://dbpedia.org/resource/Category:Films_shot_in_Australia

Films shot in Australia

dcterms:subject

dbpedia-owl:musicComposer

http://dbpedia.org/resource/Don_Davis_(composer)

dcterms:subject

dcterms:subject

dcte

rms:

subj

ect

dbpe

dia-

owl:d

irect

ordcterms:subject

dcterms:subject

Dystopian Films1999 FilmsAmerican Action Thriller Films

Cyberpunk Films The Wachowskis

http://dbpedia.org/resource/The_Wachowskis

http://dbpedia.org/resource/Dystopian_FIlmshttp://dbpedia.org/resource/1999_FIlms

http://dbpedia.org/resource/Cyberpunk_Films

http://dbpedia.org/resource/American_Action_Thriller_FIlms

http://dbpedia.org/resource/Films_About_Rebellions

Films about Rebellions

13

Linked Open Data (cloud)How are these data represented?

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

The Matrix

Don Davis

http://dbpedia.org/resource/Category:Films_shot_in_Australia

Films shot in Australia

dcterms:subject

dbpedia-owl:musicComposer

http://dbpedia.org/resource/Don_Davis_(composer)

dcterms:subject

dcterms:subject

dcte

rms:

subj

ect

dbpe

dia-

owl:d

irect

ordcterms:subject

dcterms:subject

Dystopian Films1999 FilmsAmerican Action Thriller Films

Cyberpunk Films The Wachowskis

http://dbpedia.org/resource/The_Wachowskis

http://dbpedia.org/resource/Dystopian_FIlmshttp://dbpedia.org/resource/1999_FIlms

http://dbpedia.org/resource/Cyberpunk_Films

http://dbpedia.org/resource/American_Action_Thriller_FIlms

http://dbpedia.org/resource/Films_About_Rebellions

Films about Rebellions

14

Linked Open Data (cloud)How are these data represented?

Semantic Web cake

Information from the LOD cloud is

represented in RDF

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

15

Linked Open Data (cloud)How are these data represented?

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

The Matrix

Don Davis

http://dbpedia.org/resource/Category:Films_shot_in_Australia

Films shot in Australia

dcterms:subject

dbpedia-owl:musicComposer

http://dbpedia.org/resource/Don_Davis_(composer)

dcterms:subject

dcterms:subject

dbpe

dia-

owl:d

irect

ordcterms:subject

dcterms:subject

Dystopian FilmsAmerican Action Thriller Films

Cyberpunk Films The Wachowskis

http://dbpedia.org/resource/The_Wachowskis

http://dbpedia.org/resource/Dystopian_FIlms

http://dbpedia.org/resource/Cyberpunk_Films

http://dbpedia.org/resource/American_Action_Thriller_FIlms

http://dbpedia.org/resource/Films_About_Rebellions

Films about Rebellions

dbo:

runt

ime

136

16

Linked Open Data (cloud)How are these data represented?

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

The Matrix

Don Davis

http://dbpedia.org/resource/Category:Films_shot_in_Australia

Films shot in Australia

dcterms:subject

dbpedia-owl:musicComposer

http://dbpedia.org/resource/Don_Davis_(composer)

\

dcterms:subject

dcterms:subject

dbo:

runt

ime

dbpe

dia-

owl:d

irect

ordcterms:subject

dcterms:subject

Dystopian Films136American Action Thriller Films

Cyberpunk Films The Wachowskis

http://dbpedia.org/resource/The_Wachowskis

http://dbpedia.org/resource/Dystopian_FIlms

http://dbpedia.org/resource/Cyberpunk_Films

http://dbpedia.org/resource/American_Action_Thriller_FIlms

http://dbpedia.org/resource/Films_About_Rebellions

Films about Rebellions

Several interesting (non-trivial) features come into play!

17Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Research Question

Can we use Linked Open Data for Recommender Systems?

18Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Contribution

We investigate the impact of the injection of exogenous knowledge coming from the LOD cloud

in graph-based recommender systems

19Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Graphs

Why graphs?

20Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Graphs

First, because LOD cloud is a graph

21Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Graphs

Graphs are the most straightforward representation for LOD-based data points

22Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Why graphs?

Second, graphs can easily model the recommendation task

23Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4(Bipartite graph)

Users = nodes Items = nodes

Preferences = edges

Very intuitiverepresentation!

u1

i1

u2 i2

u3 i3u4

i4

Graph-based RecSys

24Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

u1

i1

u2 i2

u3 i3u4

i4

Graph-based RecSysRecommendations

are obtained by identifying the most relevant

(item) nodes for a target user,

according to the graph topology.

25Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

u1

i1

u2 i2

u3 i3u4

i4

Graph-based RecSysRecommendations

are obtained by identifying the most relevant

(item) nodes for a target user,

according to the graph topology.

How can we obtain such information?

26Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Graph-based Recommendations

PageRank calculates the ‘importance’ of a node according to the quality and the number of its connections

PageRank

27Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

u1

i1

u2 i2

u3 i3u4

i4

Graph-based RecSysIt is likely that i4 is

suggested to u3, since it is more (and better)

connected in the graph

28Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

u1

i1

u2 i2

u3 i3u4

i4

Graph-based RecSysIt is likely that i4 is

suggested to u3, since it is more (and better)

connected in the graph

Issue: Classic PageRank

is not personalized!

29Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

Graph-based RecSysIt is likely that i4 is

suggested to u3, since it is more (and better)

connected in the graph

Issue: Classic PageRank

is not personalized!

1/8

1/8

1/8

1/81/8

1/8

1/8 1/8

When PageRank is run, all the nodes

are provided with an even distributed

probability

30Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

Graph-based RecSysIt is likely that i4 is

suggested to u3, since it is more (and better)

connected in the graph1/8

1/8

1/8

1/81/8

1/8

1/8 1/8

When PageRank is run, all the nodes

are provided with an even distributed

probability

Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: bringing order to

the Web.

Reference

31Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

Graph-based RecSysIt is likely that i4 is

suggested to u3, since it is more (and better)

connected in the graph1/8

1/8

1/8

1/81/8

1/8

1/8 1/8

When PageRank is run, all the nodes

are provided with an even distributed

probability

Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: bringing order to

the Web.

ReferenceAll the users are provided with the same ranking, which it is independent from user preferences

32Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

Graph-based RecSysIt is likely that i4 is

suggested to u3, since it is more (and better)

connected in the graph1/8

1/8

1/8

1/81/8

1/8

1/8 1/8

When PageRank is run, all the nodes

are provided with an even distributed

probability

Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: bringing order to

the Web.

ReferenceSolution: PageRank with

Priors

33Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Graph-based Recommendations

Rationale: PageRank with Priors

. T. H. Haveliwala. Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search. IEEE Trans. Knowl. Data Eng., 15(4):784–796, 2003.

Reference

Random Walks have to be influenced by previous users behaviors (preferences!). Probability cannot be even distributed.

34Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

PageRank with PriorsA large probability

(e.g. 80%) is assigned a priori to

specific items

0.33/8

0.33/8

0.33/8

0.33/8

0.33/8

0.33/8

3/8

3/8

35Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

PageRank with PriorsA large probability

(e.g. 80%) is assigned a priori to

specific items

0.33/8

0.33/8

0.33/8

0.33/8

0.33/8

0.33/8

3/8

3/8

e.g., the items a user liked!

36Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

PageRank with PriorsA large probability

(e.g. 80%) is assigned a priori to

specific items

0.33/8

0.33/8

0.33/8

0.33/8

0.33/8

0.33/8

3/8

3/8

e.g., the items a user liked!

PageRank is run, and calculations

are thus influenced by such a different

distribution of a priori probabilities.

37Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

PageRank with Priors is a very good recommendation model.

Musto, C., Basile, P., Lops, P., de Gemmis, M., & Semeraro, G. (2014).

Linked Open Data-enabled Strategies for Top-N Recommendations.

CBRecSys 2014

Reference

38Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

PageRank with Priors is a very good recommendation model.

Musto, C., Basile, P., Lops, P., de Gemmis, M., & Semeraro, G. (2014).

Linked Open Data-enabled Strategies for Top-N Recommendations.

CBRecSys 2014

Reference

We want to exploit Linked Open Data to further improve it.

39Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Methodology

Graphs provide a uniform and solid representation to model collaborative (user preferences) data points

as well as LOD-based ones

40Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

If we are able to map the items in the dataset with the entities in the LOD cloud, our representation can

be extended with new data points

Methodology

41Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

u1

u2

u3

u4

Methodologyexample - original representation

42Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

u1

u2

u3

u4

dcterms:subject

http://dbpedia.org/resource/Films_About_Rebellions

Films about Rebellions

dbprop:director

Quentin Tarantino

dbprop:director

Methodologyexample - LOD-boosted representation

1999 films

http://dbpedia.org/resource/1999_films

dcterms:subject

dcterms:subject

43Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

u1

u2

u3

u4

dcterms:subject

http://dbpedia.org/resource/Films_About_Rebellions

Films about Rebellions

dbprop:director

Quentin Tarantino

dbprop:director

Methodologyexample - LOD-boosted representation

1999 films

http://dbpedia.org/resource/1999_films

dcterms:subject

dcterms:subject

Many new information can be injected in the

graph

44Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

u1

u2

u3

u4

dcterms:subject

http://dbpedia.org/resource/Films_About_Rebellions

Films about Rebellions

dbprop:director

Quentin Tarantino

dbprop:director

Methodologyexample - LOD-boosted representation

1999 films

http://dbpedia.org/resource/1999_films

dcterms:subject

dcterms:subject

PageRank with Priors can be run again against this novel representation

45Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

u1

u2

u3

u4

dcterms:subject

http://dbpedia.org/resource/Films_About_Rebellions

Films about Rebellions

dbprop:director

Quentin Tarantino

dbprop:director

Methodologyexample - LOD-boosted representation

1999 films

http://dbpedia.org/resource/1999_films

dcterms:subject

dcterms:subject

How do the recommendations

change with such a new topology?

46Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

u1

u2

u3

u4

dcterms:subject

http://dbpedia.org/resource/Films_About_Rebellions

Films about Rebellions

dbprop:director

Quentin Tarantino

dbprop:director

Methodologyexample - LOD-boosted representation

1999 films

http://dbpedia.org/resource/1999_films

dcterms:subject

dcterms:subject

Is there any significant increase in terms of

computational load?

47Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

u1

u2

u3

u4

dcterms:subject

http://dbpedia.org/resource/Films_About_Rebellions

Films about Rebellions

dbprop:director

Quentin Tarantino

dbprop:director

Methodologyexample - LOD-boosted representation

1999 films

http://dbpedia.org/resource/1999_films

dcterms:subject

dcterms:subject

Is it necessary to inject all of the properties

available in the LOD cloud?

48Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

u1

u2

u3

u4

dcterms:subject

http://dbpedia.org/resource/Films_About_Rebellions

Films about Rebellions

dbprop:director

Quentin Tarantino

dbprop:director

Methodologyexample - LOD-boosted representation

1999 films

http://dbpedia.org/resource/1999_films

dcterms:subject

dcterms:subject

XX

49Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

u1

u2

u3

u4

dcterms:subject

http://dbpedia.org/resource/Films_About_Rebellions

Films about Rebellions

dbprop:director

Quentin Tarantino

dbprop:director

Methodologyexample - LOD-boosted representation

1999 films

http://dbpedia.org/resource/1999_films

dcterms:subject

dcterms:subjectXX

X

50Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

i4

u1

u2

u3

u4

dcterms:subject

http://dbpedia.org/resource/Films_About_Rebellions

Films about Rebellions

dbprop:director

Quentin Tarantino

dbprop:director

Methodologyexample - LOD-boosted representation

1999 films

http://dbpedia.org/resource/1999_films

dcterms:subject

dcterms:subject

Is it possibile to automatically select the most promising

properties?

51

Experiments

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

52Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Research QuestionsDo graph-based recommender systems benefit of the introduction of LOD-based features?

Do graph-based recommender systems exploiting LOD benefit of the adoption of feature selection techniques?

1/2

1.

2.

53Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Research Questions3.

4.

2/2Is there any correlation between the choice of the FS technique and the behavior of the algorithm? (e.g., better diversity or better F1) ?

How does our methodology perform with respect to state-of-the-art algorithms?

54Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experimental EvaluationDescription of the dataset

MovieLens 100k983 users1,682 movies100,000 ratings55.17% positive ratings84.43 ratings/user (avg.)48.48 ratings/item (avg.)93.7% sparsity

55Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experimental EvaluationDBpedia mapping

1,600 movies (95%) were successfully mapped to DBpedia by querying a SPARQL endpoint with the title of the movie.

56Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experimental EvaluationDBpedia mapping

60 properties were extracted from the LOD cloud

57Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experimental EvaluationMost Popular LOD properties

Top-10 properties in the Movie domaindcterms:subjectdbprop:starring

dbprop:producerdbprop:title

dbprop:writerdbprop:country

dbprop:distributordbprop:music

dbprop:directordbprop:runtime

58Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experimental EvaluationExperimental Protocol

Algorithm PageRank with Priors

Data Split 5-fold Cross Validation

Graph Representation G, GLOD, GLOD+FS

Feature Selection Techniques PageRank, Chi-Square, Information Gain, Gain Ratio, mRMR, PCA, SVM

#Selected Features 10, 30, 50 (out of 60 overall features)

Evaluation Metrics F1, Intra-List Diversity, Run Time

59Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experimental EvaluationGraph Representations :: Recap

GBasic Graph with collaborative data points

60Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experimental Evaluation

GLOD Graph extended with all the properties gathered from the LOD cloud

Graph Representations :: Recap

61Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experimental Evaluation

GLOD+FS Graph encoding only the most relevant properties selected by a feature selection technique FS

Graph Representations :: Recap

Experiment 1

62

Impact of LOD-based features.

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

F1@5

F1@10

F1@15

G

G_LOD

G

G_LOD

G

G_LOD

53 55 57 59 61

59,63

60,83

54,24

59,41

60,23

53,89

Experiment 1

63

Impact of LOD-based features :: F1-measure

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Significant improvement in all the metrics (Wilcoxon test)

Run Time (min.)

G

G_LOD

50 262,5 475 687,5 900

880

72

Experiment 1

64Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Tremendous increase in the run time, as well

Impact of LOD-based features :: Run Time

Experiment 1

65Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Impact of LOD-based features :: Graph Representation

Nodes Edges

G 2,625 100,000

GLOD 53,794 178,020

Tremendous increase in the run timedepending on the amount of information encoded in the graph

Experiment 1

66

Impact of LOD-based features :: LESSONS LEARNED

F1 RunTime

LOD features can have a good impact on the overall F1

Tremendous Run Time increase

1.2.

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experiment 2

67

Impact of Feature Selection techniques

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

PageRank

mRMR

Chi-Square

SVM

Gain Ratio

Inf. Gain

PCA

103050103050103050103050103050103050103050

53 53,5 54 54,5 5554,31

54,12

54,06

54,21

54,2

54,21

54,12

54,13

53,96

53,98

54,13

54,19

54,29

54,29

54,06

53,97

53,72

53,82

54,14

53,97

54,18

Experiment 2

68

Impact of Feature Selection techniques :: F1@5

Typically, the larger the number of features, the better the F1

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

PageRank

mRMR

Chi-Square

SVM

Gain Ratio

Inf. Gain

PCA

103050103050103050103050103050103050103050

53 53,5 54 54,5 5554,31

54,12

54,06

54,21

54,2

54,21

54,12

54,13

53,96

53,98

54,13

54,19

54,29

54,29

54,06

53,97

53,72

53,82

54,14

53,97

54,18

Experiment 2

69

Impact of Feature Selection techniques :: F1@5

Typically, the larger the number of features, the better the F1

#50

#50

#50

#50

#50

#30

#30

(best)

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

PageRank

mRMR

Chi-Square

SVM

Gain Ratio

Inf. Gain

PCA

103050103050103050103050103050103050103050

53 53,5 54 54,5 5554,31

54,12

54,06

54,21

54,2

54,21

54,12

54,13

53,96

53,98

54,13

54,19

54,29

54,29

54,06

53,97

53,72

53,82

54,14

53,97

54,18

Experiment 2

70

Impact of Feature Selection techniques :: F1@5

#50

#50

#50

#50

#50

#30

#30

(best)

VERY IMPORTANT: we noted a dataset-dependant behavior

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

PageRank

mRMR

Chi-Square

SVM

Gain Ratio

Inf. Gain

PCA

103050103050103050103050103050103050103050

53 53,5 54 54,5 5554,31

54,12

54,06

54,21

54,2

54,21

54,12

54,13

53,96

53,98

54,13

54,19

54,29

54,29

54,06

53,97

53,72

53,82

54,14

53,97

54,18

Experiment 2

71

Impact of Feature Selection techniques :: F1@5

How does it perform with respect to the baseline?Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experiment 2

72

Impact of Feature Selection techniques :: F1@5

GLOD (baseline) = 54,24

PageRank

mRMR

Chi-Square

SVM

Gain Ratio

Inf. Gain

PCA

103050103050103050103050103050103050103050

53 53,5 54 54,5 5554,31

54,12

54,06

54,21

54,2

54,21

54,12

54,13

53,96

53,98

54,13

54,19

54,29

54,29

54,06

53,97

53,72

53,82

54,14

53,97

54,18

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

PageRank

mRMR

Chi-Square

SVM

Gain Ratio

Inf. Gain

PCA

103050103050103050103050103050103050103050

53 53,5 54 54,5 5554,31

54,12

54,06

54,21

54,2

54,21

54,12

54,13

53,96

53,98

54,13

54,19

54,29

54,29

54,06

53,97

53,72

53,82

54,14

53,97

54,18

Experiment 2

73

Impact of Feature Selection techniques :: F1@5

Only three out of seven techniques (and only with 50 features) overcome the baseline

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

PageRank

mRMR

Chi-Square

SVM

Gain Ratio

Inf. Gain

PCA

103050103050103050103050103050103050103050

53 53,5 54 54,5 5554,31

54,12

54,06

54,21

54,2

54,21

54,12

54,13

53,96

53,98

54,13

54,19

54,29

54,29

54,06

53,97

53,72

53,82

54,14

53,97

54,18

Experiment 2

74

Impact of Feature Selection techniques :: F1@5

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Overall, PCA is the best-performing feature selection technique

Run Time (min.)

GLOD

GLOD+PCA

50 262,5 475 687,5 900

581

880

Experiment 2

75Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

33.9% decrease

Impact of Feature Selection techniques :: Run Time

76Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Nodes Edges

GLOD 53,794 178,020

GLOD+PCA 49,158 169,405

Experiment 2Impact of Feature Selection techniques :: Run Time

-8.6% nodes and -4.8% edges

Experiment 2

77Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Recap of the Experiment

G GLOD GLOD+PCA

F1@5 0,5406 0,5424 0,5431F1@10 0,6068 0,6083 0,6088F1@15 0,5956 0,5963 0,5970

Run Time 72 880 581LOD Properties 0 60 50

The adoption of Feature Selection Techniques improves F1 and decreases run time

Experiment 2

78

Impact of Features Selection techniques :: LESSONS LEARNED

F1 RunTime

Computational load drops down, as expected

Features Selection techniques are useful, but they do not always improve F11.

2.3. The optimal number of features is dataset-dependant

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experiment 3

79

Trade-off between F1 and diversity

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experiment 3

80

Trade-off between F1 and diversity

Can the choice of the feature selection technique endogenously induce an higher diversity (or,

respectively, an higher F1) of the recommendations?

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experiment 3

81

Trade-off between F1 and diversity :: F1@5

Features Selection techniques can be split into four classes

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experiment 3

82

Trade-off between F1 and diversity :: F1@5

Features Selection techniques can be split into four classes

Low Diversity Low Accuracy

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experiment 3

83

Trade-off between F1 and diversity :: F1@5

Features Selection techniques can be split into four classes

Low Diversity Low Accuracy

Low Diversity High Accuracy

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experiment 3

84

Trade-off between F1 and diversity :: F1@5

Features Selection techniques can be split into four classes

Low Diversity Low Accuracy

Low Diversity High Accuracy

High Diversity Low Accuracy

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experiment 3

85

Trade-off between F1 and diversity :: F1@5

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Features Selection techniques can be split into four classes

Low Diversity Low Accuracy

Low Diversity High Accuracy

High Diversity Low Accuracy

High Diversity High Accuracy

Experiment 3

86

Trade-off between F1 and diversity :: F1@5

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Features Selection techniques can be split into four classes

Low Diversity Low Accuracy

Low Diversity High Accuracy

High Diversity Low Accuracy

High Diversity High Accuracy

Experiment 3

87

Trade-off between F1 and diversity :: F1@5

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

mRMR, Information Gain and ChiSquare are not useful

Experiment 3

88

Trade-off between F1 and diversity :: F1@5

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

PCA maximizes F1, at the expense of a little diversity

Experiment 3

89

Trade-off between F1 and diversity :: F1@5

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Gain Ratio and SVM sacrifice F1, to induce an higher diversity

Experiment 3

90

Trade-off between F1 and diversity :: F1@5

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

PageRank obtains a good compromise between F1 and Diversity

Experiment 3

91

Trade-off between F1 and Diversity :: LESSONS LEARNED

F1

Behavior needs to be generalized by analyzing different datasets

Features Selection techniques can maximize a specific evaluation metric, thus a graph-based recsys can be tuned according to the requirements of a scenario

1.2.

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experiment 4

92

Comparison to State of the artBPRMF (Bayesian Personalized Ranking) [+]

U2U-KNN (User to User CF) I2I-KNN (Item to Item CF)

POPULAR (Popularity-based baseline)

[+] S. Rendle, C.Freudenthaler, Z. Gantner, L. Schmidt-Thieme: BPR: Bayesian Personalized Ranking from Implicit Feedback. UAI 2009.

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Experiment 4

93

Comparison to State of the Art

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

50

54

58

62

66

F1@5 F1@10

60,88

54,31

59,16

51,4

59,16

51,78

59,7

52,2

58,35

50,22

I2I-KNN U2U-KNN BPRMF POPULAR PR (G_LOD+PCA)

Experiment 4

94

Comparison to State of the Art

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

50

54

58

62

66

F1@5 F1@10

60,88

54,31

59,16

51,4

59,16

51,78

59,7

52,2

58,35

50,22

I2I-KNN U2U-KNN BPRMF POPULAR PR (G_LOD+PCA)

PageRank with Priors boosted with LOD is the best-performing approach

Experiment 4

95

Comparison to State of the Art

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

50

54

58

62

66

F1@5 F1@10

60,88

54,31

59,16

51,4

59,16

51,78

59,7

52,2

58,35

50,22

I2I-KNN U2U-KNN BPRMF POPULAR PR (G_LOD+PCA)

Even state-of-the-art approaches based on Matrix Factorization are overcame by our methodology

Conclusions

96Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

Recap

97

Methodolology1. PageRank with Priors as base algorithm2. Mapping of the items with nodes in the Linked

Open Data Cloud 3. Expansion of the data points and injection of new

nodes and edges 4. Use of feature selection to automatically select the

most promising properties

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

INVESTIGATION ABOUT THE EFFECTIVENESS OF LINKED OPEN DATA INGRAPH-BASED RECOMMENDER SYSTEMS

Lessons Learned

98

Evaluation1. PageRank with Priors benefit of the injection of data points

coming from the LOD cloud2. Feature Selection techniques improve the results but need

to be properly tuned, since its usage is not always useful. 3. A significant connection between the choice of the feature

selection technique and the maximization of a specific evaluation metric exists.

4. PageRank with Priors boosted with LOD significantly overcomes state-of-the-art approaches.

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

INVESTIGATION ABOUT THE EFFECTIVENESS OF LINKED OPEN DATA INGRAPH-BASED RECOMMENDER SYSTEMS

Future Research

99

Evaluation against different datasets and stronger baselines;

Further expansion of the graph, by introducing more LOD-based data points

Evaluation of Novelty and Serendipity on LOD-based Recommendations;

Cataldo Musto, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, Simone Rutigliano. Automatic Selection of Linked Open features in Graph-based Recommender Systems. CBRecSys 2015 Workshop, Vienna, 20.09.2015

questions?Cataldo Musto, Ph.D

[email protected]