a linked data recommender system using a neighborhood-based graph kernel
DESCRIPTION
A Linked Data Recommender System using a Neighborhood-based Graph KernelTRANSCRIPT
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
A Linked Data Recommender System using a Neighborhood-based Graph Kernel
Vito Claudio Ostuni, Tommaso Di Noia, Roberto Mirizzi*, Eugenio Di Sciascio
{vitoclaudio.ostuni, tommaso.dinoia, eugenio.disciascio}@poliba.it, [email protected]
Polytechnic University of Bari - Bari (ITALY) Yahoo! Sunnyvale, CA (US) (*)
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Outline
Introduction and motivation
The proposed approach
Experimental Evaluation
Contributions and Conclusion
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Recommender Systems
Help users in dealing with Information/Choice Overload
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Model-based approach: Feature vector about item content description Learn a predictive user model from past user preferences
A definition CB-RSs try to recommend items similar* to those a given user has liked in the past [P. Lops, M. de Gemmis, G. Semeraro. Content-based Recommender Systems: State of the Art and Trends. Recommender Systems Handbook.]
Content-based RSs
drama
action Heat
Argo The Godfather
Righteous Kill
(*) similar from a content-based perspective
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Motivation
Traditional Content-based Recommender Systems:
• base on keyword/attribute -based item representations
• rely on the quality of the content-analyzer to extract expressive item features
• lack of knowledge about the items
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Motivation
Traditional Content-based Recommender Systems:
• base on keyword/attribute -based item representations
• rely on the quality of the content-analyzer to extract expressive item features
• lack of knowledge about the items
• use Linked Open Data to obtain knowledge about items and richer item representations
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Linked Open Data
• Initiative for publishing and connecting data on the Web using Semantic Web technologies;
• >30 billion of RDF triples from hundreds of data sources;
• Semantic Web done right [ http://www.w3.org/2008/Talks/0617-lod-tbl/#(3) ]
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Linked Open Data
• Initiative for publishing and connecting data on the Web using Semantic Web technologies;
• >30 billion of RDF triples from hundreds of data sources;
• Semantic Web done right [ http://www.w3.org/2008/Talks/0617-lod-tbl/#(3) ]
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Graph-based Item Representation
The Godfather
Mafia_films
Gangster_films
American Gangster
Films_about_organized_crime_in_the_United_States
Best_Picture_Academy_Award_winners
Best_Thriller_Empire_Award_winners
Films_shot_in_New_York_City
subject
subject subject
subject
subject
subject
subject
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Graph-based Item Representation
The Godfather
Mafia_films Films_about_organized_crime
Gangster_films
American Gangster
Films_about_organized_crime_in_the_United_States
Films_about_organized_crime_by_country
Best_Picture_Academy_Award_winners
Best_Thriller_Empire_Award_winners
Awards_for_best_film
Films_shot_in_New_York_City
subject
subject subject
broader
broader
broader
broader
broader
subject
subject
subject
subject
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Graph-based Item Representation
The Godfather
Mafia_films Films_about_organized_crime
Gangster_films
American Gangster
Films_about_organized_crime_in_the_United_States
Films_about_organized_crime_by_country
Best_Picture_Academy_Award_winners
Best_Thriller_Empire_Award_winners
Awards_for_best_film
Films_shot_in_New_York_City
subject
subject subject
broader
broader
broader
broader
broader
broader
subject
subject
subject
subject
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Graph-based Item Representation
The Godfather
Mafia_films Films_about_organized_crime
Gangster_films
American Gangster
Films_about_organized_crime_in_the_United_States
Films_about_organized_crime_by_country
Best_Picture_Academy_Award_winners
Best_Thriller_Empire_Award_winners
Awards_for_best_film
Films_shot_in_New_York_City
subject
subject subject
broader
broader
broader
broader
broader
broader
subject
subject
subject
subject
Exploit entities descriptions
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
h-hop Item Neighborhood Graph
The Godfather
Mafia_films Films_about_organized_crime
Gangster_films
Best_Picture_Academy_Award_winners Awards_for_best_film
Films_shot_in_New_York_City
subject
subject subject
broader
broader
broader
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Challenges
• learn the user model starting from semantic graph-based item representations (h-hop Item Neighborhood Graph)
• exploit the knowledge associated to the items
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Proposed Approach
• define an appropriate kernel on graph-based item representations
• use kernel methods for learning the user model
• learn the user model starting from semantic graph-based item representations (h-hop Item Neighborhood Graph)
• exploit the knowledge associated to the items
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Kernel Methods
Work by embedding data in a vector space and looking for linear patterns in such space
𝑥 → 𝜙(𝑥)
[Kernel Methods for General Pattern Analysis. Nello Cristianini . http://www.kernel-methods.net/tutorials/KMtalk.pdf]
𝜙(𝑥)
𝜙 𝑥 Input space Feature space
We can work in the new space F by specifying an inner product function between points in it
𝑘 𝑥𝑖, 𝑥𝑗 = < 𝜙(𝑥𝑖), 𝜙(𝑥𝑗)>
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
h-hop Item Neighborhood Graph Kernel
Explicit computation of the feature map
entity importance in the item neigh. graph
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
h-hop Item Neighborhood Graph Kernel
Explicit computation of the feature map
# edges involving em at l hop from i
frequency of the entity in the
item neigh. graph
proportional factor taking into account at which hop the entity appears
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Weights computation example
i
e1 e2
p3
p2
e4
e5
p3 p3
h=2
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Weights computation example
i
e1 e2
p3
p2
e4
e5
p3 p3
h=2
Informative entity about the item even if not directly related to it
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Experimental Settings
Trained a SVM Regression model for each user Accuracy Evaluation: Precision, Recall,MRR (Rated Test Items protocol) Novelty Evaluation: Entropy-based Novelty (All Items protocol) [the lower the better]
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Dataset
Subset of Movielens mapped to DBpedia 6,040 users 3,148 movies
Mappings of various recsys datasets to DBpedia http://sisinflab.poliba.it/semanticweb/lod/recsys/datasets/
Three different train/test splits 20/80, 40/60, 80/20
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Kernel calibration – impact of alpha params. (i)
0,5
0,55
0,6
0,65
0,7
0,75
0,25 1 2 5 10 20
Prec@10 [20/80]
Prec@10 [40/60]
Prec@10 [80/20]
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Kernel calibration – impact of alpha params. (ii)
0
0,2
0,4
0,6
0,8
1
1,2
0,25 1 2 5 10 20
EBN@10 [20/80]
EBN@10 [40/60]
EBN@10 [80/20]
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Comparative approaches
•NB: 1-hop item neigh. + Naive Bayes classifier
•VSM: 1-hop item neigh. Vector Space Model (tf-idf) + SVM regr
•WK: 2-hop item neigh. Walk-based kernel + SVM regr
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Comparison with other approaches (i)
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
Prec@10 [20/80] Prec@10 [40/60] Prec@10 [80/20]
NK-bestPrec
NK-bestEntr
NB
VSM
WK
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Comparison with other approaches (ii)
0
0,2
0,4
0,6
0,8
1
1,2
1,4
1,6
1,8
EBN@10 [20/80] EBN@10 [40/60] EBN@10 [80/20]
NK-bestPrec
NK-bestEntr
NB
VSM
WK
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Contributions
A linked data RS based on kernel methods
Exploitation of semantic graph-based item descriptions from the Web of Data
Effective Item Neighborhood Graph Kernel
Combination of kernel methods and LOD based item descriptions for model-based Content-based recommendations
Future Work:
Evaluation of further kernel functions on graphs
Evaluation of different kernel methods
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Q & A