recommender systems in the linked data era
DESCRIPTION
The ultimate goal of a recommender system is to suggest interesting and not obvious items (e.g., products to buy, people to connect with, movies to watch, etc.) to users, based on their preferences. The advent of the Linked Open Data (LOD) initiative in the Semantic Web gave birth to a variety of open knowledge bases freely accessible on the Web. They provide a valuable source of information that can improve conventional recommender systems, if properly exploited. Here I present several approaches to recommender systems that leverage Linked Data knowledge bases such as DBpedia. In particular, content-based and hybrid recommendation algorithms will be discussed. For full details about the presented approaches please refer to the full papers mentioned in this presentation.TRANSCRIPT
Recommender Systems in the Linked Data eraROBERTO MIRIZZI, [email protected]
Outline
What is a Recommender System?◦ A definition
◦ Types
What is Linked Data?◦ LOD
◦ DBpedia
Some Recommender Systems (RS):◦ A content-based RS (memory-based)
◦ A mobile content-based RS (memory-based)
◦ A content-based RS (model-based)
◦ A hybrid RS (model-based)
What is a Recommender System?
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
What is a Recommender System?Recommender Systems (RSs) are software tools and techniques providing suggestions for items to be of use to a user.
[F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, editors. Recommender Systems Handbook. Springer, 2011.]
Input Data:
A set of users U = {u1, …, uM}
A set of items I = {i1, …, iN}
The preference matrix R = [ru,i]
Problem Definition:
Given user u and target item i
Predict the preference ru,i
?
?
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
Content-based (CB): recommendations are based on the assumption that if in the past a user liked a set of items with particular features, they will likely go for items having similar characteristics
Recommender Systems: types
animation fairytale ogre castle
Collaborative-filtering (CF): recommendations are based on the assumption that users having similar history are more likely to have similar tastes/needs
Hybrid: it’s not too hard to guess what they are
What is Linked Data?
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
What is Linked Data?A collection of interrelated datasets on the Web
Principles:1. Use HTTP URIs to identify
things
2. Leverage standards such as RDF and SPARQL to provide information about things
3. Link related things by relationships
[http://linkeddata.org/]
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
What is Linked Data?A collection of interrelated datasets on the Web
Principles:1. Use HTTP URIs to identify
things
2. Leverage standards such as RDF and SPARQL to provide information about things
3. Link related things by relationships
[http://linkeddata.org/]
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
foaf:page
DBpedia: a Nucleus for a Web of Open Data
http://dbpedia.org
DBpedia is a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web.
DBpedia allows you to ask sophisticated queries against Wikipedia, and to link the different data sets on the Web to Wikipedia data.
[Auer et al., DBpedia: A Nucleus for a Web of Open Data. ISWC+ASWC 2007][Bizer et el., A crystallization point for the Web of Data. Journal Web Semantics, 2009]
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
Querying DBpedia: SPARQL
DBpedia exposes a SPARQL endpoint (http://dbpedia.org/sparql) to query the dataset.
Results can be provided in several formats (e.g., JSON, XML, NTriples, etc.)
SPARQL is an RDF query language. Its queries consist of triple patterns, conjunctions, disjunctions and optional patterns
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
A graph of knowledge
Why don’t we use all this information to foster recommender systems?
Ocean’s Eleven
George Clooney
Brad Pitt
Ocean’s Twelve
Steven Soderbergh
Catherine Zeta-Jones
2000s crime films
American criminal comedy films
Crime films
Crime
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
A graph of knowledge
Ocean’s Eleven
George Clooney
Brad Pitt
Ocean’s Twelve
Steven Soderbergh
Catherine Zeta-Jones
2000s crime films
American criminal comedy films
Crime films
Crime
Why don’t we use all this information to foster recommender systems?
likes
likes
A content-based RS (memory-based)
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
The good old Vector Space Model
[http://en.wikipedia.org/wiki/File:Vector_space_model.jpg]
The Vector Space Model is an algebraic model for representing both text documents and queries as vectors of index terms wt,d that are positive and non-binary.
1, 2, ,, ,...,
T
d d d N dv w w w
, ,t d t d tw tf idf
,
,
,
t d
t d
k dk
ntf
n
, ,1
2 2
, ,1 1
( , )
N
i j i qj q i
jN N
j i j i qi i
w wd dsim d q
d q w w
' 'log
t
Didf
d D t d
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
Semantic Vector Space Model (i)
Ocean’s Eleven
George Clooney
Steven Soderberg2000s crime films
Crimestarring
directorsubject/broader
genre
Ocean’s Twelve
Brad PittCatherine Zeta-Jones
Crime filmsAmerican criminal…
Ocean’s ElevenOcean’s Twelve
starring
Each item is expressed as a tensor in a multi-dimensional space where each dimension corresponds to a specific property of the considered datasets (e.g., starring, subject/broader, director, genre, …)
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
STARRINGGeorge
Clooney [gc] (38 movies)
Catherine Z. Jones [czj] (22 movies)
Brad Pitt [bp]
(35 movies)
Ocean’s Eleven [o11](13 actors)
Ocean’s Twelve [o12](15 actors)
STARRINGGeorge
Clooney [gc] (38 movies)
Catherine Z. Jones [czj] (22 movies)
Brad Pitt [bp]
(35 movies)
Ocean’s Eleven [o11](13 actors)
Ocean’s Twelve [o12](15 actors)
Semantic Vector Space Model (ii)
starring George Clooney [gc] Catherine Z. Jones [czj] Brad Pitt [bp]
Ocean’s Eleven [o11]
Ocean’s Twelve [o12]
, ,x y x y xactor movie actor movie actorw tf idf
11,gc ow
12,gc ow
12,czj ow
11,bp ow
12,bp ow
11,czj ow
We can now compute the scalar product between the two vectors to get their similarity…
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
Semantic Vector Space Model (iii)
12 11 12 11 12 11
12 12 12 11 11 11
, , , , , ,
12 112 2 2 2 2 2
, , , , , ,
( , )gc o gc o czj o czj o bp o bp o
starring
gc o czj o bp o gc o czj o bp o
w w w w w wsim o o
w w w w w w
…and then combine all the similarities for each property:
12 11 12 11 12 11 12 11( , ) () ) ( ,( , , )
starring directostarring director subjecr subjecttsim o o sis m oim o si o oo mo
soon we will see how to compute the p coefficients
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
Ready for our first Content-based RS
( ) , 1 if likes , 1 otherwisej j j j j
profile u m r r u m r
( )
( , )
( , )( )
j
p p j i
p
j
m profile u
i
sim m m
rP
r u mprofile u
Given a user profile, defined as:
We predict the rating using a Nearest Neighbor Classifier (Memory-based) where the similarity measure is a linear combination of local similarities:
( ) , j j j
profile u m r r
or as:
[Tommaso Di Noia, Roberto Mirizzi, Vito Claudio Ostuni, Davide Romito, Markus Zanker. Linked Open Data to support Content-based Recommender Systems. 8th International Conference on Semantic Systems (I-SEMANTICS 2012) – best paper]
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
How do we compute the p coefficients?We need to identify the best possible values for the coefficient p, that is the weights associated with each property. There are plenty of choices to do that.
Depending on the nature of the user ratings (Likert or binary), we can consider the rating prediction as a regression problem (linear regression) or as a classification problem (logistic regression), and minimize a loss function J().
In the former case we can minimize the least squares loss function, and in the latter case we can minimize the cross-entropy loss function. In both cases we can use gradient descent:
p p
p
J
Another possible approach is to use a genetic algorithm, to minimize a not smooth loss function, such as the number of misclassification errors.
A mobile content-based RS (memory-based)
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
Let’s go Mobile (e.g., recommend movies in theaters)
[Vito Claudio Ostuni, Giosia Gentile, Tommaso Di Noia, Roberto Mirizzi, Davide Romito, Eugenio Di Sciascio. Mobile Movie Recommendations with Linked Data. Human-Computer Interaction & Knowledge Discovery @ CD-ARES’13 (HCI-KDD 2013)]
( , ) , 1 if likes with companion , 1 otherwisej j j j j
profile u cmp m r r u m cmp r
This time the user profile is context-dependent and is defined as:
( , , ) ( , , ) ( )i prefFilter preFilter i postFilter postFilter
r u m cmp r u m cmp r u
h (hierarchy): 1 if the theater is in the same city, 0 otherwisec (cluster): 1 if the theater is a multiplex, 0 otherwisecl (co-location): 1 if the theater is close to other POIs, 0 otherwisear (association-rule): 1 if the ticket price is known, 0 otherwiseap (anchor-point proximity): 1 if the theater is close to the user home or office, 0 otherwise
( )5
postFilter
h c cl ar apr u
( , )
( , )
( , , )( , )
j
j j i
m profile u cmp
preFilter i
r sim m m
r u m cmpprofile u cmp
And the prediction is made by two parts, contextual pre-filtering and contextual post-filtering:
A content-based RS (model-based)
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
Time for a Model-based CB-RSGeorge
Clooney [gc] Catherine Z. Jones [czj]
Brad Pitt [bp]
starring
Ocean’s Eleven [o11]
Ocean’s Twelve [o12]
Steven Soderbergh [ss]
director
2000s crime films [2cf]
Crime films [cf]
American criminal comedy [acc]
subject
11,gc ow
12,gc ow
12,czj ow
11,bp ow
12,bp ow
11,czj ow
112 ,cf ow
122 ,cf ow
12,cf ow
11,acc ow
12,acc ow
11,cf ow
11,ss ow
12,ss ow
This time each item is represented by a feature vector, where each feature corresponds to a property value.
( ) , 1 if likes , 1 otherwisej j j j j
profile u m r r u m r The user profile is defined as:
[Tommaso Di Noia, Roberto Mirizzi, Vito Claudio Ostuni, Davide Romito. Exploiting the Web of Data in Model-based Recommender Systems. 6th ACM Conference on Recommender Systems (RecSys 2012)]
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
Training the system with an SVM classifier
[https://en.wikipedia.org/wiki/File:Svm_max_sep_hyperplane_with_margin.png]
Support Vector Machine (SVM) is known to work well for text classification. Our problem of learning the user profile has a lot of commonalities with it, such as the sparse nature of the feature vector and the high dimensionality of the input space.
Main advantages:1. Feature selection is often not needed (SVM
robust to over-fitting and scales up pretty well)2. No need to tune parameters like before
We then fit a logistic model to SVM output to obtain a ranked list of items.
A hybrid RS (model-based)
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
Let’s continue with a Hybrid RS
[Vito Claudio Ostuni, Tommaso Di Noia, Eugenio Di Sciascio, Roberto Mirizzi. Top-N Recommendations from Implicit Feedback leveraging Linked Open Data. 7th ACM Conference on Recommender Systems (RecSys 2013)]
We want to recommend items i to user u, exploiting both the LOD knowledge base and other users’ interactions.
The ultimate goal of this recommendation system is to rank in the top-N positions items to be likely relevant for the user, in presence of implicit feedback.
Given the nature of the problem, the user profile is defined as:
( ) is relevant for profile u i i u
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
Path-based features
1
# ( )( )
# ( )
ui
ui D
ui
d
path jx j
path d
We define as the feature vector encoding all the interactions between user u and item i. Each component of this vector represents the relevance score between u and i with respect to a particular feature, and is defined as:
D
uix
The paths can be content-based, collaborative or hybrid.
Recommender Systems in the Linked Data Era – HP Labs, Palo Alto, CA7/12/2013
Learning the ranking functionIn order to predict the ranking and form the top-N recommendation lists we deal with the learning to rank problem by adopting a point-wise approach.In particular we use a combination of Random Forests and Gradient Boosted Regression Trees (GBRT).
Thank you!