presentacion dcai 2010

Post on 10-May-2015

265 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

DCAI 2010, September 7-10 2010, Valencia

A Recommendation System for the Semantic Web

Victor Codina and Luigi Ceccaroni

vcodina@lsi.upc.edu

Departament de Llenguatges i Sistemes Informàtics (LSI)

Universitat Politècnica de Catalunya (UPC)

Introduction Our semantic approach Evaluation Conclusions

Outline

DCAI 2010, September 7-10 2010, Valencia

Introduction & motivations

Our semantic approach

Evaluation

Conclusions & future work

2

Introduction Our semantic approach Evaluation Conclusions

The general personalization process

DCAI 2010, September 7-10 2010, Valencia 3

Learningalgorithm

ITEMS

Item Representation

User Profile

Recommendation strategy

PersonalizedRecommendation

USERS

USER MODELING

CONTENT ADAPTATION

User satisfaction

Implicitfeedback

Explicitfeedback

User behavior

Introduction Our semantic approach Evaluation Conclusions

Potential benefits of using semantics

DCAI 2010, September 7-10 2010, Valencia

The use of semantics provides several advantages to reduce some limitations of current recommenders

o Cold-start problem

• By inferring missing information exploiting the relationships of domain ontologies

o Domain-dependency

• By employing standard ontology-based languages to uniformly represent information

4

Introduction Our semantic approach Evaluation Conclusions

Service oriented architecture design

DCAI 2010, September 7-10 2010, Valencia 5

Introduction Our semantic approach Evaluation Conclusions

Ontology-based representation (weighted overlay)

Weighted User’s interest

User’s interests and Item representation

DCAI 2010, September 7-10 2010, Valencia 6

Concept taxonomies

Weighted Item annotation

Introduction Our semantic approach Evaluation Conclusions

How do we take advantage of semantics?

DCAI 2010, September 7-10 2010, Valencia

We incorporate semantics in both stages of thepersonalization process to reduce the cold-start problem

o The user-profile learning algorithm employs a domain-basedinference method

• It expands and enrich the user-profiles with interests that cannotbe directly inferred from the user feedback

o The Content-based recommendation algorithm employs a taxonomy-based similarity method

• It uses the user’s interests in more general concepts related to the item’s annotations in order to refine the matching calculation

7

Introduction Our semantic approach Evaluation Conclusions

Step 1. Interest weights of the concepts related to the item are calculated/updated

Semantically-enhanced learning algorithm

DCAI 2010, September 7-10 2010, Valencia 8

START. The user provides some feedbackabout an item (e.g. a purchase or rating of an item)

Step 2. A domain-based inference methodinfers new interests from the families of concepts with updated interests

Updated

Learnt

Inferred

Item

User

Introduction Our semantic approach Evaluation Conclusions

Based on the minimum percentage of direct subconcepts

Two types of propagationo Upward-based (propagation to the parent concept)

o Sideward-based (propagation to the siblings)

The domain-based inference method

DCAI 2010, September 7-10 2010, Valencia 9

Baseball Basketball Football Tennis Golf

Sport

[1.0][1.0][-0.5] [0.5]

[0.5]

Upward-based threshold (UIT) = 0.6Sideward-based threshold (SIT) = 0.9

Upward-based? Pct(subconcepts) = 4/5 = 0.8 0.8 > UIT = 0.6 => Propagation

Sideward-based? Pct(subconcepts) = 4/5 = 0.8 0.8 > SIT = 0.9 => No propagation

[ ? ]

Introduction Our semantic approach Evaluation Conclusions

Semantically-enhanced content-based filtering

DCAI 2010, September 7-10 2010, Valencia 10

START. The system has to predict if the user will like/dislike an item

FOR EACH item’s annotation DO:

STEP 1. The conceptScore is calculated based on:• The interest degree of the user’s interests that match the item’s annotation• The semantic similarity of the matchings (perfect or partial match)

END FOR

STEP 2. The itemScore is calculated using the weighted average of conceptScore values according to their relevance

User

Item

C1

C2Perfect

Partial

Partial

Introduction Our semantic approach Evaluation Conclusions

The taxonomy-based similarity method

DCAI 2010, September 7-10 2010, Valencia

Based on the distance in terms of taxonomy levels betweeno The item’s annotation

o The user’s interest (an ancestor of the item’s annotation)

Weighted semantic distance among levels using K factor

11

Genre

Romance

Steamy RomanceLevel 3

Level 2

Level 1 Source

Sport

Extreme

ClimbingLevel 4

UserInterest

ItemAnnotation

SIM = 0.7

ItemAnnotation

User Interest

SIM = 0.6distance = 1

distance = 1

K4 = 0.3

K3 = 0.4

Introduction Our semantic approach Evaluation Conclusions

Experimental dataset

DCAI 2010, September 7-10 2010, Valencia

Netflix-prize movie dataset

o 480,000 users

o 17,700 movies

o 100M user ratings ranging between 1 and 5

Movie taxonomy used by Netflix for annotating movies

o 1 global hierarchy of concepts describing the movies

o 3 levels of depth

o 550 nodes (item’s annotations)

RMSE metric

o Measures the error on rating prediction for a set of users

12

Introduction Our semantic approach Evaluation Conclusions

Experimental evaluation

DCAI 2010, September 7-10 2010, Valencia

Exp. 1: Traditional vs semantic approach

o GOAL. To evaluate the improvement on accuracy when the semantics-based methods are employed

• Is cold-start problem reduced?

Exp. 2: Semantic approach on two different taxonomies

o GOAL. To analyze if the hierarchical structure of the taxonomy affect the effectiveness of semantics-based methods

• How the taxonomy structure affect their performance?

13

Introduction Our semantic approach Evaluation Conclusions

Exp.1: Traditional vs Semantic approach

DCAI 2010, September 7-10 2010, Valencia

Experiment setup

o The error of two algorithm configurations is compared

• CB configuration (traditional CB approach)

• SEM-CB configuration (semantically-enhanced CB approach)

14

Config.User profile

representationInterest-prediction

methodItem - User matching

CBKeyword-based

profile Rating-based Perfect matches

SEM-CBOntology-based

profile

Rating-based +

Domain inference

Perfect + Partial matches

(semantic similarity)

Introduction Our semantic approach Evaluation Conclusions

Exp.1: Traditional vs Semantic approach

DCAI 2010, September 7-10 2010, Valencia

Overall prediction results:

15

1,025

1,03

1,035

1,04

1,045

1,05

1,055

1,06

1,065

CB SEM-CB

RMSE

Introduction Our semantic approach Evaluation Conclusions

Exp.1: Traditional vs Semantic approach

DCAI 2010, September 7-10 2010, Valencia

Prediction results grouped by user-profile size (nº ratings)

16

Each interval nearly contains2% of predictions of the Netflix test-set

Introduction Our semantic approach Evaluation Conclusions

Exp.1: Traditional vs Semantic approach

DCAI 2010, September 7-10 2010, Valencia

Comparison of RMSE based on user-profile size

17

The improvement is bigger in users with small profile-size (the cold-start users)

Introduction Our semantic approach Evaluation Conclusions

Exp.2: Semantic approach on different taxonomies

DCAI 2010, September 7-10 2010, Valencia

Experiment setup

o Two semantics-based configurations are compared on different versions of the movie taxonomy:

• Sem-CB configuration (employs the original taxonomy)

• Sem-CB+ configuration (employs an alternative version)

18

Taxonomy properties

Config. Nº nodes Nº levels Nº hierarchiesAvg. Size of nodes

per family

SEM-CB 550 3 1 14

SEM-CB+ 550 4 4 7

Introduction Our semantic approach Evaluation Conclusions

Exp.2: Semantic approach on different taxonomies

DCAI 2010, September 7-10 2010, Valencia

Results:

19

Parameter settings of semantics-based algorithms

Optimal execution Same accuracy

Introduction Our semantic approach Evaluation Conclusions

Conclusions and Future work

DCAI 2010, September 7-10 2010, Valencia

Main conclusions

o The cold-start problem is reduced by exploiting semantics

o The incorporation of semantics in a traditional CB approach

o The recommender is domain-independent by combining

• A service oriented architecture design

• Standard ontology-based languages (FOAF, OWL)

Future work

o Further experimentation

• In richer domains and with other semantic methods

o The incorporation of semantics into other approaches

• e.g. Collaborative Filtering and Hybrid systems

20

DCAI 2010, September 7-10 2010, Valencia

A Recommendation System for the Semantic Web

Victor Codina and Luigi Ceccaroni

vcodina@lsi.upc.edu

Departament de Llenguatges i Sistemes Informàtics (LSI)

Universitat Politècnica de Catalunya (UPC)

Introduction Our semantic approach Evaluation Conclusions

Exp.1: Traditional vs Semantically-enhanced

DCAI 2010, September 7-10 2010, Valencia

Comparison of overall accuracy results:

22

0,880,9

0,920,940,960,98

11,021,041,061,08

RMSE

top related