politecnico di milano top-n recommendations on unpopular items with contextual knowledge paolo...

56
olitecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di Milano ContentWise

Upload: gabriel-davidson

Post on 27-Mar-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Top-N recommendations on Unpopular Items

with Contextual Knowledge

Paolo CremonesiAntonio TripodiRoberto Turrin

Politecnico di MilanoContentWise

Page 2: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Today recommendations,based on your personal taste, are:

From this….

To this

iTV with personalization

Page 3: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Personalization: how it works

USERDATA

USER’S TASTE

FRUTIONS ANDRATINGS

CONTENTMETADATA

RECOMMENDER

SYSTEM

CONTENTRECOMMENDATIONS

Page 4: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

4

CustomersService Provider Network ProviderContent Provider

IPTV architecture

Head end

VOD

Set-top-box(decoder)

Page 5: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

user u

item i

5

Page 6: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di
Page 7: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di
Page 8: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di
Page 9: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di
Page 10: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Single domain

Page 11: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Single domain

Page 12: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Single domain

Page 13: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di
Page 14: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di
Page 15: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Single domain

Page 16: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

• Two ideas– One aII from Ricci– Closure (on UU and II)

Page 17: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

• Inserire disegnino IPTV• Recommender systems can be divided into two

families– Content-Based Filtering– Collaborative Filtering

• CB algorithms are preferred– Do not rely on metadata (very difficult to obtain in the

TV domain)– Quality has been proved to be better in terms of

accuracy an serendipity, if the system has been trained with enough data

Page 18: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

• CF algorithms can be classified into – Model-based (able to deal with new users)– Non nodel-based (not able to deal with new users)

Page 19: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Top-N recommendations on Unpopular Items

with Contextual Knowledge

Paolo CremonesiPaolo Garza

Elisa QuintarelliRoberto Turrin

Politecnico di MilanoContentWise

Short version

Page 20: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Thanks for your attention

Q&A

For any further information, please contact

Paolo [email protected]

Page 21: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Top-N recommendations on Unpopular Items

with Contextual Knowledge

Paolo CremonesiPaolo Garza

Elisa QuintarelliRoberto Turrin

Politecnico di MilanoContentWise

Long version

Page 22: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Research objectives

• Focus– Top-N recommendation task

• Goal– Improving accuracy– Providing explanation

• Requirements– Modularity (algorithm-independent)– Fast on-line recommendations

Page 23: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Algorithms

23

Accu

racy CorNgbr

Non-personalized

Neighborhood Latent factors

NNCosNgbr

AsySVD PureSVD

Collaborative

TopPop MovieAvg

Page 24: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

TopPop and MovieAvg

24

Accu

racy CorNgbr

Non-personalized

Neighborhood Latent factors

NNCosNgbr

AsySVD PureSVD

Collaborative

TopPop MovieAvg

Recommends the top-N popular items (i.e., the most rated items), regardless the user preferences and taste

Page 25: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

TopPop

• Pirates of the Caribbean: The Curse of the Black Pearl

• Forrest Gump• The Lord of the Rings: The Two Towers• The Lord of the Rings: The Fellowship of the

Ring• The Sixth Sense

Page 26: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Collaborative - Neighborhood

26

Accu

racy CorNgbr

Non-personalized

Neighborhood Latent factors

NNCosNgbr

AsySVD PureSVD

Collaborative

TopPop MovieAvg

They recommend items according to the approach: “who bought this also bought this..”Amazon like …

Page 27: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Collaborative – Latent factors

27

Accu

racy CorNgbr

Non-personalized

Neighborhood Latent factors

NNCosNgbr

AsySVD PureSVD

Collaborative

TopPop MovieAvg

They recommend items on the basis of an advanced representation of users and items in a low-dimensional feature space

Page 28: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Contextual recommendations

• Pre-filtering– L.Baltrunas, F.Ricci

RecSys'09• Post-filtering– U.Panniello, A.Tuzhilin, M.Gorgoglione, ...

RecSys'09• Contextual modeling– M.Domingue, A.Jorge, C.Soares

RecSys'09– C. Palmisano, A.Tuzhilin, M.Gorgoglione

IEEE Trans. Knowl. Data Eng., 2008

Page 29: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Association rules

• Data mining technique

• Uses “frequency based” approach to find conditional probability of events

• Forrest Gump and Nikita → Avatar

Page 30: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Association rules

• X→Y• X = previously watched movie(s)• Y = movie(s) the user will likely appreciate• Quality of association rules: – Support: frequency of the rule– Confidence: conditional probability of Y given X

• Benefits (by definition)– best recommendations in terms of accuracy

Page 31: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Association rules and RS

• Sarwar et al. Analysis of recommendation algorithms for e-commerce, EC 2000

• Computational requirements– theoretically we should test for all the possible

combinations of items in X and Y• Portfolio effect– most rules find the same small set of consequents– recommendations are biased toward obvious

items

Page 32: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Portfolio effect

Page 33: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Portfolio effect

• Which is the most simple and yet most effective association-rule based recommender system?

Page 34: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Portfolio effect

• Which is the most simple and yet most effective association-rule based recommender system?

• TopPop

Page 35: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Recall on Netflix

Algorithm Recall at 10

PureSVD50 0.48NNCosNgbr 0.45AsySVD 0.30TopPop 0.28CorNgbr 0.15MovieAvg 0.12

Page 36: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Recall on Netflix

Algorithm Recall at 10 Recall at 10long tail only

PureSVD50 0.48 0.30NNCosNgbr 0.45 0.28AsySVD 0.30 0.30TopPop 0.28 0.02CorNgbr 0.15 0.35MovieAvg 0.12 0.12

Removed the most popular itemsaccounting for 33% of ratings

Page 37: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Measured perceived quality

• Users’ judgments on Accuracy and Novelty

• Participants:30 users per 7 experimental condition → 210 users overall

• Profile: 20 - 50 years old male: 54% - female: 46%

Page 38: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Perceived relevance

Page 39: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Perceived novelty

Page 40: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Context recommender system

• Traditional Recommender

System

• Users’ contexts - Items’ characteristics

• Contextual Rule Mining

• Recommendations • Contextual rules

• Contextual Post-filtering

• Contextual Recommendations

• Users - Items

Page 41: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Experiments: rules mining

• Movielens: 1 M ratings– 1000 users– 1700 items

• Context– # age ranges = 7– # gender = M/F

• Movie features– # genres = 18

• Rules mined with FP-growth• Min support = 1000

Page 42: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Goal

• Identify correlations between user’s context and item characteristics

• Filter predictions performed by a traditional recommender

Page 43: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Inputs to the system

• Input to the recommender system

URM

• Input to the contextual rule miner

CFM – User context × Item features– number of ratings users in context c gave to items

with feature f

Page 44: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Creation of the transactional dataset

• UCM → Transactional dataset•

Example:A rating given by a Male with age [20-25] to a fantasy movie

(gender = M)(age = [20-25])(genre = fantasy)

is included in the transactional dataset

Page 45: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Rule mining: example

• The following two rules are extracted for the context (gender = M):

(gender = M) ) (genre = horror)(gender = M) ) (genre = action)

• It follows that only horror and action movies can be recommended to male users

Page 46: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Example of rules …

Gender Age Genre Prob. SupportF 35-44 Drama 35% 17000

Comedy 32% 16000Romance 18% 9000Children’s 8% 4000Musical 5% 2500Animation 4% 2000Mystery 4% 2000Fantasy 3% 1500

Page 47: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Example of rules …

Gender Age Genre Prob. SupportM 35-44 Action 23% 34000

Thriller 16% 24000Sci-Fi 14% 21000Adventure 12% 17000War 7% 10000Horror 6% 9000Mystery 3% 5000Western 2% 3000Noir 2% 3000

Page 48: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Two options

• Keep all of the rules

• Keep only rules with a “large” confidence15%

• In any case, we keep only rules with a large support (>1000 ratings)

Page 49: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Example of rules …

Gender Age Genre Prob. SupportF 35-44 Drama 35% 17000

Comedy 32% 16000Romance 18% 9000Children’s 8% 4000Musical 5% 2500Animation 4% 2000Mystery 4% 2000Fantasy 3% 1500

Page 50: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Example of rules …

Gender Age Genre Prob. SupportM 35-44 Action 23% 34000

Thriller 16% 24000Sci-Fi 14% 21000Adventure 12% 17000War 7% 10000Horror 6% 9000Mystery 3% 5000Western 2% 3000Noir 2% 3000

Page 51: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Results

Page 52: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Recall at 5 on long-tail items

Long tail Without context With contextPureSVD 21% 30%AsySVD 7% 12%NNCosNgbr 10% 24%CorNgbr 4% 9%TopPop 0.1% 8%MovieAvg 1% 5%

Removed 5% of the most popular itemsAccounting for 33% of ratings

Page 53: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Recall with non-personal methods

Page 54: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Recall with neighborhood methods

Page 55: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Recall with latent-factors methods

Page 56: Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di

Politecnico di Milano

Thanks for your attention

Q&A

For any further information, please contact

Paolo [email protected]