politecnico di milano top-n recommendations on unpopular items with contextual knowledge paolo...
TRANSCRIPT
Politecnico di Milano
Top-N recommendations on Unpopular Items
with Contextual Knowledge
Paolo CremonesiAntonio TripodiRoberto Turrin
Politecnico di MilanoContentWise
Today recommendations,based on your personal taste, are:
From this….
To this
iTV with personalization
Personalization: how it works
USERDATA
USER’S TASTE
FRUTIONS ANDRATINGS
CONTENTMETADATA
RECOMMENDER
SYSTEM
CONTENTRECOMMENDATIONS
4
CustomersService Provider Network ProviderContent Provider
IPTV architecture
Head end
VOD
Set-top-box(decoder)
user u
item i
5
Single domain
Single domain
Single domain
Single domain
Politecnico di Milano
• Two ideas– One aII from Ricci– Closure (on UU and II)
Politecnico di Milano
• Inserire disegnino IPTV• Recommender systems can be divided into two
families– Content-Based Filtering– Collaborative Filtering
• CB algorithms are preferred– Do not rely on metadata (very difficult to obtain in the
TV domain)– Quality has been proved to be better in terms of
accuracy an serendipity, if the system has been trained with enough data
Politecnico di Milano
• CF algorithms can be classified into – Model-based (able to deal with new users)– Non nodel-based (not able to deal with new users)
Politecnico di Milano
Top-N recommendations on Unpopular Items
with Contextual Knowledge
Paolo CremonesiPaolo Garza
Elisa QuintarelliRoberto Turrin
Politecnico di MilanoContentWise
Short version
Politecnico di Milano
Thanks for your attention
Q&A
For any further information, please contact
Paolo [email protected]
Politecnico di Milano
Top-N recommendations on Unpopular Items
with Contextual Knowledge
Paolo CremonesiPaolo Garza
Elisa QuintarelliRoberto Turrin
Politecnico di MilanoContentWise
Long version
Politecnico di Milano
Research objectives
• Focus– Top-N recommendation task
• Goal– Improving accuracy– Providing explanation
• Requirements– Modularity (algorithm-independent)– Fast on-line recommendations
Politecnico di Milano
Algorithms
23
Accu
racy CorNgbr
Non-personalized
Neighborhood Latent factors
NNCosNgbr
AsySVD PureSVD
Collaborative
TopPop MovieAvg
Politecnico di Milano
TopPop and MovieAvg
24
Accu
racy CorNgbr
Non-personalized
Neighborhood Latent factors
NNCosNgbr
AsySVD PureSVD
Collaborative
TopPop MovieAvg
Recommends the top-N popular items (i.e., the most rated items), regardless the user preferences and taste
Politecnico di Milano
TopPop
• Pirates of the Caribbean: The Curse of the Black Pearl
• Forrest Gump• The Lord of the Rings: The Two Towers• The Lord of the Rings: The Fellowship of the
Ring• The Sixth Sense
Politecnico di Milano
Collaborative - Neighborhood
26
Accu
racy CorNgbr
Non-personalized
Neighborhood Latent factors
NNCosNgbr
AsySVD PureSVD
Collaborative
TopPop MovieAvg
They recommend items according to the approach: “who bought this also bought this..”Amazon like …
Politecnico di Milano
Collaborative – Latent factors
27
Accu
racy CorNgbr
Non-personalized
Neighborhood Latent factors
NNCosNgbr
AsySVD PureSVD
Collaborative
TopPop MovieAvg
They recommend items on the basis of an advanced representation of users and items in a low-dimensional feature space
Politecnico di Milano
Contextual recommendations
• Pre-filtering– L.Baltrunas, F.Ricci
RecSys'09• Post-filtering– U.Panniello, A.Tuzhilin, M.Gorgoglione, ...
RecSys'09• Contextual modeling– M.Domingue, A.Jorge, C.Soares
RecSys'09– C. Palmisano, A.Tuzhilin, M.Gorgoglione
IEEE Trans. Knowl. Data Eng., 2008
Politecnico di Milano
Association rules
• Data mining technique
• Uses “frequency based” approach to find conditional probability of events
• Forrest Gump and Nikita → Avatar
Politecnico di Milano
Association rules
• X→Y• X = previously watched movie(s)• Y = movie(s) the user will likely appreciate• Quality of association rules: – Support: frequency of the rule– Confidence: conditional probability of Y given X
• Benefits (by definition)– best recommendations in terms of accuracy
Politecnico di Milano
Association rules and RS
• Sarwar et al. Analysis of recommendation algorithms for e-commerce, EC 2000
• Computational requirements– theoretically we should test for all the possible
combinations of items in X and Y• Portfolio effect– most rules find the same small set of consequents– recommendations are biased toward obvious
items
Politecnico di Milano
Portfolio effect
Politecnico di Milano
Portfolio effect
• Which is the most simple and yet most effective association-rule based recommender system?
Politecnico di Milano
Portfolio effect
• Which is the most simple and yet most effective association-rule based recommender system?
• TopPop
Politecnico di Milano
Recall on Netflix
Algorithm Recall at 10
PureSVD50 0.48NNCosNgbr 0.45AsySVD 0.30TopPop 0.28CorNgbr 0.15MovieAvg 0.12
Politecnico di Milano
Recall on Netflix
Algorithm Recall at 10 Recall at 10long tail only
PureSVD50 0.48 0.30NNCosNgbr 0.45 0.28AsySVD 0.30 0.30TopPop 0.28 0.02CorNgbr 0.15 0.35MovieAvg 0.12 0.12
Removed the most popular itemsaccounting for 33% of ratings
Politecnico di Milano
Measured perceived quality
• Users’ judgments on Accuracy and Novelty
• Participants:30 users per 7 experimental condition → 210 users overall
• Profile: 20 - 50 years old male: 54% - female: 46%
Perceived relevance
Perceived novelty
Politecnico di Milano
Context recommender system
• Traditional Recommender
System
• Users’ contexts - Items’ characteristics
• Contextual Rule Mining
• Recommendations • Contextual rules
• Contextual Post-filtering
• Contextual Recommendations
• Users - Items
Politecnico di Milano
Experiments: rules mining
• Movielens: 1 M ratings– 1000 users– 1700 items
• Context– # age ranges = 7– # gender = M/F
• Movie features– # genres = 18
• Rules mined with FP-growth• Min support = 1000
Politecnico di Milano
Goal
• Identify correlations between user’s context and item characteristics
• Filter predictions performed by a traditional recommender
Politecnico di Milano
Inputs to the system
• Input to the recommender system
URM
• Input to the contextual rule miner
CFM – User context × Item features– number of ratings users in context c gave to items
with feature f
Politecnico di Milano
Creation of the transactional dataset
• UCM → Transactional dataset•
Example:A rating given by a Male with age [20-25] to a fantasy movie
(gender = M)(age = [20-25])(genre = fantasy)
is included in the transactional dataset
Politecnico di Milano
Rule mining: example
• The following two rules are extracted for the context (gender = M):
(gender = M) ) (genre = horror)(gender = M) ) (genre = action)
• It follows that only horror and action movies can be recommended to male users
Politecnico di Milano
Example of rules …
Gender Age Genre Prob. SupportF 35-44 Drama 35% 17000
Comedy 32% 16000Romance 18% 9000Children’s 8% 4000Musical 5% 2500Animation 4% 2000Mystery 4% 2000Fantasy 3% 1500
Politecnico di Milano
Example of rules …
Gender Age Genre Prob. SupportM 35-44 Action 23% 34000
Thriller 16% 24000Sci-Fi 14% 21000Adventure 12% 17000War 7% 10000Horror 6% 9000Mystery 3% 5000Western 2% 3000Noir 2% 3000
Politecnico di Milano
Two options
• Keep all of the rules
• Keep only rules with a “large” confidence15%
• In any case, we keep only rules with a large support (>1000 ratings)
Politecnico di Milano
Example of rules …
Gender Age Genre Prob. SupportF 35-44 Drama 35% 17000
Comedy 32% 16000Romance 18% 9000Children’s 8% 4000Musical 5% 2500Animation 4% 2000Mystery 4% 2000Fantasy 3% 1500
Politecnico di Milano
Example of rules …
Gender Age Genre Prob. SupportM 35-44 Action 23% 34000
Thriller 16% 24000Sci-Fi 14% 21000Adventure 12% 17000War 7% 10000Horror 6% 9000Mystery 3% 5000Western 2% 3000Noir 2% 3000
Politecnico di Milano
Results
Politecnico di Milano
Recall at 5 on long-tail items
Long tail Without context With contextPureSVD 21% 30%AsySVD 7% 12%NNCosNgbr 10% 24%CorNgbr 4% 9%TopPop 0.1% 8%MovieAvg 1% 5%
Removed 5% of the most popular itemsAccounting for 33% of ratings
Politecnico di Milano
Recall with non-personal methods
Politecnico di Milano
Recall with neighborhood methods
Politecnico di Milano
Recall with latent-factors methods
Politecnico di Milano
Thanks for your attention
Q&A
For any further information, please contact
Paolo [email protected]