personalizing & recommender systems

Download Personalizing  & Recommender Systems

Post on 05-Jan-2016




4 download

Embed Size (px)


Personalizing & Recommender Systems. Bamshad Mobasher Center for Web Intelligence DePaul University, Chicago, Illinois, USA. Personalization. The Problem - PowerPoint PPT Presentation


Limited Knowledge Profile Injection Attacks in Collaborative Filtering Systems

Personalizing & Recommender SystemsBamshad Mobasher

Center for Web Intelligence

DePaul University, Chicago, Illinois, USA12PersonalizationThe ProblemDynamically serve customized content (books, movies, pages, products, tags, etc.) to users based on their profiles, preferences, or expected interests

Why we need it?Information spaces are becoming much more complex for user to navigate (huge online repositories, social networks, mobile applications, blogs, .)For businesses: need to grow customer loyalty / increase salesIndustry Research: successful online retailers are generating as much as 35% of their business from recommendations

23Recommender SystemsMost common type of personalization: Recommender systems

RecommendationalgorithmUserprofile34Common ApproachesCollaborative FilteringGive recommendations to a user based on preferences of similar usersPreferences on items may be explicit or implicitIncludes recommendation based on social / collaborative contentContent-Based FilteringGive recommendations to a user based on items with similar content in the users profileRule-Based (Knowledge-Based) FilteringProvide recommendations to users based on predefined (or learned) rulesage(x, 25-35) and income(x, 70-100K) and children(x, >=3) recommend(x, Minivan)Hybrid Approaches45

Content-Based Recommender Systems56Content-Based Recommenders: Personalized Search AgentsHow can the search engine determine the users context?

Query: Madonna and Child??Need to learn the user profile:User is an art historian?User is a pop music fan?67Content-Based Recommenders :: more examplesMusic recommendationsPlay list generation

Example: Pandora78

Collaborative Recommender Systems89Collaborative Recommender Systems


Collaborative Recommender Systems1011Collaborative Recommender Systems

1112Social / Collaborative Tags


Example: Tags describe the Resource

Tags can describeThe resource (genre, actors, etc)Organizational (toRead)Subjective (awesome)Ownership (abc)etc

Tag RecommendationThese systems are collaborative.Recommendation / Analytics based on the wisdom of crowds.Tags describe the user

Rai Aren's profileco-authorSecret of the Sands"

Social RecommendationA form of collaborative filtering using social network dataUsers profiles represented as sets of links to other nodes (users or items) in the networkPrediction problem: infer a currently non-existent link in the network16


Example: Using Tags for Recommendation1718

Aggregation & Personalization across social, collaborative, and content channels1819


20Build a content-based recommender forNews stories (requires basic text processing and indexing of documents)Blog posts, tweetsMusic (based on features such as genre, artist, etc.)Build a collaborative or social recommenderMovies (using movie ratings), e.g., movielens.orgMusic, e.g.,, last.fmRecommend songs or albums based on collaborative ratings, tags, etc.recommend whole playlists based on playlists from other usersRecommend users (other raters, friends, followers, etc.), based similar interests21Possible Interesting Project Ideas2122The Recommendation TaskBasic formulation as a prediction problem

Typically, the profile Pu contains preference scores by u on some other items, {i1, , ik} different from itpreference scores on i1, , ik may have been obtained explicitly (e.g., movie ratings) or implicitly (e.g., time spent on a product page or a news article)Given a profile Pu for a user u, and a target item it, predict the preference score of user u on item it2223Collaborative Recommender SystemsCollaborative filtering recommendersPredictions for unseen (target) items are computed based the other users with similar interest scores on items in user us profilei.e. users with similar tastes (aka nearest neighbors)requires computing correlations between user u and other users according to interest scores or ratingsk-nearest-neighbor (knn) strategy

Can we predict Karens rating on the unseen item Independence Day?2324Basic Collaborative Filtering ProcessNeighborhood Formation PhaseRecommendations

NeighborhoodFormationRecommendationEngineCurrent User RecordHistoricalUser Recordsuseritemrating

NearestNeighborsCombinationFunctionRecommendation Phase2425Collaborative Filtering: Measuring SimilaritiesPearson Correlationweight by degree of correlation between user U and user J

1 means very similar, 0 means no correlation, -1 means dissimilar

Works well in case of user ratings (where there is at least a range of 1-5)Not always possible (in some situations we may only have implicit binary values, e.g., whether a user did or did not select a document)Alternatively, a variety of distance or similarity measures can be used

Average rating of user Jon all items.

2526Collaborative filtering recommendersPredictions for unseen (target) items are computed based the other users with similar interest scores on items in user us profilei.e. users with similar tastes (aka nearest neighbors)requires computing correlations between user u and other users according to interest scores or ratings

predictionCorrelation to KarenPredictions for Karen on Indep. Day based on the K nearest neighborsCollaborative Recommender Systems2627Collaborative Filtering: Making PredictionsWhen generating predictions from the nearest neighbors, neighbors can be weighted based on their distance to the target userTo generate predictions for a target user a on an item i:

ra = mean rating for user au1, , uk are the k-nearest-neighbors to aru,i = rating of user u on item Isim(a,u) = Pearson correlation between a and u

This is a weighted average of deviations from the neighbors mean ratings (and closer neighbors count more)

2728Example Collaborative SystemItem1 Item 2Item 3Item 4Item 5Item 6Correlation with AliceAlice5233?User 12441-1.00User 2213120.33User 342321.90User 4332310.19User 53222-1.00User 6531320.65User 75151-1.00BestmatchPredictionUsing k-nearest neighbor with k = 12829Collaborative Recommenders :: problems of scale

2930Item-based Collaborative FilteringFind similarities among the items based on ratings across usersOften measured based on a variation of Cosine measurePrediction of item I for user a is based on the past ratings of user a on items similar to i.


Predicted rating for Karen on Indep. Day will be 7, because she rated Star Wars 7That is if we only use the most similar itemOtherwise, we can use the k-most similar items and again use a weighted average

sim(Star Wars, Indep. Day) > sim(Jur. Park, Indep. Day) > sim(Termin., Indep. Day)3031Item-based collaborative filtering

3132Item-Based Collaborative FilteringItem1 Item 2Item 3Item 4Item 5Item 6Alice5233?User 12441User 221312User 342321User 433231User 53222User 653132User 75151Item similarity0.760.790.600.710.75BestmatchPrediction3233Collaborative Filtering: Evaluationsplit users into train/test setsfor each user a in the test set:split as votes into observed (I) and to-predict (P)measure average absolute deviation between predicted and actual votes in PMAE = mean absolute erroraverage over all test users

33Data sparsity problemsCold start problemHow to recommend new items? What to recommend to new users?Straightforward approachesAsk/force users to rate a set of itemsUse another method (e.g., content-based, demographic or simply non-personalized) in the initial phaseAlternativesUse better algorithms (beyond nearest-neighbor approaches)In nearest-neighbor approaches, the set of sufficiently similar neighbors might be too small to make good predictionsUse model-based approaches (clustering; dimensionality reduction, etc.)

34Example algorithms for sparse datasetsRecursive CFAssume there is a very close neighbor n of u who has not yet rated the target item i .Apply CF-method recursively and predict a rating for item i for the neighborUse this predicted rating instead of the rating of a more distant direct neighborItem1Item2Item3Item4Item5Alice5344?User13123?User243435User333154User415521sim = 0.85Predict rating forUser135More model-based approachesMany ApproachesMatrix factorization techniques, statisticssingular value decomposition, principal component analysisApproaches based on clusteringAssociation rule miningcompare: shopping basket analysisProbabilistic modelsclustering models, Bayesian networks, probabilistic Latent Semantic AnalysisVarious other machine learning approachesCosts of pre-processing Usually not discussedIncremental updates possible?36Dimensionality ReductionBasic idea: Trade more complex offline model building for faster online prediction generationSingular Value Decomposition for dimensionality reduction of rating matricesCaptures important factors/aspects and their weights in the data factors can be genre, actors but also non-understandable onesAssumption that k dimensions capture the signals and filter out noise (K = 20 to 100)Constant time to make recommendationsApproach also popular in IR (Latent Semantic Indexing), data compression,

37A picture says

BobMaryAliceSue38Matrix factorizationVkTDim1-0.44-0.570.060.380.57Dim20.58-0.660.260.18-0.36

UkDim1Dim2Alice0.47-0.30Bob -0.440.23Mary0.70-0.06Sue0.310.93Dim1Dim2Dim15.630Dim203.23

SVD:Prediction: = 3 + 0.84 = 3.84

39Content-based recommendationCollaborative filtering does NOT require any information about the items,However, it might be reasonable to exploit such informationE.g. recommend fantasy novels to people who liked fantasy novels in the pastWhat do we need:Some information about the available items such as the