recommender systems - king's college london common approach to designing recommender systems is...

Recommender Systems

6CCS3WSN-7CCSMWAL

http://insidebigdata.com/wp-content/uploads/2014/06/Humor-recommender.jpg

Some basic methods of recommendation

I Recommend popular items

I Collaborative FilteringItem-to-Item: People who buy X also buy YAmazon (Items), Facebook (Friends), YouTube (Movies)

I Content Based FilteringUser Profile plus Description of ItemsIf User watched a lot of Spy movies, recommend itemsclassified as Spy movies.Used by Netflix (among other things)

Whatever (graph clustering ....)

Amazon.com: Item-to-Item Collaborative Filtering

User Personal Profile

Various types of Item recommendation

Netflix: Content based filtering

One Netflix personalization is the collection of genre rows (aimed at the users tastes).

These range from familiar high-level categories like Comedies and Dramas to highly

tailored slices such as Imaginative Time Travel Movies.

Each row has 3 layers of personalization (for the user): the choice of genre itself, the

subset of titles selected within that genre, and the ranking of those titles.

(Experimentally) we measured an increase in member retention by placing the most

(user) tailored rows higher on the page instead of lower. (Example of A/B testing)

The Netflix Prize

The Netflix Prize and the Recommendation Problem

In 2006 we announced the Netflix Prize, a machine learning and data mining competition for movie rating prediction. We offered $1 million to whoever improved the accuracy of our existing system called Cinematch by 10%. We conducted this competition to find new ways to improve the recommendations we provide to our members, which is a key part of our business. However, we had to come up with a proxy question that was easier to evaluate and quantify: the root mean squared error (RMSE) of the predicted rating. The race was on to beat our RMSE of 0.9525 with the finish line of reducing it to 0.8572 or less. A year into the competition, the Korbell team won the first Progress Prize with an 8.43% improvement. They reported more than 2000 hours of work in order to come up with the final combination of 107 algorithms that gave them this prize. (…………) To put these algorithms to use, we had to work to overcome some limitations, for instance that they were built to handle 100 million ratings, instead of the more than 5 billion that we have, and that they were not built to adapt as members added more ratings. But once we overcame those challenges, we put the two algorithms into production, where they are still used as part of our recommendation engine.

http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html

Collaborative filtering

I Item based

I User based

Basic idea: Exploit User-Item relationships

Item based: Path melon-A-grapes.People who like Item melon also like Item grapes

User based: Path C-Melon-A. User A and C similar.Application?Recommend User A’s items to C (shopping)Recommend User A and C to each other (online dating)

Example: Item based Collaborative Filtering

Cosine Similarity

S(a, b) = cos(a, b) =a · b|a||b|

a = (a1, ..., an) is a vector,a · b =

∑ni=1 aibi , |a|2 =

∑ni=1 a

2i .

Where did we see this before?

We will consider the following sample data of preference of fourusers for three items:

ID user item rating

241 u1 m1 2

222 u1 m3 3

276 u2 m1 5

273 u2 m2 2

200 u3 m1 3

229 u3 m2 3

231 u3 m3 1

239 u4 m2 2

286 u4 m3 2

Step 1: Write the user-item ratings data in a matrix form.

m1 m2 m3

u1 2 ? 3

u2 5 2 ?

u3 3 3 1

u4 ? 2 2

Calculate similarityStep 2. Calculate similarity between items, e.g. m1 and m2.

m1 m2 m1 m2

u1 2 ?

u2 5 2 ------> 5 2

u3 3 3 3 3

u4 ? 2

Fortunately both m1 and m2 have been rated by users u2 and u3. We create two

item-vectors, v1 for item m1 and v2 for item m2, and find the cosine similarity

between them. At this point there are several approaches. We use the one where the

similarity is based on all pairs of users who rated both items, ignoring their other

ratings. Thus, the two item-vectors would be,

v1 = 5u2 + 3u3

v2 = 2u2 + 3u3

The cosine similarity between the two vectors, v1 and v2, would then be:

cos(v1, v2) =5 ∗ 2 + 3 ∗ 3√

(25 + 9)√

(4 + 9)= 0.904

Item-item similarityThe complete item-to-item similarity matrix as follows:

m1 m2 m3

m1 1 0.90 0.79

m2 1 0.87

m3 1

This table can be pre-computed.

Step 3. Use table to estimate user ratings for missing items.

u1 rated m1 and m3

m1 m2 m3

u1 2 ? 3

R(u1,m2) =2S(m1,m2) + 3S(m2,m3)

S(m1,m2) + S(m2,m3)=

2 ∗ 0.9 + 3 ∗ 0.87

0.9 + 0.87= 2.7

R(u2,m3) =5S(m1,m3) + 2S(m2,m3)

S(m1,m3) + S(m2,m3)= 3.4

R(u4,m1) =2S(m1,m2) + 2S(m1,m3)

S(m1,m2) + S(m1,m3)= 2

Fill in missing values

Before

m1 m2 m3

u1 2 ? 3

u2 5 2 ?

u3 3 3 1

u4 ? 2 2

After

m1 m2 m3

u1 2 2.7 3

u2 5 2 3.4

u3 3 3 1

u4 2 2 2

Content based filtering

J’s favorite cake is Choco Cream. J went to a cake shop for it, but ChocoCream cakes were sold out.

J asked the shopkeeper to recommend something similar and wasrecommended Choco Fudge, a cake that has the same ingredients. Jbought it.

Content-based (CB) filtering systems are systems recommending itemssimilar to items a user liked in the past.

These systems focus on algorithms, which assemble users preferences intousers profiles and all items information into items profiles. Then theyrecommend those items close to the user by similarity of their profiles.

A user profile is a set of assigned keywords (terms, features) collectedfrom items previously found relevant (or interesting) by the user.

An item profile is a set of assigned keywords (terms, features) of the itemitself.See http://recommender.no/info/content-based-filtering-recommender-systems/

J liked Choco Cream cakes, its ingredients (along with other thingsJ likes) form Js user profile.

The system reviewed other available item profiles and found thatChoco Fudge cake was the most similar in the item profile.

The similarity is high because both cakes have many of the sameingredients (chocolate, sugar, sponge cake).

This was the reason for the recommendation.

Q: Where have we seen this sort of thing before?

Where have we seen this before?

Item-1= (property 1, property 2,....,property n)Item-2= (property 1, property 2,....,property n)Item-k= (property 1, property 2,....,property n)

User-Tastes= (property 1, property 2,....,property n)

An Item is a vector of properties.A Users-Taste is a vector of properties.Retrieve the Items most appropriate to the Users Tastes

The quality of the system depends on finding good descriptiveproperties

Classic Information Retrieval

A document is a vector of terms.A user query is a vector of terms.Retrieve the documents most appropriate to the user query

An Item is a vector of properties.A Users Tastes is a vector of properties.Retrieve the Items most appropriate to the Users Tastes

Classic Information Retrieval

SummaryA common approach to designing recommender systems is content-basedfiltering.Content-based filtering methods are based on a description of the itemand a profile of the users preference.In a content-based recommender system, keywords are used to describethe items and a user profile is built to indicate the type of item this userlikes.In other words, the algorithm tries to recommend items that are similarto those that a user liked in the past (or is examining in the present).In particular, various candidate items are compared with items previouslyrated by the user and the best-matching items are recommended.This approach has its roots in information retrieval and informationfiltering research.To abstract the features of the items in the system, an item presentationalgorithm is applied. A widely used algorithm is the tf-idf representation(also called vector space representation).To create a user profile, the system mostly focuses on two types ofinformation:1. A model of the user’s preference.2. A history of the user’s interaction with the recommender system.(Wikipedia)

Compare some methods for Movilens data

I Movilens. https://movielens.org/Non-commercial, personalized movie recommendations.

I Data (User, movie, rating,...)

I Compare Popular, Random, UBCF, IBCF recommendations

I Create an evaluation scheme for the data set

I Take 90% of data for training (to build the data matrix),predict the top n recommendations for each user based onvarious methods, and then check the answer against the 10%of the data we kept back

I Get n = 1, 3, 5, 10, 15, 20 recommendations for users

Results. ROC and precision-recall

The meaning of plots. See IR lectures

True and false positives

True positive rate = True positives/Relevant-DocsFalse positive rate = False positives/ Non-relevant

The simplest case. We know the true answer. (which documents are Relevant). We look how the classifier worked

4

8

Evaluating an IR system• Precision: fraction of retrieved docs that are relevant

• Recall: fraction of relevant docs that are retrieved

• False negatives: relevant docs judged as non-relevant by IR system

• Consider the first row sum and first column sum

Relevant Non-relevant

Retrieved tp (true positive) fp (false positive)

Not Retrieved fn (false negatives) tn (true negatives)

Precision P = tp / (tp + fp)Recall R = tp / (tp + fn)

Various comments

I It seems like UBCF did better than IBCF.

I Then why would we use IBCF?

I The answer lies is when and how are we generatingrecommendations.

I UBCF saves the whole matrix of data and generates therecommendation at predict by finding the closest user.

I IBCF saves only k closest items in the matrix and doesnt haveto generate everything.

I It is pre-calculated and predict simply reads off the closestitems.

I Understandably, RANDOM is the worst.

I But perhaps surprisingly, its hard to beat POPULAR. I guesswe are not so different, you and I.

Quoted fromhttps://www.r-bloggers.com/testing-recommender-systems-in-r/

R for experiment#https://www.r-bloggers.com/testing-recommender-systems-in-r/# Load required librarylibrary(recommenderlab)

data(MovieLense)# 943 x 1664 rating matrix of class realRatingMatrix with 99392 ratings.# Let’s check some algorithms against each otherscheme <- evaluationScheme(MovieLense, method = "split", train = .9,

k = 1, given = 10, goodRating = 4)

#scheme? Read up details 90% of data used for training (fill in the matrix)algorithms <- list(

"random items" = list(name="RANDOM", param=list(normalize = "Z-score")),"popular items" = list(name="POPULAR", param=list(normalize = "Z-score")),"user-based CF" = list(name="UBCF", param=list(normalize = "Z-score",

method="Cosine",nn=50, minRating=3)),

"item-based CF" = list(name="IBCF", param=list(normalize = "Z-score"))

)

# run algorithms, predict next n moviesresults <- evaluate(scheme, algorithms, n=c(1, 3, 5, 10, 15, 20))

# Draw ROC curveplot(results, annotate = 1:4, legend="topleft")

# See precision / recallplot(results, "prec/rec", annotate=3)

The notes used material from:

I The Netflix Prize http://techblog.netflix.com/2012/04/netflix-

recommendations-beyond-5-stars.html

I Amazon.com Recommendations, Item-to-Item Collaborative Filtering

https://www.cs.umd.edu/ samir/498/Amazon-Recommendations.pdf

I Chapter 9 of Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff

Ullman.

http://www.mmds.org/#book

I https://ashokharnal.wordpress.com/2014/12/18/worked-out-example-item-

based-collaborative-filtering-for-recommenmder-engine/

Example: Item based Collaborative Filtering. But the working is wrong on that

page.

I http://recommender.no/ All sorts of stuff. e.g.

http://recommender.no/info/content-based-filtering-recommender-systems/

I And not forgetting Wikipedia.

I https://www.r-bloggers.com/testing-recommender-systems-in-r/

https://sanealytics.com/2012/06/10/testing-recommender-systems-in-r/

recommender systems - king's college london common approach to designing recommender systems is...

Documents