how to build recommender system

41
ФАКУЛТЕТ ЗА ИНФОРМАТИЧКИ НАУКИ И КОМПЈУТЕРСКО ИНЖЕНЕРСТВО СИСТЕМИ ЗА ПРЕПОРАКИ. Системи на знаење

Upload: mitko-gurbanski

Post on 22-Mar-2017

93 views

Category:

Science


3 download

TRANSCRIPT

Page 1: How to build recommender system

ФАКУЛТЕТ ЗА ИНФОРМАТИЧКИ НАУКИ И КОМПЈУТЕРСКО ИНЖЕНЕРСТВО

СИСТЕМИ ЗА ПРЕПОРАКИ.

Системи на знаење

Page 2: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Препораки

База

Пребарување Препораки

Производи, веб страни, блогови, новинарски текстови, …

Items

Search Recommendations

Products, web sites, blogs,  news  items,  …

1/29/2013 4 Jure Leskovec, Stanford C246: Mining Massive Datasets

Examples:

Page 3: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Пример на систем за препораки

Customer X Buys Metallica CD Buys Megadeth CD

Customer Y Does search on Metallica Recommender system

suggests Megadeth from data collected about customer X

1/29/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 3 3

Customer X Buys Metallica CD Buys Megadeth CD

Customer Y Does search on Metallica Recommender system

suggests Megadeth from data collected about customer X

1/29/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 3

Page 4: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Од оскудност до изобилство (From scarcity to abundance)

n Shelf space is a scarce commodity for traditional retailers ¨ Also: TV networks, movie theaters,…

n The web enables near-zero-cost dissemination of information about products ¨ From scarcity to abundance

n More choice necessitates better filters ¨ Recommendation engines ¨ How Into Thin Air made Touching the Void a

bestseller: http://www.wired.com/wired/archive/12.10/tail.html

Page 5: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Долгата опашка (The Long Tail)

Source: Chris Anderson (2004) Source: Chris Anderson (2004)

1/29/2013 6 Jure Leskovec, Stanford C246: Mining Massive Datasets

Page 6: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Физички наспроти Интернет продавници

1/29/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 7

Read http://www.wired.com/wired/archive/12.10/tail.html to learn more! 6

Page 7: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Recommendation Types

n Editorial and hand curated ¨ List of favorites

¨ Lists of “essential” items

n Simple aggregates ¨ Top 10, Most Popular, Recent Uploads

n Tailored to individual users ¨ Amazon, Netflix, …

Page 8: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Formal Model

n X = set of Customers

n S = set of Items n Utility function u: X x S àR

¨ R = set of ratings

¨ R is a totally ordered set

¨ e.g., 0-5 stars, real number in [0,1]

Page 9: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Utility Matrix

0.410.2

0.30.50.21

King Kong LOTR Matrix Nacho Libre

Alice

Bob

Carol

David

LOTR = Lord of the Rings

Page 10: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Key Problems (1) Gathering  “known”  ratings  for  matrix

How to collect the data in the utility matrix

(2) Extrapolate unknown ratings from the known ones Mainly interested in high unknown ratings

We  are  not  interested  in  knowing  what  you  don’t  like   but what you like

(3) Evaluating extrapolation methods How to measure success/performance of

recommendation methods

1/29/2013 11 Jure Leskovec, Stanford C246: Mining Massive Datasets

Page 11: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Gathering Ratings

n Explicit ¨ Ask people to rate items ¨ Doesn’t work well in practice – people can’t

be bothered n  Implicit

¨ Learn ratings from user actions ¨ e.g., purchase implies high rating ¨ What about low ratings?

Page 12: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Extrapolating Utilities n Key problem: matrix U is sparse

¨ most people have not rated most items ¨ Cold start:

n  New items have no ratings n  New users have no history

n Three approaches to recommender systems ¨ 1) Content-based ¨ 2) Collaborative ¨ 3) Hybrid (Latent factor based)

Page 13: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Content-based Recommender Systems 13

Page 14: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Content-based recommendations

n Main idea: recommend items to customer x similar to previous items rated highly by x

Examples: n Movie recommendations

¨ recommend movies with same actor(s), director, genre, …

n Websites, blogs, news ¨ recommend other sites with “similar” content

Page 15: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Plan of action

likes

Item profiles

Red Circles

Triangles User profile

match

recommend build

Page 16: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Item Profiles n For each item, create an item profile

n Profile is a set (vector) of features ¨ movies: author, title, actor, director,…

¨ text: set of “important” words in document

n How to pick important features (words)? ¨ Usual heuristic from text mining is TF-IDF

(Term Frequency * Inverse Doc Frequency) n  Term ... Feature

n  Document ... Item

Page 17: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

TF-IDF fij = frequency of term (feature) i in doc (item) j

ni = number of docs that mention term i N = total number of docs TF-IDF score: wij = TFij × IDFi

Doc profile = set of words with highest TF-IDF scores, together with their scores

1/29/2013 18 Jure Leskovec, Stanford C246: Mining Massive Datasets

Note: we normalize TF to  discount  for  “longer”   documents

Page 18: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

User profiles and prediction

User profile possibilities: Weighted average of rated item profiles Variation: weight by difference from average

rating for item …

Prediction heuristic: Given user profile x and item profile i, estimate 𝑢(𝒙, 𝒊)  =  cos  (𝒙, 𝒊)  =   𝒙·𝒊

| 𝒙 |⋅| 𝒊 |

1/29/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 19

Page 19: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Model-based approaches

n For each user, learn a classifier that classifies items into rating classes ¨ liked by user and not liked by user ¨ e.g., Bayesian, regression, SVM

n Apply classifier to each item to find recommendation candidates

n Problem: scalability ¨ Won’t investigate further in this class

Page 20: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Advantages of Content-based approach (Pros)

20

+: No need for data on other users No cold-start or sparsity problems

+: Able to recommend to users with unique tastes

+: Able to recommend new & unpopular items No first-rater problem

+: Able to provide explanations Can provide explanations of recommended items by

listing content-features that caused an item to be recommended

1/29/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 20

Page 21: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Limitations of content-based approach (Cons)

–: Finding the appropriate features is hard E.g., images, movies, music

–: Overspecialization Never  recommends  items  outside  user’s  

content profile People might have multiple interests Unable to exploit quality judgments of other users

–: Recommendations for new users How to build a user profile?

1/29/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 21

Page 22: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Collaborative Filtering

22

Page 23: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Collaborative Filtering Consider user x

Find set N of other users whose ratings are  “similar”  to   x’s  ratings

Estimate x’s  ratings   based on ratings of users in N

1/29/2013 23 Jure Leskovec, Stanford C246: Mining Massive Datasets

x

N

Page 24: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Similar users Let rx be the vector of user x’s  ratings Jaccard similarity measure

Problem: Ignores the value of the rating Cosine similarity measure

sim(x, y) = cos(rx, ry) = ⋅

|| ||⋅|| ||

Problem: Treats  missing  ratings  as  “negative” Pearson correlation coefficient

Sxy = items rated by both users x and y

1/29/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 24

rx = [*, _, _, *, ***] ry = [*, _, **, **, _]

rx, ry as sets: rx = {1, 4, 5} ry = {1, 3, 4}

rx, ry as points: rx = {1, 0, 0, 1, 3} ry = {1, 0, 2, 2, 0}

rx, ry …  avg. rating of x, y

𝒔𝒊𝒎 𝒙, 𝒚 =∑ 𝒓𝒙𝒔 − 𝒓𝒙 𝒓𝒚𝒔 − 𝒓𝒚𝒔∈𝑺𝒙𝒚

∑ 𝒓𝒙𝒔 − 𝒓𝒙 𝟐𝒔∈𝑺𝒙𝒚 ∑ 𝒓𝒚𝒔 − 𝒓𝒚

𝟐𝒔∈𝑺𝒙𝒚

Let rx be the vector of user x’s  ratings Jaccard similarity measure

Problem: Ignores the value of the rating Cosine similarity measure

sim(x, y) = cos(rx, ry) = ⋅

|| ||⋅|| ||

Problem: Treats  missing  ratings  as  “negative” Pearson correlation coefficient

Sxy = items rated by both users x and y

1/29/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 24

rx = [*, _, _, *, ***] ry = [*, _, **, **, _]

rx, ry as sets: rx = {1, 4, 5} ry = {1, 3, 4}

rx, ry as points: rx = {1, 0, 0, 1, 3} ry = {1, 0, 2, 2, 0}

rx, ry …  avg. rating of x, y

𝒔𝒊𝒎 𝒙, 𝒚 =∑ 𝒓𝒙𝒔 − 𝒓𝒙 𝒓𝒚𝒔 − 𝒓𝒚𝒔∈𝑺𝒙𝒚

∑ 𝒓𝒙𝒔 − 𝒓𝒙 𝟐𝒔∈𝑺𝒙𝒚 ∑ 𝒓𝒚𝒔 − 𝒓𝒚

𝟐𝒔∈𝑺𝒙𝒚

Page 25: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Rating predictions

25

Let rx be the vector of user x’s  ratings Let N be the set of k users most similar to x

who have rated item i Prediction for item s of user x:

𝑟 =  ∑ 𝑟∈

𝑟 =∑ ⋅∈∑ ∈

Other options? Many  other  tricks  possible…

1/29/2013 26 Jure Leskovec, Stanford C246: Mining Massive Datasets

Shorthand: 𝒔𝒙𝒚 = 𝒔𝒊𝒎 𝒙, 𝒚

- тежнее да ги усредни рејтинзите

Page 26: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Complexity

n Expensive step is finding k most similar customers ¨ O(|U|)

n Too expensive to do at runtime ¨ Need to pre-compute

n Naïve precomputation takes time O(N|U|) ¨ Simple trick gives some speedup

n Can use clustering, partitioning as alternatives, but quality degrades

Page 27: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Item-Item Collaborative Filtering So far: User-user collaborative filtering Another view: Item-item

For item i, find other similar items Estimate rating for item i based

on ratings for similar items Can use same similarity metrics and

prediction functions as in user-user model

1/29/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 27

);(

);(

xiNj ij

xiNj xjijxi s

rsr

sij… similarity of items i and j rxj…rating of user u on item j N(i;x)… set items rated by x similar to i

Page 28: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ 28

12 11 10 9 8 7 6 5 4 3 2 1

4 5 5 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

users m

ovie

s

- unknown rating - rating between 1 to 5

1/29/2013 28 Jure Leskovec, Stanford C246: Mining Massive Datasets

k=2

Page 29: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ 29

12 11 10 9 8 7 6 5 4 3 2 1

4 5 5  ? 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

users

- estimate rating of movie 1 by user 5

1/29/2013 29 Jure Leskovec, Stanford C246: Mining Massive Datasets

mov

ies

Page 30: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ 30

12 11 10 9 8 7 6 5 4 3 2 1

4 5 5  ? 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

users

Neighbor selection: Identify movies similar to movie 1, rated by user 5

1/29/2013 30 Jure Leskovec, Stanford C246: Mining Massive Datasets

mov

ies

1.00

-0.18

0.41

-0.10

-0.31

0.59

sim(1,m)

Here we use Pearson correlation as similarity: 1) Subtract mean rating mi from each movie i m1 = (1+3+5+5+4)/5 = 3.6 row 1: [-2.6, 0, -0.6, 0, 0, 1.4, 0, 0, 1.4, 0, 0.4, 0] 2) Compute cosine similarities between rows

Page 31: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ 31

12 11 10 9 8 7 6 5 4 3 2 1

4 5 5  ? 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

users

Compute similarity weights: s13=0.41, s16=0.59

1/29/2013 31 Jure Leskovec, Stanford C246: Mining Massive Datasets

mov

ies

1.00

-0.18

0.41

-0.10

-0.31

0.59

sim(1,m)

Се земаат предвид само двата најслични корисници

Page 32: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ 32

12 11 10 9 8 7 6 5 4 3 2 1

4 5 5 2.6 3 1 1

3 1 2 4 4 5 2

5 3 4 3 2 1 4 2 3

2 4 5 4 2 4

5 2 2 4 3 4 5

4 2 3 3 1 6

users

Predict by taking weighted average:

r15 = (0.41*2 + 0.59*3) / (0.41+0.59) = 2.6 1/29/2013 32 Jure Leskovec, Stanford C246: Mining Massive Datasets

mov

ies

𝒓𝒊𝒙 =∑ 𝒔𝒊𝒋 ⋅ 𝒓𝒋𝒙𝒋∈𝑵(𝒊;𝒙)

∑𝒔𝒊𝒋

Page 33: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Item-Item vs User-User

33

0.418.010.9

0.30.50.81

Avatar LOTR Matrix Pirates

Alice

Bob

Carol

David

1/29/2013 34 Jure Leskovec, Stanford C246: Mining Massive Datasets

In practice, it has been observed that item-item often works better than user-user

Why? Items are simpler, users have multiple tastes

Page 34: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Pros and cons of collaborative filtering + Works for any kind of item

No feature selection needed - Cold Start:

Need enough users in the system to find a match - Sparsity:

The user/ratings matrix is sparse Hard to find users that have rated the same items

- First rater: Cannot recommend an item that has not been

previously rated New items, Esoteric items

- Popularity bias: Cannot recommend items to someone with

unique taste Tends to recommend popular items

1/29/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 35

Page 35: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Hybrid Methods

Implement two or more different recommenders and combine predictions Perhaps using a linear model

Add content-based methods to collaborative filtering Item profiles for new item problem Demographics to deal with new user problem

1/29/2013 36 Jure Leskovec, Stanford C246: Mining Massive Datasets

Page 36: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Evaluation

1 3 4

3 5 5

4 5 5

3

3

2 2 2

5

2 1 1

3 3

1

movies

users

1/29/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 38

36

Page 37: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Evaluation

37

1 3 4

3 5 5

4 5 5

3

3

2 ? ?

?

2 1 ?

3 ?

1

Test Data Set

users

movies

1/29/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 39

Page 38: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Evaluating Predictions Compare predictions with known ratings

Root-mean-square error (RMSE)

∑ 𝑟 − 𝑟∗ where 𝒓𝒙𝒊 is predicted, 𝒓𝒙𝒊∗ is the true rating of x on i

Precision at top 10: % of those in top 10

Rank Correlation: Spearman’s  correlation between  system’s  and  user’s  complete  rankings

Another approach: 0/1 model

Coverage: Number of items/users for which system can make predictions

Precision: Accuracy of predictions

Receiver operating characteristic (ROC) Tradeoff curve between false positives and false negatives

1/29/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 40

Page 39: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Tip: Add data

n Leverage all the data available ¨ Don’t try to reduce data size in an effort

to make fancy algorithms work ¨ Simple methods on large data do best

n Add more data ¨ e.g., add IMDB data on genres

n More Data Beats Better Algorithms http://anand.typepad.com/datawocky/2008/03/more-data-usual.html

Page 40: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Finding similar vectors n Common problem that comes up

in many settings n Given a large number N of

vectors in some high-dimensional space (M dimensions), find pairs of vectors that have high cosine-similarity ¨ e.g., user profiles, item profiles

Page 41: How to build recommender system

СИСТЕМИ НА ЗНАЕЊЕ

Прашања?

41