caching strategies for in memory neighborhood-based recommender systems

14
Caching Strategies for In-Memory Neighborhood-based Recommender Systems Simon Dooms @sidooms

Upload: simon-dooms

Post on 10-May-2015

248 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Caching strategies for in memory neighborhood-based recommender systems

Caching Strategies for

In-Memory Neighborhood-based

Recommender Systems

Simon Dooms

@sidooms

Page 2: Caching strategies for in memory neighborhood-based recommender systems

Introduction

05/09/2013 Simon Dooms - Ghent University – WEBIST 2013 2

• Neighborhood-based recommender systems

• User-based Collaborative Filtering (UBCF)

ConclusionsResultsCachingAbout SimilaritiesIntro

Item 1 Item 2 Item 3 Item 4 Item 5

User 1 2 1 4 5

User 2 1 5 5 4

User 3 3 5 1

User 4 5

User 5 1 2

For every user u:

For every item i:

calculate(u,i)

������������ ��, � =

������(��, ��) ∗ ����(��, �) + ������(��, ��) ∗ ����(��, �)

?

Page 3: Caching strategies for in memory neighborhood-based recommender systems

05/09/2013 Simon Dooms - Ghent University – WEBIST 2013 3

ConclusionsResultsCachingAbout SimilaritiesIntro

������������ ��, � =

������ �!"(#$, #%) ∗ ����(��, �) + ������ �!"(#$, #&) ∗ ����(��, �)

#$ #% #& #' #(

#$ 1 0.2 0.5 0.7 0.5

#% 1 0.4 0 0.7

#& 1 0.7 0.6

#' 1 0.8

#( 1

• User similarities needed

• Sometimes precalculated

Users Similarities

5 10

50 1,225

500 124,750

5,000 12,497,500

50,000 1,249,975,000

#����� ∗ (#����� − 1)

2

Page 4: Caching strategies for in memory neighborhood-based recommender systems

ConclusionsResultsCachingAbout Similarities

Intro

05/09/2013 Simon Dooms - Ghent University – WEBIST 2013 4

Hypothesis:

Some similarities are used more than others

• Full recommendation calculation, MovieLens 100K

• Similarity frequency:

Page 5: Caching strategies for in memory neighborhood-based recommender systems

ConclusionsResultsCachingAbout Similarities

Intro

05/09/2013 Simon Dooms - Ghent University – WEBIST 2013 5

Hypothesis:

Some similarities are used more than others

• Full recommendation calculation, MovieLens 100K

• Similarity frequency:

Page 6: Caching strategies for in memory neighborhood-based recommender systems

• ������ �!"(#-, #")

• Used in r./ #-, � with i not rated by �0– > needs similarities of users who rated i

– > ���(�0,�1) needed = #items rated by �1 but not by �0

• Same for ������ �!"(#", #-)

ConclusionsResultsCachingAbout Similarities

Intro

05/09/2013 Simon Dooms - Ghent University – WEBIST 2013 6

• Similarities usage frequency differs

• Predict similarity usage frequency?

������������ ��, � =

������ �!"(#$, #%) ∗ ����(��, �) + ������ �!"(#$, #&) ∗ ����(��, �)

Page 7: Caching strategies for in memory neighborhood-based recommender systems

• ������ �!"(#-, #")

• Used in r./ #-, � with i not rated by �0– > needs similarities of users who rated i

– > ���(�0,�1) needed = #items rated by �1 but not by �0

• Same for ������ �!"(#", #-)

ConclusionsResultsCachingAbout Similarities

Intro

05/09/2013 Simon Dooms - Ghent University – WEBIST 2013 7

• Similarities usage frequency differs

• Predict similarity usage frequency?

������������ ��, � =

������ �!"(#$, #%) ∗ ����(��, �) + ������ �!"(#$, #&) ∗ ����(��, �)

Items rated by �0

Items rated by �1

Usagefrequency������ �!"(#-, #") = cardinality inverse intersection

Page 8: Caching strategies for in memory neighborhood-based recommender systems

ConclusionsResultsCaching

About SimilaritiesIntro

05/09/2013 Simon Dooms - Ghent University – WEBIST 2013 8

• Usage frequency similarities is known

• Now what?

Use information for caching

SMART Cache:

If cache full, replace entry with lowest predicted usage

frequency

LRU Cache:

If cache full, replace entry least recently used

No Cache (baseline):

No caching, all similarities recalculated

Page 9: Caching strategies for in memory neighborhood-based recommender systems

ConclusionsResults

CachingAbout SimilaritiesIntro

05/09/2013 Simon Dooms - Ghent University – WEBIST 2013 9

• Full recommendation calculation

Page 10: Caching strategies for in memory neighborhood-based recommender systems

ConclusionsResults

CachingAbout SimilaritiesIntro

05/09/2013 Simon Dooms - Ghent University – WEBIST 2013 10

• Lowest needed cache size for LRU?

For every user u:

For every item i:

calculate(u,i)

0.21% similarities = 942 similarities

High temporal locality:

Good for LRU

Page 11: Caching strategies for in memory neighborhood-based recommender systems

ConclusionsResults

CachingAbout SimilaritiesIntro

05/09/2013 Simon Dooms - Ghent University – WEBIST 2013 11

• What if order changed?

For every item i:

For every user u:

calculate(u,i)

For every user u:

For every item i:

calculate(u,i)

Page 12: Caching strategies for in memory neighborhood-based recommender systems

ConclusionsResults

CachingAbout SimilaritiesIntro

05/09/2013 Simon Dooms - Ghent University – WEBIST 2013 12

• ORDER matters

• SMART more

stable results

Page 13: Caching strategies for in memory neighborhood-based recommender systems

Conclusions

• Similarity values not equally important

• SMART caching:

– Better for random like ordering

– Most stable (predictable) results

• LRU caching:

– When outer-user (high temporal locality)

– Smaller cache size needed (0.21% vs 60%)

• Calculation order (user,item) pairs important

• Caching needs to be carefully considered

ConclusionsResultsCachingAbout SimilaritiesIntro

05/09/2013 Simon Dooms - Ghent University – WEBIST 2013 13

Page 14: Caching strategies for in memory neighborhood-based recommender systems

Simon Dooms

@sidooms

Caching Strategies for

In-Memory Neighborhood-based

Recommender Systems