recommender system with distributed representation

24
分散表現を用いた 商品レコメンダーシステムの構築と評価 Recommender System with Distributed Representation Thuy PhiVan 1,2 , Chen Liu 2 and Yu Hirate 2 1. Computational Linguistics Laboratory, NAIST 2.Rakuten Institute of Technology, Rakuten, Inc. {ar-thuy.phivan, chen.liu, yu.hirate}@rakuten.com

Upload: rakuten-inc

Post on 12-Jan-2017

1.535 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Recommender System with Distributed Representation

分散表現を用いた商品レコメンダーシステムの構築と評価Recommender System with Distributed Representation

Thuy PhiVan1,2, Chen Liu 2 and Yu Hirate2

1. Computational Linguistics Laboratory, NAIST

2.Rakuten Institute of Technology, Rakuten, Inc.

{ar-thuy.phivan, chen.liu, yu.hirate}@rakuten.com

Page 2: Recommender System with Distributed Representation

2

1. Distributed Representation

for words, docs and categories

Page 3: Recommender System with Distributed Representation

3

Distributed Representations for Words

• Distributed representations for words

• Similar words are projected into similar vectors.

• Relationship between words can be expressed

as a simple vector calculation.

[T.Mikolov et al. NIPS 2013]

• Analogy

• v(“woman”) – v(”man”) + v(”king”) = v(“queen”)

Page 4: Recommender System with Distributed Representation

4

2 models in word2vec

input projection output input projection output

v(t-2)

v(t-1)

v(t+1)

v(t+2)

v(t)

v(t-2)

v(t-1)

v(t+1)

v(t+2)

v(t)

CBoW Skip-gram

• given context words

• predict a probability of

a target word

• given a target word

• predict a probability of

context words

Page 5: Recommender System with Distributed Representation

5

Sample results of word2vec

trained by Wikipedia data

query: nagoya

• osaka 0.799002

• chiba 0.762829

• fukuoka 0.755166

• sendai 0.731760

• yokohama 0.729205

• kobe 0.726732

• shiga 0.705707

• niigata 0.699777

• aichi 0.692371

• hyogo 0.687128

• saitama 0.685672

• tokyo 0.671428

• sapporo 0.670466

• kumamoto 0.660786

• japan 0.658769

• kitakyushu 0.654265

• wakayama 0.652783

• shizuoka 0.624380

query: coffee

• cocoa 0.603515

• robusta 0.565269

• beans 0.565232

• bananas 0.565207

• cinnamon 0.556771

• citrus 0.547495

• espresso 0.542120

• caff 0.542082

• infusions 0.538069

• tea 0.532565

• cassava 0.524657

• pineapples 0.523557

• coffea 0.512420

• tapioca 0.510727

• sugarcane 0.508203

• yams 0.507347

• avocados 0.507072

• arabica 0.506231

Page 6: Recommender System with Distributed Representation

6

Doc2Vec(Paragraph2Vec) [Q.Le et al. ICML2014]

input projection output input projection output

v(doc)

v(t-1)

v(t+1)

v(t)

v(t-2)

v(t-1)

v(t)

v(t+1)

v(doc)

PV-DM PV-DBoW

v(t-2)

• Assign a “Document Vector” to each document

• Document vector can be used for

• feature of the document

• similarity of documents

Page 7: Recommender System with Distributed Representation

7

Category2Vec [Marui et al. NLP2015]

https://github.com/rakuten-nlp/category2vec

• Assign “Category Vector” to each category.

• Each document has its own category information.

input projection outputinput projection output

v(doc)

v(t-1)

v(t+1)

v(t)

v(t-2)

v(t-1)

v(t)

v(t+1)

v(doc)

CV-DM CV-DBoW

v(t-2)

v(cat)

v(cat)

Page 8: Recommender System with Distributed Representation

8

2. Applying Doc2Vec to

Item Recommender

Page 9: Recommender System with Distributed Representation

9

Recommender Systems in EC service

Item2Item recommender• Given an item, show relevant items to the item

User2Item recommender• Given a user, show relevant items to the user

Page 10: Recommender System with Distributed Representation

10

Distributed Representation for Users and Items

Document : a sequence of words with context.

User : a sequence of item views with user’s intention.

Set of documentsVectors for words

Vectors for documents

sim{word, word}

sim{doc, word}

sim{doc, doc}

Set of user behaviorsVectors for items

Vectors for users

sim{item, item}

sim{user, item}

sim{user, user}

Page 11: Recommender System with Distributed Representation

11

Dataset Preparation

• Service:

• Rakuten Singapore www.rakuten.com.sg

• Rakuten’s EC service in Singapore

• Started from 2014.

• Data Source

• Purchase History Data

• Click Through Data

• Term

• Jan. 2015 – Oct. 2015

Page 12: Recommender System with Distributed Representation

12

Dataset Preparation

(Purchase History Data)

• A set of items purchased by the same user.

User ID A set of Purchased Items

user #1 𝑖𝑡𝑒𝑚1,1, 𝑖𝑡𝑒𝑚1,2

user #2 {𝑖𝑡𝑒𝑚2.1, 𝑖𝑡𝑒𝑚2.2, 𝑖𝑡𝑒𝑚2.3}

⋮ ⋮

user #N {𝑖𝑡𝑒𝑚𝑁.1}

Page 13: Recommender System with Distributed Representation

13

Dataset Preparation

(Click Through Data)

• A set of users’ sessions

• Session :

• A sequence of page views with the same cookie.

• A sequence is splitted by time interval > 2 hours.

User ID A set of Sessions

user #1 𝑖𝑡𝑒𝑚1.1.1, 𝑖𝑡𝑒𝑚1.1.2, ⋯ , 𝑖𝑡𝑒𝑚1.1.𝑛 , 𝑖𝑡𝑒𝑚1,2,1 ⋯

user #2 {𝑖𝑡𝑒𝑚2.1.1, 𝑖𝑡𝑒𝑚2.1.2}

⋮ ⋮

user #N 𝑖𝑡𝑒𝑚𝑁.1.1, 𝑖𝑡𝑒𝑚𝑁.1.2, ⋯ , 𝑖𝑡𝑒𝑚𝑁.1.𝑛 , 𝑖𝑡𝑒𝑚𝑁,2,1, ⋯

Longer than 2 hours time

Session A Session B

: session

Page 14: Recommender System with Distributed Representation

14

Dataset Property

• More than 60% of sessions finish with one page request.• More than X% of users visited rakuten.com.sg one time only.

Distribution of Session Length Distribution of Session Count

Page 15: Recommender System with Distributed Representation

15

Item2Item Recommender (Example)

Click

Though

Data

Purchase

History

Data

Page 16: Recommender System with Distributed Representation

16

3. Evaluation

Page 17: Recommender System with Distributed Representation

17

Evaluation Metrics

Training Data

2015/0

1/0

1

2015/0

8/3

1

Test

Data

2015/0

9/0

1

2015/1

0/3

1

• N is the total number of common users in training and testing data

• Hit-rate of the recommender system (RS):

hit-rate = Number of hits / N

• Each user: RS predicts top-20 items

• “Hit”: any items for 1 particular user appear in test data

Page 18: Recommender System with Distributed Representation

18

Evaluations

1. Parameter Optimization

• Find an optimal parameter set.

• Find important parameters to build a good

model

2. Performance Comparison with Conventional

Recommender Algorithms

• Item Similarity

• Matrix Factorization

Page 19: Recommender System with Distributed Representation

19

1. Parameter Optimization

Parameter Values Explanation

Size[50, 100, 200, 300,

400, 500]Dimensionality of the vectors

Window [1, 3, 5, 8, 10, 15]Maximum number items of context

that the training algorithm take into account

Negative [0, 5, 10, 15, 20, 25]Number of “noise words” should be drawn

(usually between 5-20)

Sample[0, 1e-2, 1e-3, 1e-4,

1e-5, 1e-6, 1e-7, 1e-8]Sub-sampling of frequent words

Min-count [1, ..., 20]Items appear less than this min-count

value is ignored

Iteration [10,15, 20, 25, 30] Number of iteration for building model

• Best setting for parameters

Size Window Negative Sample min_count Iteration hit-rate

300 8 10 1e-5 3 20 0.1821

Page 20: Recommender System with Distributed Representation

20

1. Parameter Optimization

13.7

15.5

17.7 18.2 17.817.2

0

2

4

6

8

10

12

14

16

18

20

50 100 200 300 400 500

hit

-ra

te(%

)

Size

15.4

16.917.8 18.2 18 18

0

2

4

6

8

10

12

14

16

18

20

1 3 5 8 10 15

hit

-ra

te(%

)

window

15.9

17.9 18.2 17.6 17.4 17.3

0

2

4

6

8

10

12

14

16

18

20

0 5 10 15 20 25

hit

-ra

te(%

)

Negative

16.216.516.416.7

18.2

15.1

2

0.3

0

2

4

6

8

10

12

14

16

18

20

0

1.0

0E

-02

1.0

0E

-03

1.0

0E

-04

1.0

0E

-05

1.0

0E

-06

1.0

0E

-07

1.0

0E

-08

hit

-ra

te(%

)

Sample

16

.8 18

.2

18.9

18

.8

18

.9 19

18

.8

18

.7

18

.9

18

.90

2

4

6

8

10

12

14

16

18

20

1 3 5 7 9 11 13 15 17 19

hit

-ra

te(%

)

Min_count

16.817.8 18.2 18.2 18.2

0

2

4

6

8

10

12

14

16

18

20

10 15 20 25 30

hit

-ra

te(%

)

Iteration

Page 21: Recommender System with Distributed Representation

21

2. Performance Comparison

with Conventional Recommender Algorithms

Item Similarity Matrix Factorization

U x

I= { }

= { }

Jaccard Sim. of user setsdim=32

max iteration=25

Page 22: Recommender System with Distributed Representation

22

2. Performance Comparison

with Conventional Algorithms

0

2

4

6

8

10

12

14

16

18

20

Item Similarity MatrixFactorization

Doc2Vec

hit

-rate

(%)

Doc2Vec based algorithm performed the best.

Page 23: Recommender System with Distributed Representation

23

Conclusion and Future Works

• Conclusion• Developed distributed representation based RS.

• Applied it to dataset generated based on Rakuten Singapore

click through data.

• Confirmed distributed representation based RS performed better

than conventional RS algorithms.

• Future Works• Distributed representation based RS based on other datasets

• Rakuten Singapore Product Data

• Rakuten (Japan) Ichiba Click Though Data

• Hybrid Model (contents based RS x user behavior based RS)

• Testing the real service.

Page 24: Recommender System with Distributed Representation

24

Thank you