Download - Journal club: Meta-Prod2Vec
![Page 1: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/1.jpg)
Meta-Prod2Vec - Product Embeddings Using Side-Information for Recommendation
Yuya Kanemoto
Vasile F et al. RecSys 2016
![Page 2: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/2.jpg)
Neural embedding: Word2Vec (Skip-gram)• A method for learning distributed vector representations that capture a large
number of syntactic and semantic word relationships
• Example: Tokyo - Japan + Germany = Berlin
• Word2Vec is essentially a two-layer neural network
• Objective function:
Mikolov T et al. 2013
![Page 3: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/3.jpg)
Skip-gram with negative sampling
• Data sets are often too large to perform SGD as iterations at the denominator of conditional probability takes time
• We could set the task to distinguish the target word co-occurrences and k negative samples
Mikolov T et al. 2013
: Objective function
: Objective function with negative sampling
![Page 4: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/4.jpg)
Embedding and Matrix Factorisation
• The objective of the embedding is closely related to matrix factorisation
• Embedding can be considered as decomposition of SPMI (shifted pointwise mutual information) matrix
Levy O et al. 2014
![Page 5: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/5.jpg)
Neural embedding: Prod2Vec
• A method applying Skip-gram model for product recommendation
• When an user buys a product, products with similar vector representation will be recommended
Grbovic M et al. 2015
![Page 6: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/6.jpg)
Prod2Vec for popular songs
“Shake It Off” “All About That Bass”
Vasile F et al. 2016
![Page 7: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/7.jpg)
Prod2Vec in cold start case
“You’re Not Sorry” “Du Hast”
Vasile F et al. 2016
![Page 8: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/8.jpg)
Meta-Prod2Vec constraints
• Meta-Prod2Vec = Prod2Vec + product meta-data
• The aim is to deal with cold start problems
Vasile F et al. 2016
![Page 9: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/9.jpg)
Loss function of Prod2Vec
Vasile F et al. 2016
![Page 10: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/10.jpg)
Negative sampling for Meta-Prod2Vec
Vasile F et al. 2016
![Page 11: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/11.jpg)
Loss function of Meta-Prod2Vec
Vasile F et al. 2016
I: input J: output M: meta-data
![Page 12: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/12.jpg)
Evaluation of experiments
Vasile F et al. 2016
• Hit ratio at K (HR@K): whether product appears in the top K list of recommended products (doesn’t care the rank of test product in the recommendation list)
• Normalised discounted cumulative gain (NDCG@K): measurement of the performance of a recommendation system based on the graded relevance of the recommended entities. It varies from 0 to 1, with 1 representing the ideal ranking of the entities.
IDCG is the maximum possible (ideal) DCG for a given set of queries rel: graded relevance of the result at position i k: maximum number of entities that can be recommended
![Page 13: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/13.jpg)
Methods for comparison
Vasile F et al. 2016
• BestOf: based on popularity
• CoCounts: based on cosine similarity (basic collaborative filtering)
• Prod2Vec
• Meta-Prod2Vec
• Mix(Prod2Vec,CoCounts):
• Mix(Meta-Prod2Vec,CoCounts):
Parameters Number of songs: 433k Number of artists: 67k Embedding dimension: 50 Context window size: 3 λ: 1 α: 0.15
![Page 14: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/14.jpg)
Relative importance of meta data
Vasile F et al. 2016
![Page 15: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/15.jpg)
Improvement in cold start
Vasile F et al. 2016
Cold start
![Page 16: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/16.jpg)
Improvement in cold start
Vasile F et al. 2016
![Page 17: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/17.jpg)
Better performance in ensemble model
Vasile F et al. 2016
![Page 18: Journal club: Meta-Prod2Vec](https://reader036.vdocuments.net/reader036/viewer/2022062503/587284c81a28abc7068b6e81/html5/thumbnails/18.jpg)
Discussion
• Meta data was informative, especially for cold start case
• Ensemble method (with 15% Meta-Prod2Vec) worked well
• No comparison with matrix factorisation methods/other meta-data
utilising Word2Vec variants