dynamic generation of personalized hybrid recommender systems

1
Dynamic Generation of Personalized Hybrid Recommender Systems WiCa, Wireless & Cable, www.wica.intec.ugent.be Gaston Crommenlaan 8 box 201, 9050 Ghent, Belgium Ghent University, Department of Information Technology Simon Dooms [email protected] Luc Martens [email protected] There is too much content available. Too much to watch, listen to, or read it all. We need automated intelligent content filtering aka recommendation . The recommender systems research domain spent the last 20 years actively developing and researching new recommendation algorithms and strategies, leading to a new problem MyMediaLite Which one should I use? Collaborative Filtering · Content-based Filtering · Knowledge-based Filtering Matrixfactorization · FactorWiseMatrixFactorization BiasedMatrixFactorization · Popular Items · Random Items · ItemKNN ItemAttributeKNN · CoClustering · TimeAwareBaselineWithFrequencies SVDPlusPlus · ItemAverage · GlobalAverage SigmoidCombinedAsymmetricFactorModel · BiPolarSlopeOne · UserKNN UserItemBaseline · SlopeOne · SigmoidSVDPlusPlus · TimeAwareBaseline NaiveBayes · LatentFeatureLogLinearModel · SVD · PCA · Probability-based ? Recommendation Algorithm Overload Information Overload Goal Hybrid recommender systems combine the merits of multiple recommendation algorithms but they are hard to configure and usually include only a few algorithms. What if we could throw all algorithms together and have an intelligent system automatically compose hybrids personalized for each user? Well, that’s the goal. Hybrid Framework Dataset For our experiments we use the MovieLens (1M) dataset. But since it lacks new and recent items (most recent movie is from 2000), we merge it with a constantly growing ratings dataset gathered from social media: MovieTweetings https://github.com/sidooms/MovieTweetings Algorithms are considered black boxes to facilitate the integration of new and various types of recommendation algorithms. Currently we have integrated over 20 algorithms which include all of the rating predictors from MyMediaLite and a few custom algorithms. We plan on integrating other recommendation libraries as well. Algorithms Learning = Optimization Problem Recommendation algorithms Algorithm weights vector e.g. = (1,0,0,0.5,1,1,0,1,0) User-centric dynamic ensemble recommender , = 1 1 , + 2 2 , + … + , Optimize weights vector = ( 1 , 2 ,…, ) such that is minimized. Optimize Fast (seconds) Slow (hours) Responsive online recommender Real-time integration of new user ratings by adding the ratings to the fold test sets Fold datasets All data The Filter Bubble Control and Transparency by allowing users to view and modify their system-calculated algorithm weights. Offline Results Early tests with RMSE evaluation on the MovieLens (100K) dataset with 10 MyMediaLite algorithms, show statistically interesting results. Future work: Online.

Upload: simon-dooms

Post on 02-Jul-2015

60 views

Category:

Science


1 download

DESCRIPTION

Poster about my PhD as presented during the ACM RecSys 2013 conference in Hong Kong, Oct 12, 2013 by Simon Dooms.

TRANSCRIPT

Page 1: Dynamic generation of personalized hybrid recommender systems

Dynamic Generation of Personalized

Hybrid Recommender Systems

WiCa, Wireless & Cable, www.wica.intec.ugent.be Gaston Crommenlaan 8 box 201, 9050 Ghent, Belgium

Ghent University, Department of Information Technology

Simon Dooms [email protected]

Luc Martens [email protected]

There is too much content available. Too

much to watch, listen to, or read it all. We

need automated intelligent content filtering

aka recommendation. The recommender

systems research domain spent the last 20

years actively developing and researching

new recommendation algorithms and

strategies, leading to a new problem …

MyMediaLite

Which one should I use?

Collaborative Filtering · Content-based Filtering · Knowledge-based Filtering M a t r i x f a c t o r i z a t i o n · F a c t o r W i s e M a t r i x F a c t o r i z a t i o n BiasedMatrixFactorization · Popular Items · Random Items · ItemKNN ItemAttributeKNN · CoClustering · TimeAwareBaselineWithFrequencies S V D P l u s P l u s · I t e m A v e r a g e · G l o b a l A v e r a g e SigmoidCombinedAsymmetricFactorModel · BiPolarSlopeOne · UserKNN UserItemBaseline · SlopeOne · SigmoidSVDPlusPlus · TimeAwareBaseline NaiveBayes · LatentFeatureLogLinearModel · SVD · PCA · Probability-based

?

Recommendation

Algorithm Overload

Information

Overload Goal

Hybrid recommender systems combine the

mer i ts o f mu l t ip le recommendat ion

algorithms but they are hard to configure

and usually include only a few algorithms.

What if we could throw all algorithms

together and have an intelligent system

automatically compose hybrids personalized

for each user? Well, … that’s the goal.

Hybrid Framework

Dataset

For our experiments we use the MovieLens

(1M) dataset. But since it lacks new and

recent items (most recent movie is from

2000), we merge it with a constantly growing

ratings dataset gathered from social media:

MovieTweetings https://github.com/sidooms/MovieTweetings

Algorithms are considered black boxes to

facilitate the integration of new and various

types of recommendation algorithms.

Currently we have integrated over 20

algorithms which include all of the rating

predictors from MyMediaLite and a few

custom algorithms. We plan on integrating

other recommendation libraries as well.

Algorithms

Learning = Optimization Problem

Recommendation algorithms

Algorithm weights vector 𝛾 𝑢 e.g. 𝛾 𝑢 = (1,0,0,0.5,1,1,0,1,0)

User-centric dynamic ensemble recommender 𝑔 𝑢, 𝑖 = 𝛾𝑎1

𝑢 ∗ 𝑔𝑎1𝑢, 𝑖 + 𝛾𝑎2

𝑢 ∗ 𝑔𝑎2𝑢, 𝑖 + … + 𝛾𝑎𝑛

𝑢 ∗ 𝑔𝑎𝑛𝑢, 𝑖

Optimize weights vector 𝛾 𝑢 = (𝛾𝑎1𝑢 , 𝛾𝑎2

𝑢 , … , 𝛾𝑎𝑛𝑢 )

such that 𝑓 𝛾 𝑢 is minimized.

Optimize

Fast (seconds) Slow (hours)

Responsive online recommender Real-time integration of new user ratings by adding the ratings to the fold test sets

Fold

data

sets

A

ll data

The Filter Bubble

Control and Transparency by allowing users

to view and modify their system-calculated

algorithm weights.

Offline Results

Early tests with RMSE evaluation on the

MovieLens (100K) dataset with 10

MyMediaLite algorithms, show statistically

interesting results. Future work: Online.