recommender system introduction

Recommender SystemIntroduction

[email protected]

What is good recommender system?

Outline

• What is recommender system?– Mission– History– Problems

• What is good recommender system?– Experiment Methods– Evaluation Metric

Information Overload

How to solve information overload

• Catalog– Yahoo, DMOZ

• Search Engine– Google, Bing

Mission

• Help user find item of their interest.• Help item provider deliver their item to

right user.• Help website improve user engagement.

RecommenderSystem

Search Engine vs. Recommender System

• User will try search engine if– they have specific needs– they can use keywords to describe needs

• User will try recommender system if– they do not know what they want now– they can not use keywords to describe needs

History: Before 1992

• Content Filtering – An architecture for large scale information

systems [1985] (Gifford, D.K)– MAFIA: An active mail-filter agent for an

intelligent document processing support [1990] (Lutz, E.)

– A rule-based message filtering system [1988] (Pollock, S. )

http://psrg.lcs.mit.edu/~gifford/

http://www-personal.umich.edu/~pollock/

History: 1992-1998

• Tapestry by Xerox Palo Alto [1992]– First system designed by collaborative filtering

• Grouplens [1994]– First recommender system using rating data

• Movielens [1997]– First movie recommender system– Provide well-known dataset for researchers

http://www.ischool.utexas.edu/~i385d/readings/Goldberg_UsingCollaborative_92.pdf

http://www.grouplens.org/

http://www.movielens.org/login

History: 1992-1998

• Fab : content-based collaborative recommendation– First unified recommender system

• Empirical Analysis of Predictive Algorithms for Collaborative Filtering [1998] (John S. Breese)– Systematically evaluate user-based

collaborative filtering

http://courses.ischool.utexas.edu/Turnbull_Don/2008/fall/INF_385Q/readings/Balabanovic_Shoham-1997-Fab.pdf

History: 1999-2005

• Amazon proposed item-based collaborative filtering (Patent is filed in 1998 and issued in 2001) [link]

• Thomas Hofmann proposed pLSA [1999] and apply similar method on collaborative filtering [2004]

• Pandora began music genome project [2000]

http://glinden.blogspot.com/2006/11/item-to-item-collaborative-filtering.html

http://www.cs.brown.edu/~th/

http://www.cs.brown.edu/~th/papers/Hofmann-UAI99.pdf

http://comminfo.rutgers.edu/~muresan/IR/Docs/Articles/toisHofmann2004.pdf

http://en.wikipedia.org/wiki/Music_Genome_Project

History: 1999-2005

• Lastfm using Audioscrobbler to generate user taste profile on musics.

• Evaluating collaborative filtering recommender systems [2004] (Jonathan L. Herlocker)

History: 2005-2009

• Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. [2005] (Alexander Tuzhilin)

• Netflix Prize [link]– Latent Factor Model (SVD, RSVD, NSVD, SVD++)– Temporal Dynamic Collaborative Filtering– Yehuda Koren [link]’s team get prize

http://pages.stern.nyu.edu/~atuzhili/

http://pages.stern.nyu.edu/~atuzhili/

http://netflixprize.com/

http://research.yahoo.com/Yehuda_Koren

History: 2005-2009

• ACM Conference on Recommender System [2007] (Minneapolis, Minnesota, USA)

• Digg, Youtube try recommender system.

History: 2010-now

• Context-Aware Recommender Systems• Music Recommendation and Discovery• Recommender Systems and the Social Web• Information Heterogeneity and Fusion in

Recommender Systems• Human Decision Making in Recommender Systems• Personalization in Mobile Applications• Novelty and Diversity in Recommender Systems• User-Centric Evaluation

History: 2010-now

• Facebook launches instant personalization [2010]– Clicker– Bing– Trip Advisor– Rotten Tomatoes– Pandora– ……

Problems

• Main Problems– Top-N Recommendation– Rating Prediction

Problems

• Top-N Recommendation– Input

– Output

user item

A a

B a

B b

… …

Problems

• Top-N Recommendation– Input

– Output

user item rating

A a

B a

B b

… … …

?

What is good recommender system?

Experiment Methods

• Offline Experiment• User Survey• Online Experiment– AB Testing

Experiment Methods

• Offline Experiment

DataSet

Train Test

• Advantage:• Only rely on dataset•

• Disadvantage:• Offline metric can not reflect business goal

Experiment Methods

• User Survey– Advantage:• Can get subjective metrics• Lower risk than online testing

– Disadvantage:• Higher cost than offline experiments• Some results may not have statistical significance• Users may have different behaviors under testing

environment or real environment• It’s difficult to design double blink experiments.

Experiment Methods

• On line experiments (AB Testing)– Advantage:• Can get metrics related to business goal

– Disadvantage:• High risk/cost• Need large user set to get statistical significant result

Experiment Metrics

• User Satisfaction• Prediction Accuracy• Coverage• Diversity• Novelty• Serendipity• Trust• Robust• Real-time

Experiment Metrics

• User Satisfaction– Subjective metric– Measured by user survey or online experiments

Experiment Metrics

• Prediction Accuracy– Measured by offline experiments– Top-N Recommendation• Precision / Recall

– Rating Prediction• MAE, RMSE

Experiment Metrics

• Coverage– Measure the ability of recommender system to

recommend long-tail items.

– Entropy, Gini Index

||

|),(|

I

NuRCoverage Uu

Experiment Metrics

• Diversity– Measure the ability of recommender system to

cover users’ different interests.– Different similarity metric generate different

diversity metric.

Experiment Metrics

• Diversity (Example)

Watch History Related Items

Experiment Metrics

• Novelty– Measure the ability of recommender system to

introduce long tail items to users.– International Workshop on Novelty and

Diversity in Recommender Systems [link]– Music Recommendation and Discovery in the

Long Tail [Oscar Celma]

http://ir.ii.uam.es/divers2011/

http://www.linkedin.com/in/oscarcelma

http://www.linkedin.com/in/oscarcelma

Experiment Metrics

• Serendipity– A recommendation result is serendipity if:• it’s not related with user’s historical interest• it’s novelty to user• user will find it’s interesting after user view it

Experiment Metrics

• Trust– If user trust recommender system, they will

interact with it.– Ways to improve trust:• Transparency• Social• Trust System (Epinion)

Experiment Metrics

• Robust– The ability of recommender system to prevent

attack.– Neil Hurley. Tutorial on Robustness of

Recommender System. ACM RecSys 2011.

Experiment Metrics

• Real-time– Generate new recommendations when user

have new behaviors immediately.

Too many metric!Which is most important?

How to do trade-off

• Business goal• Our belief• Making new algorithms by 3 steps

experiments:– Offline testing– User survey– Online testing

Thanks!

recommender system introduction

Technology

recommender system user

user user

user trust recommender

good recommender system

movie recommender system

discovery recommender

experiment metrics novelty

experiment metrics trust