recommender lecture

Download Recommender Lecture

Post on 21-Oct-2015




1 download

Embed Size (px)


recommender system lecturer notes


  • Recommender Systems

  • CustomizationCustomization is one of the more attractive features of electronic commerce.Creating a different product for every user, suited to his/her tastes.Once thought to be a novelty, now essentialProvides a way for online providers to compete with brick-and-mortar competitors.Possible to serve niche markets.Bezos: If I have two million customers on the Web, then I should have two million stores on the Web(how dated is that? )

  • How can personalization help?Turn browsers into buyersPeople may go to Amazon without a specific purchase in mind.Showing them something they want can spur a purchase.Cross-salesCustomers who have bought a product are suggested related products.Encourages LoyaltyAmazon is interested in becoming an e-commerce portal. This means that they would like to respond to all your online purchasing needs.

  • ExamplesAmazonFeatured Recommendations: tailored to past views/purchases.People who bought this: compares customers Alerts- sends you email when stuff you like is on sale.Customer reviewsListManiaAllows users to add their own reviews of products.Customers can find other reviews by a given user.

  • ExamplesNetflixYou rate movies and others are suggested based on these ratings.You are compared to other users.Reel.comMovie Matches you enter a movie, and it suggests similar movies.Compares movies to movies.

  • ExamplesCiteseerRecommends papers based on citations, similar text, cited by.

    LaunchLets you customize your own radio station.You get a customized mp3 stream

  • Types of recommendationsPopulation-basedFor example, the most popular news articles, or searches, or downloads.Useful for sites that frequently add content.No user tracking needed.

    Netflix: Movers on the top 100Reflects movies that have been popular overall.

  • Types of recommendationsItem-to-itemContent-basedOne item is recommended based on the users indication that they like another item.If you like Lord of the Rings, youll like Legend.

    Netflix: 1-5 star rating.Estimates how much youll like a movie based on your past ratings.

  • Types of RecommendationsChallenges with item-to-item:Getting users to tell you what they likeBoth financial and time reasons not to.Getting enough data to make novel predictions.What users really want are recommendations for things theyre not aware of.

  • Types of recommendationsItem-to-itemMost effective when you have metadata that lets you automatically relate items.Genre, actors, director, etc.Also best when decoupled from paymentUsers should have an incentive to rate items truthfully.

  • Types of recommendationsUser-basedUsers who bought X like Y.Each user is represented by a vector indicating his ratings for each product.Users with a small distance between each other are similar.Find a similar user and recommend things they like that you havent rated.Netflix: Users who liked

  • Types of recommendationsUser-basedAdvantages:Users dont need to rate much.No info about products needed.Easy to implementDisadvantagesPushes users toward the middle products with more ratings carry more weight.How to deal with new products?Many products and few users -> lots of things dont get recommended.

  • Types of RecommendationsManual/free-formUsers write reviews for a product, which are attached to the product. Advantages:Natural language, explanations for pros/cons, users get to participate.Disadvantages:Few neutral recommendations, difficult to automate.Netflix: Member Reviews, Critic Reviews

  • Potential ApplicationsPlacing a product in spaceThe product youre looking at is like Configuring displayChoosing what to show or emphasize based on preferences.Personalized discounts/couponsGrocery stores do this.Clustering usersDetermining the tastes of your consumers.

  • Details: How RS workContent-based (user-based) systems try to learn a model of a users preferences.This is a function that, for each user, maps an item, to an indication of how much the user likes it.Might be yes/no or probabilistic.

  • How RS workA common model-learner is a nave Bayes classifier.An item is represented as a feature vector.Web pages: list/bag of possible wordsMovies: list of possible actors, directors, etc.This vector is large, so common features are filtered out. (the, an, etc)Useful for unstructured data such as text

  • Nave Bayes ClassifierMaps from an input vector to a probability of liking.Nave: assumes inputs are independent of each other.Probability that an item j belongs to class i, given a set of attribitutes:P(Ci | A1=v1 & A2=v2 An=vn)If all As independent, we can use:P(Ci) = P P(A = Vj | Ci)(this is easy to compute)Pick the C with the highest probability.

  • Training a Nave Bayes ClassifierHow do we know P(A = vj | Ci)?User labels data for us (says what she likes).For each class, we compute the fraction of times that A=vj

  • ExampleTwo classes (yes, no)Three documents, each of which have four words.D1: {cat, dog, fly, cow} -> yesD2: {crow, straw, fly, zebra} -> noD3: {cat, dog, zoom, flex} -> yesNumber of unique words in yes: 6Number of unique words in no: 4Total # of words: 9

  • ExampleP(cat | yes): 2/6P(cat | no): 0/6P(yes | {cat, zoom, fly, dog}) = 2/6 * 1/6 * 1/6 * 2/6 = 0.003P(no | {cat, zoom, fly, dog}) = e * e * 1/4 * e ~ 0.00025(epsilon helps us deal with sparse data)

  • Rule-learning algorithmsIf data is structured, rules can be learned for classificationDirector=kubrick && star=mcdowell -> likeTitle=police academy* -> not likeThese rules can be stored efficiently as a decision treeTests at each node.Fast, easy to learn, can handle noise

  • Decision TreesTitle=Police AcademyyesnoNot likeDirector=kubrickStar=mcdowellyesyesnonolike

  • Other model-learning approachesTFIDFProduces similar results to Nave BayesNeural NetLearns a nonlinear function mapping features to classes.More powerful, but results can be hard to interpret.

  • Comparing users to usersOften, its easier to compare users to other users.Less data neededNo knowledge of items required.Typical approach involves nearest-neighbor classification.

  • Nearest-neighbor classificationWe create a feature vector for each user containing an element for each ratable item.To compare two users, we compute the Euclidean distance between the filled-in elements of their feature vectors.Sqrt(Si(|uji uki)2) To recommend, find a similar user, then find things that user rated highly.

  • ExampleSay our domain consists of four movies:Police AcademyClockwork OrangeLord of the RingsTitanicWe represent this as a four-tuple:

  • ExampleWe currently have three users in the systemu1: u2: u3: A new user u4, comes in.

    Most similar to u1, so we would recommend they see Lord of the Rings and avoid Clockwork Orange

  • Personal and Ethical IssuesHow to get users to reveal their preferences?How to get users to rate all products equally (not just ones they love or hate)Users may be reluctant to give away personal data.Users may be upset by preferential treatment.

  • SummaryRecommender systems allow online retailers to customize their sites to meet consumer tastes.Aid browsing, suggest related items.Personaliztion is one of e-commerces advantages compared to brick-and-mortar stores.Challenges: obtaining and mining data, making intelligent and novel recommendations, ethics. Can perform comparisons across users or across items.Trade off data needed versus detail of recommendation.