recommender systems. outline limitations of recommender systems smartmuseum case study

Recommender Systems

Outline

•Limitations of Recommender Systems

•SMARTMUSEUM Case Study

(2006 – 2009)

• Open competition to build a collaborative filtering algorithm

• $1,000,000 cash prize

• Winner beat Netflix’s current accuracy by 10.06%

The Netflix Prize

How many approaches were used?

However, the winning model wasn’t implemented...

“Over 500” [1]

1.

2. They didn’t predict changing user requirements from dvds to streaming

Why wasn’t it implemented?

The competition used 100 million ratings while Netflix had around 5 billion at the time

SMARTMUSEUM:A Mobile Recommender System for the Web of Data

Smart Museum

• A mobile recommender system

• Presents users with site recommendations and on site works of art

• Provides descriptions and associated multimedia content

• Ontology based system (item-based)

Limitations of mobile recommender systems• Heterogeneous content

• Different structures

• Different vocabularies

• Semantic differences

• Over-specialisation• Objects too similar to past preferences

• Content vs context

How Recommendations Are Made

• The system recommends objects on the basis of a user profile and context information such as the physical location and motivation of the user

User Profile

User Profile (Cont.)

System Overview: Scenarios

1.

2. The mobile outdoor scenario:• Uses GPS or cell identifier• Combines with user profile

and visit time

The desktop scenario:• Can create a user profile specifying

preferences and abilities

System Overview: Scenarios (Cont.)

The mobile indoor scenario:• Manually switch to indoors or by RFID sensor

• Users can use their current profile or a pre-defined one (avoids cold-start)

• Object descriptions come from actively maintained collections

• Objects are annotated with contexts and with weightings

• User can retrieve related content of each recommendation

• Liking or disliking an object updates the user profile based on the triples occurring in

object annotations

3.

Example Triple

System Components

• Metadata service

• Context service

• User profile service

• Filtering service

Metadata Service

• Responsible for storing the object annotations obtained from crawling

the web.

• Checks designated URLS that point to a data dump of triples and

stores this in the database for further access.

• The data is then sent to the filtering service for indexing

Context Service

• Maps the RFID identifiers and GPS coordinates to URIs in the ontologies.

• Maps manual and sensor based contextual data to the concepts defined in the ontologies.

• Spatial search constraints corresponding to the user’s context are used to limit the matching of possible objects to be recommended.

• The context information is used to retrieve the parts of the user’s profile that are seen as relevant given their previous behaviour.

User Profile Service

• Stores user profiles and maintains context information alongside the

profiles.

• Adapts to explicit relevance feedback determined through indication

of specific objects as relevant or non-relevant.

• The user’s interest is modelled as a conditional probability.

Filtering Service

• Indexes the content from the meta data service.

• Filters recommendations upon the mobile client’s request based on the user profile and context.

The Recommender System

• User profiling• Data indexing (won’t cover)

• Result ranking• Query expansion• Feature balancing (count based normalisation of content vs context triples)

• Result clustering

User Profiling

• Context-aware user profiling requires flexible models that can be used to represent different context variables. Therefore, we have adopted a probabilistic user profiling model.• A user profile consists of a set of profile entries e where

e = <triple, contextTriple, count, tone>• Assume independence of the triples which is a simplification but

keeps the computation simple and has been proven to perform well in practice. [2]

User Profiling (Example Entry)

Duomo di Milano

Assume the user ‘likes’ the Duomo di MilanoWe would insert the following into the user profile:

triple = <DuomoDiMilano, rdf:type, aat:church>

contextTriple = <rdf: Resource, sm:userLocation, place:Milan>

tone = positive

Computing Likelihoods

• To compute the weight of each triple given a context we get the likelihood of a context generating a certain triple.

P(t|ct) =

• Compute this separately for positive and negative feedback

count(t|ct)count(ct)

Information Filtering

wi,j = tfi,j x idfi

Ni,j

∑k Nk,j

tfi,j =

• Where N is the total number of objects, ni is the num of objects where the triple i appears.

idfi = log Nni

Ni,j• Where is the num of times a triple i is mentioned in object j

(Note: This is the weighting for a feature, not an object)

Why use this weighting?

• NZ is likely to have more occurrences than Wellington

• Wellington has a “deductive closure” of NZ (subsumption)

• The weighting counters this to preference Wellington

• Allows specific triples to be matched to more general ones

Result Ranking

• Item Based

• Cosine similarity between each object and the profile of the user

• Previously calculated weightings and probabilities included in ranking

Query Expansion (ontology based)

• Users interested in the Duomo di Milano may also be interested in Milan or Florence (nearby).

• Wu-Palmer similarity measure [3]

• Concepts about a specified threshold of are selected for expansion

Milan

Florence

Result Clustering• After result ranking we have the similarity between objects and user profile• Users may want results based on different preferences• Prevents over-specialisation in recommendations• FastICA Algorithm [4]

Experimental Evaluation

• Recommendation experiment and Linking Experiment• Tested on objects indexed with Getty Vocabularies to build concepts,

terms, and descriptions of the objects.• Museum professionals provided relevance assessments for the

dataset (500 objects, 28 user profiles)• The accuracy of the methods was measure in terms of recall,

precision, and mean average precision

Results

Results

• When used together, query expansion, feature balancing, and clustering improve filtering accuracy.

• Clustering improved the MAP of the filtering process by 11%. (only when combined)

• Clustering is effective for reducing over-specialisation, and increasing the diversity of the results.

User Trials: Outline

• Conducted at the Mueseum of Fine Arts in Malta, and at Museo Galileo.

• 24 participants were recruited (11 and 13 respectively).

• Given 30-minute presentation about the system.

• Users filled out a questionnaire on a modified version of the System Usability Scale.

User Trials: Questionnaire

User Trials: Results

Users were asked ideas for improvements:

• Indoor map support.

• Explanations behind how other objects are related to the one being examined and relation with the user profile.

• Support for planning a tour beforehand

Improvements

• Testing generalisation in other domains• Utilising sensors that don’t require users to read RFID tags• Indoor map functionality• Incorporating other sensor technologies (e.g. camera, microphones,

accelerometers, compasses)• Collaborative filtering based on other user’s profiles and

recommendations

Author Conclusions

• Using ontologies to represent objects and enhance information retrieval leads to substantial improvement in recommendation accuracy. (strong evidence)

• Post-retrieval clustering increased the diversity of recommendations and improved the general retrieval performance.

Additional Resources

• SMARTMUSEUM demonstrations: https://vimeo.com/7571279 and https://vimeo.com/11101366

• Onboarding New users in Recommender Systems: http://grouplens.org/onboarding-new-users-in-recommender-systems/#more-5563

• Introduction to Recommender Systems MOOC by the University of Minnesota: https://www.coursera.org/learn/recommender-systems

https://vimeo.com/7571279






http://grouplens.org/onboarding-new-users-in-recommender-systems/#more-5563



https://www.coursera.org/learn/recommender-systems



References[1] Chen, E. (2011). Winning the Netflix Prize: A Summary. Retrieved from http://blog.echen.me/2011/10/24/winning-the-netflix-prize-a-summary/[2] Manning, C.D., Schuetze, H. (1999). Foundations of Statistical Natural Language Processing. 1st ed. The MIT Press.[3] Wu, Z., Palmer, M. (1994). Verbs semantics and lexical selection. In: Proceedings of the 32nd annual meeting on Association for Computational Linguistics. Morristown, NJ, USA: Association for Computational Linguistics; p. 133 – 138.[4] Hyvarinen, A., Oja, E. (1997). A fast fixed-point algorithm for independent component analysis; p. 1483 – 1492.

http://blog.echen.me/2011/10/24/winning-the-netflix-prize-a-summary/

http://blog.echen.me/2011/10/24/winning-the-netflix-prize-a-summary/

recommender systems. outline limitations of recommender systems smartmuseum case study

Documents