contextual recommendation system mainrecommender systems are extensively used in the ecommerce world...

75
POLITECNICO DI MILANO V Facolta di Ingineria Dipartimento di Elettronica e Informazione Corso di Laurea Specialistica in Ingegneria Informatica (Master of Science in Computing Systems Engineering) Contextual Recommendation System Contextual Recommendation System Contextual Recommendation System Contextual Recommendation System Supervisor: Prof.ssa Letizia Tanca Master’s Tesina Submitted By: Rahman Mohammed Mahmudur Matricola: 737152 Email: [email protected] Academic Year 2011

Upload: others

Post on 11-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Microsoft Word - Contextual Recommendation System mainCorso di Laurea Specialistica in Ingegneria Informatica
(Master of Science in Computing Systems Engineering)
Contextual Recommendation SystemContextual Recommendation SystemContextual Recommendation SystemContextual Recommendation System
Supervisor: Prof.ssa Letizia Tanca
Rahman Mohammed Mahmudur
3 A Logical Model for Context and User Preferences 12
3.1 Modelling Context 12
3.4 Context Parameters 13
3.6 Dynamic and Static Context Parameters 13
3.7 Hierarchies for Attributes 14
3.8 Contextual Preferences 15
3.8.1 Basic preferences 15
3.8.2 Aggregate preferences 15
3.9 Inheriting Preferences 16
4.1 Explicitly 18
4.2 Implicitly 18
4.3 Inferring 18
5.3 Storing the Value Functions 23
5.4 Storing Aggregate Preferences 24
6. Recommender systems 25
7. Modeling Contextual Information in Recommender Systems 30
8. Paradigms/ Concepts for Incorporating Context in Recommender
Systems
34
9.1 Design paradigms for Contextual Recommendation Systems
47
9.2.3 Offline Services 61
9.3.2 Aggregating Neighbor Ratings 71
9.3.3 Pearson-r Correlation Algorithm 71
9.3.4 Instance-based methods. 72
Fig. 3.9.b The hierarchy tree for parameter L. 16
Fig. 3.9.c The hierarchy tree of location 17
Fig. 5.a Data cubes for each context parameter.
21
Fig. 5.1.a The two fact tables of our schema (one for each context parameter) and
the dimension tables for Users and Restaurants.
22
Fig. 5.2.a A typical (left) and an extended dimension table (right) 22
Fig. 8.a General components of the traditional recommendation process
35
37
Fig. 8.2.a Final phase of the contextual post-filtering approach: recommendation
list adjustment
Fig. 9.a Proposed architecture for a context-aware Recommender System. 45
Fig. 9.1.a Use Case Diagram for the Contextual Recommender System 47
Fig 9.1.2.a The Sequence Diagram for Prepare Profile Acquisition use case 52
Fig. 9.1.2.b The Sequence Diagram for Prepare Contextual Profile use case 54
Fig. 9.1.2.c The Sequence Diagram for Prepare Recommendation List use case 55
Fig. 9.2.3 Log file 63
Fig. 9.2.4.a A sample of user history H 65
Fig. 9.3.2.a Pearson-r correlation 72
6
Abstract
The traditional recommendation system usually ignored the contextual information and
simply focused on the past preferences of customers. However, there are usually various
factors influencing customer’s decision on what product to buy and what information to rely
on in reality. Besides, customer’s demand might change with context as well (e.g. time,
location and weather etc). Therefore, this work proposes a contextual recommendation
framework to improve this problem and provide more suitable recommendation results which
could more consistent with customers requirement.
Recommender systems are efficient tools that overcome the information overload problem by
providing users with the most relevant contents. This is generally done through user's
preferences/ratings acquired from log files of his former sessions. Besides these preferences,
taking into account the interaction context of the user will improve the relevancy of
recommendation process. In this paper, we propose a contextual recommender system based
on both user profile and context. The approach we present is based on a previous work that
provides a set of personalization services. We show how these services can be deployed in
order to provide advanced contextual recommender systems.
7
Acknowledgment
First and foremost I offer my sincere gratitude to my supervisor, Prof.ssa Letizia Tanca, who
has supported me throughout my thesis with her patience and knowledge. I attribute the level
of my Masters degree to her encouragement and dedication to advice me.
Special thanks to my supervisor, who consistently and tirelessly dedicated her time to
encourage, guide and help me throughout my stay at Politecnico Di Milano. Thanks to all the
dear friends nearby in helping and encouraging me during my stay.
Finally, not only for the thesis, but for the whole opportunity I have to pursue this masters
program to an end, I would like to thank Politecnico di Milano and DSU scholarship
Program, who provided me with all the necessary fees, education and helpful stuffs.
8
Chapter One
1. Introduction
Recommender systems are used to provide users with a richer experience and help them make
the selection process easier. Recommender systems are extensively used in the ecommerce
world to recommend products to customers, such as recommending movies to Web site
visitors or recommending customers for books, etc. However, in many applications, such as
recommending a vacation package, personalized content on a Web site, various products in an
online store, or a movie, it may not be sufficient to consider only users and items – it is also
important to incorporate the contextual information into the recommendation process. For
example, in the case of personalized content delivery on a Web site, it is important to
determine what content needs to be delivered (recommended) to a customer and when. More
specifically, on weekdays a user might prefer to read world news when she logs on in the
morning and the stock market report in the evening, and on weekends to read movie reviews
and do shopping. In particular, the same consumer may use different decision-making
strategies and prefer different products or brands under different contexts [Lussier &
Olshavsky 1979, Klein & Yadav 1989, Bettman et al. 1991]. Therefore, accurate prediction of
consumer preferences undoubtedly depends upon the degree to which we have incorporated
the relevant contextual information into a recommendation method.
A better Recommendation System (RS) is the one which delivers recommendations that best
match with users' preferences, needs and hopes at the right moment, in the right place and on
the right media. This can't be achieved without designing a RS that takes into account all
information and parameters that influence user's ratings. This information may concern
demographic data, preferences about user's domain of interest, quality and delivery
requirements as well as the time of the interaction, the location, the media, the cognitive status
of the user and his availability. This knowledge is organized into the two concepts of user
profile and context. The user profile groups information that characterizes the user himself
while the context encompasses a set of features which describe the environment within which
the user interacts with the system.
9
1.1 Objective
Our main objective include the following: (i) We present a general architecture for Contextual
Recommendation System based on a set of personalization services. (ii) We extend the
traditional recommendation model with a new context learning service that enables concrete
construction of users' contexts from their log files. (iii) The contextualization service proposed
in is improved by combining both support and confidence of ratings within a given context
instead of considering their frequencies only. This work reports the general state
conceptualization effort for integrating context into recommendation system.
Taking into account both profiles and contexts in a recommendation process benefits to any
RS for many reasons: (i) Users' preferences/ratings change according to their contexts (ii) The
additive nature of traditional RS does not consider multiple ratings of the same content. (iii)
RS may fail in providing some valuable recommendations as their similarity distance is
uniformly applied to user preferences without analyzing the discrepancies introduced by the
context.
We argued that relevant contextual information does matter in recommender systems and that
it is important to take this contextual information into account when providing
recommendations. We also explained that the contextual information can be utilized at various
stages of the recommendation process. We have also showed that various techniques of using
the contextual information and we presented a case study describing one possible way of
generalization of contextual recommendation work.
1.2 Outline
This thesis begins by introducing recommender systems and the techniques associated with
them, setting the context in which this work is set. This thesis investigates some approaches to
exploit Context in Recommender Systems. It provides a general architecture of Contextual
Recommender Systems and analyzes separate components of this model. The main focus is to
investigate new approaches based on both user profile and context. In this work I also
describe the procedure on item selection and item weighting for context-dependent
Collaborative Filtering (CF). The approach we present is based on a previous work on data
10
personalization which leads to the definition of a Personalized Access Model that provides a
set of personalization services which can be deployed in order to provide advanced contextual
recommender systems.
Chapter 2 and Chapter 3, Context and A Logical Model for Context and User Preferences,
discusses the general notion of context as well as how it can be modeled in recommender
systems.
Chapter 4 and Chapter 5, Obtaining Contextual Information and The Storage Model,
discusses about the procedure and type of contextual information and way to store our context
and preference information in the database, also discusses about the storage of preferences and
then the storage of attribute hierarchies.
Chapter 6 and Chapter 7, Recommender systems and Modeling Contextual Information in
Recommender Systems, reviews recommender systems in detail by examining the current
recommender systems work within the literature. With a short example I show how the
contextual information will model for the recommender system. And try to give a concept for
Incorporating Context in Recommender Systems in Chapter 8.
Chapter 9, Proposed Architecture for a Contextual Recommender System, here I propose a
general architecture for contextual recommendations system and try to give visibility using
Use case diagram and sequence diagram. And also try to which methodology will work with
the proposed architecture and fulfill the contextual recommendation system. I also focus on
Recommendation system which combines content-based techniques and Collaborative
Filtering. The content-based approach permits to learn users' profiles by analyzing content
descriptors. Resulting profiles are, then, matched and compared to determine similar users in
order to make collaborative recommendations. The CF approach allows exploiting the ratings
given by the Top K neighbors of the active user in order to derive the missing rating by
aggregation function.
Chapter 10, Conclusion, Finally, we have shown how these services can be combined to form
a Contextual recommendation system. Further research on this topic will concentrate on the
evaluation of this approach by comparing traditional RS with Contextual recommendation
System.
11
Chapter Two
2. Context
Context is any information than can be used to characterize the situation of an entity. An
entity is a person, place or object that is considered relevant to the interaction between a user
and an application, including the user and the application themselves. Examples of context
are: location, identity and state of people, companions, time, activities of the current user, the
devices being used etc. A system is context-aware, if it uses context to provide relevant
information and/or services to the user, where relevancy depends on the user’s task. In
particular, users express their preferences on specific attributes of a relation. Such preferences
depend on context, that is, they may have different values depending on context.
Context is a general term. There are various types of context that are related to data
engineering tasks. Such as User Context, User Context includes the user’s profile, location,
people nearby, even the current social situation. This type of context parameters directly
affects the type of information relevant to a user. Thus the user context affects the results of
query processing. Likely another context is Time Context, which refers to the typical
characterizations of time such as time of a day, week, month and season of the year. Time
may affect the result of a query, since relevance may also be dependent on the time. There are
dependencies among the various types of context parameters. For instance time context may
affect the computing context for example network traffic at weekends is less than during
weekdays. Besides the notion of context, context is also used in information modeling as a
higher level conceptual entity that describes a group of conceptual entities from a particular
standpoint.
2.1 The Characteristics of Context
The characteristics of context parameters make handling them an intricate task. Such
characteristics include the following:
Context exhibits a range of temporal characteristics. Some types of context (such as profiles)
are relatively static whereas other types of context (such as location) are dynamic.
12
Furthermore, storing the context history is important for predicting future values of context
and developing appropriate models for context evolution.
Context information is imperfect. There are various reasons for that. One reason stems for the
fact that context information is often dynamic, thus it gets quickly out dated. Then, some
forms of context information is often produced by crude sensor inputs which may result in
faulty information. Furthermore, due to disconnections or failures the available information
may be imprecise. Finally, there is often the need for time consuming transformations for
producing usable values of context. Often the overhead of such transformations may be
avoided or reduced on the cost of producing rougher estimations.
Finally Context parameters are highly interrelated. There are complex dependencies among
them that are sometimes difficult to deduce. Such dependencies may lead to conflicting and
sometimes inconsistent results. Context Information often involves real world entities. The
characteristics of context information can vary, perhaps substantially, depending on the
context source and the type of context.
Chapter Three
3. A Logical Model for Context and User Preferences
Our model is based on relating context and database relations through preferences (σ–
preference). First, we present the fundamental concepts related to context modeling. Then, we
proceed in defining user preferences.
3.1 Modeling Context
The modeling of context relies on several fundamental concepts. As usual, domains represent
the available types and collections of values of the system. Context parameters refer to the
available set of attributes that the database designer will chose to represent context. At any
point in time, a context state refers to an instantiation of the context parameters at this point.
Context parameters are extended with OLAP-like hierarchies, in order to enable a richer set of
query operations to be applied over them.
13
3.2 Domains
A domain is an infinitely countable set of values. All domains are enriched with a special
value ∗ for representing NULL, the semantics of which refers to our lack of knowledge.
3.3 Attributes and Relations
As usual, we assume a countable collection of attribute names. Each attribute Ai is
characterized by a name and a domain dom(Ai). A relation schema is a finite set of attributes
and a relation instance is a finite subset of the Cartesian product of the domains of the relation
schema.
3.4 Context Parameters
Context is modeled through a finite set of special-purpose attributes, called context parameters
(ci). For a given application X, we can define its context environment CX as a set of n context
parameters {c1, c2, . . . , cn}.
3.5 Context State
In general, a context state is an assignment of values to context parameters. The context state
at time instant t is a tuple with the values of the context parameters at time instant t, CSX(t) =
{c1(t), c2(t), . . . cn(t)}, where ci(t) is the value of the context parameter ci at time point t. For
instance, assuming location and weather as context parameters, a context state can be:
CS(current) = {Acropolis, sunshine}.
3.6 Dynamic and Static Context Parameters
We discriminate between two kinds of context parameters: (a) static and (b) dynamic context
parameters. Static context parameters take as value a simple value out of their domain.
Dynamic
context parameters on the other hand, are instantiated by the application of a function, the
result of which is an instance of the domain of the context parameter.
14
3.7 Hierarchies for Attributes
It is possible for an attribute to participate in an associated hierarchy of levels of aggregated
data i.e., it can be viewed from different levels of detail. Formally, an attribute hierarchy is a
lattice of attributes – called levels for the purpose of the hierarchy – L = (L1, …. ,Ln, ALL).
We require that the upper bound of the lattice is always the level all, so that we can group all
the values into the single value “all”. The lower bound of the lattice is called the detailed level
of the parameter. For instance, let us consider the hierarchy location of Fig. 3.7.a . Levels of
location are Region, City, Country, and all. Region is the most detailed level. Level all is the
most coarse level for all the levels of a hierarchy. Aggregating to the level all of a hierarchy
ignores the respective parameter in the grouping (i.e., practically groups the data with respect
to all the other parameters, except for this particular one).
Fig 3.7.a. Hierarchies on Location
The relationship between the values of the context levels is achieved through the use of the set
of anc L2
L1 functions. A function anc L2
L1assigns a value of the domain of L2 to a value of the
domain of L1. For instance, anc City
Region(Acropolis) = Athens.
3.8 Contextual Preferences
In this section, we define how a context state affects the results of a query. In our model, each
user expresses his/her preference by providing a numeric score between 0 and 1. This score
expresses a degree of interest, which is a real number. Value 1 indicates extreme interest. In
reverse, value 0 indicates no interest for a preference. The special value for a preference
means that there is a user’s veto for the preference. Furthermore, the value ∅ represents that
any value is acceptable. More specifically, we divide preferences into basic (concerning a
single context parameter) and aggregate (concerning a combination of context parameters):
3.8.1 Basic preferences
Each basic preference is described by (a) a context parameter ci, (b) a set of non-context
parameters Ai, and (c) a degree of interest, i.e., a real number between 0 and 1. So, for the
context parameter ci, we have: preferencebasic i (ci,Ak+1, . . .,An) = interest–scorei
3.8.2 Aggregate preferences
Each aggregate preference is derived from a combination of basic preferences. The aggregate
preference is expressed by a set of context parameters ci and a set of non-context parameters
Ai, and has a degree of interest ( preference (c1, . . . ck,Ak+1, . . .,An) = interest–score). The
interest score of the aggregate preference is a value function of the individuals scores (the
degrees of the basic preferences). The value function prescribes how to combine basic
preferences to produce the aggregate score, according to the user’s profile. Users define in
their profile how the basic scores contribute to the aggregate, giving a weight to each context
parameter. So, if the weight for a context parameter is wi the interest score will be:
interest–score = w1* interest–score1 + . . . + wk * interest–scorek.
In our motivating example, there are two context parameters, location and weather. Also, the
set of non-context parameters are attributes about restaurants and users (in this case the user is
Mary), that are stored in the database. From Mary,s profile we know that when she is at
Acropolis she gives at the restaurant BeauBrummel the score 0.8, and when the weather is
cloudy the same restaurant has score 0.9. In order to explain Mary,s high scores to the above
preferences, we refer that the restaurant BeauBrummel is located in Athens, near Acropolis,
16
and Mary likes to eat french cuisine when the weather is cloudy (BeauBrummel has french
cuisine). So, the basic preferences are:
preferencebasic1 (Acropolis, BeauBrummel,Mary) = 0.8 and
preferencebasic2 (cloudy,BeauBrummel,Mary) = 0.9
In this way, if the weight of location is 0.6 and the weight of weather is 0.4, the preference has
score: 0.6 ∗ 0.8 + 0.4 ∗ 0.9 = 0.84 (from the above value function). Thus, we have:
preference(Acropolis, cloudy,BeauBrummel,Mary) = 0.84.
3.9 Inheriting Preferences
When the context parameter of a basic preference participates in different levels of a
hierarchy, users can express their preference in any level, as well in more than one level. For
example, Mary can denote that the restaurant Beau Brummel has interest score 0.8 when she is
at Kifisia and 0.6 when she is in Athens. Note that in the hierarchy of location the city of
Athens is one level up the region of Kifisia.
Name Notation
Attribute Ai
TABLE 3.9.a NOTATIONS
All
L1
L2
17
The tree of Fig. 3.9.b represents the different levels of hierarchy for a context parameter. For
the parameter L, let L1, L2,. . . , Lm, ALL be the different levels of the hierarchy, which can
take various different values. There is a hierarchy tree, for each combination of non-context
parameters.
Fig. 3.9.c The hierarchy tree of location.
In our reference example (Fig.3.9.c), there is a hierarchy tree for each user profile and for a
specific restaurant that represents the interest scores of the user for the restaurants,
accordingly to the context parameter’s hierarchy. The root of the tree concerns level ALL with
the single value all. The values of a certain dimension level L are found in the same level of
the tree (e.g., Athens and Ioannina, being both members of the dimension level City, are found
at the same level of the tree in Fig. 3.9.c). The ancestor relationships anc L2
L1 are translated to
parent-child relationships in the tree (e.g., the node Greece is the parent of the node Athens).
Each node is characterized by a score value for the preference concerning the combination of
the non-context attributes with the context value of the node. If the query conditions refer to a
level of the tree in which there is no explicit score given by the user, we propose three ways
to find the appropriate score for a preference. In the first approach, we traverse the tree
upwards until we find the first predecessor for which a score is specified. In this case, we
assume that, a user that defines a score for a specific level, implicitly defines the same score
for all the lower levels. In the second approach, we compute the average score of all the
successors of the immediately lower level. Finally, following a hybrid approach, we can
compute a weighted average score combining the scores from both the predecessor and the
all
Greece
Athens
Cyprus
0.9
18
successors. In any of the above cases, if no score is defined at any level of the hierarchy, there
is a default score of 0.5 for value all.
As for example, Fig.3.9.c that depicts a hierarchy for a user (Mary) and a restaurant
(BeauBrummel). So, for instance the restaurant BeauBrummel has score 0.8 when Mary is
near Acropolis, 0.7 when she is in Kifisia, and 0.9 when she is in Ioannina. The root of the
hierarchy has the default score 0.5. These degrees of interest scores, except the last one, have
been explicitly defined by the user in her profile. If the query conditions refer to Athens, for
which there is no score, the first approach gives score 0.5, because this is the first available
predecessor’s score. If we choose the second approach, this leads to score (0.8 + 0.7)/2 = 0.75,
while the third one produces a weighted combination of the above scores.
Chapter Four
4. Obtaining Contextual Information
The contextual information can be obtained in a number of ways, including:
4.1 Explicitly i.e., by directly approaching relevant people and other sources of contextual
information and explicitly gathering this information either by asking direct questions or
eliciting this information through other means. For example, a website may obtain contextual
information by asking a person to fill out a web form or to answer some specific questions
before providing access to certain web pages.
4.2 Implicitly from the data or the environment, such as a change in location of the user
detected by a mobile telephone company. Alternatively, temporal contextual information can
be implicitly obtained from the timestamp of a transaction. Nothing needs to be done in these
cases in terms of interacting with the user or other sources of contextual information – the
source of the implicit contextual information is accessed directly and the data is extracted
from it.
4.3 Inferring the context using statistical or data mining methods. For example, the
household identity of a person flipping the TV channels (husband, wife, son, daughter, etc.)
19
may not be explicitly known to a cable TV company; but it can be inferred with reasonable
accuracy by observing the TV programs watched and the channels visited using various data
mining methods. In order to infer this contextual information, it is necessary to build a
predictive model (i.e., a classifier) and train it on the appropriate data. The success of inferring
this contextual information depends very significantly on the quality of such classifier, and it
also varies considerably across different applications. For example, it was demonstrated in
that various types of contextual information can be inferred with a reasonably high degree of
accuracy in certain applications and using certain data mining methods, such as Naive Bayes
classifiers and Bayesian Networks.
Finally, the contextual information can be “hidden” in the data in some latent form, and we
can use it implicitly to better estimate the unknown ratings without explicitly knowing this
contextual information. For instance, in the previous example, we may want to estimate how
much a person likes a particular TV program by modeling the member of the household
(husband, wife, etc.) watching the TV program as a latent variable. It was also shown in that
this deployment of latent variables, such as intent of purchasing a product (e.g., for yourself
vs. as a gift, work-related vs. pleasure, etc.), whose true values were unknown but that were
explicitly modeled as a part of a Bayesian Network (BN), indeed improved the predictive
performance of that BN classifier. Therefore, even without any explicit knowledge of the
contextual information (e.g., which member of the household is watching the program),
recommendation accuracy can still be improved by modeling and inferring this contextual
information implicitly using carefully chosen learning techniques (e.g., by using latent
variables inside well-designed recommendation models). A similar approach of using latent
variables is presented in.
Assume that the context is defined with a predefined set of contextual attributes, the structure
of which does not change over time. The implication of this assumption is that we need to
identify and acquire contextual information before actual recommendations are made. If the
acquisition process of this contextual information is done explicitly or even implicitly, it
should be conducted as a part of the overall data collection process. All this implies that the
decisions of which contextual information should be relevant and collected for an application
20
should be done at the application design stage and well in advance of the time when actual
recommendations are provided.
In particular, Adomavicius et al. propose that a wide range of contextual attributes should be
initially selected by the domain experts as possible candidates for the contextual attributes for
the application. For example, in a movie recommendation application described in Example 1,
we can initially consider such contextual attributes as Time, Theater, Companion, Weather, as
well as a broad set of other contextual attributes that can possibly affect the movie watching
experiences, as initially identified by the domain experts for the application. Then, after
collecting the data, including the rating data and the contextual information, we may apply
various types of statistical tests identifying which of the chosen contextual attributes are truly
significant in the sense that they indeed affect movie watching experiences, as manifested by
significant deviations in ratings across different values of a contextual attribute. For example,
we may apply pairwise t-tests to see if good weather vs. bad weather or seeing a movie alone
vs. with a companion significantly affect the movie watching experiences (as indicated by
statistically significant changes in rating distributions). This procedure provides an example of
screening all the initially considered contextual attributes and filtering out those that do not
matter for a particular recommendation application. For example, we may conclude that the
Time, Theater and Companion contexts matter, while the Weather context does not in the
considered movie recommendation application.
5. The Storage Model
There is a straightforward way to store our context and preference information in the database.
We organize preferences as data cubes, following the OLAP paradigm. We follow the
implementation of our context model in relational DBMS structures. First, we discuss the
storage of preferences and then the storage of attribute hierarchies.
Fig. 5.a Data cubes for each context parameter.
5.1 Storing Basic Preferences
In this model, we store basic user preferences in hypercubes, or simply, cubes. The number of
data cubes is equal with the number of context parameters, i.e., we have one cube for each
parameter, as shown in Fig. 3. In each cube, there is a dimension for restaurants, a dimension
for users and a dimension for the context parameter. In each cell of the cube, we store the
degree of interest for a specific preference. So, we can have the knowledge of score for a user,
a restaurant and a context parameter. Formally, a cube is defined as a finite set of attributes C
= (AC, A1, . . ., An, M), where AC is a context parameter, A1, . . . , An are non-context
attributes and M is the interest score. The values of a cube are the values of the corresponding
preference rules. A relational table implements such a cube in a straightforward fashion. The
primary key of the table is AC, A1, . . ., An. If dimension tables representing hierarchies exist,
we employ foreign keys for the attributes corresponding to these dimensions. Our schema
which is a modification of the classical star schema is depicted in Fig. 4.a. As we can see,
22
User
uid
name
phone
address
email
FactWeather
rid
uid
weather
score
FactLocation
rid
uid
lid
score
Resturants
rid
name
phone
region
cusine
Location
lid
country
city
region
there are two fact tables, Fact-Location and Fact-Weather. The dimension tables are: Users
and Restaurants. These are dimension tables for both fact tables.
Fig. 5.1.a The two fact tables of our schema (one for each context parameter) and the
dimension tables for Users and Restaurants.
5.2 Storing Context Hierarchies An advantage of using cubes to store user preferences is that they provide the capability of
using hierarchies to introduce different levels of abstractions of the captured context data. In
that way, we can have a hierarchy on a given context dimension. Context dimension
hierarchies give to the application the opportunity to use a combination of data between the
fact and the dimension tables on one of the context parameters. The typical way to store data
in databases is shown in Fig. 4.2.a (left).
Fig. 5.2.a A typical (left) and an extended dimension table (right)
23
In this modeling, we assign an attribute for each level in the hierarchy. We also assign an
artificial key to efficiently implement references to the dimension table. The contents of the
table are the values of the anc L2
L1 functions of the hierarchy. The denormalized tables of this
kind, participating in a database schema often called a star schema suffer from the fact that
there exists exactly one row for each value of the lowest level of the hierarchy. Therefore, if
we want to express preferences at a higher level of the hierarchy, we need to extend this
modeling (assume for example that we wish to express the preferences of Mary when she is in
Cyprus, independently of the specific region, or city of Cyprus she is found at).
To this end, in our model, we use an extension of this approach, as shown in the right of Fig.
4.2.a. In this kind of dimension tables, we introduce an extra tuple each value at any level of
the hierarchy. We populate attributes of lower levels with NULLs. To explain the particular
level that a value participates at, we also introduce a level indicator attribute. Dimension
levels are assigned attribute numbers through a topological sort of the lattice.
5.3 Storing the Value Functions
The computation of aggregate preferences refers to the composition of simple basic
preferences, in order to compute the aggregate one. The technique used for this involves using
weights for each of the parameters. Each aggregate preference involves (a) a set of k context
parameters i.e., cubes and (b) a set of n non-context parameters, common to all context cubes:
preference(c1, . . . ck,Ak+1, . . . , An) = interest_score
The non-context parameters pin the values of the aggregate scores to specific numbers and
then, the individual scores for each context parameter are collected from each context table.
Recall that the formula for computing an aggregate preference is:
interest_score = w1 ∗ interest_score1 +. . .+wk ∗ interest_scorek.
Therefore, the only extra information that needs to be stored concerns the weights employed
for the computation of the formula. To this end, we employ a special purpose table
AggScores(wC1, . . . , wCk,Ak+1, . . . , An). The value for each context parameter wCi is the
weight for the respective interest score and the value for each non-context attribute Aj is the
specific value uniquely determining the aggregate preference. For instance, in our running
example, the table AggScores has the attributes Location_weight, Weather_weight, User and
24
Restaurant. A record in this table can be (0.6, 0.4, Mary, Beau Brummel). Assume that from
Mary,s profile, we know that Beau Brummel has interest score at the current location 0.8 and
at the current weather 0.9, then, the aggregate score is: 0.6 ∗ 0.8 + 0.4 ∗ 0.9 = 0.84
5.4 Storing Aggregate Preferences
Aggregated preferences are not explicitly stored in our system. The main reason is space and
time efficiency, since this would require maintaining a context cube for each context state and
for each combination of non-context attributes. Assume that the context environment CX has n
context parameters {c1, c2, . . . , cn} and that the cardinality of the domain dom(ci) of each
parameter ci is (for simplicity) m. This means that there are m n potential context states, leading
to a very large number of context cubes and prohibitively high costs for their maintenance.
Note that some of the m n context states may not be useful, since they may correspond to
combinations of values of context parameters that represent context states that are not valid or
have a very small probability of being queried. Furthermore, some context parameters or
context states may be more popular for some non-context parameters (e.g., users) than for
others, thus making the storage of all states for all non-context parameters unjustifiable.
Finally, retrieving specific entries of such cubes is not very efficient, since it would require
building and maintaining indexes on various combinations of the context parameters. For
these reasons, we choose to store only previously computed aggregate scores. We also
propose using an auxiliary data structure that we call the context tree to index them.
25
6. Recommender systems
6.1. Introduction Current information systems deal with a huge amount of content, and deliver in consequence a
high number of results in response to user queries. Thus, users are not able to distinguish
relevant contents from secondary ones. Recommender systems (RS) are efficient tools
designed to overcome the information overload problem by providing users with the most
relevant content. Recommendations are computed by predicting user's ratings on some
contents. Rating predictions are usually based on a user profiling model that summarizes
former user's behavior.
Recommender systems use the past experiences and preferences of the target users as a basis
to provide personalized recommendations for them and as the same time, solve the
information overloading problem. The recommender system is not only limited to E-
commerce. It is also applicable for searching the most appropriate results in various search
systems. In the inferring process, the effects of contextual information should also be
considered and be used as the criterion of recommendation to provide appropriate
recommendation results. The multidimensional recommendation model (MD recommendation
model) proposed by Adomavicius and Tuzhilin (2001) as the foundation to establish a
recommendation structure with multidimensional data collection and analysis ability and
solve the movie recommendation problems with the use of hierarchy processing and aggregate
calculating capabilities. Context as the dynamic information describing the situation of items
and users and affecting the user’s decision process is essential to be used by recommender
systems.
6.2. Basics to a Recommender Systems
Traditionally, recommender systems deal with applications that have two types of entities,
users and items. The recommender system would first acquire the ratings of users toward
items they have already experienced to analyze their interests and preferences then provide
recommendations from items of the same classification that haven’t been rated by the users. In
26
other words, traditional recommender system could be shown as the values of two-
dimensional “Users×Items” matrix and it also computed the rating function of all the users
toward the items R(u,i). It can be shown as R: Users × Items → Ratings.
According to Balabanovic and Shoham [1997], the approaches to recommender systems are
usually classified as Content-based, Collaborative, and Hybrid.
6.2.1 Content-Based Recommender Systems:
In content-based recommendation methods, the rating R(u, i) of item i for user u is typically
estimated based on the ratings R(u, i’) assigned by the same user u to other items i’ Items
that are “similar” to item i in terms of their content. More formally, let Content(i) be the set of
attributes characterizing item i. It is usually computed by extracting a set of features from item
i (its content) and is used to determine appropriateness of the item for recommendation
purposes. Since many content-based systems are designed for recommending text-based
items, including Web pages and Usenet news messages, the content in these systems is
usually described with keywords. The user has to rate a sufficient number of items before a
content-based recommender system can really understand her preferences and present reliable
recommendations. This is often referred to as a new user problem, since a new user, having
very few ratings, often is not able to get accurate recommendations. Some of the problems of
the content-based methods can be remedied using collaborative methods.
6.2.2 Collaborative Recommender Systems:
Traditionally, many collaborative recommender systems try to predict the rating of an item for
a particular user based on how other users previously rated the same item. More formally, the
rating R(u, i) of item i for user u is estimated based on the ratings R(u’, i) assigned to the
same item i by those users u’ who are “similar” to user u. According to Breese et al. [1998],
algorithms for collaborative recommendations can be grouped into two general classes:
memory-based (or heuristic-based) and model-based.
Memory-based algorithms are heuristics that make rating predictions based on the entire
collection of items previously rated by the users. That is, the value of the unknown rating ru,i
27
for user u and item i is usually computed as an aggregate of the ratings of some other (e.g., the
N most similar) users for the same item i:
ru,i = aggr ru’,i
u’∈U
where ˆU denotes the set of N users that are the most similar to user u and who have rated
item i ( N can range anywhere from 1 to the number of all users). The similarity measure
between the users u and u’, sim(u, u’), determines the “distance” between users u and u’ and
is used as a weight for ratings ru’,i , the more similar users u and u’ are, the more weight rating
ru’ ,i will carry in the prediction of ru,i.
In contrast to memory-based methods, model-based algorithms use the collection of ratings to
learn a model, which is then used to make rating predictions. Therefore, in comparison to
model-based methods, the memory-based algorithms can be thought of as “lazy learning”
methods in the sense that they do not build a model but instead perform the heuristic
computations at the time recommendations are sought. One example of model-based
recommendation techniques is where a probabilistic approach to collaborative filtering is
proposed and the unknown ratings are calculated as:
and it is assumed that rating values are integers between 0 and n, and the probability
expression is the probability that user u will give a particular rating to item i given the
previous ratings of items rated by user u. Although the pure collaborative recommender
systems do not have some of the shortcomings of the content-based systems described earlier,
such as limited content analysis or over-specialization, they do have other limitations. In
addition to the new user problem (the same issue as in content-based systems), the
collaborative recommender systems also tend to suffer from the new item problem, since they
rely solely on rating data to make recommendations Therefore, the recommender system
would not be able to recommend a new item until it is rated by a substantial number of users.
The sparsity of ratings is another important problem that collaborative recommender systems
frequently face, since the number of user specified ratings is usually very small compared to
the number of ratings that need to be predicted.
28
6.2.3 Hybrid Recommender Systems:
Content and Collaborative methods can be combined into the hybrid approach in several
different ways. To building hybrid recommender systems is to implement separate
collaborative and content-based recommender systems. Then, we can have two different
scenarios. First, we can combine the outputs (ratings) obtained from individual recommender
systems into one final recommendation using either a linear combination of ratings or a voting
scheme. Alternatively, we can use one of the individual recommender systems, at any given
moment choosing to use the one that is “better” than others based on some recommendation
quality metric. The hybrid methods can provide more accurate recommendations than pure
collaborative and content-based approaches. In addition, various factors affecting performance
of recommender systems, including product domain, user characteristics, user search mode
and the number of users.
All of the approaches described in this section focus on recommending items to users or users
to items and do not take into consideration additional contextual information, such as time,
place, the company of other people, and other factors affecting recommendation experiences.
To address these issues, Adomavicius and Tuzhilin proposed a multidimensional approach
to recommendations where the traditional two-dimensional user/item paradigm was extended
to support additional dimensions capturing the context in which recommendations are made.
This multidimensional approach is based on the multidimensional data model used for data
warehousing and On-Line Analytical Processing (OLAP) applications in databases, on
hierarchical aggregation capabilities, and on user, item and other profiles defined for each of
these dimensions. Here also mention how the standard multidimensional OLAP model is
adjusted when applied to recommender systems. Finally, to provide more extensive and
flexible types of recommendations that can be requested by the user on demand, Adomavicius
and Tuzhilin present a Recommendation Query Language (RQL) that allows users to express
complex recommendations that can take into account multiple dimensions, aggregation
hierarchies, and extensive profiling information.
29
6.3 Challenges of Recommender Systems
Quality of a recommender system: Can the recommender system be trusted to produce
accurate recommendations? The recommender system must eliminate recommendations
produced by the system that the system believes the user will like but in actually the user does
not like. These are also known as false positives.
Sparsity: Since users may not rate some items, the user-item matrix may have many missing
ratings and be very sparse. Therefore, finding correlations between users and items becomes
quite difficult and can lead to weak recommendations.
Synonymy: Recommender systems are usually not able to discover associations between
similar items that may just have different names.
First Rater Problem: An item cannot be recommended unless it has been rated before. The
problem usually occurs when new items are added to the system or when there are items that
are rarely viewed and therefore may not have been rated.
New user problem: The new user problem is an issue common to all the recommender
systems. In fact, new users have no profile, thus they cannot be recommended with any items.
Moreover, the content-based recommender system requires that the users rate a sufficient
number of items before it can understand their preferences and present reliable
recommendations. Therefore, a new user, which has very few ratings, might not get accurate
recommendations.
New item problem: In addition to the new user problem, collaborative recommenders do
suffer of a peculiar issue which affects new items. Since collaborative systems recommend the
items most preferred by the user’s neighborhood, a new item cannot be recommended to
anyone because nobody rated it. Therefore, until the new item is rated by a substantial number
of users, the recommender system would not be able to recommend it. Content-based
recommender does not suffer such a problem because the new items have their own features
to be matched with user profiles.
30
7. Modeling Contextual Information in Recommender Systems
The recommendation problem is reduced to the problem of estimating ratings for the items
that have not been seen by a user. This estimation is usually based on the ratings given by this
user to other items, ratings given to this item by other users, and possibly on some other
information as well (e.g., user demographics, item characteristics). Note that, while a
substantial amount of research has been performed in the area of recommender systems, the
vast majority of the existing approaches focus on recommending items to users or users to
items and do not take into the consideration any additional contextual information, such as
time, place, the company of other people (e.g., for watching movies). Motivated by this we
explore the area of contextual recommendation systems, which deal with modeling and
predicting user tastes and preferences by incorporating available contextual information into
the recommendation process as explicit additional categories of data. These long-term
preferences and tastes are usually expressed as ratings and are modeled as the function of not
only items and users, but also of the context. In other words, ratings are defined with the
rating function as
R : User×××× Item×××× Context →→→→ Rating;
where User and Item are the domains of users and items respectively, Rating is the domain of
ratings, and Context specifies the contextual information associated with the application. To
illustrate these concepts, consider the following example.
Example 7.1. Considering the application for recommending movies to users, where users and
movies are described as relations having the following attributes:
• Movie: the set of all the movies that can be recommended; it is defined as
Movie (MovieID, Title, Length, ReleaseYear, Director, Genre)
• User: the people to whom movies are recommended; it is defined as
User (UserID, Name, Address, Age, Gender, Profession)
Further, the contextual information consists of the following three types that are also defined
as relations having the following attributes:
31
• Theater: the movie theaters showing the movies; it is defined as
Theater (TheaterID, Name, Address, Capacity, City, State, Country)
• Time: the time when the movie can be or has been seen; it is defined as
Time (Date, DayOfWeek, TimeOfWeek, Month, Quarter, Year)
Here, attribute DayOfWeek has values Mon, Tue, Wed, Thu, Fri, Sat, Sun, and
attribute TimeOfWeek has values “Weekday” and “Weekend”.
• Companion: represents a person or a group of persons with whom one can see a
movie. It is defined as
Companion (companionType)
where attribute companionType has values “alone”, “friends”, “girlfriend/boyfriend”,
“family”, “co-workers”, and “others”. Then the rating assigned to a movie by a person
also depends on where and how the movie has been seen, with whom, and at what
time. For example, the type of movie to recommend to college student Jane Doe can
differ significantly depending on whether she is planning to see it on a Saturday night
with her boyfriend vs. on a weekday with her parents.
As we can see from this example and other cases, the contextual information Context can be
of different types, each type defining a certain aspect of context, such as time, location (e.g.,
Theater), companion (e.g., for seeing a movie), purpose of a purchase, etc. Further, each
contextual type can have a complicated structure reflecting complex nature of the contextual
information. Although this complexity of contextual information can take many different
forms, one popular defining characteristic is the hierarchical structure of contextual
information that can be represented as trees, as is done in most of the context-aware
recommender and profiling systems. For instance, the three contexts from Example 1 can have
the following hierarchies associated with them:
Theater: TheaterID→City→ State → Country
Time: Date → DayOfWeek → TimeOfWeek, Date → Month → Quarter →Year.
(For the sake of completeness, we would like to point out that not only the contextual
dimensions, but also the traditional User and Item dimensions can have their attributes form
32
hierarchical relationships. For example, the main two dimensions from Example 1 can have
the following hierarchies associated with them: Movie: MovieID→Genre; User:
UserID→Age, UserID→Gender, UserID → Profession.)
Furthermore, we follow the representational view of Dourish, assume that the context is
defined with a predefined set of observable attributes, the structure of which does not change
significantly over time. Although there are some papers in the literature that take the
interactional approach to modeling contextual recommendations, such as models context
through a short-term memory (STM) interactional approach borrowed from psychology, most
of the work on context-aware recommender systems follows the representational view. As
stated before, we also adopt this representational view and assume that there is a predefined
finite set of contextual types in a given application and that each of these types has a well-
defined structure.
Contextual information was also defined as follows. In addition to the classical User and Item
dimensions, additional contextual dimensions, such as Time, Location, etc., were also
introduced using the OLAP-based multidimensional data (MD) model widely used in the data
warehousing applications in databases. Formally, let D1,D2, ……, Dn be dimensions, two of
these dimensions being User and Item, and the rest being contextual. Each dimension Di is a
subset of a Cartesian product of some attributes (or fields) Aij; ( j = 1,…. ,ki), i.e., Di ⊆ Ai1
×Ai2 ×: : :×Aiki , where each attribute defines a domain (or a set) of values. Moreover, one
or several attributes form a key, i.e., they uniquely define the rest of the attributes. In some
cases, a dimension can be defined by a single attribute, and ki =1 in such cases. For example,
consider the three-dimensional recommendation space User××××Item××××Time, where the User
dimension is defined as User ⊆ UName×Address×Income×Age and consists of a set of users
having certain names, addresses, incomes, and being of a certain age. Similarly, the Item
dimension is defined as Item ⊆ IName×Type×Price and consists of a set of items defined by
their names, types and the price. Finally, the Time dimension can be defined as Time
⊆Year×Month×Day and consists of a list of days from the starting to the ending date (e.g.
from January 1, 2011 to December 31, 2011).
33
Given dimensions D1, D2,…….. , Dn, we define the recommendation space for these
dimensions as a Cartesian product S = D1 ×D2 ×…..×Dn. Moreover, let Rating be a rating
domain, representing the ordered set of all possible rating values. Then the rating function is
defined over the space D1 ×D2 ×…..×Dn as R : D1×…… ×Dn → Rating. For instance,
continuing the User × Item × Time example considered above, we can define a rating
function R on the recommendation space User ×××× Item ×××× Time specifying how much user u
∈ User liked item i ∈ Item at time t ∈ Time, R(u, i, t).
The rating function R introduced above is usually defined as a partial function, where the
initial set of ratings is known. Then, as usual in recommender systems, the goal is to estimate
the unknown ratings, i.e., make the rating function R total. The main difference between the
multidimensional (MD) contextual model described above and the previously described
contextual model lies in that contextual information in the MD model is defined using
classical OLAP hierarchies, whereas the contextual information in the previous case is defined
with more general hierarchical taxonomies, that can be represented as trees (both balanced and
unbalanced), directed acyclic graphs (DAGs), or various other types of taxonomies. Further,
the ratings in the MD model are stored in the multidimensional cubes, whereas the ratings in
the other contextual model are stored in more general hierarchical structures. We would also
like to point out that not all contextual information might be relevant or useful for
recommendation purposes.
Different approaches to using contextual information in the recommendation process can be
broadly categorized into two groups:
(1) recommendation via context driven querying and search, and
(2) recommendation via contextual preference elicitation and estimation.
The context-driven querying and search approach has been used by a wide variety of mobile
and tourist recommender systems. Systems using this approach typically use contextual
information (obtained either directly from the user, e.g., by specifying current mood or
interest, or from the environment, e.g., obtaining local time, weather, or current location) to
query or search a certain repository of resources (e.g., restaurants) and present the best
matching resources (e.g., nearby restaurants that are currently open) to the user.
The other general approach to using contextual information in the recommendation process,
i.e., via contextual preference elicitation and estimation, represents a more recent trend in
context-aware recommender systems literature. In contrast to the previously discussed
context-driven querying and search approach (where the recommender systems use the current
context information and specified current user’s interest as queries to search for the most
appropriate content), techniques that follow this second approach attempt to model and learn
user preferences, e.g., by observing the interactions of this and other users with the systems or
by obtaining preference feedback from the user on various previously recommended items. To
model users’ context-sensitive preferences and generate recommendations, these techniques
typically either adopt existing collaborative filtering, content-based, or hybrid
recommendation methods to context-aware recommendation settings or apply various
intelligent data analysis techniques from data mining or machine learning (such as Bayesian
classifiers or support vector machines).
While both general approaches offer a number of research challenges, in the remainder of this
chapter we will focus on the second, more recent trend of the contextual preference elicitation
35
and estimation in recommender systems. We do want to mention that it is possible to design
applications that combine the techniques from both general approaches (i.e., both context-
driven querying and search as well as contextual preference elicitation and estimation) into a
single system.
To start the discussion of the contextual preference elicitation and estimation techniques, note
that, in its general form, a traditional 2-dimensional (2D) (User× Item) recommender system
can be described as a function, which takes partial user preference data as its input and
produces a list of recommendations for each user as an output. Accordingly, Figure 8.a
presents a general overview of the traditional 2D recommendation process, which includes
three components: data (input), 2D recommender system (function), and recommendation list
(output). Note that, as indicated in Figure 8.a, after the recommendation function is defined
(or constructed) based on the available data, recommendation list for any given user u is
typically generated by using the recommendation function on user u and all candidate items to
obtain a predicted rating for each of the items and then by ranking all items according to their
predicted rating value. Later in this section, we will discuss how the use of contextual
information in each of those three components gives rise to three different paradigms for
context-aware recommender systems.
Fig. 8.a General components of the traditional recommendation process
Traditional recommender systems are built based on the knowledge of partial user
preferences, i.e., user preferences for some (often limited) set of items, and the input data for
traditional recommender systems is typically based on the records of the form < user; item;
rating >. In contrast, context-aware recommender systems are built based on the knowledge
of partial contextual user preferences and typically deal with data records of the form < user;
item; context; rating >, where each specific record includes not only how much a given user
Data U X I X R 2D
Recommender
Recommendatios
U
36
liked a specific item, but also the contextual information in which the item was consumed by
this user (e.g., Context = Saturday). Also, in addition to the descriptive information about
users (e.g., demographics), items (e.g., item features), and ratings (e.g., multi-criteria rating
information), context-aware recommender systems may also make use of additional context
attributes, such as context hierarchies (e.g., Saturday → Weekend). Based on the presence of
this additional contextual data, several important questions arise: How contextual information
should be reflected when modeling user preferences? Can we reuse the wealth of knowledge
in traditional (non-contextual) recommender systems to generate context-aware
recommendations.
In the presence of available contextual information, following the diagrams in Figure 8.b, we
start with the data having the form U ××××I ××××C××××R, where C is additional contextual dimension
and end up with a list of contextual recommendations i1, i2, i3 , . … for each user. However,
unlike the process in Figure 8.b, which does not take into account the contextual information,
we can apply the information about the current (or desired) context c at various stages of the
recommendation process. More specifically, the context-aware recommendation process that
is based on contextual user preference elicitation and estimation can take one of the three
forms, based on which of the three components the context is used in, as shown in Figure 8.b:
Contextual pre-filtering (or contextualization of recommendation input): In this
recommendation paradigm (presented in Figure 8.b.a), contextual information drives data
selection or data construction for that specific context. In other words, information about the
current context c is used for selecting or constructing the relevant set of data records (i.e.,
ratings). Then, ratings can be predicted using any traditional 2D recommender system on the
selected data.
recommendation paradigm (presented in Figure 8.b.b), contextual information is initially
ignored, and the ratings are predicted using any traditional 2D recommender system on the
entire data. Then, the resulting set of recommendations is adjusted (contextualized) for each
user using the contextual information.
37
recommendation paradigm (presented in Figure 8.b .c), contextual information is used directly
in the modeling technique as part of rating estimation.
Fig.8.b Paradigms for incorporating context in recommender systems.
Data UX I XC X R Data UX I XC X R
Data UX I XC X R
Contextualized
2D Recommender
Contextual
Recommendation
i1,i2,i3, …..
C
U
U
C
U
C
38
8.1 Contextual Pre-Filtering
As shown in Figure 8.b, the contextual pre-filtering approach uses contextual information to
select or construct the most relevant 2D (User × Item) data for generating recommendations.
One major advantage of this approach is that it allows deployment of any of the numerous
traditional recommendation techniques previously proposed in the literature. In particular, in
one possible use of this approach, context c essentially serves as a query for selecting
(filtering) relevant ratings data. An example of a contextual data filter for a movie
recommender system would be: if a person wants to see a movie on Saturday, only the
Saturday rating data is used to recommend movies. Note that this example represents an exact
pre-filter. In other words, the data filtering query has been constructed using exactly the
specified context.
For example, following the contextual pre-filtering paradigm, Adomavicius et al. proposed a
reduction-based approach, which reduces the problem of multidimensional (MD) contextual
recommendations to the standard 2D User × Item recommendation space. Therefore, as with
any contextual pre-filtering approach, one important benefit of the reduction-based approach
is that all the previous research on 2D recommender systems is directly applicable in the MD
case after the reduction is done.
In particular, let R D User××××Item: U ××××I →→→→ Rating be any 2D rating estimation function that,
given existing ratings D (i.e., D contains records <user; item; rating> for each of the known,
user-specified ratings), can calculate a prediction for any rating, e.g., R D User××××Item
(John;StarWars). Then, a 3-dimensional rating prediction function supporting the context of
time can be defined similarly as R D User××××Item××××Time : U ××××I ××××T →→→→ Rating, where D contains
records < user; item; time; rating > for the user-specified ratings. Then the 3-dimensional
prediction function can be expressed through a 2D prediction function in several ways,
including:
∀ ∈(u, i, t) U ×I×T; R D User×Item×Time(u, i, t) = R
D[Time=t](User;Item;Rating) User×Item (u, i)
Here [Time = t] denotes a simple contextual pre-filter, and D[Time = t](User, Item, Rating)
denotes a rating dataset obtained from D by selecting only the records where Time dimension
39
has value t and keeping only the values for User and Item dimensions, as well as the value of
the rating itself. I.e., if we treat a dataset of 3-dimensional ratings D as a relation, then D
[Time = t] (User; Item;Rating) is simply another relation obtained from D by performing two
relational operations: selection and, subsequently, projection.
However, the exact context sometimes can be too narrow. Consider, for example, the context
of watching a movie with a girlfriend in a movie theater on Saturday or, i.e., c = (Girlfriend,
Theater, Saturday). Using this exact context as a data filtering query may be problematic for
several reasons. First, certain aspects of the overly specific context may not be significant. For
example, user’s movie watching preferences with a girlfriend in a theater on Saturday may be
exactly the same as on Sunday, but different from Wednesday’s. Therefore, it may be more
appropriate to use a more general context specification, i.e., Weekend instead of Saturday.
And second, exact context may not have enough data for accurate rating prediction, which is
known as the “sparsity” problem in recommender systems literature. In other words, the
recommender system may not have enough data points about the past movie watching
preferences of a given user with a girlfriend in a theater on Saturday.
8.2 Contextual Post-Filtering
As shown in Figure 8.b b, the contextual post-filtering approach ignores context information
in the input data when generating recommendations, i.e., when generating the ranked list of all
candidate items from which any number of top-N recommendations can be made, depending
on specific values of N. Then, the contextual post-filtering approach adjusts the obtained
recommendation list for each user using contextual information. The recommendation list
adjustments can be made by:
Filtering out recommendations that are irrelevant (in a given context), or
Adjusting the ranking of recommendations on the list (based on a given context).
For example, in a movie recommendation application, if a person wants to see a movie on a
weekend, and on weekends she only watches comedies, the system can filter out all non-
comedies from the recommended movie list. More generally, the basic idea for contextual
post-filtering approaches is to analyze the contextual preference data for a given user in a
given context to find specific item usage patterns (e.g., user Jane Doe watches only comedies
40
on weekends) and then use these patterns to adjust the item list, resulting in more “contextual”
recommendations, as depicted in Figure 8.2.a.
Fig. 8.2.a Final phase of the contextual post-filtering approach: recommendation list
adjustment
classified into heuristic and model-based techniques. Heuristic post-filtering approaches focus
on finding common item characteristics (attributes) for a given user in a given context (e.g.,
preferred actors to watch in a given context), and then use these attributes to adjust the
recommendations, including:
• Filtering out recommended items that do not have a significant number of these
characteristics (e.g., to be recommended, the movies must have at least two of the preferred
actors in a given context), or
• Ranking recommended items based on how many of these relevant characteristics they have
(e.g., the movies that star more of the user’s preferred actors in a given context will be ranked
higher).
In contrast, model-based post-filtering approaches can build predictive models that calculate
the probability with which the user chooses a certain type of item in a given context, i.e.,
probability of relevance (e.g., likelihood of choosing movies of a certain genre in a given
context), and then use this probability to adjust the recommendations, including:
Data UxIxCxR Input Context C Input User U
Item Usage Patterns
Item Usage Patterns
Contextual
Recommendations
i1,i2,i3,……………
41
•Filtering out recommended items that have the probability of relevance smaller than a pre-
defined minimal threshold (e.g., remove movies of genres that have a low likelihood of being
picked), or
• Ranking recommended items by weighting the predicted rating with the probability of
relevance.
Panniello et al. provide an experimental comparison of the exact pre-filtering method versus
two different post-filtering methods – Weight and Filter – using several real-world e-
commerce datasets. The Weight postfiltering method reorders the recommended items by
weighting the predicted rating with the probability of relevance in that specific context, and
the Filter post-filtering method filters out recommended items that have small probability of
relevance in the specific context. Interestingly, the empirical results show that the Weight
postfiltering method dominates the exact pre-filtering, which in turn dominates the Filter post-
filtering method, thus, indicating that the best approach to use (pre or post filtering) really
depends on a given application.
As was the case with the contextual pre-filtering approach, a major advantage of the
contextual pos-filtering approach is that it allows using any of the numerous traditional
recommendation techniques previously proposed in the literature. Also, similarly to the
contextual pre-filtering approaches, incorporating context generalization techniques into post-
filtering techniques constitutes an interesting issue for future research.
8.3 Contextual Modeling
As shown in Figure 8.b.c, the contextual modeling approach uses contextual information
directly in the recommendation function as an explicit predictor of a user’s rating for an item.
While contextual pre-filtering and post-filtering approaches can use traditional 2D
recommendation functions, the contextual modeling approach gives rise to truly
multidimensional recommendation functions, which essentially represent predictive models
(built using decision tree, regression, probabilistic model, or other technique) or heuristic
calculations that incorporate contextual information in addition to the user and item data, i.e.,
Rating = R(User, Item, Context). A significant number of recommendation algorithms –
based on a variety of heuristics as well as predictive modeling techniques – have been
42
developed over the last 10-15 years, and some of these techniques can be extended from the
2D to the multidimensional recommendation settings. We present a few examples of
multidimensional heuristic-based and model-based approaches for contextual modeling.
8.3.1 Heuristic-Based Approaches
The traditional two-dimensional (2D) neighborhood-based approach can be extended to the
multidimensional case, which includes the contextual information, in a straightforward
manner by using an n-dimensional distance metric instead of the user-user or item-item
similarity metrics traditionally used in such techniques. Consider an example of the
User×Item×Time recommendation space. Following the traditional nearest neighbor
heuristic that is based on the weighted sum of relevant ratings, the prediction of a specific
rating ru,i,t in this example can be expressed as:
ru,i,t = k Σ(u,i,t) =(u,i,t) W((u,i,t) ,(u,i,t))× ru,i,t
where W((u,i,t) ,(u,i,t)) describes the “weight” rating ru,i,t carries in the prediction of
ru,i,t, and k is a normalizing factor. Weight W((u,i,t) ,(u,i,t)) is typically inversely related
to the distance between points (u,i,t) and (u,i,t) in multidimensional space, i.e.,
dist[(u,i,t) , (u,i,t)]. In other words, the closer the two points are (i.e., the smaller the
distance between them), the more weight ru,i,t carries in the weighted sum. One example of
such relationship would be W((u,i,t) ,(u,i,t)) = 1/ dist[(u,i,t) , (u,i,t)], but many
alternative specifications are also possible. As before, the choice of the distance metric dist is
likely to depend on a specific application. One of the simplest ways to define a
multidimensional dist function is by using the reduction-like approach by taking into account
only the points with the same contextual information, i.e.,
dist[(u,i,t) , (u,i,t)] = dist[(u,i,t) , (u,i,t)] , if t =t
+∞; otherwise
This distance function makes ru,i,t depend only on the ratings from the segment of points
having the same values of time t. Therefore, this case is reduced to the standard 2-dimensional
43
rating estimation on the segment of ratings having the same context t as point (u,i,t).
Furthermore, if we further refine function dist[(u,i,t) , (u,i,t)] in so that it depends only on
the distance between users when i = i, then we would obtain a method that is similar to the
pre-filtering approach described earlier. Moreover this approach easily extends to an arbitrary
n-dimensional case by setting the distance d between two rating points to dist[(u,i) , (u,i)]
if and only if the contexts of these two points are the same.
Other ways to define the distance function would be to use the weighted Manhattan distance
metric, i.e., dist[(u,i,t) , (u,i,t)] = w1d1(u,u) + w2d2(i,i) + w3d3(t,t), where d1,d2 and d3
are distance functions defined for dimensions User, Item, and Time respectively, and w1,w2
and w3 are the weights assigned for each of these dimensions (e.g., according to their
importance). In summary, distance function dist[(u,i,t) , (u,i,t)] can be defined in many
different ways and, while in many systems it is typically computed between ratings of the
same user or of the same item, it constitutes an interesting research problem to identify
various more general ways to define this distance and compare these different ways in terms
of predictive performance.
8.3.2 Model-Based Approaches
systems literature for the traditional two-dimensional recommendation model. Some of these
methods can be directly extended to the multidimensional case, such as the method proposed
in, who show that their 2D technique outperforms some of the previously known collaborative
filtering methods.
In addition to possible extensions of existing 2D recommendation techniques to multiple
dimensions, there have also been some new techniques developed specifically for context-
aware recommender systems based on the context modeling paradigm. For example,
following the general contextual modeling paradigm, Oku et al.propose to incorporate
additional contextual dimensions (such as time, companion, and weather) directly into
recommendation space and use machine learning technique to provide recommendations in a
restaurant recommender system. In particular, they use support vector machine (SVM)
44
classification method, which views the set of liked items and the set of disliked items of a user
in various contexts as two sets of vectors in an n-dimensional space, and constructs a
separating hyperplane in this space, which maximizes the separation between the two data
sets. Finally, another model-based approach is presented where a Personalized Access Model
(PAM) is presented that provides a set of personalized context-based services, including
context discovery, contextualization, binding and matching services. Then Abbar et al.
describe how these services can be combined to form Context-Aware Recommender Systems
(CARS) and deployed in order to provide superior context-aware recommendations
45
Figure 9.a: Proposed architecture for a context-aware Recommender System.
Weather Temperature Locaton Companion
Model
Adapter
Prediction
Engine
Recommender
Pre
dicti
on
List
USER
46
In Figure 9.a the proposed framework is shown. Here, the rectangular boxes represent system
components and each arrow indicates information flow. For example, the context provider is
a module that tracks changes of contextual variables. It contacts appropriate context services
and stores all context states in a local database. When requested by the context manager, it
provides information about known contextual variables for a user or an item of a specific time
point. The context manager performs all the reasoning related to the context. It determines if a
contextual variable is important for a prediction, removes noisy context data and makes
predictions for missing contextual variables. The User and Item components represent user
and item information in the system. The user is modelled with his preferences and in CF this
is a vector of item ratings. The item model captures relevant knowledge in the application
domain. The model adapter is responsible for integrating contextual data into the prediction
algorithm. It takes information provided by the contextual reasoner and enhances the
representation of the user/item model with context. For instance, in this adds relevant
contextual variables to the user model. The prediction engine takes the enhanced data model
and generates a list of rating predictions. Note that we plan to have several model adapters and
prediction engines, which generate different lists of predictions. The recommender takes all
the produced recommendation lists and combines them into a final recommendation list. It can
use information obtained from the contextual manager to filter out, or change the ratings for
the final list. The explanation engine takes the final list of recommendations and provides the
explanations for each of them. It closely collaborates with the recommendation engine to
identify the best recommendations. In fact, a recommendation may be suggested just because
it could be easy to explain and not only because it has a high predicted rating. The
explanation engine could also use the contextual reasoner to find out needed information to
motivate a recommendation because of the particular contextual conditions. A user as output
gets a list of recommendations with possible explanations. The explicit feedback of the user is
recorded and is used to influence the model adapter. Note that there is also a loop between the
recommender and the model adapter, which could be used to integrate different
recommendation techniques via co-training.
9.1 Design paradigms for Contextual Recommendation Systems
Through our understanding of the various types of recommender systems and contextual
parameter, a major portion of this work is to build a system that can put into a real life
context. The contextual recommender system was designed to use both the user-based and
item-based recommender algorithms.
Figure 9.1.a : Use Case Diagram for the Contextual Recommender System
48
9.1.1 Use Cases
During this part, it is important to outline the scope of the work and the major user flows that
are associated with this system. In Figure 9.1.a, we outline the system, its major use cases,
and its interaction with the users. We can see that how the Users’s use cases link with the
Context Manager’s use cases.
Through the use of Figure 9.1.a, we can determine the various use cases associated with this
system:
2. Users Login
3. User profile
5. Contextualization
10. View the Contextual Recommended Prediction list
Use Case 1 Register a New User
Preconditions List of programs already exist
Description: Create a new account/Register.
User enters their name, username and password for login
purposes.
49
System confirms login information.
Use Case 3 User profile
Preconditions User has logged into the system
Description: User selects to edit their profile from their main page.
System displays the profile page.
User selects or edits their profile information.
User submits profile.
System verifies the profile and returns the User back to the
main page.
Provides an extensive definition of the preferences that a user
has in a given domain of interest. It is a set of (attribute,
value) couples rated by the user.
Use Case 4 Active Context using Context Provider
Preconditions Context provider is a module that tracks changes of
contextual variables. It contacts appropriate context services
and stores all context states in a local database.
Description Set of features characterizing the environment within which
users interact with system. It is the context within which the
active user interacts with Contextual Recommendation
System. It is also called current context.
Use Case 5 Contextualization
Preconditions Contextualization process takes as input a user profile and
the user history corresponding to the user feedback in
50
contexts in which these ratings were defined. The main idea
of the contextualization is to check whether there are
correlations between the user profile elements and the user
feedback within a given context.
Use Case 6 Contextualize Profiles
Preconditions The importance of taking into account the Rating when the
user interacts from the context.
Description Context acquisition from log files. It is a discovery of
relationships between these contexts and user profiles
elements. It's a set of mappings that relate a subset of profile
preferences to the context in which they are defined.
Use Case 7 Content Acquisition
Preconditions It is based on the analysis of the content of the items. That
means that items have a set of features which model their
content.
by analyzing content descriptors. Resulting profiles are, then,
matched and compared to determine similar users in order to
make collaborative recommendations.
Use Case 8 Recommendation Engine
Preconditions User has already entered their information, user has created a
profile and The goal is generated based on the input of
previous users within a user-item matrix. This matrix is used
to correlate users/items to the current user/item and generate
51
a recommendation.
Description The output of a recommender system can either be a
prediction or the expected rating score the active user will
give the current item. A recommendation is a single or list of
items that have the highest prediction. The recommender
takes all the produced recommendation lists and combines
them into a final recommendation list. It can use information
obtained from the contextual manager to filter out, or change
the ratings for the final list.
Use Case 9 Create Prediction list through Recommendation
Preconditions Takes the final list of recommendations.
Description It provides the explanations for each of them.