representation, similarity measures and aggregation methods using fuzzy sets for content-based...

19
Fuzzy Sets and Systems 160 (2009) 76 – 94 www.elsevier.com/locate/fss Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems Azene Zenebe a , , Anthony F. Norcio b a Management Information Systems Department, Bowie State University, 14000 Jericho Park Rd., Bowie, MD 20715-9465, USA b Information Systems Department, University of Maryland at Baltimore County (UMBC), 1000 Hilltop Circle, MD 21250, USA Received 17 July 2007; received in revised form 8 March 2008; accepted 10 March 2008 Available online 26 March 2008 Abstract Representation of features of items and user feedback, and reasoning about their relationships are major problems in recommender systems. This is because item features and user feedback are subjective, imprecise and vague. The paper presents a fuzzy set theoretic method (FTM) for recommender systems that handles the non-stochastic uncertainty induced from subjectivity, vagueness and imprecision in the data, and the domain knowledge and the task under consideration. The research further advances the application of fuzzy modeling for content-based recommender systems initially presented by Ronald Yager. The paper defines a representation method, similarity measures and aggregation methods as well as empirically evaluates the methods’ performance through simulation using a benchmark movie data. FTM consist of a representation method for items’ features and user feedback using fuzzy sets, and a content-based algorithm based on various fuzzy set theoretic similarity measures (the fuzzy set extensions of the Jaccard index, cosine, proximity or correlation similarity measures), and aggregation methods for computing recommendation confidence scores (the maximum–minimum or Weighted-sum fuzzy set theoretic aggregation methods). Compared to the baseline crisp set based method (CSM) presented, the empirical evaluation of the FTM using the movie data and simulation shows an improvement in precision without loss of recall. Moreover, the paper provides a guideline for recommender systems designers that will help in choosing from a combination of one of the fuzzy set theoretic aggregation methods and similarity measures. Published by Elsevier B.V. Keywords: Fuzzy sets; Fuzzy connectives and aggregation operators; Recommender systems; Learning; Empirical evaluation 1. Introduction Recommender systems are systems that provide users with an ordered list of items and information that help them to decide which items to consider or look at based on the individual user preferences [25]. Recommendation systems use background data such as historical data consisting of ratings from users before the recommendation begins, input data such as features of items or users’ ratings in order to initiate a recommendation, and models and algorithms to combine the former two and generate a recommendation [8]. There have been many advances in recommender systems research. An extensive review of the different approaches used in recommender systems is presented in Burke [8]. Recently, Adomavicius and Tuzhilin [1] have identified Corresponding author. Tel.: +1 301 860 3641; fax: +1 301 860 3593. E-mail address: [email protected] (A. Zenebe). 0165-0114/$ - see front matter Published by Elsevier B.V. doi:10.1016/j.fss.2008.03.017

Upload: azene-zenebe

Post on 21-Jun-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

Fuzzy Sets and Systems 160 (2009) 76–94www.elsevier.com/locate/fss

Representation, similarity measures and aggregation methods usingfuzzy sets for content-based recommender systems

Azene Zenebea,∗, Anthony F. Norciob

aManagement Information Systems Department, Bowie State University, 14000 Jericho Park Rd., Bowie, MD 20715-9465, USAbInformation Systems Department, University of Maryland at Baltimore County (UMBC), 1000 Hilltop Circle, MD 21250, USA

Received 17 July 2007; received in revised form 8 March 2008; accepted 10 March 2008Available online 26 March 2008

Abstract

Representation of features of items and user feedback, and reasoning about their relationships are major problems in recommendersystems. This is because item features and user feedback are subjective, imprecise and vague. The paper presents a fuzzy set theoreticmethod (FTM) for recommender systems that handles the non-stochastic uncertainty induced from subjectivity, vagueness andimprecision in the data, and the domain knowledge and the task under consideration. The research further advances the applicationof fuzzy modeling for content-based recommender systems initially presented by Ronald Yager. The paper defines a representationmethod, similarity measures and aggregation methods as well as empirically evaluates the methods’ performance through simulationusing a benchmark movie data. FTM consist of a representation method for items’ features and user feedback using fuzzy sets,and a content-based algorithm based on various fuzzy set theoretic similarity measures (the fuzzy set extensions of the Jaccardindex, cosine, proximity or correlation similarity measures), and aggregation methods for computing recommendation confidencescores (the maximum–minimum or Weighted-sum fuzzy set theoretic aggregation methods). Compared to the baseline crisp setbased method (CSM) presented, the empirical evaluation of the FTM using the movie data and simulation shows an improvementin precision without loss of recall. Moreover, the paper provides a guideline for recommender systems designers that will help inchoosing from a combination of one of the fuzzy set theoretic aggregation methods and similarity measures.Published by Elsevier B.V.

Keywords: Fuzzy sets; Fuzzy connectives and aggregation operators; Recommender systems; Learning; Empirical evaluation

1. Introduction

Recommender systems are systems that provide users with an ordered list of items and information that help themto decide which items to consider or look at based on the individual user preferences [25]. Recommendation systemsuse background data such as historical data consisting of ratings from users before the recommendation begins, inputdata such as features of items or users’ ratings in order to initiate a recommendation, and models and algorithms tocombine the former two and generate a recommendation [8].

There have been many advances in recommender systems research. An extensive review of the different approachesused in recommender systems is presented in Burke [8]. Recently, Adomavicius and Tuzhilin [1] have identified

∗ Corresponding author. Tel.: +1 301 860 3641; fax: +1 301 860 3593.E-mail address: [email protected] (A. Zenebe).

0165-0114/$ - see front matter Published by Elsevier B.V.doi:10.1016/j.fss.2008.03.017

Page 2: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 77

various areas of improvements for current recommender systems. They are: (i) better methods for representing userbehavior and information about items; (ii) more advanced recommendation modeling methods; (iii) incorporationof contextual information into recommendation process; (iv) utilization of multi-criteria ratings; (v) development ofless intrusive and more flexible recommendation methods; and (vi) development of recommender systems effective-ness measures. This paper attempts to address the improvement need stated in (i) and (ii) using the fuzzy modelingtechnique.

1.1. The problem

A content-based recommendation requires data on the behavior of users and features of items. Its performancedepends on the data and how this data is used, i.e. represented and inferred. Representation of and reasoning aboutthe behavior of users and features of items raised a number of challenging issues. Features of items and users’ be-havior are subjective, vague and imprecise. These, in turn, induce uncertainty on representation of and reasoningabout the items’ features, users’ behavior, and their relationship. Such uncertainty is non-stochastic or non-randomand is induced from subjectivity, vagueness and imprecision in the data, the domain knowledge and the task underconsideration [20].

In relation to items, the uncertainty is associated to the extent (e.g. low to high) in which the items have some features.For instance, to what extent does a movie have drama content or is it highly drama related? In relation to the users’behavior such as interest, the uncertainty is associated to methods employed to measure and represent users’ interest asprecisely as possible. In relation to the recommendation task, uncertainty is associated to types of relationship that existbetween first, user behavior and item features, second, among users in terms of their behavior, and third, among itemsin terms of their features. Such non-stochastic uncertainty is not studied well and modeled in previous recommendersystems research such as [25,12].

1.2. The proposed method

In fuzzy modeling membership functions in fuzzy set theory are deliberately designed to treat the vagueness andimprecision in the context of the application [22]. This capability is used to provide the framework to address therepresentation and inference challenges associated to non-stochastic uncertainty [20] in recommender systems.

The study is motivated by the “reclusive methods” proposed by Yager [30]. Yager [30] discusses the potential offuzzy modeling and also presents a methodology consisting of collection of justifications and heuristic rules for therecommendations based on fuzzy set and fuzzy logic. He assumes the availability of a representation scheme for eachobject that allows the development of an appropriate tool to calculate the similarity relationship over the set of objects.He also assumes the availability of similarity index between two objects. However, he did not conduct an empiricalstudy to support or refute the effectiveness of using fuzzy modeling. This research is an attempt to further develop andempirically evaluate fuzzy set theoretic method (FTM) for content-based recommender systems using the problem ofmovie recommendation as its domain. Particularly, the proposed approach uses features of items as background dataand user’s feedback such as ratings of items as input.

The study defines a representational method, aggregation methods, and similarity measures for content-based rec-ommender systems. It also develops algorithms and carries out an empirical assessment of the effect of the fuzzyset theoretic method (FTM) on the performance of a movie recommender system by comparing its results to the re-sults of the baseline crisp set based method (CSM). An empirical assessment of the effect of the various inferencemechanisms—combinations of the aggregation methods and similarity measures is also performed. The study alsoconsiders analyzing the effect of the model size (also called training size—number of rated items initially needed froma user), and recommendation size (number of items recommended at one time to a user) on the performance of a movierecommender system.

1.3. Results, conclusion and contributions

Using actual data on movies, the results of the simulation study provide empirical evidence that supports the ef-fectiveness of FTM. The results show a modest increase in precision without loss of recall. It also requires a modesttraining size and recommendation size. Moreover, the results of analysis of variance show that there are significant

Page 3: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

78 A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94

differences among the different alternative combination of fuzzy set theoretic similarity measures and aggregationmethods in their recommendation accuracy. The paper has the following main contributions:

1. Compared to using crisp set theory, the paper shows using fuzzy set theory slightly improves precision without lossof recall for content-based movies recommendation application.

2. The paper provides a representation framework for features of items and users feedback using fuzzy sets as well asnew algorithms for content-based item recommender systems.

3. The paper presents a practical and detailed description of how to apply fuzzy set theory in a new domain as well ashow to conduct extensive experimental study to validate the theory.

4. The paper provides a guideline for recommender systems designers that will help them to choose a combination ofone of the fuzzy set theoretic aggregation methods and the fuzzy set theoretic similarity measures.

The remainder of the paper is organized as follows. Section 2 presents a review of related literature. Section 3presents the representation method, inference methods, similarity measures and algorithms. Section 4 describes thedataset, evaluation settings, and evaluation metrics. Section 5 presents the results of the evaluation followed by thediscussion in Section 6. Finally, our conclusions and future research directions are presented in Section 7.

2. Related literature

2.1. Recommender systems

There are various classifications of recommendation methods. Based on the sources of data and how these dataare used for recommendation, Burke [8] has classified recommendation methods into: collaborative, content-based,demographic, utility-based, and knowledge-based. There are also various variants of hybrid methods that combine thesemethods. These hybrid methods are discussed in detail in [8]. Moreover, a recent survey of state-of-the-art recommendersystems along with suggestions for improvements is found in [1].

The two most widely used methods of recommendation are content-based and collaborative filtering (CF). In collab-orative filtering, an item is recommended to a user based on other similar users’ actions like interests, preferences andratings [24]. Because of the availability of ratings data, CF is the most fully explored and several numbers of studies arereported. Deshpande and Karypis [12] developed item-based Top-N recommendation algorithms that are collaborativetype and faster than traditional user–user collaborative algorithms with comparable recommendation hit-rate. More-over, results of evaluation of these CF algorithms for recommender systems using the MovieLens dataset are reportedin terms of precision and F1-measure in [21].

In content-based recommendation, an item is recommended to a user mainly based on the characteristics of theitem and the user’s past actions like purchases, queries, and ratings. Moreover, in content-based recommendation,standard machine learning techniques such as clustering, Bayesian networks and induction learning are applied informing attribute-based models [8]. Alspector et al. [2] used a set of seven movie features—category, MAAP rating,academy award, origin, length, director and Maltin rating, in addition to the rating. They showed that the pure CFmethod produces significantly better results (in terms of correlation measure between predicted and actual rating) thanthe ones obtained with the content-based method.

Basu et al. [6] applied inductive learning approaches that use Ripper for recommendation of movies. They showed thatcontent-based approaches result in loss of precision with modest increase in recall; collaborative approaches improveprecision with modest loss of recall; and hybrid approaches increase both precision and recall. Weng and George [27]have also reported similar result for precision. These studies indicated that the mere introduction of movie featuresalone does not improve precision. The present study attempts to show that a proper introduction of movie features doesimprove precision without loss of recall.

2.2. Fuzzy modeling

Fuzzy set theory offers a rich spectrum of methods for the management of non-stochastic uncertainty. It is well suitedto handle imprecise information, the un-sharpness of classes of objects or situations, and the gradualness of preferenceprofiles [31].

Page 4: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 79

A fuzzy set A in X is characterized by its membership function, which is defined as [31]: �A(x) : x ∈ X → [0, 1],where X is a domain space. Alternatively, set A can be characterized by a set of pairs:A = {(x, �A(x)), x ∈ X}.According to the context in which X is used and the concept presented, the fuzzy membership function, �A(x), canhave different interpretations [7]. As a degree of similarity, it represents the proximity between different pieces ofinformation. For example, the membership of movie x in the fuzzy set of “drama movies” can be estimated by thedegree of similarity. As a degree of preference, it represents the intensity of preference in favor of x, or the feasibilityof selecting x as a value of X. For instance, a movie with rating of four out of five indicates the degree of a user’ssatisfaction or liking with x based on certain criteria, like movie attributes such as content-intensity of action, drama,and humor. These two interpretations are used in this research.

The relationship between fuzzy set theory and probability has been the most controversial in uncertainty modelingmainly due to the possible confusion and differences between membership function and probability measure. It isclear now that they are two complementary theories. Dubois and Prade [16] presented some of the mathematicaldifference between possibility and the quantitative, explicit Kolmogoroff’s probability. Zimmerman [34] also indicatedthat comparing fuzzy set theory with probability theory is not easy because of the various aspects in which thecomparison can be made, because of the many different ways of mathematically defining and operating fuzzy sets,and because of the various kinds of fuzziness that can be modeled. One difference between fuzziness and probabilityis probability theory measures the chance that a proposition is correct where as fuzzy set theory measures degree ofcorrectness to which a proposition is correct [15].

Possibility theory, introduced in 1978 by Zadeh [32], made it possible to deal with uncertainties of impreciseknowledge. It allows the quantification of uncertainty as a pair of numbers, possibility and necessity. In a propositionsuch as ‘X is A’, where X is a variable and A is a fuzzy set, if all we know about the value of X is that X is A,then this corresponds to a situation where information is incomplete. Then one can associate a possibility distributionto X (denoted by �x) where the values of X are ordered according to their degree of plausibility or possibility. Thepossibility distribution function �x(u) is defined to be numerically equal to the membership function �A(u) [32]. Thatis, a membership function has a possiblistic interpretation, which assumes the presence.

Forecasting, bidding and auctions, negotiation, targeting, recommender systems and profiling are pointed out asapplication areas that can benefit from soft computing paradigm and data mining [5,29,30]. Research efforts thataddress non-stochastic uncertainty using fuzzy modeling for recommendation systems have emerged recently, e.g.[19,30]. These works use fuzzy sets to define linguistic categorizations of products in an e-commerce, and formoverlapping clusters using fuzzy clustering.

FTM is case-based as “ “similar” cases are recalled, and based on them each possible decision is evaluated.” [17, p.609]. Dubois et al. [14] presented a fuzzy-set based counterpart to the case-based decision. As a case-based decisiontheory (CBDT) driven approach, FTM does have the following features [17]: each recommendation decision is evaluatedover a different set of cases; to make recommendation decision there is no need to consider all possible states and thecorresponding outcomes; the system is not assumed to know anything about the outside world, apart from past cases;and the system learns by including new cases and updated its similarity judgment and recommendation decisionscontinuously.

FTM differs from the previous recommendation systems in its representation and inference methods. Furthermore,this study has done extensive empirical evaluation to identify the significance of the use of fuzzy representation andinference methods and other factors on the performance.

3. The representation and inference methods

The proposed fuzzy set theoretic content-based approach is based on a user’s previous feedback, and features of thenew items and features of the set of items for which the user has provided feedback. The rationale of this method isthat users are more likely to have interest in items, like movies, that are similar to the items they have experienced andliked. The representation method, inference engine consisting of aggregation methods and similarity measures, and thealgorithms are presented in this section.

3.1. Item representation using fuzzy set

A membership function in fuzzy set theory is deliberately designed to handle the vagueness and imprecision in thecontext of the application [22]. The type of function that is suitable depends on the application context, and in certain

Page 5: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

80 A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94

cases the meaning captured by fuzzy sets is not too sensitive to the variations in the shape [22]. In practice, triangular,trapezoid, Gaussian, S-function, and exponential-like functions are the most commonly used membership functions.Moreover, in practice, a suitable membership function’s shape is assumed a priori and its parameters are determined bydomain experts or using machine learning techniques [22]. The former approach, i.e. domain analysis and expertise, isused in this research.

For an item described with multiple attributes, more than one attribute can be used for recommendation. Someattributes can also be multi-valued involving overlapping or non-mutually exclusive possible values. One should notethat, movies are multi-genres and multi-actors [3]. The values of multi-valued attributes in an item can be representedmore accurately within a fuzzy set framework than within a crisp set framework. Items of this type are considered inthis research. Moreover, the representation scheme presented for a movie can be generalized and applied to any itemwith similar characteristics to a movie. A few examples are music, TV shows, restaurants and books.

Let an item Ij (j = 1, . . . , M) be defined in the space of an attribute X = {x1, x2, x3, . . . , xL}, then Ij can takemultiple values such as x1, x2, . . . , and xL. The membership function of item Ij to value xk(k = 1, . . . , L), denoted by�xk

(Ij ), needs to be determined. Hence, a vector Xj = {(xk, �xk(Ij )), k = 1, . . . , L} is formed for Ij , where �xk

(Ij )

can be interpreted as the degree of similarity of Ij to a hypothetical (or prototype) pure xk type of the item; or as thedegree of presence of value xk in item Ij .

In a movie marketing application, most movies are selected for pleasure and expenditure of time. Users choosemovies they like and enjoy. Furthermore, users use subjective features of movies such as “funny”, “romantic”, and“scary” (all are kinds of movie genres) to select movies more than objective features such as the director, theatre locationand price, which are useful but are less important [10]. We use the movie as an item and movie genre as the attributeto make the method operational and develop the heuristic.

Analysis of descriptions of main film genres shows that genre g1 of movies (e.g. action) and genre g2 of movies (e.g.adventure) are overlapping in terms of their subject matter and other movie attributes [3]. Based on the result of thefindings in [10], movies highly liked by users can be grouped into similar categories by subjective features of moviessuch as genre and MPAA rating. This assertion is also verified in Section 5 using the movie dataset.

In determining the genre content of a movie, we use a heuristic based on the genres’ rank orders, available in IMdb.comand provided by the movie producers, instead of the crisp representation—the genre is presence (1) or absence (0)data value. An ideal technique is automatic content analysis of a movie for automatic identification of its genres. Theautomatic content analysis of a movie for automatic identification of its genres will provide the extent or degree ofpresence or amount of a genre in a movie. For example, amount of dramatic feature in a movie A, amount of actionfeatures in a movie A, etc. However, automatic content analysis technologies are not yet well developed and available.For example, a preliminary research on the automatic identification of movie genres by exploiting audio-visual cues ina movie is reported in [26].

In the absence of this technique, we conduct the domain analysis based on the literature in movies [3,26] and theavailable data provided by movies producers in the movie database. Given the definition of a movie in the space ofgenre (G), a movie can have one major genre denoted by x1 and one or more minor genres x2, x3, and so on, in thedecreasing order of their degrees of presence in a movie. The degree of membership of movie Ij (j = 1, . . . , M) togenre xk (k = 1, . . . , N) is denoted by �xk

(Ij ). Hence, for Ij , a vector Gj = {(xk , �xk(Ij )), k = 1, . . . , N} can be

formed. The following steps are taken in the development of the heuristic for the determination of the membershipfunction �xk

(Ij ).Step 1: Sort xk in descending order of their degree of presence in Ij . In IMDB (www.imdb.com 1 ) the genres of

movie Ij are presented in the order of their significance. For example, movie ‘King Kong (2005)’ has Action as a majorgenre, and Adventure as the first minor, Drama as the second minor, Fantasy as the third minor, and Thriller as thefourth minor genres.

Step 2: Assign higher degrees of membership value to more important genres of a movie. For instance,

If Ij has only one genre, then �xk(Ij ) = 1 and �xk

(Ij ) = 0 for all k = 2 to N .If Ij has two genres, then �xk

(Ij ) = 0.9, �xk(Ij ) = 0.4 and �xk

(Ij ) = 0 for all k = 3 to N .If Ij has three genres, then �xk

(Ij ) = 0.7 and �xk(Ij ) = 0.5, �xk

(Ij ) = 0.2 and �xk(Ij ) = 0 for all k = 4 to

N ; and so on.

1 “IMDb History,” vol. 2004: Internet Movie Database Inc., n.d. 〈http://www.imdb.com/〉.

Page 6: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 81

Table 1Membership degree of movie to genres for User 7

Movie (Ij ) Gj (vector for j th movie)Rating xj1 xj2 xj3 … xj15 xj16

56 4 0.683 1.000 0.438 … 0.000 0.00079 5 1.000 0.000 0.000 … 0.467 0.00089 3 0.683 0.000 0.000 … 0.000 1.000… … … … … … … …254 2 0.438 0.000 1.000 … 0.000 0.000

Based on the heuristics illustrated above, the possibility for item Ij to take different values of X varies and themembership function should meet the following four criteria: (1) assigning higher degree of membership to majorvalues than minor values; (2) assigning 0 to values that are not associated with the item; (3) degrees of membershipshould be normalized to the range of [0, 1]; and (4) the same value of X at similar rank positions between differentitems should have varying degrees of membership values if the number of values of X associated with the items aredifferent. We represent this type of heuristic with a Gaussian-like membership function, as shown in (1).

�xk(Ij ) = rk/2

√�∗|Lj|(rk−1), (1)

where N = |Lj | is the number of values of X associated with Ij and rk (1�rk � |Lj |) is the rank position of value xk ,and � > 1 is a parameter used as a threshold to control the difference between consecutive values of X in Ij . Moreover,� is the only parameter that needs to be determined.

For example, using Eq. (1) with � set to 1.2 (after various experimental trials), the movie ‘King Kong (2005)’ isrepresented in terms of genres: for |Lj | = 5 and xj = {(Action, 1), (Adventure, 0.366), (Drama, 0.272), (Fantasy,0.211), (Thriller, 0.168)}. Furthermore, for User 7, Table 1 shows the representation of some of the rated movies.

Ross et al. [23] stated that a membership function (MF) should match “the intuitively plausible semantic descriptionof imprecise properties of the objects in X.” The reason we make the first important genre completely satisfied is moviesare advertised exclusively as having a specific genre such as drama and comedy; as well users have similar perceptionto movies. The soundness of the representation and inference method are further investigated in Section 5 using themovie dataset.

The reciprocal function defined as 1/k for k = 1 to N is also investigated. The problem with the reciprocal functionis it does not consider the total number of different genres that exist in movies leading to the same degree of membershipvalue for the same genre at same rank position with different number of genres. A uniform distribution membership ofgenres in a movie, assuming a movie with multiple genres has equal degree of genre presence for all occurring genresrepresented by 1 or 0, is the baseline crisp set representation.

The heuristic that leads to the membership function in (1) is developed based on the analysis of the movie dataset,literature on movies [10] and preliminary experimental trials conducted on �. This heuristic, for instance, assumes thattwo genres will not have equal degree of presence in a movie. That is, two or more genres cannot exist in “equal degreeof presence” or “in equal amount” in a movie. For example, let us say a movie X has drama and comedy. The moviecannot have dramatic and comedy nature with exact equal ‘amount’, i.e. with exact degree of membership 0.5 each.This is also true in the movies database used in this research. This assumption is logical because a movie cannot haveexactly the same “content” of two or more genres. If the amount of content is close then � needs to be tuned.

Moreover, in future research, studying various membership functions is needed in order to get an optimal fuzzyset based recommender system. The basis of our study is, provided that a membership function that satisfies the fourcriteria identified above. A is used, the recommendations by FTM will be better than crisp-set based approach. Therepresentation scheme can be extended to recommender systems based on a combination of multiple attributes. Forexample, one can use movie actresses/actors as the second attribute. The actors in a movie can be represented in a vectorA = {a1, a2, . . . , ak} for K actors. The role or importance of an actor or actress ak in a movie mi can be represented bydegree of membership associated with the fuzzy variable ‘degree of role or importance’. That is, Aj = {(ak, �ak

(Ij )),for k = 1 to K}, where �ak

(Ij ) can be determined heuristically or using machine learning.

Page 7: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

82 A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94

3.2. User feedback representation using fuzzy set

User rating is the most widely used feedback in recommender systems. It is a proxy variable used for measuringuser degrees of interest in an item. User ratings are represented and interpreted as binary values-those liked or disliked.In five-scale ratings, those above 3 are considered as liked. However, user rating is intrinsically imprecise as a usermay give different ratings to the same item at different times and situations due to the difficulty to make a distinctionbetween rating 4 and 5, and 1 and 2. Moreover, the same rating, say 4 on a scale of 5, given by two users does notnecessarily imply equal degrees of interest in an item.

For pessimistic users, a rating of 4 may mean strongly liked but for optimistic raters it may mean somewhat liked. Isthe difference between ratings 3 and 4 the same as the difference between 4 and 5? These all contribute to fuzziness thatarises from the human thinking processes instead of randomness associated with the ratings. Therefore, user interestbased on user rating is treated as a fuzzy variable and its uncertainty is represented using a possibility distributionfunction. Let the fuzzy variable degrees of interest in an item (DI) consisting of strongly liked (SL), liked (L), indifferent(I ), disliked (D), and strongly disliked (SD) fuzzy values, and associated with user rating (R) expressed in continuumfrom minimum value (Min) to maximum value (Max). Then, the proposition ‘a user has strongly liked an item I’ hasthe possibility distribution function �R(I) = �SL(R = r), for r between Min and Max.

Under this interpretation or semantic of the fuzzy variable DI, the user rating is represented and inferred usinga possibility distribution function by treating the rating as a fuzzy number [22]. For instance a rating 4 on 5 scalewhich refers to strongly liked is represented in terms of its possibility distribution values = {�SL(R = 4)/5, �L(R =4)/4, . . . , �SD(R = 4)/1}. For instance, for User 7 in Table 1, the possibility distribution of User 7’s rating of Highon movie 56 can be expressed as: {�SL(R = 4) = 0.50, �L(R = 4) = 1, �I(R = 4) = 0.50, �D(R = 4) =0.25, �SD(R = 4) = 0.0}; and possibility distribution of User 7’s rating of Very High on movie 79 can be expressedas: {�SL(R = 5) = 1, �L(R = 5) = 0.55, �I(R = 5) = 0.20, �D(R = 5) = 0.0, �SD(R = 5) = 0.0}.

Without losing generality, a half triangular fuzzy number, which is the simplest model of uncertain quantity, isused to represent the degree of positive experience a user has in relation to an item. The half triangular fuzzy numbermembership function, for user rating r on Ii ∈ [Min, Max] and for a fuzzy set A on DI, is defined as:

�A(Ii) = (r − Min)/(Max − Min). (2)

As a result, a set of items liked by a user, denoted by E, is defined as: {Ii : �A(Ii) > 0.5, i.e., Ii : r > (Min+Max)/2}.

3.3. Inference engine and algorithm

Based on the representation scheme defined for items and user feedback, the inference engine consisting of therecommendation score aggregation methods and the similarity measures is defined below.

3.3.1. Fuzzy set theoretic similarity measuresOne of the most important issues in recommender systems research is computing similarity between users, and

between items (products, events, services, etc.). This in turns highly depends on the appropriateness and reliability ofthe methods of representation. The set-theoretic, proximity-based and logic-based are the three classes of measures ofsimilarity [11]. In fuzzy set and possibility framework, similarity of users or items is computed based on the membershipfunctions of the fuzzy sets associated to the users or item features. Based on the work of Cross and Sudkamp [11],those similarity measures that are relevant for item recommendation application are adapted.

For items Ij and Ik that are defined as {(xi, �xi(Ij )), i = 1, . . . , N} and {(xi, �xi

(Ik)), i = 1, . . . , N}, a similaritymeasure between Ij and Ik is denoted by S(Ik, Ij ), and the different similarity measures are defined as

S1(Ik, Ij ) =∑

i min(�xi(Ik), �xi

(Ij ))∑i max(�xi

(Ik), �xi(Ij ))

, (3)

S2(Ik, Ij ) =∑

i �xi(Ik) ∗ �xi

(Ij )√((

∑i (�xi

(Ik))2))

√((

∑i (�xi

(Ij )2))

, (4)

Page 8: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 83

S3(Ik, Ij ) = 1 −

⎛⎜⎜⎝ d2(Ik, Ij )

Max︸︷︷︸i

{�xi(Ik), �xi

(Ij )}

⎞⎟⎟⎠ , (5)

S4(Ik, Ij ) = 1 − 2

ZIk+ ZIj

d2(Ik, Ij )2, (6)

where

d2(Ik, Ij ) =√∑

i(�xi

(Ik) − �xi(Ij ))2 and ZIa =

∑i(2 ∗ �xi

(I a) − 1)2 for a = k or j.

These fuzzy set based similarity measures are: fuzzy set theoretic in (3), cosine in (4), proximity (Minkowski’sdistance based) in (5), and correlation-like in (6). The similarity measure for the baseline method, where item feature(e.g. genre) and user feedback (e.g. rating) are represented as crisp sets is called the crisp set based similarity measure(CSM) and defined as

S5(Ik, Ij ) =∑

i min(�xi(Ik), �xi

(Ij ))∑i max(�xi

(Ik), �xi(Ij ))

. (7)

In (7) �xi(Ik) is 1 if the feature value xi exists in Ii , and assigned to 0 otherwise. Eqs. (3) and (7) are similar in all

aspects, except for the difference in the representation schema used.For example, Table 2 shows movies I1 = {Copycat 1995: Crime/Mystery/Thriller/Drama}, and I2 = {Grudge 2004:

Horror/Thriller/Mystery} with their crisp (I1 and I2) and fuzzy (G1 and G2) representations. Using crisp set theoretic(7), crisp cosine (similar to (4) except that 1 or 0 are used instead of membership degree) and crisp distance measures(similar to (5) except that 1 or 0 are used instead of membership degree) the similarity between I1 and I2 are 0.40,0.58, and 0.73, respectively. Finally, using the corresponding fuzzy set theoretic similarity measures (3), (4), and (5) thesimilarity coefficients are 0.24, 0.25, and 0.55. The difference in the representation leads to the variation in similaritymeasures. The effects of these differences in similarity measures on recommender systems’ accuracy are assessed inSection 5.

3.3.2. Aggregation methodsThe recommendation decision-making methods used in FTM are case-based decision. There are different recom-

mendation score aggregation methods for computing and combining the various contributions to recommendationconfidence scores in order to make recommendation decision within the framework of fuzzy and possibility theory.The two alternatives that consider both user feedback and similarity between previously liked items and a new item areweighted-sum and max–minimum. They are defined as follows.

Weighted-sum: For each targeted item Ij , calculate the weighted sum (weighted-sum) recommendation confidencescore as

R1(Ij ) =∑

k

�E(I k)S(Ik, Ij ), (8)

where E is a set of positively experienced (liked) items by users, and �E(Ik) is the membership of the item Ik tothe set E. Furthermore, S(Ik, Ij ) is the similarity between Ij and Ik computed using the similarity measures definedabove. A normalized evaluation score for each Ij ’s recommendation confidence score is obtained using NR1(Ij ) =R1(Ij )/ max︸︷︷︸

k

[R1(Ik)]Maximum–minimum: For each targeted item Ij , calculate the maximum–minimum (max–min) recommendation

confidence score as

R2(Ij ) = max︸︷︷︸k

{min(S(Ij , Ik), �E(Ik))}, (9)

Page 9: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

84 A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94

Table 2Movies representation in space of genres

Crime Horror Mystery Thriller Drama

I1 1 0 1 1 1G1 1 0 0.44 0.35 0.29I2 0 1 1 1 0G2 0 1 0.41 0.47 0

where E is a set of positively experienced (liked) items by users, and �E(Ik) is the membership of the item Ik to the setE. Furthermore, S(Ik, Ij ) is the similarity between Ij and Ik computed using one of the similarity measures defined inEqs. (3)–(7).

The resulting recommendation confidence score Rk(Ij ), either from (8) or (9), is used to determine the Top-n itemsfor recommendation and can be consider as a degree of support to recommend Ij . A normalized evaluation score isobtained by dividing each Ij ’s recommendation confidence score by the maximum of recommendation confidencescores.

R1 takes into account all participating elements hence it has a compensatory effects between highly liked and justliked items in E as well as most similar and least similar items. R2 is high as soon as there is at least one good rateditem similar to the new item. R2 is similar to Eq. (13) in Calmès et al. [9] and Eq. (4) in Dubois et al. [14]. There areseveral variants of aggregator operators, and two common operators are:

R3(Ij ) = min︸︷︷︸k

{max((1 − S(Ij , Ik), �E(Ik))},

R4(Ij ) = max︸︷︷︸k

{∏(S(Ij , Ik), �E(Ik))

}.

R3 is defined in Dubois et al. [13], and is high when all items are similar to the new one. Hence, R2 is considered tobe the optimistic counterpart to R3. R4 is just a variant of R2 where product operator is used instead of the min for the‘and’ operator, and it is high in the same circumstances as R2. In future works, the aggregation operators R3, R4 andother suitable operators will be identified and considered. Furthermore, future works will consider linguistic quantifiersas presented in Yager [28].

3.3.3. Implementation of algorithmsFirst, the algorithm uses items for which the user has expressed a degree of interest or preference; and then the algo-

rithm finds similar items to those that interest the user. Using the defined representation schemes, the various similaritymeasures and aggregation strategies, various algorithms are designed and implemented. Simulation implementationsof the algorithms and simulation runs are performed on an Intel Pentium (R) 4, 2.66 GHz of speed, 512 MB of memory,and running with Windows operating systems. The Java 2 programming language and Microsoft Access 2002 are usedas software tools.

4. Evaluations

The performance of the proposed method and its algorithms are evaluated using movies as items. The movie dataset,experimental setup, simulation runs and evaluation metrics are presented below.

4.1. Dataset

The benchmark dataset from MovieLens at the University of Minnesota, 2 which has been widely used in recom-mendation research, is used in this study. The dataset includes movie attributes, user ratings, and user demographic

2 〈http://movielens.umn.edu〉

Page 10: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 85

information. The dataset consists of 100,000 ratings (1–5) from 943 users on 1682 movies; and each user has ratedat least 20 and at most 737 movies. In the dataset, movies are described with: movie id, movie title, release date,video release date, IMDb URL, and 20 genres including action, adventure, animation, children’s, comedy, crime,documentary, drama, fantasy, film-noir, horror, musical, mystery, romance, sci-fi, thriller, war, western, family, andunknown.

Genres in the MovieLens dataset are represented with binary values, which do not reflect the true content of moviesin the genre space. Therefore, we use the proposed representation scheme by incorporating information about moviegenres retrieved from the Internet Movie Database (imdb.com), which is a large database consisting of comprehensiveinformation about past, present and upcoming movies. In IMDb, the producers of a movie present genres in the moviein order of their degree of presence, and this information is used in Eq. (1).

4.2. Evaluation metrics

Accuracy is a commonly used metric for a recommender system based on user tasks or goals [18]. The accuracymetrics includes predictive and recommendation accuracy measures. Predictive accuracy measures such as meanabsolute error, mean squared error and percentage of correct predictions are found to be less appropriate when the usertask is to find ‘good’ items and when the granularity of true value is small because predicting a 4 as 5 or a 3 as 2 makesno difference to the user [18]. Instead, the recommendation accuracy metrics of precision, recall and F1-measure aremore appropriate [18]. The precision and recall are computed using movies for which ratings are provided and heldfor testing; and movies with ratings 4 and 5 are considered as movies liked by users. This approach of measuringperformance is widely used in recommender systems research, e.g. [12,18,21].

Precision measures the ratio of correct recommendations being made. Recall reflects the coverage or hit rate ofrecommendations. F1-measure = (2∗precision∗ recall)/(precision+ recall) is a single metric that combines precisionand recall.

4.3. Simulation run and experimental design

Simulation recommender systems for movies are designed and developed to assess the effectiveness of the proposedmethod. The simulation system works as follows:

(1) For each user, it randomly splits the movie ratings dataset into a training set and a test set.(2) Using the training set, it computes recommendation confidence score for each item in the test set using the different

similarity measures and aggregation strategies.(3) For each user, using the movies in the testing set, it generates Top-N recommendations and computes the precision,

recall, and F1 measure.(4) Using different random selection of the movies into testing and training sets, 10 different runs are executed to avoid

sensitivity to sampling bias, and the mean results are reported.

The simulation system runs for 100 randomly selected users from 943 users available in the dataset. Each user hasrated between 20 and 737 movies. During the simulation runs training size (model size) of 5, 10, 15, 20, 25, and 30along with the corresponding test size of total number of rated movies minus training size are used. Using differentrandom partition of the movies rated by a user into testing and training sets, 10 different runs are executed to avoidsensitivity to sampling bias. Moreover, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, and 50 are used as values of the variablenumber of item to be recommended (recommendation size).

Precision, recall, and F1-measure are the dependent variables. Table 3 presents the independent variables along withthe possible values. On each of the dependent variables, a 2 × 5 × 6 × 12 factorial analysis [4] is conducted. In thisdesign the fixed factors are aggregation method with two categories, similarity measure with five categories, trainingsize with six categories and recommendation size with 12 categories. The effect of testing size is also studied as acovariate variable. All statistical tests are performed at 5% level.

Moreover, (i) using different random selection of the movies into testing and training sets 10 different runs areexecuted to avoid sensitivity to sampling bias, and average results are reported; (ii) 100 users and related records areselected randomly; (iii) normalization is applied on the recommendation scores in all cases.

Page 11: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

86 A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94

Table 3Factors

Independent factor Description Possible values

Aggregation method Fuzzy reasoning as aggregation meth-ods

Weighted-sum or Maximum–minimum (max–min)

Similarity measure Similarity between two movies Crisp set-theoretica , fuzzy set-theoretic, cosine, proximity-based,correlation-like

Model size Movies rated by a user—the K mostliked movies

5, 10, 15, 20, 25, 30

Recommendation size (Top-N) Total number of items to be recom-mended

3, 4, 5, 10, 15, 20, 25, 30, . . . , 50

Number of testing cases Number of items not seen by a userand available for recommendation

Total number of rated movies minustraining size

aThis is used as the baseline to assess the effect of fuzzy representation and inference.

5. Results

Following the assessment of the soundness of FSM, first the mean performance of the proposed method is presentedalong with the results of comparative analysis between a fuzzy set theoretic method (FTM) that includes fuzzy set,cosine, proximity-based, and correlation-like, and the baseline crisp set similarity-based method (CSM). Second, theresults from the analysis of the effect of the different factors—similarity measure, aggregation method, training size,and recommendation size—on the recommendation accuracy are presented.

5.1. The soundness of the FTM

The FTM uses recently liked movies by the user. It depends on the degree of similarity between the targeted movie,and those rated movies as well as the aggregation method. It assumes users prefer movies with specific genres. For a userthe rated movies are grouped into three groups: group 1 consisting of movies with rating 1 and 2 (labeled as disliked),group 2 consisting of movies with rating 4 and 5 (labeled as liked), and group 3 consisting of movies with rating 3(labeled as indifferent). For instance, Fig. 1 visually presents the distribution of the genres by the three preferencecategories for user 7 (Table 1). It roughly shows that user 7 likes drama, romance and adventure; dislikes thriller; andis indifferent to comedy, action and crime.

Based on the representation of movie genres using (1), we check the soundness of the representation by verifying thefollowing assertion using statistical methods: users have preference to movies of specific genres (e.g. Drama, Comedyand Family) with some degree of memberships over others (e.g. Action, Horror and Adventure).

First, one-way ANOVA shows that there are significant differences in the mean degree of membership of moviesfor the different genres among those movies which are liked, disliked, and indifferent by users with the exception ofAnimation, Documentary, Musical and Science Fiction. Had it been a random occurrence of genres with respect ofusers’ ratings, the results would have been non-significant for all genres. The reason for non-significant differences inthe four genres is due to their rare occurrence in movies data. Second, since recommendation decisions are made usingrecommendation scores, another way of verifying the soundness of the representation and inference procedure is to testcorrelation between user ratings and recommendation confidence scores. Significant bi-variant relationships are foundwhich is in agreement with the result reported in the movie literature [10]. Therefore, the results support the assertionand hence validate the soundness of the representation and inference methods.

5.2. Overall performance

The means of recommendation accuracy by aggregation method and similarity measure are presented in Figs. 2and 3. The highest mean precision, recall and F1-measure are 55%, 38%, and 38% for FTM compared to the 49%,39%, and 38% for the baseline CSM (crisp set based) method, respectively. Overall, FTM improves the precision withcomparable recall, and the 6% increase is statistically significant at 5%.

Page 12: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 87

0.50

0.40

0.30

0.20

0.10

0.00disliked liked

Preferenceindifferent

DramaComdeyActionThrillerRomanceAdventureAnimationCrime

Mea

n

Fig. 1. Genre preferences distribution for user 7: 〈age = 57, Male, Administrator〉.

0.00

0.10

0.20

0.30

0.40

0.50

0.60

Precision

Crisp SetFuzzy SetCosineProximityCorrelation

Recall F1-Measure

Fig. 2. Mean recommendation accuracy by similarity measures for weighted-sum.

0.00

0.10

0.20

0.30

0.40

0.50

0.60

Precision

Crisp Set CosineFuzzy Set ProximityCorrelation

Recall F1-Measure

Fig. 3. Mean recommendation accuracy by similarity measures for max–min.

Page 13: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

88 A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94

0.70

0.60

0.50

0.40

0.30

Mea

n Pr

ecis

ion

0.0 to0.2999

0.3 to0.3999

0.5 to0.5999

0.6 to0.6999

0.7 to0.7999

0.8 to0.8999

0.9 to1.0

Recall

similarityidCrsip set

Proximity

CosineFuzzy Set

Corrleation

Fig. 4. Mean precision by recall using the weighted-sum.

Fig. 4 shows the precision for different recall levels. Overall, unsteady decline in precision is observed as recallincreases, and the decline is consistent across all similarity measures. For fixed recall, all the mean precisions of FTMare greater than that of the baseline CSM.

5.3. Effect of aggregation method

Fig. 5 shows the effect of aggregation strategy on performance. The precision is not affected by the aggregationmethod (not significant at 5%), but the max–min aggregation method yields greater recall than the weighted-summethod (significant at 5%). Therefore, depending on a task, recommender systems designers can choose one of the twoaggregation methods. For instance, max–min aggregation strategy would be a choice for the task of finding all “good”items because it results in higher recall across all crisp-based and fuzzy-based similarity measures.

5.4. Comparison between fuzzy-theoretic method and the baseline crisp set-theoretic method

Pair-wise multiple comparisons test using Bonferroni 3 test, using the results in Figs. 2 and 3, is conducted betweenmean performance of crisp set theoretic approach (CSM) and the varieties of fuzzy set theoretic approaches (FTM).The results are summarized as follows.

With weighted-sum aggregation method, all the mean precisions and F1-measures of the FTM (with the exceptionof the case using proximity similarity measure where its mean F1-measure is not significantly different) are greaterthan that of the CSM. Moreover, there is no statistically significant difference between the two groups in their meanrecall except in the case of the proximity-based similarity where its mean recall is lower for FTM.

With maximum–minimum aggregation method (at 5% level of significance), all the mean precisions of the FTMare greater than that of the CSM. Furthermore, all the mean recalls of the FTM are less than that of the CSM. Finally,except cosine similarity where there is no significant difference, all the mean F1-measures of the FTM are less thanthat of the CSM. The result on recall is unexpected, and is further analyzed and discussed in subsection 5.6 andSection 6.

5.5. Comparison among the fuzzy-theoretic methods

The performance differences among the fuzzy set theoretic similarity based approaches are analyzed using Bonfer-roni test, and the analysis and results are reported in detail in [33]. The results show the different alternative combination

3 The Bonferroni test, based on Student’s t statistic, adjusts the observed significance level for the fact that multiple comparisons are made. It iscommonly used multiple comparison tests, and more powerful for a small number of pairs [SPSS PC User Manual].

Page 14: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 89

0.65

0.63

0.60

0.57

0.54

0.51

0.48

0.45

Mea

n

Crisp Set Fuzzy Set CorrelationProximityCosineSimilarity Measure

Recommendation StrategyWeighted-Sum

Maximum-Sum

Similarity Measure

Mea

n

Recommendation StrategyWeighted-Sum

Maximum-Sum

0.45

0.40

0.35

0.30

0.25

Crisp Set Fuzzy Set CorrelationProximityCosine

Fig. 5. Mean precision by aggregation strategy.

of fuzzy set theoretic similarity measures and aggregation methods have significant and different effects on recom-mendation accuracy. For precision, cosine and fuzzy set are better choices. For recall, fuzzy set and correlation arebetter choices when the aggregation method is weighted-sum, and cosine is the better choice when the aggregationmethod is max–min. For F1 measure, correlation and fuzzy set are better choices when the aggregation method isweighted-sum, and cosine is the better choice when the aggregation method is max–min. In all cases, distance basedsimilarity performs poorly.

Page 15: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

90 A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94

0.70

0.65

0.60

0.55

0.50

0.45

0.40

0.35

0.30

Mea

n

5 10 15 20 3025training size

Similarity MeasureCrisp Set

Cosine

Fuzzy Set

Proximity

Correlation

Fig. 6. Mean precision by training size.

5.6. Training size and recommendation size sensitivity analysis

Since similar results are obtained in both aggregation methods, results from Weighted-sum aggregation method arepresented. Overall, as shown in Figs. 6 and 7, precision and recall do not increase as the model size (training size)increases. The reason is once sufficient number of items that suggest the preference of a user are gathered, additionalfeedback on items do not improve the preference of the user as well as the recommendation accuracy. Pair-wise meancomparison test shows no significant difference in both precision and recall due to variation in training size. A similarpattern is also reported in the literature, e.g. [12]. The results are also consistent across all the similarity measures andthe two aggregation methods. Overall, a training size of 5–15 produces mean precision between 0.47 and 0.59.

As shown in Fig. 8, when the size of the recommendation list increases, the precision is not significantly increasing.Hence an FTM based recommender system does not need to recommend a large number of movies to achieve a betterprecision.

As shown in Fig. 9, recall increases as the number of recommended items increases. The results are consistent acrossall the similarity measures and the two aggregation methods. Furthermore, overall, a recommendation of 3–10 itemsproduces mean precision between 0.45 and 0.55.

6. Discussion

Using the baseline CSM (7) based recommender system that use genres and ratings represented in crisp set resultsin lower precision compared to the FTM based recommender systems that use genres and ratings represented usingfuzzy set. For FTM, the mean precision increases approximately by 6%, which is significant at 5%. Precision is im-proved because handling of the non-stochastic uncertainty using the fuzzy theoretic-based representation and inference

Page 16: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 91

training size

Mea

n

5 10 15 20 3025

0.42

0.41

0.39

0.38

0.36

0.35

0.33

0.32

0.30

Similarity MeasureCrisp Set

Cosine

Fuzzy Set

Proximity

Correlation

Fig. 7. Mean recall by training size.

mechanism reduce the noises in the recommender system processes. However, recall is not improved. Further inves-tigation of the other factors reveals that compared to precision, recall is highly affected by recommendation size andtraining size. For fixed recall, compared with CSM, precision is improved by the FTM (Fig. 4). The improvement isvery important for recommender systems where precision is more important than recall. Recommender systems areexpected to recommend only a limited number of items, usually 5–20.

For precision, there is no significance difference between the two aggregation methods. However, compared toweighted-sum strategy, the max–min strategy produces better performance in terms of recall. In weighted-sum strategyas in (8), both the similarity between the targeted movie and previously rated movies, and the degree of interestto the previously rated movies contribute to recommendation confidence score. In max–min strategy as in (9), therecommendation confidence score is dominated by either the similarity or the degree of interest to the previously ratedmovies. This variation could contribute to the difference in recall between the two strategies.

The variations in performance by similarity measures are attributed to the inherent nature of item feature. For movies,the movie by genre matrix that is used to compute similarity is sparse. Nearly 50%, 34%, 13%, 3%, 0.7%, or 0.3% ofthe movies in the data set have 1, 2, 3, 4, 5, or 6 genres, respectively. For a movie with vector of size 20 genres mostof the values are zeros. The sparsity in the movies by genres matrix has different effects on the performance.

The sparsity in the movie by genres matrix has less effects on the cosine and fuzzy set based similarity measures. Thecosine measure is based on magnitude and direction of the vectors. Furthermore, the fuzzy set based similarity measureis based on the set-theoretic operations on fuzzy sets, fuzzy set cardinality and commonality of attributes. These twosimilarity measures capture a scale invariant understanding of similarity, and it ignores the effect of the proportionalsize differences.

In case of proximity measure, the similarity coefficient can be nonzero even when there is no common elementbetween two objects. This means that fuzzy sets with non-empty intersection can have the same similarity as fuzzy sets

Page 17: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

92 A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94

Mea

n

0.70

0.65

0.60

0.55

0.50

0.45

0.40

0.35

0.30

5 10 15 20 30 40 45 5025 3543top-n

Similarity MeasureCrisp Set

CosineFuzzy Set

ProximityCorrelation

Fig. 8. Mean precision for different similarity measures by recommendation size.

with empty intersection. Correlation produces the next poor performance because it is a weighted squared Euclideandistance, and it has similar characteristics as the proximity measure.

The sensitivity analysis shows the effect of the model size and recommendation size factors on the recommendationaccuracy. The FTM was found to be highly insensitive to the training size, and the results are consistent across the twoaggregation methods and all the similarity measures. The stability in FTM can be attributed to its inherent representationand inference techniques. As a result, to gain a good precision, FTM requires fewer training cases and needs to providea few recommendations. Hence the FTM based algorithms are scalable and appropriate for online recommendation.

Previous studies [2,6,27] compared content-based approach and collaborative filtering, and they found that the mereintroduction of movie features alone does not improve precision. One possible reason for the poor performance ofthe content-based method could be the representation scheme used for the various multi-valued and non-numericalfeatures. Unlike our fuzzy set based representation of movies’ content, the genres have crisp representation; and alsogenres and other non-numerical features are approximated by mean rating of those movies rated with the features’values.

Creating an equivalent base for a fair comparison between the results of the proposed approach and the results ofreported studies for movies recommendation using the same MovieLens dataset is not straightforward. For example,comparison with collaborative filtering algorithms sounds unfair because our approach has additional item features.Moreover, the experimental set-ups are different. Notwithstanding these limitations, we present the results of ourmethod and the results of equivalent CF from the literature. We are attempting to show that the proper introduction ofcontent information in recommender systems improves precision.

Using the same dataset as our study, user-based CF using belief distribution and nearest-neighbor algorithms resultedin an optimal top-15 mean precision of 23% [21]; top-15 mean precision for FTM is in the range of 50–55% as indicatedin Fig. 9. Similarly, user-based CF resulted in an optimal top-10 mean F1-measure of 23% [24]; Top-15 mean F1-

Page 18: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 93

Mea

n0.60

0.65

0.70

0.55

0.50

0.45

0.40

0.35

0.30

0.25

0.20

0.15

0.10

5 10 15 30 35 40 45 5020 2543top-n

Similarity Measure

Crisp Set

Cosine

Fuzzy Set

Proximity

Correlation

Fig. 9. Means plot of recall for different similarity measures by recommendation size.

measure for FTM is in the range of 25–32%. Hence, the present study shows proper introduction of movie featuresdoes improve precision without loss of recall.

Some of the other potential applications of the FTM are for information filtering, tutoring systems, and web mining.The successful application of the proposed method for the movie recommender system makes its extension to recom-mendation of other similar items such as music, jokes, and books evident. The application of the proposed method forinformation filtering such as news filtering is also possible. For instance, representation of user news interest, contentof news, and their relationship as well as computing similarity and recommendation scores can be done in similar waysas for the movie recommender system.

7. Conclusion and future research directions

The study presents a fuzzy set theoretic method (FTM) for item recommender systems. Using data on movies, theresults of the simulation study provide empirical evidence that supports the effectiveness of FTM in terms of betterprecision, and lower model size and recommendation size. Furthermore, the study shows that the different alternativecombination of fuzzy set theoretic similarity measures and aggregation methods have significant and different effectson recommendation accuracy.

Compared to the baseline CSM, FTM improves the precision without loss in recall. Therefore, the study shows theuse of fuzzy modeling as the foundation for representing and reasoning on non-stochastic uncertainty can improve theeffectiveness of content-based recommender systems.

Further studies are planned to extend the FTM in several directions. First, inclusion of additional attributes, e.g.actors/actresses, and directors for movies, are expected to improve the performance of the system. Second, differ-ent membership functions, different aggregator operators and linguistic quantifiers need to be considered, and then

Page 19: Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems

94 A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94

empirically tested in order to get an optimal fuzzy set based recommender system. Third, there is a need for test-ing the FTM approach with additional datasets from other domain applications before making further generalization.Fourth, extending the FTM method to support collaborative filtering is essential. Fifth, the use of membership degreesof genres in movies obtained using the membership function for the application of the Bayesian’s method (formingFuzzy–Bayesian approach) for modeling stochastic uncertainty is an exciting study that demands a separate research.

References

[1] G. Adomavicius, A. Tuzhilin, Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions,IEEE Trans. Knowledge Data Eng. 17 (6) (2005) 734–749.

[2] J. Alspector, A. Kolcz, N. Karunanithi, Comparing feature-based and clique-based user models for movie selection, in: Proc. Third ACM Conf.on Digital Libraries, 1998, pp. 11–18.

[3] R. Altman, Film/Genre, British Film Institute, London, 1999.[4] S. Anderson, A. Auquier, W.W. Hauck, D. Oakes, W. Vandaele, H.I. Weisberg, Statistical Methods for Comparative Studies: Techniques for

Bias Reduction, Wiley, New York, 1980.[5] J. Basak, M. Kumar, E-commerce and soft computing: scopes of research, IETE Tech. Rev. 18 (4) (2001) 283–293.[6] C. Basu, H. Hirsh, H. Cohen, Recommendation as classification: using social and content-based information in recommendation, in: Proc. 15th

Nat. Conf. on Artificial Intelligence, 1998, pp. 714–720.[7] T. Bilgiç, I.B. Turksen, Measurement of membership functions: theoretical and empirical work, in: D. Dubois, H. Prade (Eds.), Fundamentals

of Fuzzy Sets: Handbook of Fuzzy Sets and Systems, Kluwer, Boston, 2000, pp. 195–232.[8] R. Burke, Hybrid recommender systems: survey and experiments, User Model. User-Adapted Interaction 12 (4) (2002) 331–370.[9] M.D. Calmès, D. Dubois, E. Hüllermeier, H. Prade, F. Sédes, Flexibility and fuzzy case-based evaluation in querying: an illustration in an

experimental setting, Internat. Journal of Uncertainty, Fuzziness Knowledge-Based Syst. 11 (1) (2003) 43–66.[10] E. Cooper-Martin, Consumers and movies: some findings on experiential products, Adv. Consumer Res. 18 (1991) 372–378.[11] V.V. Cross, T.A. Sudkamp, Similarity and Compatibility in Fuzzy Set Theory: assessment and Applications, Physica-Verlag, New York, 2002.[12] M. Deshpande, G. Karypis, Item-based top-N recommendation algorithms, ACM Trans. Inform. Syst. 22 (1) (2004) 143–177.[13] D. Dubois, L. Godo, H. Prade, A. Zapico, On the possiblistic decision model: from decision under uncertainty to case-based decision, Int. J.

Uncertainty Fuzziness Knowledge 7 (1999) 631–670.[14] D. Dubois, E. Hüllermeier, H. Prade, Fuzzy methods for case-based recommendation and decision support, J. Intelligent Inform. Syst. 27 (2006)

95–115.[15] D. Dubois, H.T. Nguyen, H. Prade, Possibility theory, probability and fuzzy sets, in: D. Dubois, H. Prade (Eds.), Fundamentals of Fuzzy Sets:

The Handbooks of Fuzzy Sets, Kluwer, Boston, 2000, pp. 343–438.[16] D. Dubois, H. Prade, Fuzzy Sets and Systems, Theory and Applications, Academic Press, New York, 1980.[17] I. Gilboa, D. Schmeidler, Case-based decision theory, Quart. J. Econom. 110 (3) (1995) 605–639.[18] J.L. Herlocker, J.A. Konstan, L.G. Terveen, J.T. Riedl, Evaluating collaborative filtering recommender systems, ACM Trans. Inform. Syst. 22

(1) (2004) 5–53.[19] U. Kaymak, Fuzzy target selection using RFM variables, in: Proc. Joint 9th IFSA World Congress and 20th NAFIPS Internat. Conf., 2001, pp.

1038–1043.[20] G.J. Klir, T.A. Folger, Fuzzy Sets, Uncertainty, and Information, Prentice-Hall, NJ, 1988.[21] M.R. McLaughlin, J.L. Herlocker, A collaborative filtering algorithm and evaluation metric that accurately model the user experience, in: Proc.

of the 27th Annual ACM Conf. on Research and Development in Information Retrieval, 2004, pp. 329–336.[22] W. Pedrycz, F. Gomide, An Introduction to Fuzzy Sets, The MIT Press, Cambridge, MA, 1998.[23] T.J. Ross, J.M. Booker, W.J. Parkinson, Considerations for Using Fuzzy Set Theory and Probability Theory, in: T.J. Ross, J.M. Booker, W.J.

Parkinson (Eds.), Fuzzy Logic and Probability Applications: Bridging the Gap, 1999.[24] B.M. Sarwar, G. Karypis, J.A. Konstan, J. Riedl, Analysis of recommender algorithms for E-commerce, in: Proc. ACM E-Commerce Conf.,

2000, pp. 158–167.[25] J.B. Schafer, K.J.A. Konstan, J. Riedl, Electronic commerce recommender applications, J. Data Mining Knowledge Discovery 5 (1, 2) (2001)

115–152.[26] M. Sugano, R. Isaksson, Y. Nakajima, H. Yanagihara, Shot genre classification using compressed audio-visual features, in: Proc. IEEE Internat.

Conf. on Image Processing, 2003, pp. 17–20.[27] S.-S. Weng, G. Meghabghab, Personalized product recommendation in E-Commerce, in: Proc. 2004 IEEE Internat. Conf. on e-Technology,

e-Commerce and e-Service, 2004, pp. 413–420.[28] R.R. Yager, Aggregating evidence using quantified statements, Inform. Sci. 36 (1, 2) (1985) 179–206.[29] R.R. Yager, Targeted E-commerce marketing using fuzzy intelligent agents, IEEE Intelligent Syst. 15 (6) (2000) 42–45.[30] R.R. Yager, Fuzzy logic methods in recommender systems, Fuzzy Sets and Systems 136 (2003) 133–149.[31] L.A. Zadeh, Fuzzy Sets, Inform. Control 8 (1965) 338–353.[32] L.A. Zadeh, Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets System 1 (1) (1978) 3–28.[33] A. Zenebe, D. Anyiwo, A. Khatri, Effects of fuzzy theoretic inference strategy and similarity measure on a personalized recommender system,

in: Proc. 2006 World Congress in Computer Science, Computer Engineering, and Applied Computing, 2006, pp. 1–7.[34] H.-J. Zimmermann, Fuzzy Set Theory and Its Applications, Kluwer Academic Publishers, Boston, MA, 1991.