latency of neighborhood based recommender systemssch/publications/fi.pdf · 1 introduction...

Latency of neighborhood basedrecommender systems

Szymon Chojnacki and Mieczys law K lopotek

Institute of Computer Science PAS,J.K. Ordona 21, 01-237 Warsaw, Poland{sch,klopotek}@ipipan.waw.pl

Abstract. Latency of user-based and item-based recommenders is eval-uated. The two algorithms can deliver high quality predictions in dy-namically changing environments. However, their response time dependsnot only on the size, but also on the structure of underlying datasets.This constitutes a major drawback when compared to two other compet-itive approaches i.e. content-based and model-based systems. Therefore,we believe that there exists a need for comprehensive evaluation of thelatency of the two algorithms.

During a typical worst case scenario analysis of collaborative filteringalgorithms two assumption are made. The first assumption says thatdata are stored in dense collections. The second assumption states thatlarge amount of computations can be performed in advance during thetraining phase. As a result it is advised to deploy user-based systemwhen the number of users is relatively small. Item-based algorithms arebelieved to have better technical properties when the number of items issmall.

We consider a situation in which the two assumptions are not neces-sarily met. We show that even though the latency of the two methodsdepends heavily on the proportion of users to items, this factor does notdifferentiate the two methods. We evaluate the algorithms with severalreal-life datasets. We augment the analysis with both graph-theoreticaland experimental techniques.

1 Introduction

Recommender systems (RS) are an important component of the Intelligent Web.The systems make information retrieval easier and push users from typing queriestowards clicking on suggested links. At a conceptual level recommender systemsare similar to data-mining classification and regression predictive algorithms [12]and deal with analogous tasks. For example, we would use data-mining modelto predict which genre of music is interesting for a user. We would use a RS toforesee which particular song matches user’s preferences. Hence, recommenderalgorithms are used to make prediction on more fine-grained level than tradi-tional data-mining tools, which are also utilized in the pre-processing step of RS[2]. Herlocker et al. [14] delineate eleven functions that RS can have.

2 Szymon Chojnacki and Mieczys law K lopotek

There exist two basic types of recommender algorithms i.e. content-basedmethods and collaborative filtering (CF) techniques. Content-based method re-quire an access to an information describing users. It can be age, income oraddress of a person. CF techniques rely only on feedback extracted from users’interactions with items represented by ratings or preferences. If the predictionsare made by analyzing the neighborhood of users or items than we deal withneighborhood-based CF. Latent factors model such as matrix factorization (e.g.Singular Value Decomposition [17] of user-item matrix) is an example of a model-based algorithm.

The purpose of this article is to analyze the factors that impinge on theresponse time of neighborhood based collaborative filtering algorithms. Latencyis defined as the time required for making a recommendation for a randomlyselected user. Ability to deliver results within a limited time constraint is animportant condition for deploying the algorithms in real-life online settings. Lowlatency is also essential for high throughput of a system, which equals to thenumber of recommendations that can be made during a given period of time.

We focus our attention on user-based [13] and item-based algorithms [24].The motivation for the first method is the observation that an item should berecommended if it is highly rated by users similar to the analyzed user. Thesecond method asserts that the analyzed user would exhibit high preference foritems that are similar to the items she has already rated high. Two users areargued to be similar if the distance between two vectors containing quantifiedusers’ preferences for items is small. The distance can be defined by means ofvarious intuitive measures (e.g. euclidean metric, cosine similarity or Pearson cor-relation) [21]. The similarity between two items is defined in an analogous way.Because of the ease of implementation and straightforward training phase, thealgorithms belong to the most popular recommender systems. They are usuallydeployed as a first choice solution by web developers and deliver recommenda-tions until more advanced methods are tuned, larger datasets are collected ornecessary experience in data modeling is gained.

Our contribution is as follows:

– The influence of the users to items proportion on the latency has been ex-tensively studied

– Two flavors of implementing user-based and item-based algorithms are de-scribed

– The relationship between the asymptotic Newman’s formula and technicalperformance of neighborhood algorithms is envisioned

– The problems of an intuitive synthetic data generation are outlined– The advantages of using bipartite graph generators are delineated– It is shown that two features of social networks i.e. heavy-tail distribution

and growing density are besides of the numbers of users, items and ratingsresponsible for the difference in latency between user-based and item-basedmethods

The article consists of eight sections. In the second section we review theresult in the field of evaluating recommender systems. In sections 3 − 7 we de-

Latency of neighborhood based recommender systems 3

scribe four approaches to dedicated to analyzing the latency of user-based anditem-based methods. We start with classic worst case scenario analysis. Then wecompare the latency of the two algorithms obtained with real-life datasets. Insection 5 we try to enrich the analysis with the asymptotic Newman’s formula.In section 6 a synthetic datasets are generated and in section 7 the artificialdatasets are used. The last eighth section contains the conclusions.

2 Related work

We can distinguish two classes of criteria used when evaluating performance ofRS. The first class consists of correctness related characteristics. The secondclass comprises technical indicators. Both groups of measures play a comple-mentary role. The first group lets us to find out how attractive are the producedpredictions (or recommendations). The second group of indicators informs us ifwe can meet imposed technical constraints with the resources we posses. Amongthe most important characteristics in the first group [25] are: accuracy, preci-sion, recall, rank correlation, confidence, trust, novelty, diversity or serendipity.In the second group [8] we focus on: memory consumption, time required to traina model from scratch, latency and time required to update a model.

This section is only a brief foray into the problems of evaluating RS. Thisfield of research has developed dynamically in recent years. One of the eventsthat was responsible for this phenomenon was the Netflix Prize challenge1. Thecompetition was organized by a large DVD retailer in US. The prize of 1 mil-lion dollars was awarded to the team that managed to improve RMSE (rootmean standard error) of the retailer’s Cinematch algorithm by more than 10%.It turned out that classic collaborative filtering techniques relying on the no-tion of neighborhood do not perform as good as SVD-based approaches [17] ora technique derived from the artificial neural networks domain i.e. RestrictedBoltzmann Machines [23]. Although neighborhood algorithms output orthogo-nal predictions to the two advanced techniques and became a part of the blendedwinning solution, the importance of user-based and item-based algorithms be-came diminished after the challenge.

We claim that even though SVD and RBM were reported to outperformneighborhood-based algorithms, the quality of the latter should not be under-estimated and they are still competitive in real-life settings. Organizers of theNetflix evaluation made much effort to deliver realistic and huge data. However,the setting of the competition did not envision the problems that we need to facein most realistic deployments i.e. instant creation of new items, users or ratingsand an access to a real-time feedback from users.

These drawbacks were overcome during the Online Task of the DiscoveryChallenge [16] organized as a part of the ECML 2009 (European Conference onMachine Learning). The owners of the BibSonomy2 bookmarking portal openedits interfaces to recommender systems taking part in the evaluation. Whenever

1 http://www.netflixprize.com2 http://www.bibsonomy.org


a user of BibSonomy was bookmarking a digital resource (a publication or awebsite) a query was sent to all the systems. The tag recommendation of arandom one was displayed to the user. After the action a feedback with user’sactions was sent to all systems. The systems could have been maintained duringthe challenge, because they were configured as web services. Among the mostinteresting lessons we learned during all three parts of the evaluation are:

– The winning solution is a hybrid system blending content-based predictionswith neighborhood related statistics from three bipartite graphs (user-post,user-tag, post-tag) [18]

– The best matrix factorization algorithm [22] was able to compete with othermethods only in the second task containing pruned data

– It turned out in task three that none of the systems was able to meet latencyconstraint and deliver all of its recommendations within 1 000 millisecondstime limit.

Moreover, during the aftermath of the challenge two more facts were ob-served. Firstly, it has been shown in [19] that the winning solution can be di-rectly applied to datasets from Delicious and StackOverflow APIs. In both casesembedding online adoption mechanism leads to statistically significant improve-ment of recall measure. Secondly, it has been shown in [6], that the system thatmanaged to minimize the latency and deliver majority of predictions around200 milliseconds faster than the winning team, outperformed all the systems interms of the percentage of correct tags that were clicked by users. Our researchwas greatly motivated by the above observations.

3 Worst case analysis

The variables and functions used to analyze the latency of user-based and item-based algorithms are contained in Table 3. We recall [21] the pseudo-code of bothalgorithms in the Algorithm frames 1 and 2. The results of standard approach[15, 10] used to assess technical properties of neighborhood-based RS are givenin Table 2.

Algorithm 1: Item-based recommender (Item - LN)

input : an active user uoutput: a list of top ranked items by u

* foreach item v′ ∈ I, v′ /∈ I(u) doforeach v ∈ I(u) do

compute sim(v,v′)

pref(u,v′) ← sim(v,v′) · pref(u,v)


U a set of users

I a set of items

E a set of ratings given to items by users

I(U ′) a set of items ranked by users from U ′ ⊂ U

U(I ′)a set of users who ranked at least

one item from I ′ ⊂ I

N2(u)a set of users who ranked at least

one common item with u

N3(u) a set of items ranked by users from N2(u)

sim(u, u′) similarity between users u and u′

sim(v, v′) similarity between items v and v′

pref(u, v) a rating given to item v by user u

Table 1. Variables and functions used in the article.

Algorithm 2: User-based recommender (User - LN)

input : an active user uoutput: a list of top ranked items by u

* foreach other user u′ ∈ U do compute sim(u,u′);

retain the most similar users - the neighborhood UN

foreach item v ∈ I(UN ) doforeach user u′ ∈ U(v) do

pref(u,v) ← sim(u,u′) · pref(u′,v)


In case of item-based recommender, we iterate over all items v′ ∈ I thatwere not ranked by an active user u. The preference of u for v′ is calculatedas a product of a vector of similarities between v′ and v ∈ I(u) and a vectorof preferences given by u to v ∈ I(u). In case of user-based recommending,similarities between items are replaced by similarities between users. Moreover,in order to tune the accuracy we can limit computations of the preferences onlyto the most similar users (the neighborhood)3.

It is assumed in the standard complexity analysis that similarities can be pre-computed. As a result, it is asserted that user-based technique is faster when thenumber of users is relatively small (compare Fig. 1). The item-based algorithmis believed to be faster when the number of items is relatively small [21].

Recommender Memory Training time Latency

consumption

UserBased O(|U |2) O(|U |2|I|) O(|U |log(|U |))

ItemBased O(|I|2) O(|I|2|U |) O(|I|log(|I|))

Table 2. Worst case scenario asymptotic properties.

In practical applications precomputing similarity matrices is a questionablestep. It can be justified in situations, where the collections of users or items arestable (e.g. a bookstore). However, in dynamic settings with constant refreshingand churn it might be difficult to maintain updated similarities. Such scenariosare typical for news websites [9], content management [1] or web advertisements[4]. Another problem with the asymptotic analysis is an assumption that vectorswith preferences are stored in dense structures (e.g. double[]). This assump-tion is not met, because real-life datasets are sparse. Consequently, sparse datastructures are utilized (e.g. Set<Double>). Hence, we believe that there is a needfor thorough analysis of an expected value of the latency in real-life settings.

In the following sections we use Mahout framework4. Mahout contains highlyefficient open-source implementations of RS. We use Mahout’s data model andits generic implementations of user-based and item-based algorithms. We alsodeveloped two modified implementations.

User-LN is Mahout’s GenericUserBasedRecommender with neighborhood sizeset to all users

User-SN is GenericUserBasedRecommender with modified starred step in frame2. Instead of u′ ∈ U we implemented u′ ∈ N2(u)

3 We show the influence of the size of the neighborhood on the latency in the exper-imental part. However, in the main analysis we wish to obtain comparable resultsbetween the two algorithms and deactivate this step.

4 http:www.mahout.apache.org


U=10%I=90%

U=20%I=80%

U=30%I=70%

U=40%I=60%

U=50%I=50%

U=60%I=40%

U=70%I=30%

U=80%I=20%

U=90%I=10%

Asym

pto

tic C

om

ple

xity

p (proportion of users in all nodes)

𝑈 2 𝐼 𝑈 𝐼 2 User-based preferred

Item-based preferred

Fig. 1. Users/items proportion vs complexity.

Item-LN is modified GenericItemBasedRecommender implementation as de-scribed in frame 1

Item-SN is Mahout’s GenericItemBasedRecommender implementation, it dif-fers from frame 1 in the starred step, instead of v′ ∈ I Mahout runs overv′ ∈ N3(u)

Suffix -LN stands for large neighborhood and -SN stands for small neighbor-hood.

4 Real-life datasets

The datasets used in the evaluation and subsequent latency statistics are sum-marized in Table 3. Latency was measured over a sample of randomly drawn300 users and extracting 10 top ranked items. The datasets were downloadedfrom the Clear-Bits repository5, which contains dumps of logs extracted from11 query answering forums. The dumps are maintained by StackExchange DataExplorer6. For instance, the following question was asked on 2010-07-29 00:48and tagged with two ¡distribution¿ and ¡poisson¿ in a forum dedicated to thestatistical analysis:

What is the relationship between a Nonhomogeneous Poisson processand a process that has heavy tail distribution for its inter arrival times?

5 http://www.clearbits.net/torrents/1487-nov-20106 http://data.stackexchange.com/


Graph statisitcs Latency

Users Items Nodes Edges p Item-LN Item-SN User-LN User-SN

cooking-c 731 3 706 4 437 8 700 0.16 2.4 1.6 1.0 0.8

cooking-t 1 909 692 2 601 4 789 0.73 0.3 0.1 0.2 0.1

game-development-c 658 2 207 2 865 5 557 0.23 1.0 0.4 0.2 0.1

game-development-t 1 029 480 1 509 2 738 0.68 0.2 0.1 0.1 0.0

gaming-c 1 081 4 653 5 734 12 233 0.19 2.2 1.1 0.5 0.8

gaming-t 3 024 1 374 4 398 7 074 0.69 0.60 0.2 0.3 0.1

photography-c 443 1 991 2 434 4 866 0.18 1.0 0.7 0.4 0.3

photography-t 936 615 1 551 2 525 0.6 0.20 0.1 0.0 0.1

server-fault-c 14 771 72 976 87 747 156 497 0.17 33.4 3.9 5.4 9.6

server-fault-t 54 278 5 118 59 396 160 561 0.91 35.2 16.2 6.8 1.2

stack-apps-c 292 985 1 277 4 050 0.23 0.7 0.5 0.5 0.3

stack-apps-t 642 244 886 1 695 0.72 0.2 0.1 0.1 0.1

statistical-analysis-c 439 1 991 2 430 5 478 0.18 1.2 0.6 0.4 0.3

statistical-analysis-t 1 006 448 1 454 2 408 0.69 0.1 0.0 0.1 0.1

super-user-c 13 942 79 403 93 345 176 326 0.15 37.7 8.9 22.4 8.1

super-user-t 56 975 5 246 62 221 172 858 0.92 48.5 20.0 9.4 2.4

ubuntu-c 1 173 4 596 5 769 10 013 0.20 1.5 0.5 0.2 0.2

ubuntu-t 2 984 1 021 4 005 8 360 0.75 0.8 0.4 0.3 0.1

web-applications-c 1 173 3 275 4 448 6 612 0.26 0.8 0.4 0.1 0.0

web-applications-t 2 547 958 3 505 5 851 0.73 0.6 0.2 0.2 0.0

webmasters-c 630 2 021 2 651 4 553 0.24 0.7 0.3 0.1 0.1

webmasters-t 1 223 517 1 740 2 903 0.70 0.2 0.1 0.1 0.0

Table 3. Statistics describing real-life datasets used in the evaluation and correspond-ing latency. Parameter p stands for the proportion of the number of users to the numberof all nodes (users and items).


We built two datasets for each forum. The first dataset is suffixed with-comments, the second is suffixed with -tags. In case of comments, posts aretreated as items and if a user sent a comment to a post than we created an edgebetween the two. In case of tags, posts are perceived as users and tags attachedto posts are items. If a post was tagged with a certain word than we created anedge between the two. We augmented each edge with a random number of stars{1, 2, 3, 4, 5}. In general, there are many ways of implicit inferring the prefer-ences. We stayed with the random for simplicity and because the complexity ofthe implementation of similarity measure that we used (i.e. Pearson correlation)depends only on the number of non-empty coordinates.

0,01

0,1

1

10

100

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1

La

ten

cy in

ms (

log

sca

le)

p = (number of users / number of all nodes)

Item - LN

Item - SN

User - LN

User - SN

Fig. 2. Latency vs users/items proportion.

The results are drawn in Fig. 2. We see that latency of each of the four imple-

mentations depends on p = |U ||U |+|I| . In general latency decreases until p ≈ 0.65

and further grows. However, we can not find an evidence in favor of the state-ment that p differentiates the order of user-based and item-based algorithms.We see that in most cases User-SN is the fastest and Item-LN is the slowest.We can only presume that in the interval p ∈ {0.3; 0.6} the latency matches theoverall U-shaped curve, but the datasets we evaluated do not cover this range.

5 The Newman’s formula

The asymptotic formula derived by Newman in [20] is based on a notion ofthe expected degree of a neighboring node in a random bigraph. We can use


it to build an intuition on the size of the world in which we search for goodrecommendations (Fig. 3). It can not be applied directly to assess the latencyas it is based on an assumption of the local tree-like structure [5]. The expectednumber of users having rated at least one item in common with a random useru′ is given by:

N2(u′) ≈ 〈u〉(〈v2〉〈v〉− 1

), (1)

where 〈u〉 = |E||U | is the first moment of the user degree distribution, 〈v2〉 =∑

v∈I|v|·|v||I| is the second moment of the item degree distribution and 〈v〉 = |E|

|I|is the first moment of the item degree distribution. Degree |u| of user u is thenumber of ratings she has made. Degree |v| of an item v is the number of usersthat have rated the item.

Users Items

… 𝑢

…

…

𝑢⟨𝑣2⟩

⟨𝑣⟩− 1

…

…

…

𝑢⟨𝑣2⟩

⟨𝑣⟩− 1

⟨𝑢2⟩

⟨𝑢⟩− 1

Fig. 3. For an active user u, the expected number of her ratings is 〈u〉. However, the ex-pected number of potentially similar users N2(u) is larger than 〈u〉〈v〉. It can be approx-

imated by the asymptotic Newman’s formula with 〈u〉(

〈v2〉〈v〉 − 1

). The number of items

of potentially similar users N3(u) can be approximated by 〈u〉(

〈v2〉〈v〉 − 1

)(〈u2〉〈u〉 − 1

).

The Newman’s formula suggests that the latency of the neighborhood-basedCF algorithms depends on the shapes of the distributions of users and items’degrees. It can be proven by means of the Cauchy-Schwartz inequality [7] that

〈u〉〈v2〉〈v〉 ≥ 〈u〉〈v〉. Moreover, by virtue of the variance decomposition equation


Var(x) = 〈x2〉 − 〈x〉 we see that the higher the variances of the user and itemdegree distributions the bigger the size of the world of potentially similar usersand their items may be expected.

6 Synthetic data

ITE

MS

( I )

USERS (U)

RATINGS (E)

Fig. 4. Filling randomly user-item matrix with ratings leads to symmetric degree dis-tributions.

In this section we describe the procedure we followed in order to generatedrandom datasets for latency evaluation. Firstly, let us point at limitations of astraightforward intuitive approach, in which we fill user-item matrix randomlyFig. 4. By doing so we are unable to get neither correlations among users oritems nor skewed distributions. We would get a symmetric distribution, as theprobability Puk

of rating exactly k items by a random user is given by:

Puk=

(|I|k

)(|E||U | |I|

)k (1− |E||U | |I|

)|I|−k. (2)

Pukdescribes a binomial distribution. This limitation is conceptually related

to the criticism of the Erdos random graphs [11] and an introduction of thepreferential attachment mechanism in random unipartite graphs [3].

6.1 Generative procedure

The generative procedure consists of three steps: (1) new node creation, (2) edgeattachment type selection and (3) running bouncing mechanism. The steps arerun after an initialization of the bigraph. The procedure requires specifying seven


Initialize (m=4) 1) A new node is created (here a user)

Users Items

2) An attachment type is drawn for each edge

e·α

e·(1-α)

e

3) Number of bounced nodes is set

e·α·b

4) Bouncing is performed

random preferential

One itaration of the generator probability that a new user is created is p, (1-p) for new item.

Fig. 5. The procedure of generating random bipartite graphs.

m the number of initial loose edges

T the number of iterations

p the probability that a new node is a user

1− p the probability that a new node is an item

e the number of edges created by each new node

αthe probability that an item is selectedas edge’s ending with preferential attachment

1− α the probability that an item is selectedas edge’s ending uniformly

βthe probability that a user is selectedas edge’s ending with preferential attachment

1− β the probability that a user is selectedas edge’s ending uniformly

bthe fraction of preferentially attached edgesthat were created via a bouncing mechanism

Table 4. Parameters used in bigraph generator


parameters in Table 4. It is a simplified version of the procedure described in[7].

In the preferential attachment mechanism the probability that a node isdrawn is linearly proportional to its degree. Opposite to the preferential attach-ment is random attachment, in which a probability of selection is equal for allnodes. The model is based on an iterative repetition of three steps.

Step 1Create a new node with e loose edges. If a random number is greater then pthe created node is a user otherwise it is an item.

Step 2For each edge decide whether to join it to a node of the second modalityrandomly or with preferential attachment. The probability of selecting pref-erential attachment is α for new user and β for new item.

Step 3For each edge that is supposed to be created with preferential attachmentdecide if it should also be generated via a bouncing mechanism.

Bouncing is performed in three micro steps: (1) a random node is drawn fromthe nodes that are already joined with the new node, (2) a random neighbor ofthe drawn node is chosen, (3) a random neighbor of the neighbor is selectedfor joining with the new node. The bouncing mechanism was injected into themodel in order to parametrize the level of transitivity in a graph. The transitivityis a feature of real datasets and in terms of recommender systems representthe correlations between items ranked by different users. In unipartite graphstransitivity is measured by the local clustering coefficient, which is calculated foreach node as a number of edges among direct neighbors of the node divided byall possible pairs of the direct neighbors.

6.2 Properties

One can see that after t iterations the bigraph consists of |U(t)| = m+ pt users,|I(t)| = m + (1 − p)t items, and |E(t)| = m + t · e edges. After relatively manyiterations (t >> m) we can neglect m. In the presented model, an average userdegree is time invariant:

|E(t)||U(t)|

=m+ t · em+ pt

≈ e

p,

Formal analysis of the generator is contained in [7]. It enables us to producewide range of bigraphs (or equivalently user-item matrices). From the point ofview of neighborhood-based recommenders it is important that by changing αand β we can gradually change the distributions from exponential to power-law.Moreover, even if we keep |U |, |I| and |E| constant we can obtain graphs withsignificantly different average levels of potentially similar users and their itemsFig. 6.


200

220

240

260

280

300

320

340

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Average number of SIMILAR USERS

alpha

beta

7 Experimental results

The procedure described in section 6.1 was applied to create 84 synthetic user-item matrices (bigraphs with randomly labeled edges). In both groups we setm = 100 and b = 0.15. First group of 48 bigraphs was created by setting thenumber of iterations T = 2 000, skewness parameters α, β ∈ {0.1, 0.9}, an aver-age number of added edges e ∈ {3, 5, 10, 30} and the probability that a new nodeis a user p ∈ {0.25, 0.5, 0.75}. In this group the number of users varies between560 and 1 652, the number of items is between 548 and 1 640, the number ofedges is between 7 952 and 63 834 (Table. 7). The second group of 36 graphs wasgenerated by setting T ∈ {4 000, 8 000, 16 000}, α = β = 0.1, e ∈ {3, 5, 10, 30}and the probability that a new node is a user p ∈ {0.25, 0.5, 0.75}. In this groupthe maximum number of users is 12 086, the maximum number of items is 12 158and the maximum number of edges (ratings) is 502 206.

In this article we focus our attention on measuring the latency of neighborhood-based RS. It has been shown in [8] that random bipartite graphs can also beapplied to evaluating memory consumption of various algorithms, as well as timerequired to update or train a model.

7.1 Density

We used 48 graphs from the first group to evaluate the relationship between thedensity of generated graphs and the latency. Density is controlled by parameter e.Graphs generated with e = 3, 5, 10, 30 were subsequently labeled as very sparse,sparse, dense and very dense (Fig. 7).

All systems perform faster in sparser graphs. In very sparse graphs lowerlatency is observed in user-based algorithms than item-based. In this sparse


1600

1700

1800

1900

2000

2100

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Average number of ITEMS of SIMILAR USERS

alpha

beta

Fig. 6. In the above figures we use average number of similar users to denote anaverage value of N2(u) over all u ∈ U and an average number of items of simialr usersto denote an average value of N3(u) over all u ∈ U .

setting -SN implementations outperform their -LN counterparts. As the datadensify, item-based approaches improve their results in relation to user-based.In the last chart with very dense graphs both item-based algorithms are thefastest. In a dense setting -LN implementations outperform their -SN counter-parts. The reason for this phenomenon is the fact that when an average numberof potentially similar users |N2(u)| becomes close to |U | it is faster to iterateover all users than to reach them be exploiting user’s neighborhood in a graph.

7.2 Degree distribution

Let us recall that parameters α and β control in the generator the proportionof edges connected via the preferential attachment mechanism or via a randomselection. It can be shown [7] that as α → 1 the distribution of item degreesbecomes power-law and as α → 0 the distribution of item degrees becomesexponential. Parameter β controls the distribution of user degrees. The varianceof power-law distribution is higher than of exponential distribution. Hence, byvirtue of the Newman’s formula we can expect lower latency in graphs withexponential distribution. This deduction is confirmed by the results depicted inFig. 8.

We have presented in Figure 8 the results obtained for dense graphs (e = 10)with α = β = 0.1 and α = β = 0.9. We can see that the long-tail feature playssimilar role as the density. Item-LN algorithm becomes a leader for all threelevels of p = 0.25, 0.5, 0.75 when run on highly skewed datasets. On the other


0

0,5

1

1,5

2

2,5

3

0,25 0,5 0,75La

ten

cy in

ms


Very sparse graph

Item - LN

Item - SN

User - LN

User - SN

0

1

2

3

4

5

6

7

0,25 0,5 0,75

Late

ncy

in m

s


Sparse graph

Item - LN

Item - SN

User - LN

User - SN

0

5

10

15

20

25

30

0,25 0,5 0,75

Late

ncy

in m

s


Dense graph

Item - LN

Item - SN

User - LN

User - SN

0

50

100

150

200

250

0,25 0,5 0,75

Late

ncy

in m

s


Very dense graph

Item - LN

Item - SN

User - LN

User - SN

Fig. 7. Latency of four implementations obtained on synthetic datasets, grouped bydensity.


Generator’s parameters Generated graphs

m T α β p e Users Items Edges

100 2 000 0.1 0.1 0.25 3 625 1 575 7 952

100 2 000 0.1 0.9 0.25 3 605 1 595 8 224

100 2 000 0.9 0.1 0.25 3 586 1 614 8 055

100 2 000 0.9 0.9 0.25 3 640 1 560 8 037

· · · · · · · · · · · · · · · .. · · · · · · · · ·

100 2 000 0.9 0.1 0.75 30 1 586 614 62 610

100 2 000 0.9 0.9 0.75 30 1 617 583 63 834

100 4 000 0.1 0.1 0.25 3 1 090 3 110 16 047

100 4 000 0.1 0.1 0.5 3 2 096 2 104 16 149

100 4 000 0.1 0.1 0.75 3 3 081 1 119 16 111

100 4 000 0.1 0.1 0.25 5 1 066 3 134 24 359

100 4 000 0.1 0.1 0.5 5 2 098 2 102 24 332

100 4 000 0.1 0.1 0.75 5 3 073 1 127 24 275

100 4 000 0.1 0.1 0.25 10 1 112 3 088 43 987

· · · · · · · · · · · · · · · .. · · · · · · · · ·

100 16 000 0.1 0.1 0.5 30 8 118 8 082 502 206

100 16 000 0.1 0.1 0.75 30 12 073 4 127 498 679

Table 5. A sample of generated synthetic datasets.

hand Item-SN was the slowest when α = β = 0.1, but it gets close to bothuser-based implementations when α = β = 0.9.

Results from this and the previous section show, that item-based methodshave lower latency then user-based when datasets are densely populated withratings or node degrees have high variance. The two features of datasets alsodifferentiate two flavors of implementing the neighborhood based algorithms i.e.-LN and -SN. The results are consistent with the results obtained for real-lifedatasets in section 4. These datasets are relatively sparse and the lowest latencyis observed for user-based methods.

7.3 Size

In this subsection we check if the results we obtained for graphs containing≈ 2 000 nodes are valid for larger graphs. The latency presented in Figure 9was calculated for 12 graphs from the second group and 3 graphs from the firstgroup having α = β = 0.1 and e = 10. We can see that preserving constant


0

5

10

15

20

25

0,25 0,5 0,75

Late

ncy

in m

s


Dense graph (alpha=0.1 beta=0.1)

Item - LN

Item - SN

User - LN

User - SN

0

5

10

15

20

25

30

0,25 0,5 0,75

Late

ncy

in m

s


Dense graph (alpha=0.9 beta=0.9)

Item - LN

Item - SN

User - LN

User - SN

Fig. 8. Latency of four implementations obtained on synthetic datasets, grouped byskewness.

average node’s degree when a graph’s number of nodes grow leads to behaviorcharacteristic for sparser graphs. The distance between user-based and item-based algorithms increases. Also the difference between -SN and -LN envisions.This suggests that an average degree is a relative measure of density and shouldonly be analyzed within a specific size context. Another observation that we canread from Fig. 9 is a slight change of the shape of each curve. A curve joiningUser-LN algorithm becomes more U-shaped as the data sparsify. Finally, as thenumber of nodes grows (an average node degree is constant e = 10) the latencyalso grows. We believe that it can be attributed to the growth of the variancesin the distributions and explained by the Newman’s formula.


1

10

100

1000

0,25 0,5 0,75 0,25 0,5 0,75 0,25 0,5 0,75 0,25 0,5 0,75

2000 4000 8000 16000

La

ten

cy in

ms (

log

sca

le)

p (proportion of users) divided by the number of all nodes

Growing dense graphs

Item - LN

Item - SN

User - LN

User - SN

Fig. 9. Latency of four implementations obtained on synthetic datasets, grouped bysize.

7.4 Neighborhood

In order to preserve comparable results between user-based and item-based ap-proaches we switched of the neighborhood size parameter in the user-based im-plementations. The parameter is basically used to tune the accuracy of RS, butit has a significant influence on the latency. In this subsection we present whatlevel of decrease in latency we can expect when the size of neighborhood usedto weight the items is smaller than |U |.

The results of running User-LN algorithm with limited neighborhood aredrawn in Fig. 10. The experiments were run for graphs from the first group(T = 2 000) with e = 10. Let us mention that, when the neighborhood is setto 500 (N = 500) the number of considered users is equal to the number of allusers for p = 0.25 (p · T = 500) and the fact of switching on the neighborhoodparameter does not influence on the latency. The same algorithm outperformsItem-LN implementation for p = 0.75. This is because, when p = 0.75 the numberof users is ≈ 1 500 and limiting it to 500 most similar users gives relativelystronger advantage than in case of |U | ≈ 1 000. As the neighborhood decreasesto 400, 300, 200, 100 we observe gradual improvement in latency.

8 Conclusions

The purpose of this article was to verify the role of the proportion of the numberof users to the number of items in determining which neighborhood-based algo-


0

50

100

150

200

250

0,25 0,5 0,75

Late

ncy

in m

s


Very dense graph

Item - LN

User - LN

N = 500

N = 400

N = 300

N = 200

N = 100

User - LN with limited neighborhood

Fig. 10. Latency of four implementations obtained on synthetic datasets (neighbor-hood step in Algorithm 2 activated).

rithm is faster. We utilized four types of tools to analyze this problem: (1) com-plexity analysis, (2) real-life datasets evaluation, (3) the Newman’s asymptoticformula and (4) synthetic datasets generation. None of the tools can describethe problem on its own. We have shown that the studied proportion indeed im-pinges on the latency. However, it does not differentiate the order of user-basedand item-based methods. We have shown that the density of user-item matrixand the variance of node degrees are two factors that set the order between thetwo approaches.

References

[1] D. Agarwal, B. C. Chen, P. Elango, N. Motgi, S. T. Park, R. Ramakrishnan,S. Roy, and J. Zachariah. Online Models for Content Optimization. In D. Koller,D. Schuurmans, Y. Bengio, and L. Bottou, editors, NIPS, pages 17–24. MIT Press,2008.

[2] X. Amatriain, A. Jaimes, N. Oliver, and J. M. Pujol. Data mining methods forrecommender systems. In F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor,editors, Recommender Systems Handbook, pages 39–71. Springer US, 2011.

[3] A. Barabasi and R. Albert. Emergence of scaling in random networks. Science(New York, N.Y.), 286(5439):509–512, 1999.

[4] A. Z. Broder. Computational advertising and recommender systems. In Proceed-ings of the 2008 ACM conference on Recommender systems, RecSys ’08, pages1–2. ACM, 2008.


[5] S. Chojnacki, K. Ciesielski, and M. K lopotek. Node degree distribution in af-filiation graphs for social network density modeling. In L. Bolc, M. Makowski,and A. Wierzbicki, editors, Social Informatics, volume 6430 of Lecture Notes inComputer Science, pages 51–61. Springer Berlin / Heidelberg, 2010.

[6] S. Chojnacki, D. Czerski, and M. K lopotek. Optimization of tag recommendersystems in a real life setting. In 3rd Conference on Human System Interaction,pages 107–112, 2010.

[7] S. Chojnacki and M. Klopotek. Random graphs for bipartite networks modeling.Journal of Control and Cybernetics, 40(3), 2011. in print.

[8] S. Chojnacki and M. A. Klopotek. Random graphs for performance evaluation ofrecommender systems. Journal of Control and Cybernetics, 40(2), 2011. in print.

[9] A. S. Das, M. Datar, A. Garg, and S. Rajaram. Google news personalization:scalable online collaborative filtering. In WWW ’07: Proceedings of the 16th in-ternational conference on World Wide Web, pages 271–280. ACM, 2007.

[10] C. Desrosiers and G. Karypis. A comprehensive survey of neighborhood-basedrecommendation methods. In F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor,editors, Recommender Systems Handbook, pages 107–144. Springer US, 2011.

[11] P. Erdos and A. Renyi. On the evolution of random graphs. In Publication of theMathematical Institute of the Hungarian Academy of Sciences, pages 17–61, 1960.

[12] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten.The WEKA data mining software: an update. SIGKDD Explorations, (1):10–18.

[13] J. L. Herlocker, J. A. Konstan, A. Borchers, and J. Riedl. An algorithmic frame-work for performing collaborative filtering. In Proceedings of the 22nd annualinternational ACM SIGIR conference on Research and development in informa-tion retrieval, pages 230–237. ACM Press, 1999.

[14] J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl. Evaluating collab-orative filtering recommender systems. ACM Trans. Inf. Syst., 22:5–53, January2004.

[15] M. Jahrer, A. Toscher, and R. Legenstein. Combining predictions for accuraterecommender systems. In KDD ’10, pages 693–702. ACM, 2010.

[16] R. Jaschke, F. Eisterlehner, A. Hotho, and G. Stumme. Testing and evaluatingtag recommenders in a live system. In D. Benz and F. Janssen, editors, Workshopon Knowledge Discovery, Data Mining, and Machine Learning, pages 44–51, 2009.

[17] Y. Koren. Factorization meets the neighborhood: a multifaceted collaborativefiltering model. In Proc. Int. Conf. on Knowledge Discovery and Data Mining,pages 426–434, 2008.

[18] M. Lipczak, Y. Hu, Y. Kollet, and E. Milios. Tag sources for recommendation incollaborative tagging systems. In Proceedings of the ECML/PKDD 2009 Discov-ery Challenge Workshop, 2009.

[19] M. Lipczak and E. Milios. Learning in efficient tag recommendation. In RecSys’10: Proc. the 4th ACM Conference on Recommender Systems, pages 167–174.ACM, 2010.

[20] M. Newman, S. Strogatz, and D. J. Watts. Random graphs with arbitrary degreedistributions and their applications. 64(026118), July 2001.

[21] S. Owen, R. Anil, T. Dunning, and E. Friedman. Mahout in action (MEAP).Manning, 2011.

[22] S. Rendle and L. Schmidt-thieme. Factor models for tag recommendation in bib-sonomy. In Proceedings of the ECML/PKDD 2009 Discovery Challenge Workshop,2009.


[23] R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted boltzmann machines forcollaborative filtering. In Proceedings of the 24th International Conference onMachine Learning. International Conference on Machine Learning, 2007.

[24] B. M. Sarwar, G. Karypis, J. A. Konstan, and J. Riedl. Item-based collaborativefiltering recommendation algorithms. In WWW, pages 285–295, 2001.

[25] G. Shani and A. Gunawardana. Evaluating recommendation systems. In F. Ricci,L. Rokach, B. Shapira, and P. B. Kantor, editors, Recommender Systems Hand-book, pages 257–297. Springer US, 2011.

latency of neighborhood based recommender systemssch/publications/fi.pdf · 1 introduction...

Documents