a context-aware personalized resource recommendation for pervasive learning

Post on 15-Jul-2016

223 Views

Category:

Documents

7 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Cluster Comput (2010) 13: 213–239DOI 10.1007/s10586-009-0113-z

A context-aware personalized resource recommendationfor pervasive learning

Junzhou Luo · Fang Dong · Jiuxin Cao · Aibo Song

Received: 26 May 2009 / Accepted: 5 November 2009 / Published online: 21 November 2009© Springer Science+Business Media, LLC 2009

Abstract As it is difficult for learners to discover and ob-tain the most appropriate resources from massive educa-tion resources according to traditional keyword searchingmethod, the context-aware based resource recommenda-tion service becomes a significant part of pervasive learn-ing environments. At present, recommendation mechanismsare widely used in e-commerce field, where content-basedor collaborative-based filter strategies are usually consid-ered separately. However, in these existing recommenda-tion mechanisms, the dynamic interests and preference oflearners, the access pattern and the other attributes of perva-sive learning environments (such as multi-modes connect-ing and resources distribution) are always neglected. Thus,these mechanisms can not effectively reflect learners’ ac-tual preference and can not adapt to pervasive learning en-vironments perfectly. To address these problems, a context-aware resource recommendation model and relevant recom-mendation algorithm for pervasive learning environmentsare proposed. Therein, with taking into account the rele-vant contextual information, the calculation of relevant de-gree between learners and resources can be divided into twomain parts: logic-based RRD (resource relevant degree) andsituation-based RRD. In the first part, content-based and

J. Luo · F. Dong (�) · J. Cao · A. SongSchool of Computer Science and Engineering, SoutheastUniversity, Nanjing, P.R. Chinae-mail: fdong@seu.edu.cn

J. Luoe-mail: jluo@seu.edu.cn

J. Caoe-mail: jx.cao@seu.edu.cn

A. Songe-mail: absong@seu.edu.cn

collaborative-based recommendation mechanisms are com-bined together, where the individual preference tree (IPT)is introduced to take into account the multi-dimensional at-tributes of resources, learners’ rating matrix and the energyof access preference. Meanwhile, the learners’ historical se-quential patterns of resource accessing are also consideredto further improve the accuracy of recommendation. In thesecond part, in order to enhance the validation of recommen-dation, the connecting type relevance and time satisfactiondegree are calculated according to other relevant contexts.Then, the candidate resources can be filtered and sorted viacombining these two parts to generate (Top-N) recommen-dation results. The simulations show that our newly pro-posed method outperforms other state of-the-art algorithmson traditional and newly presented metrics and it may alsobe more suitable for pervasive learning environments. Fi-nally, a prototype system is implemented based on SEU-ESPto demonstrate the relevant recommendation process further.

Keywords Pervasive learning · Recommendation ·Dynamic preference weight · Individual preference tree ·Logic-based RRD · Situation-based RRD

1 Introduction

Nowadays, in the light of the pervasive computing technol-ogy [1], the traditional desktop computer supported learn-ing model has been conspicuously changing. Integratedwith context-aware computing technology [2–7], a learnercan acquire interesting education resources seamlessly frommassive education resources pool not only by desktop com-puters at designated places, but also by various wirelesshandheld devices such as PDAs, mobile phones in right

214 Cluster Comput (2010) 13: 213–239

place and at right time based on learners’ surrounding con-text such as when and where the learners are (time andspace), what learning resources are available for learners,what learners’ access preferences are, and who are the suit-able learning collaborators. This next generation e-learningmode is called Pervasive Learning (P-Learning) [8].

In the traditional e-learning model, learners usually ob-tain certain education resources via simple keyword basedsearching technique. Nevertheless, as the massive educationresources are distributed in wide area network, it is very dif-ficult for learners to discover and acquire the most interest-ing resources according to this simple method in P-Learningenvironments. Thus, in order to carry the right resources atthe right place in the right way without learner’s explicit in-tervention, in P-Learning environments, it is significant toprovide context-aware based resource recommendation ser-vice to learners in which the certain education resources canbe filtered from massive education resource pool and be rec-ommended to target learners automatically in terms of dy-namic implicit contextual information.

As P-Learning is a relative new concept, only fewworks [8–17] are concentrated on this field in recent yearswhich are mainly based on the modeling of context-awareP-Learning systems from an overall point of view andthe implementing of several real world projects, such asTANGO [9], PERKAM [9], ULE [11] and JAPELAS [12].However, most of these works only consider providing re-sources to learners according to their explicit access requestsand do not take into account the resource recommendationmechanism based on learners’ contextual information. Thus,it is very difficult to provide the most proper resources toeach learner at the right place in the right way; meanwhilethe learning efficiency will be greatly restricted.

On the other hand, with rapid development of Internet,resource recommendation mechanism is widely used in e-commerce systems [18], such as Amazon and eBay. Accord-ing to different recommendation techniques, the traditionalrecommendation algorithms can be mainly divided intotwo types: content-filtering-based recommendation [19–22]and collaborative-filtering-based recommendation [23–29].Nevertheless, as the amount of education resources is ex-tremely large meanwhile the learners’ preference and rele-vant contextual information will change dynamically, therewill be several drawbacks when applying existing recom-mendation algorithms to P-Learning environments directly:

(1) In existing recommendation algorithms, due to consid-ering learners’ (for the sake of discussing conveniently,we use the term ‘learner’ and ‘user’ interchangeablywhich appears in these literature) individual prefer-ence information alone, only certain resources whichare similar to learners’ historical preference could berecommended by content-filtering-based recommenda-tion algorithms. Besides, due to only comparing sim-

ilarity between learners’ rating information while ne-glecting content-based relativity between resources, thecollaborative-based recommendation algorithms willalso lead to a lower accurate similarity result. There-fore, it could not be suitable for the situation that thenumber of learners is quite small.

(2) The learners’ interests will be changing with the lapse oftime. However, as existing recommendation strategiescould not take into consideration the access time of his-torical records, they can not make certain responses intime when learners’ current interests are changing (suchas accessing some education resources recently). Thus,it will lead to a great difference between recommendedresources and learners’ actual interests.

(3) Owing to the repeatability and periodicity of learn-ing process, there are some dependence relationshipsamong learners’ historical access records which can re-flect resource access patterns and learner’s latent pref-erence information. Unfortunately, existing content-filtering-based and collaborative-filtering-based recom-mendation systems always neglect these useful informa-tion.

(4) As P-Learning environments can support multimodeconnecting (such as via PC or PDA), the education re-sources can also be divided into corresponding cate-gories (PC version, PDA version and so on) to adaptdifferent kinds of terminal equipments. Therefore, rec-ommender system needs to choose the proper kind ofresources according to learners’ current contexts (suchas connecting mode). Besides, as the massive resourcesare stored in distributed P-Learning environments, dif-ferent resources accessing may cost different responsetime for each learner. For the sake of avoiding too muchaccess response time (even out of learners’ tolerancerange), the recommender system needs to make cer-tain recommendation by considering this characteristic.However current strategies just take into account logicbased relevant degree between resource’s contents or at-tributes and each learner while ignoring certain situa-tion based relevant degree. Therefore, these strategiescan not adapt to P-Learning environments perfectly.

In this paper, to address the drawbacks of presentP-Learning systems and existing resource recommenda-tion algorithms mentioned before, a Context-aware basedResource Recommendation model and relevant recommen-dation algorithm for P-Learning environments called PL-CR2 are proposed. Therein, based on the relevant con-textual information (such as target learners’ historical ac-cess records, location information and multi-dimensionalattributes of education resources etc.), the calculation ofrelevant degree between learners and resources are di-vided into two main parts: logic-based Resource Relevant

Cluster Comput (2010) 13: 213–239 215

Degree (RRD) and situation-based resource relevant de-gree. In the first part, the content-based and collaborative-based recommendation mechanisms are combined to ben-efit from respective advantages. In order to reflect learn-ers’ complete spectrum of interests, the individual prefer-ence tree (IPT) is introduced to take into account resources’multi-dimensional attributes, learners’ rating informationand time-related preference energy. Meanwhile, learners’historical access sequential patterns are also considered soas to further improve the accuracy of recommendation. Inthe second part, in order to enhance the validation of rec-ommendation, based on the characteristics of P-Learning,the connecting type relevant and time satisfaction degree ofcertain resource can be calculated in terms of learners’ otherdynamic context information (such as connecting mode andbandwidth). Finally, these massive resources will be filteredand ranked by combining two kinds of RRD and then theTop-N recommendation results will be provided to the tar-get learner. Based on these mechanisms, the learner’s reallearning demands can be satisfied accurately and effectivelyaccording to the real-time updated contextual information.The simulation results show that our algorithm can surpasstraditional recommendation algorithms significantly in pre-cision, integrated utility and validity metrics and could bemuch more suitable for P-Learning environments. More-over, a prototype system of our recommendation model andalgorithm based on SEU-ESP (SouthEast University digitalEducation public Services supporting Platform) have beendeveloped to demonstrate that our proposal could help learn-ers obtain their concerned education resources according tothe relevant contextual information.

The remainder of this paper is organized as follows: in thenext section, we review the previous works of P-Learningenvironments and resource recommendation mechanism.Then, the context-aware resource recommendation modelwill be presented in Sect. 3. And Sect. 4 elaborates and an-alyzes our resource recommendation algorithm in details.Section 5 presents the simulation results and performanceanalysis. In Sect. 6, we introduce the implementation of ourrecommendation model and algorithm based on SEU-ESP,and then describe a usage case to demonstrate how our pro-posal can help learners in learning process. The summary ofour research and the direction of future work are given inSect. 7.

2 Related works

According to the objective of P-Learning, the self-adaptabi-lity is one of the most important features, which means thatlearners may have the capability to carry their own learn-ing environment and right information at the right place inthe right way. To achieve this goal, the context-aware tech-nology needs to be taken into account. The context-aware

feature of P-Learning environments allows system to un-derstand learner’s behavior and environmental parametersin real world timely, such as the locations and behavior oflearners, and the network bandwidth of learning environ-ment [8].

As an important aspect of pervasive computing field, thecontext-aware technology has been widely concerned forseveral years. The typical definition of context can be de-scribed as follows: context is referred to as any informationthat can be used to characterize the situation of an entitywhere an entity can be a person, place and a physical orcomputational object [2]. In order to specify this concept,Schilit [2] divides context into three categories as Comput-ing context, User context and Physical context meanwhileRyan et al. [7] referred to context as the user’s location, envi-ronment, identity and time. Based on them, there have beenmany research efforts for the development of context awaretoolkits, such as Packard’s Cooltown project [3] Dey’s Con-text Toolkit [4], Mostefaoui et al’s CB-SeC framework [5]and Roman et al.’s Gaia middleware [6] etc. However, dueto the characteristics of learning process, the context-awaremechanism in P-Learning environments should much moreemphasize on offering more adaptive supports to learnersby taking into account their learning behaviors and provid-ing personalized instructions or hints to learners. Thus, asconfronting with several new requirements, the categoriesof context should be redefined. Stephen Yang et al. [10] de-fine context from two perspectives: learner context which in-cludes the surrounding environment affecting learners’ Webcontent discovery and access and the content context whichincludes the surrounding environment affecting Web con-tent delivery and presentation. Tzu-Chi Yang et al. [13] de-fine five kinds of context as Personal context, Environmen-tal context, Feedback, Personal information and learningportfolio and Environmental information. Based on them,several context-aware based P-Learning projects have beendeveloped in recent years. For example, Ogata et al. [12]presented a context-aware language-learning support sys-tem for Japanese polite expressions learning, which is calledJAPELAS. The systems can provide learners with appropri-ate expressions according to different contexts (e.g., occa-sions or locations) via mobile devices (e.g., PDA). Rogerset al. [16] integrated the learning experiences of indoorand outdoor activities by observation in the working scene.Therein, learners are not only capable of getting data, voiceand images from the scene by observation, but also of gath-ering related information from learning activities via wire-less networks. Joiner et al. [17] presented their studies ofapplying context-aware devices to education by providingstudents with timely vocal statements related to specific ac-tivities in real conditions. Recently, El-Bishouty et al. in-troduce a knowledge awareness mapping based P-Learning

216 Cluster Comput (2010) 13: 213–239

environments-PERKAM [9]. It utilizes the RFID technol-ogy to detect the learner’s environmental objects and loca-tion, and then the education resources and helpers are pro-vided to the active learner in accordance with the detectedobjects and the current location.

However, from the above introductions we can indicatethat current P-Learning works mainly pay attention to learn-ers’ contextual information. However, with the develop-ment of SOA technology, the P-Learning systems are usu-ally based on SOA architecture. As education resources aretransparent to high-level learners, any of learners’ operation(such as resource recommendation) should be achieved bycalling a variety of relevant support services (such as re-source recommendation service). Thus, the more relevantcontextual information such as the context of service andeducation resources should also be taken into account. Be-sides, from the existing context definitions mentioned above,we can indicate that these works usually neglect learners’historical learning (resource access) process which can pro-vide some important personal information. Meanwhile theprevious systems mainly base on simple information match-ing and resource selection according to certain contextswhile neglecting the context-aware resource recommenda-tion mechanism by using learners’ historical learning infor-mation. Therefore, learners could not obtain the most in-teresting education resources from massive resource pooleffectively, and then the quality and efficiency of learningprocess will be seriously affected.

On the other hand, the explosive growth of WWW andemergence of e-commerce have led to the rapid developmentof recommender systems which use personalized informa-tion filtering technology to either predict whether a particu-lar user will like a particular resource (prediction problem)or to identify a set of N resources that will be of interestto a certain user (Top-N recommendation problem) [18]. Inrecent years, various resource recommendation algorithmshave been developed which can be mainly divided intotwo categories: content-filtering-based recommendation andcollaborative-filtering-based recommendation.

The content based recommendation algorithms usuallyrank and recommend certain resources by calculating rel-evant degree between resource and target user’s preference.And only the resources that have a high degree of similarityto whatever user’s preferences would be recommended. Forexample, Mostafa et al. proposed a content-based filteringsystem called SIFTER [19]. This system can filter informa-tion based on content and a user’s specific interests with thevector-space model. Liren Chen et al. present an agent thathelps users to effectively browse and search the Web calledWebMate [21]. It uses multiple TF-IDF vectors to keep trackof user interests in different domains. These domains are au-tomatically learned by WebMate. During search, user can beprovided with multiple pages as similarity/relevance guid-ance for the search in terms of relevant keywords vectors.

However, content based recommendation algorithms can notrecommend new type of resources to users, and can not re-flect dynamic changing process of user’s preference either.Therefore, the precision and diversification of recommenda-tion will not be guaranteed.

The collaborative based recommendation technique isachieving widespread success in resource recommendationfield. The main concept of it is to find k neighbors of ac-tive user with the most similar interests by searching user-item (resource) rating matrix. Then collect the candidate re-sources which have already accessed by k neighbors but notby the target user, and predict the rating of candidate re-sources for target user to obtain Top-N recommendation re-sults. With the increase of resources, pure recommendationmechanism will face the sparsity rating problem [18], andseveral improved methods have been proposed. In Wang’sstudy [23], a collaborative filtering based recommendationproblem in a generative probabilistic framework is reformu-lated. In order to get more robustness against data sparsityand give better recommendations, the final rating is esti-mated by fusing predictions from three sources: predictionsbased on ratings of the same item by other users, predictionsbased on different item ratings made by the same user, and,ratings predicted based on data from other but similar userswho’s rated other but similar items. Ma et al. [24] use en-hanced Pearson Correlation Coefficient (PCC) based algo-rithm when computing the similarity of users or items. Theypropose an effective missing data prediction algorithm, inwhich information of both users and items is taken into ac-count. Then the missing data of user-item matrix will be pre-dicted if and only if it will bring positive influence for rec-ommendation of active users. Although unifying the item-based and user-based similarity calculation may alleviatesparsity rating problem to some extent, this problem cannot be solved thoroughly because these improved algorithmswill bring the extra inaccuracy and add the complexity whencalculating similarity and predicting rating. Besides, thesecollaborative based algorithms only take into considerationuser-item matrix when computing relevant similarity, mean-while the users’ preferences for resource’s content are al-ways be neglected. Based on it, in order to mix their in-dividual advantages to obtain a better recommendation re-sult, there are few works in which content and collaborativemethods can be combined together into the hybrid approach.For example, Claypool et al. [30] combine the outputs (rat-ings) obtained from individual recommendation technique(content and collaborative) into one final recommendationusing either a linear combination of ratings. Billsus et al.[31] use one of the individual recommender systems, i.e., atany given moment, and choose to use the one that is “better”than others based on some recommendation quality metric.However, due to only considering the simply combination ofcontent based and collaborative based strategies, these hy-brid strategies can not make use of information on resource

Cluster Comput (2010) 13: 213–239 217

attributes, user’s resource access time, rating and historicalaccess pattern, thus they can not obtain an accurate recom-mendation result either.

In addition, the resource recommendation results couldbe impacted by learners’ dynamic contextual information inP-Learning environments, such as learner connecting modeand response time of resource access. But existing recom-mendation algorithms only take into account the logic rela-tionship between resources and learners, therefore, they cannot adapt to P-Learning environments very well.

In summary, in order to improve the learning efficiency,a context-aware based resource recommendation model forP-Learning environments is needed. At the same time, asthe requirement of resource recommendation in P-Learningenvironments can not be satisfied by existing algorithms, theneed of a certain recommendation algorithm, which can bindthe characteristics with large number of resources, dynamiclearners’ preference and relevant context information to sup-port the highly efficient resource access and learning processof learners, is greatly felt.

3 Context-aware resource recommendation model inP-Learning environments

3.1 Context modeling in P-Learning

In P-Learning environments, in order to reflect context-aware resource access and recommendation process com-pletely, three types of contexts are defined to describelearner, support service and education resource in this pa-per.

Learner context (L-Context) includes six main aspectsof information as Connect Device, Learner Environment,Learner Profile, Learner Preference, Learning Process andAccess Request. Therein, Connect Device represents thetype of connection device. Learner Environment containslearner’s location information. Learner Profile contains sev-eral personal information such as learner name, learner IDand affiliation. Learner Preference reflects the resourceswhich certain learner is interested in. Learning Process rep-resents the learner’s historical learning process (resource ac-cess) records. Access Request contains learner’s request formassive education resources.

Service context (S-Context) includes three main aspectsof information: Service Environment, Service Profile andService QoS. Therein, Service Environment indicates ser-vice’s location. Service Profile contains the name, input pa-rameters, output parameters and function description of cer-tain support service. And Service QoS represents the QoSparameters provided by support services including work-load, reputation, availability, security and so on.

The main information of Resource Context (R-context) isbased on China ELearning Technology Standard (CELTS)

Fig. 1 The description of three kinds of contexts

[32] which is defined by China Elearning Technology Stan-dardization Committee and several dynamic features ofthese education resources. The particular description of thethree contexts can be defined as Fig. 1.

3.2 Context-aware based resource recommendation model

Based on the definition of three contexts mentioned above, anovel context-aware resource recommendation model in P-Learning environment is introduced which consists of fivemain modules: context sensor module, context managementmodule, context warehouse module, support service selec-tion module and resource recommendation service module,as shown in Fig. 2.

The context sensor module has three interfaces which areused to collect dynamic activities data from learners, sup-port services and education resources according to prede-fined context structure. In real system, in order to achievethis goal, this module must be deployed at learner’s connec-tion devices, support service container and local resourcesstorage pool respectively. Additionally, because of varietytypes of sensory data, this module must convert the sensorydata to normalized forms in order to support the integrationof high-level contexts.

The context management module consists of three sub-modules: basic context retrieval (CR), context aggregation(CA) and context filtering (CF). Therein, CR collects thepreliminary context information for a given time (�T ) froma set of appropriate context sensors connected to contextmanagement module. CA is in charge of integrating the rel-ative sensed contexts into a standardized form based on pre-defined context structure. And CF eliminates redundant oruseless information received from CA and chooses essential

218 Cluster Comput (2010) 13: 213–239

Fig. 2 The architecture ofmulti-dimensionalcontext-aware resourcerecommendation model

part of certain context to be updated in context warehousemodule which is used to store the contexts.

The context warehouse module is responsible for storingdifferent kinds of contexts. This module consists of threesub-modules: L-context storage, S-context storage and R-context storage which can provide contextual information toresource recommendation module.

The support service selection module is in charge ofchoosing the most proper support service (such as resourcerecommendation service) from service container accordingto learners’ request and dynamic contextual information.

The resource recommendation service module, which isdeployed in support service container, is the core componentof our context-aware resource recommendation model. Themain part of this module is the relevant context-aware re-source recommendation algorithm which is responsible forcalculating and offering ranked relative education resourceslist to target learner according to the context information ofparticular learner and education resources.

In resource recommendation process, the system can ob-tain target learner’s relevant contextual information, such assupport service request, historical resource access records,location etc. Then, support service selection module mayrun the service matching process according to relevant con-texts and pick the most proper support service (assumedas resource recommendation service in this paper). Next,the resource recommendation service may call the cer-tain recommendation algorithm by taking consideration therelevant contexts of learners and education resources. Fi-nally, the recommendation results will be provided to targetlearner automatically. Therein, when learners access some

resources or other contextual information changes, the newcontexts can be updated in time to guarantee the availabilityof system.

Due to space limitation, only the context-aware resourcerecommendation algorithm which belongs to resource rec-ommendation service module will be emphatically focusedon in this paper.

4 Context-aware based resource recommendationalgorithm in P-Learning environments

The resource recommendation is the core component of re-source access process in P-Learning environments. There-fore, in this paper, to address the relevant drawbacks of exist-ing recommendation algorithms, a Context-aware ResourceRecommendation for Pervasive Learning, called PL-CR2,is proposed based on recommendation model mentioned inSect. 3.

In order to adapt to P-Learning environment, several con-textual information of learner and education resource will betaken into consideration. Therein, the PL-CR2 algorithm canbe mainly divided into two steps: logic-based RRD calcu-lation and situation-based RRD calculation. In logic-basedRRD calculation, the content-based and collaborative-basedrecommendation mechanisms are combined with integratingthe resources’ multi-dimensional attributes, learners’ ratinginformation, time-related preference energy and historicalaccess sequential patterns. In situation-based RRD calcula-tion, based on the characteristics of P-Learning, the connect-ing type relevant and time satisfaction degree of certain re-source can be calculated in terms of learners’ environment

Cluster Comput (2010) 13: 213–239 219

context (such as connecting mode and bandwidth) to im-prove the validity of recommendation. Finally, these massiveresources will be filtered and ranked by combining these twokinds of RRD and then the Top-N recommendation resultcan be obtained further.

4.1 Logic-based Resource Relevant Degree Calculation

Logic-based RRD calculation is mainly based on analyzinglearner’s historical resource access records of learner con-text, including individual preference based RRD calculation,collaborative preference based RRD calculation and sequen-tial pattern based RRD calculation.

4.1.1 Individual preference based RRD calculation

The critical task of individual preference RRD calculationis how to model learner’s preferences and compute the rel-evant RRD between massive candidate resources and activelearner. At present, most of the existing learner’s prefer-ence description methods are based on vector-space model.Therein, resources can be divided into several categories ac-cording to different domain which certain resource belongsto, such as literature, mathematics and computer science.And then the weight value of certain domain for particu-lar learner, which denotes the importance of this domainto this learner, could be computed according to the numberof learner’s accessed resources which belong to the corre-sponding domain. However, based on R-Context definition,education resources usually have several kinds of attributes(not only as subject but also as secondary subject, educa-tion type, institution and so on). Therefore, in order to re-flect learner’s preference accurately, the multi-dimensionalattributes of education resource should be taken into ac-count.

Definition 1 According to the attention-degree of learn-ers to each dimensional attribute of resource (supposethat all learners have the same attention-degree to oneattribute), the resource attributes’ description model canbe defined as a multi-dimension attributes vector C =〈(CK1,CW1), (CK2,CW2), . . . , (CKm,CWm)〉, where CKt

denotes the t th dimensional attribute’s name of resource,CW t denotes the relevant weight value, CW1 ≥ CW2 ≥· · · ≥ CWm and

∑CW t = 1. Based on this description

model, the attributes of a certain resource Rj can bedefined as RAj = 〈ck1, ck2, . . . , ckm〉, where ckt denotesthe t-th dimension attribute’s keyword of resource Rj .For example, C = 〈(Subject,0.4), (Secondary Subject,0.3),(Education Type,0.2), (Institution,0.1)〉, RAj = 〈ComputerScience,Data Structure,Master Degree (M.D),SoutheastUniversity (SEU)〉.

In addition, in P-Learning environments, learner’s recentresource access preference usually plays an important role

to the future interests. However, current vector-space basedpreference modeling methods always treat all accessed re-sources equally, and the dynamic changes of learner’s pref-erence with the passage of time are always neglected. Thus,learner’s interests and preference could not be measured ac-curately. Herein, the preference energy (PE) concept is in-troduced in order to reflect dynamic interests and preferenceof learner’s more accurately.

Definition 2 Preference Energy (PE): for learner Ui ’s ac-cessed resource Rj , preference energy of Rj called PEi,j

denotes the preference degree of Ui to Rj at current time.As the effect of Rj to Ui ’s future interest will becomingsmaller with resource access process going on, therefore,PEi,j should be attenuated gradually.

Rule 1 The design rule of preference energy attenuationfunction can be defined as follows:

(1) Learner’s interesting resources in future will be similarto recent accessed resources, thus the resources whichare accessed recently should be endowed a larger PE;

(2) In order to avoid the effect of an occasional resource ac-cess to long-term access rule, the PE of recent accessedresource should not be sustained in large value for a longtime;

(3) The previous long-term preference of learner (continu-ous resource access with same type) will still affect theresource recommendation.

According to Rule 1, the negative exponential function isused to denote the preference energy attenuation function asfollows:

PEattenuation(x) = e−λ(x−1), x ≥ 1 (1)

Therein, x is the resource access order of learner Ui .λ ∈ (0,1) is the attenuation parameter, and attenuation rateof PE value can be adjusted by changing λ value: the largerthe λ, the faster the PE will be attenuated. The PEattenuation

with λ = 0.1 is shown in Fig. 3, in region a, the effect of oc-casional accessing will be exclude rapidly; and meanwhilein region b, the long-term preference can be emphasized fur-ther when the relevant accessing order is large.

Based on formula (1), PE value of latest accessed re-source is equal to 1, and with access going on, PE valueof resources could be updated.

According to Definitions 1 and 2, the individual prefer-ence tree called IPT is introduced. On the basis of multi-dimensional attributes of accessed resources, learner’s ratinginformation and PE value of certain resources are combinedto make a three-dimensional information model of learner’spreference.

220 Cluster Comput (2010) 13: 213–239

Fig. 3 Preference energy attenuation function PEattenuation (λ = 0.1)

Fig. 4 Individual preference tree based on a four-dimensional at-tributes description model

Definition 3 Individual Preference Tree (IPT): Accordingto resource attributes description model C, learner Ui ’s IPTcan be defined as a (m + 1)-depth tree (start from 0) inwhich m denotes the size of C. The root node can be de-noted as IPT root(Ui). The leaf node which represents a ac-cessed resource of Ui can be defined as a five-tuple asIPT leaf = {RID,PE,PW,RG, level}, where RID denotes re-source ID, PE denotes current preference energy of certainaccessed resource, PW denotes the preference weight infor-mation of certain resource (the normalization of PE), RGdenotes the rating of Ui to certain resource (the scope is 1–5in this paper), and level denotes the layer number of thisnode. The non-leaf node can be defined as a four-tuple asIPTnormal = {ck,PW,RG, level}, where ck is the keyword ofthe level-th attribute of C (as CKlevel). A four-dimension at-tributes description model based individual preference treeis shown in Fig. 4. And additional definitions can be de-scribed as follows:

(1) The PW of non-leaf node k can be defined as the ratioof the sum of the PE value of all leaf nodes which be-long to k’s sub-tree to the sum of the PE value of allleaf nodes which belong to IPT(Ui). The calculation

formula can be described as follows, where σ(k,p) is0–1 function which denotes whether k is the predeces-sor of leaf node p, PEp denotes preference energy ofresource Rp . When k is a leaf node, the numerator ofthis formula can be simplified as PEk .

PWk =∑

p(PEp · σ(k,p))∑

p PEp

(2)

(2) The RG of non-leaf node k can be defined as the meanof RG value of all leaf nodes which belong to k’s sub-tree. The calculation formula can be described as fol-lows, where RGp denotes the rating of Ui to Rp .

RGk =∑

p(RGp · σ(k,p))∑

p σ(k,p)(3)

In learner’s IPT, each accessed resource corresponds toa unique path from root to relevant leaf node, and the key-words of all nodes located in this path correspond to the rel-evant keywords of Rj ’s attributes. And moreover:

Property 1 The PW value of h-th (0 < h < m + 1) levelnode is the sum of PW value of its entire immediate succes-sor which located at (h + 1)-th level.

Property 2 The keywords of h-th (0 ≤ h < m + 1) levelnode’s immediate successors are followed as lexicographi-cal order from left to right.

Based on Definitions 2 and 3, the update strategy of IPTcan be described as follows:

(1) Search the keywords of the latest accessed resource Rj ’sattributes in IPT(Ui) from top to down. If the keywordof s-th attribute can not be matched, the m − s + 2 newnodes (m − s + 1 non-leaf nodes and 1 leaf node) withlatter m − s + 1 attributes of resource will be created atthe proper position;

(2) Update the PE and PW of all leaf nodes according toformulas (1) and (2);

(3) Calculate PW of all nodes from bottom to up accordingto Property 1;

(4) Calculate RG of the new node’s all predecessors accord-ing to formula (3).

On the basis of above definition, the individual prefer-ence based RRD (called IPR) between resource Rj andlearner Ui can be reflected by the matching degree betweenRj and IPT(Ui). The relevant calculation rule is defined asfollows:

Rule 2 The rule of individual preference based RRD calcu-lation can be described as:

Cluster Comput (2010) 13: 213–239 221

(1) According to the sequence of resource attributes de-scription model C, the more deeply the relevant at-tributes of resource and the nodes’ keyword in IPT arematched, the larger the IPR will be;

(2) The larger the preference weight of matching nodes, thelarger the IPR will be;

(3) The larger the rating of matching nodes, the larger theIPR will be.

According to Rule 2, the matching path between certainresource and IPT should be found. The main process is asfollows: a certain resource Rj is matched with each level’sck value of IPT from top to down according to each dimen-sional attribute’s keyword of Rj . Then finding the longestpath as the matching path of Rj and IPT(Ui) denoted asMR(Ui,Rj ), where keywords of all nodes located in thispath are identical to corresponding keywords of resource’sattributes. Let t is the bottom node of MR(Ui,Rj ), the in-dividual preference based RRD (IPR) between resource Rj

and learner Ui can be calculated by the following formula:

IPR(Ui,Rj ) = α · Nor

( ∑

s∈MR(Ui ,Rj )

MWs.level · s.PW

)

+ (1 − α) · Nor(t.RG) (4)

Therein, α denotes the weight between preference weightinformation and learner rating information; Nor(x) is thenormalization function; MWk is the k-th level’s matchingweight of IPT. As the PW value of nodes may decreasewith the node’s depth increasing, in order to meet the firstitem of Rule 2, the MWk should increase with depth grow-ing. In this paper, it can be defined as MWk = CW−1

k , andMWm+1 = MWm when the current matching node is a leafnode.

In order to improve the efficiency of IPR calculation, theresources can be preliminarily filtered by setting a matchingdepth threshold ϕ. Here, the candidate resource set CSi (Ui)

of IPR can be defined as the certain resources whose match-ing depth with IPT(Ui) is at least ϕ.

4.1.2 Collaborative preference based RRD calculation

The core process of collaborative based resource recommen-dation is to find k neighbors which have the same or similarinterests and preference to target learner. Most of the exist-ing collaborative based recommendation algorithms usuallyrepresent the similarity between learners by computing sim-ilarity between two rating vectors in learner-resource ratingmatrix. However, as there is a large number of educationresources in P-Learning environments, the learner’s ratingdata will become very sparse (no more than 1% of resourcenumber), thus there are several drawbacks of the traditionalrating data based similarity calculation method:

(1) Although several algorithms can alleviate the effectof sparsity rating problem by unifying item-based anduser-based similarity calculation methods, this problemcan not be solved thoroughly due to bringing extra inac-curacy and adding the complexity which are mentionedin Sect. 2;

(2) The previous algorithms only take into account thelearner-resource rating matrix, and can not make useof resource attributes information and resource accesstimestamp information. So, it will lead to inaccurate re-sult of similarity calculation.

To address these drawbacks, our proposal is to integratethe rating information of learner to certain resources, rele-vant attributes of the accessed resources and dynamic pref-erence energy information based on learner’s IPT to solvethe sparsity rating problem.

Rule 3 In order to enhance the flexibility and accuracy ofsimilarity calculation, if there is some similarity betweenthe attributes of two learners’ accessed resources, these twolearners can still be judged as similar neighbors even ifthere are no identical accessed resources of them. Based onthis rule, the similarity between target learner and any otherlearners can be calculated effectively and accurately eventhough learner’s rating is sparse or the intersection betweentwo learners’ rating vector is small. Therefore, the IPT basedsimilarity calculation rule between two learners can be de-scribed as follows:

(1) The more similar the attributes of learner Ui and learnerUh’s accessed resources, the larger similarity betweenthese two learners will be;

(2) The more similar the dynamic preference weight withsame domain of learner Ui and learner Uh, the largersimilarity between these two learners will be;

(3) The more similar the rating data with same domain oflearner Ui and learner Uh, the larger similarity betweenthese two learners will be.

As described in Rule 3, the similarity between two learn-ers can be calculated based on the attributes intersection sub-tree between two relevant IPT which can be defined as fol-lows:

Definition 4 Attributes Intersection Sub-tree (AIS): forlearner Ui and Uh, AIS(Ui,Uh) denotes the maximum con-nected intersection between IPT(Ui) and IPT(Uh) withsame node’s keyword. Therein, for an arbitrary node u

located in a certain path of AIS, suppose that node u’sprojection node on IPT(Ui) and IPT(Uh) are PNu(Ui)

and PNu(Uh) respectively, then there is PNu(Ui).ck =PNu(Uh).ck (or PNu(Ui).RID = PNu(Uh).RID when theyare leaf node). As the ck and RID values of each node’s im-mediate successors follow a ascending order, the generating

222 Cluster Comput (2010) 13: 213–239

Fig. 5 The generation processof Attributes IntersectionSub-tree

strategy of AIS can be described as follows, and a exampleprocess is shown in Fig. 5 (the serial numbers denote therelevant matching order):

(1) The traversing process is started on the most left nodesof IPT(Ui) and IPT(Uh)’s second layer (level = 1), andthe current traversing node can be supposed as s and t

respectively;(2) If s.ck = t.ck (or s.RID = t.RID when s and t are leaf

node), then s and t are matching node for each other. Atthis time, a new node u will be added into the properposition in AIS and set: u.ck = s.ck = t.ck (or u.RID =s.RID = t.RID when s and t are leaf node), u.level =s.level = t.level, then s and t are pointed to the mostleft node of their’ immediate successors respectively;

(3) If s.ck < t.ck (or s.RID < t.RID when s and t are leafnode), then t is pointed to its right brother;

(4) If s.ck > t.ck (or s.RID > t.RID when s and t are leafnodes) or t = ∅ (t is not exist), then backtrack to the lastmatching node between IPT(Ui) and IPT(Uh), s and t

are pointed to their brother respectively, then goto (2);

(5) The recursion will be finished until IPT(Ui) or IPT(Uh)

are traversed completely.

Based on Rule 3 and combined with definition of AIS, thecalculation of similarity between two learners can be dividedinto two aspects as: preference weight based similarity andlearner rating based similarity.

Definition 5 The preference weight based similarity SimPW

can reflect the similarity between learner Ui and learnerUh’s dynamic interests and preferences. According to PWvalue of certain respective projection nodes on IPT(Ui) andIPT(Uh) which correspond to each node on AIS(Ui,Uh), thecalculation of SimPW(Ui,Uh) can be defined as formula (5).

Definition 6 The learner rating based similarity SimRG canreflect the similarity between the rating vectors of learnerUi and learner Uh. Different from traditional learner ratingbased similarity calculation methods, in order to overcomesparsity rating problem,

SimPW(Ui,Uh) =∑

u∈AIS(Ui ,Us)MWu.level · PNu(Ui).PW · PNu(Uh).PW

√∑s∈IPT(Ui)

MWs.level · s.PW2√∑

t∈IPT(Uh) MW t.level · t.PW2(5)

SimRG(Ui,Uh) =∑

l∈L |(PNl (Ui).RG − RGi ) · (PNl(Uh).RG − RGh)|√∑

l∈L(PNl (Ui).RG − RGi )2 ·√∑

l∈L(PNl(Uh).RG − RGh)2(6)

CPR(Ui,Rj ) =∑

Uh∈NUk(Ui)Sim(Ui,Uh) · IPR(Uh,Rj ) · χ(Uh,Rj )∑

Uh∈NUk(Ui)Sim(Ui,Uh)

(7)

the main concept of SimRG calculation is to computethe similarity between RG value of respective projec-tion nodes on IPT(Ui) and IPT(Uh) which correspondto each leaf node on AIS(Ui,Uh), and does not needto have the identical accessed resources between twolearners. Additionally, in order to reduce the deviation

of learners’ different rating scales, learner’s ratingdata should be modified. Therefore, the calculation ofSimRG(Ui,Uh) can be defined as formula (6), where L

denotes the leaf nodes set of AIS(Ui,Uh), RGi and RGh

denote the mean value of Ui and Uh’s rating data respec-tively.

Cluster Comput (2010) 13: 213–239 223

Moreover, the similarity between Ui and Uh can be cal-culated as follows, where β denotes the weight betweenSimPW and SimRG.

Sim(Ui,Us) = βNor(SimPW(Ui,Us))

+ (1 − β)Nor(SimRG(Ui,Us)) (8)

However, if a learner does not have enough similar learn-ers, traditional algorithms will generate a lot of dissimilarlearners which will definitely decrease the prediction accu-racy of active learner. Thus, in order to enhance efficiency ofcalculation, learners set should be preliminarily filtered viasetting a similarity matching threshold τ . The two learnersare effective similar neighbor only if the similarity betweenthem is at least τ . Then, the top k learners, which have thelargest k similarity between itself and Ui and meanwhile sat-isfy the effective requirement, are defined as Ui ’s nearest k

neighbors denoted as NUk(Ui). Let accessed resource setsof each learner located in NUk(Ui) are RS1,RS2, . . . ,RSk

respectively, then the collaborative based candidate resourceset can be defined as CSc(Ui) = RS1 ∪ RS2 ∪ · · · ∪ RSk . Foreach resource Rj in CSc , the collaborative preference basedRRD (called CPR) between Ui and Rj can be defined asformula (8), where IPR(Uh,Rj ) denotes the individual pref-erence relevant between Rj and Ui ’s similar neighbor Uh

and χ(Uh,Rj ) is a 0–1 function which denotes whether Rj

belongs to Uh’s accessed resource set.

4.1.3 Sequential pattern based RRD calculation

In P-Learning environments, the learning processes (re-source access processes) of learners usually have sometemporal-dependency relationship. For example, the re-source access sequence of a certain learning process is:{Lecture notes on higher mathematics, Reference on highermathematics, Exercise book on higher mathematics}. Un-der this circumstance the temporal-dependency relation-ship among all resources in this learning process can reflectlearner’s latent resource access pattern and preference. Forexample, when learner have accessed the lecture notes, it ishighly possible for learner to access the reference in nearfuture. Thus, in order to further improve the performanceof resource recommendation and solve new user problem (anew user, having very few ratings, often can not be able toget accurate recommendations) [17], the relevant resourceaccess sequential patterns are needed to be mined accordingto learners’ historical access records. Then, based on thesesequential patterns, the most probable resource which willbe accessed in near future can be predicted according tolearner’s recent accessed resources.

Definition 7 In P-Learning environments, the relative re-source access operations of a single login about certainlearner can be regarded as an access sequence defined as s ={R1,R2, . . . ,Rn}. So, according to each access sequence ofall learners, the access sequence set S can be obtained. Thesupport degree of a sequence s, denote as SD(s), is the num-ber of sequences in the access sequence set containing s.Given a positive integer μ as the min-support threshold, anaccess sequence sα is called a sequential pattern if the se-quence is contained by at least μ tuples in access sequenceset.

Thus, based on these definitions, the problem statementof access sequence mining in P-Learning environments canbe described as follows: Given access sequence set S andmin-support threshold μ, find the complete set of sequentialpatterns in the certain access sequence set.

The main idea of our resource access sequence min-ing strategy comes from a pattern-growth based sequentialpattern mining method called PrefixSpan, which has beendemonstrated with higher efficiency than others [33]. Themain process of PrefixSpan can be described as four steps:

(1) Find out all sequential patterns with length 1;(2) Divide search space according to the prefix;(3) Find out subsets of sequential patterns;(4) Execute step 3 recursively until there is not any more

sequential pattern which can be discovered. The detailsof this algorithm are described in [33].

However, in real P-learning environments, some accesssequential patterns can only be embodied into the behaviorsof some specific learners. Thus, if μ is large, these certainspecific sequential patterns could not be found out. On theother hand, if μ is small, a lot of ineffective access sequencemay be mined. Therefore, the clustering analysis of learnersis introduced in. Meanwhile, the access sequential patternmining process should be limited into a certain cluster toimprove the mining accuracy.

In order to reduce time overhead, only the static clus-tering is taken into consideration according to attributes ofL-Context. First of all, the attributes should be normalizedas follows.

For integer parameter, such as grade and age, can be nor-malized by formula (9):

Dint(i, j) = |xi − xj |xmax − xmin

, where 1 ≤ i ≤ j ≤ n and

xmax = max{x1, x2, . . . , xn}, (9)

xmin = min{x1, x2, . . . , xn}

224 Cluster Comput (2010) 13: 213–239

Fig. 6 The clustering basedaccess sequential pattern miningstrategy

Fig. 7 The example of clustering based access sequential pattern mining (a) access sequence set (b) sequential pattern tree

For string parameter, such as major and education type,can be normalized by formula (10):

Dstr(i, j) ={

1 if xi is equal to xj

0 otherwise(10)

And then, the static similarity of two learners for L-Context can be denoted as formula (11):

SimLearner(i, j) =∑n

f =1 ξ (f )D(i, j)(f )

∑nf =1 ξ (f )

(11)

Based on these formulas, all learners can be dividedinto k categories according to classical clustering algorithmin which k is a changeable parameter. Moreover the ac-cess sequence set S can also be divided into k subset asS1, S2, . . . , Sk according to these categories, where each el-ement of Sh is an access sequence corresponded to a cer-tain learner who belongs to the h-th learner category. Then,PrefixSpan algorithm could be applied to each access se-quence subset respectively. The detail of our strategy can bedescribed in Fig. 6.

For example, an access sequence set S1 and relevant se-quential pattern tree are shown in Fig. 7. Therein, supposethat there is only one cluster and the min-support thresholdis 2, the subscript of certain sequential pattern in sequentialpattern tree denotes the corresponding support degree.

Based on the sequential pattern tree of certain subset, theRRD of candidate resource can be calculated according tolearner’s recent access records to predict the most possible

resources which will be accessed in the future. In commonsense, the learner’s current accessed sequence can not bematched completely with existing sequential pattern, thusin order to reflect learner’s recent access situation and in-crease the effective recommendation, the sequential patternbased RRD should be divided into two parts: matching withthe latest accessed resource of target learner (LR1) as pre-fix; matching with the latest w accessed resources of targetlearner (LRw) as prefix sequence.

Suppose that the postfix resource sets of certain sequen-tial pattern tree with LR1 and LRw as prefix, which meansthat the resources which appear in the successor nodes ofLR1 and LRw in sequential pattern tree, are RSpos-1 andRSpos-W respectively. Meanwhile let the sequential patternset with LR1 and LRw as prefix (the successor nodes ofLR1 and LRw in sequential pattern tree) are PSpos-1 andPSpos-W respectively. For example, as in Fig. 7, let LR1 ={R2},LRw = {R1,R2}, then RSpos-1 = {R3,R4},RSpos-W ={R3,R4},PSpos-1 = {〈R2,R3〉2, 〈R2,R3,R4〉2, 〈R2,R4〉2},PSpos-W = {〈R1,R2,R3〉2, 〈R1,R2,R3,R4〉2, 〈R1,R2,

R4〉2}.

Rule 4 According to the properties of sequential patterntree, for certain prefix, the sequential pattern based RRD cal-culation rule can be described as follows.

(1) The more times a certain resource appears in PSpos-1 andPSpos-W, the larger the sequential pattern

Cluster Comput (2010) 13: 213–239 225

SPR(Ui,Rj ) = λ ∗∑

s∈PSpos-1ψ(FindRes(s − LR1,Rj )) · SD(s)

SD(LR1)∑

s∈PSpos-1ψ(FindRes(s − LR1,Rj ))

+ (1 − λ) ∗∑

s∈PSpos-Wψ(FindRes(t − LRW,Rj ))

SD(t)

SD(LRW )∑

s∈PSpos-Wψ(FindRes(t − LRW,Rj ))

(12)

{LRRDIC(Ui,Rj ) = ωIC · Nor(IPR(Ui,Rj )) + (1 − ωIC) · Nor(CPR(Ui,Rj ))

LRRDICS(Ui,Rj ) = ωsICS · Nor(SPR(Ui,Rj )) + ωi

ICS · Nor(IPR(Ui,Rj )) + (1 − ωsICS − ωc

ICS) · Nor(CPR(Ui,Rj ))(13)

based RRD of this resource will be. For example inFig. 7, when prefix is R1, the resource R2 which ap-pears in sequential patterns four times has the largestpossibility that will be accessed in the near future;

(2) The more anterior a certain resource appears in a se-quential pattern, the larger the sequential pattern basedRRD of this resource will be. For example in Fig. 7, fora sequential pattern 〈R1,R2,R3〉2,R2 will much morepossibly be accessed in near future than R3;

(3) The larger the ratio between support degree of sequen-tial pattern s which resource Rj is located in and thesupport degree of its predecessor sequential pattern, thelarger the sequential pattern based RRD of this resourcewill be. For example in Fig. 7, as R3 in 〈R1,R3〉2,the ratio between support degree of 〈R1,R3〉2 and sup-port degree of its predecessor 〈R1〉2 is 1; meanwhile asR4 in 〈R3,R4〉2, the ratio between support degree of〈R3,R4〉2 and support degree of its predecessor 〈R3〉3

is 2/3;

Based on Rule 4, the access sequence based recom-mendation candidate resource set can be defined as CSs =RSpos-1 ∪ RSpos-W. Then the sequential pattern based RRD(called SPR) between learner Ui and each resource in CSs

can be defined as formula (12).Therein, λ is weight between two different prefix match-

ing modes; SD function denotes the support degree of cer-tain sequential pattern; FindRes function can return the po-sition of resource Rj in sequential pattern without corre-sponding prefix, FindRes will return −1 when nothing wasfound; ψ(x) denotes the order weight of resource in sequen-tial pattern, where x denotes the resource’s order in cer-tain sequential pattern and there is ψ(−1) = 0 and ψ(1) >

ψ(2) > · · · > ψ(n) (we suppose that ψ(x) = 1/(x + 1) inthis paper).

As sequential pattern mining is an off-line process, SPRcan be calculated within O(|CSs | · (|PSpos-1| + |PSpos-W|).

4.1.4 Logic-based Resource Relevant Degree Generation

There are several drawbacks existing in traditional content-based and collaborative-based recommendation algorithms:

content-based recommendation algorithms can not recom-mend other neighbors’ accessed resources; collaborative-based recommendation algorithms can not recommend cer-tain resources which are similar to learner’s own interestsand are not accessed by any other neighbors. Thus the IPRand CPR should be combined. On the other hand, althoughSPR can bring more useful information in RRD calculation,the calculation of SPR will also lead to higher time com-plexity. Therefore, in order to make the trade-off betweenalgorithm running time and recommendation accuracy andimprove the accuracy of RRD further, SPR is taken into con-sideration as the optional part of our proposal, and then twologic-based RRD calculation strategies as IC-LR and ICS-LR are proposed relevantly. The former only considers IPRand CPR when computing logic-based RRD; the latter con-siders IPR, CPR and SPR simultaneously.

On the other hand, for purpose of avoiding over-speciali-zation problem (resources should not be recommended ifthey have been accessed by learner itself) [9], based on defi-nition of IPR, CPR and SPR, the candidate resource recom-mendation set of learner Ui for IC-LR and ICS-LR strate-gies can be described as follows, where RSi denotes Ui ’saccessed resource set.{

CSIC(Ui) = CSi (Ui) ∪ CSc(Ui) − RSi

CSICS(Ui) = CSi (Ui) ∪ CSc(Ui) ∪ CSs(Ui) − RSi

(14)

Based on them, the logic-based RRD (LRRD) calcula-tion between learner Ui and certain candidate resource Rj

for IC-LR and ICS-LR strategies can be defined respectivelyas formula (14), where ωIC denotes the weight between IPRand CPR for IC-LR, ωs

ICS and ωiICS denote the weight among

IPR, CPR and SPR respectively for ICS-LR.

4.2 Situation Based Resource Relevant Degree Calculation

Different from traditional recommendation algorithms, inP-Learning environments, the recommendation results canalso be affected greatly by other relevant contextual infor-mation of resources and learners, besides the logic-basedRRD between resources and learners, such as learner’s con-necting device and resource access response time (download

226 Cluster Comput (2010) 13: 213–239

time). Therefore, there are several drawbacks when applyingexisting recommendation algorithms to P-Learning environ-ments directly:

(1) As P-Learning environments can support several kindsof connecting mode, the existing recommendation algo-rithms may cause that the recommended resources’ typecan not match target learner’s connecting mode even ifthere are large logic-based RRD between target learnerand recommended resources, then it will lead to ineffec-tive recommendation;

(2) As education resources are distributed in P-Learningenvironments, the existing recommendation algorithmsmay cause that the access response time of recom-mended resources will be much longer even if there arelarge logic-based RRD between target learner and rec-ommended resources, and then it will also lead to inef-fective recommendation.

Consequently, to address these problems, the situationbased RRD should be calculated in recommendation processaccording to relevant contextual information. Therein, onlylearner’s connecting mode and resource access responsetime are taken into consideration in this paper.

In P-Learning environments, the education resources canbe divided into several types according to learner’s con-necting mode (such as PC/PDA/Cell Phone), therefore inrecommendation process only the certain resources whosetype are accord with learner’s connecting mode may satisfythe requirement. According to the definition of R-Contextand L-Context, suppose that learner’s connecting device isLi.connect.type, resource type of Rj is Rj .pro.type, thenthe learner’s connecting type based RRD (called CTR) canbe defined as follows, where Equal function will return 1 ifand only if two types are matched each other, and otherwisereturn 0.

CTR(Ui,Rj ) = Equal(Li.connect.type,Rj .pro.type) (15)

On the other hand, since education resources are distrib-uted in P-Learning environments, whether resource accessresponse time can satisfy learner’s requirement should betaken into account within recommendation process. Supposethat the resource storage nodes in current P-Learning en-vironments are SN1,SN2, . . . ,SNw respectively; the avail-able bandwidth between learner Ui and each storage nodeare NB1(Ui),NB2(Ui), . . . ,NBw(Ui) respectively; accord-ing to R-Context, for certain resource Rj , the correspond-ing storage node is Rj .Location and the size of Rj isRj .Profile.Size. Then resource access response time for Ui

can be defined:

RT(Ui,Rj ) = Rj .Profile.Size

NBRj .Location(Ui)(16)

Fig. 8 The example of time satisfaction function

Definition 8 Time Satisfaction Degree (TSD) and TimeSatisfaction Function (TSF): TSD(Ui,Rj ) denotes the bene-fit value corresponding to the access response timeRT(Ui,Rj ), and the value scope is [0,1]. Therein, 1 de-notes that learner can obtain maximum benefit; 0 denotesthat learner can not obtain any benefit from Rj . TSFi,j de-notes the mapping relationship between RT and TSD, andthere is TSD(Ui,Rj ) = TSFi,j (RT(Ui,Rj )).

The time satisfaction function will have different formsaccording to learner’s requirement. As shown in Fig. 8(a),TSF can be defined as a curve between reqmin and reqmaxwhen learner’s requirement of response time can be de-scribed as an interval [reqmin, reqmax]. As shown in Fig. 8(b),TSF can be defined as a unit step function when learner’s re-quirement of response time can be described as a point valuereq.

Based on calculation of CTR and TSD, the relative char-acteristics of resource access process in P-Learning environ-ments can be reflected effectively.

4.3 Context-aware based resource recommendationalgorithm PL-CR2

In summary, in order to adapt to the P-Learning envi-ronments, the logic-based RRD and situation-based RRDshould be combined to make resource recommendation.Therein, the certain resource can be denoted as effectivecandidate resource if and only if the resource’s type canmatch learner’s connecting mode and the resource access re-sponse time can satisfy learner’s lowest requirement. Thenintegration-based RRD (IRRD) between candidate resourceRj and target learner Ui can be defined as follows, where γ

denotes the weight between LRRD and TSD:

IRRD(Ui,Rj ) = γ · Nor(LRRD(Ui,Rj ))

+ (1 − γ ) · Nor(TSD(Ui,Rj )) (17)

ATR(Ui,Rj ) = 0 and TSD(Ui,Rj ) = 0

Consequently, a context-aware based resource recom-mendation algorithm for P-Learning environments called

Cluster Comput (2010) 13: 213–239 227

Fig. 9 The pseudocode of PL-CR2 algorithm

PL-CR2 is proposed. Firstly, the CTR and TSD of Ui andeach resource belonged to candidate resource set CS can becalculated according to formulas (15) and (16), and the re-sources which can not match Ui ’s connecting mode or cannot satisfy the lowest requirement of access response timecan be filtered. Then according to different logic-based RRDcalculation strategies, the IRRD between each candidate re-source Rj and target learner Ui can be computed basedon formula (17). Finally, the Top-N resources with largestIRRD are regarded as the Top-N resource recommendationset. The pseudocode of PL-CR2 is described in Fig. 9.

Analysis of time complexity In IPR calculation, accordingto formula (4), the IPR-based candidate resource recommen-dation set for Ui called CSi (Ui ) can be defined as certainresources whose matching depth are at least ϕ. Therefore,IPR can be calculated within O(|CSi | · m), where m is thesize of resource attributes description model C. In CPR cal-culation, according to formulas (5)–(8) and relevant defin-itions, as it does not need to consider item-based similaritycalculation (as mentioned in [24]), the similarity between Ui

and any other learner can be computed within O(p · |AIS|),where p is the size of learner set. Meanwhile, as the IPRbetween all learners and relevant resources have been al-ready computed, therefore the CPR can be calculated withinO(p · |AIS|+k · |CSc|). In SPR calculation, as sequential pat-tern mining process can be finished offline and does not needto update frequently, therefore the SPR can be calculated

within O(|CSs | · (|PSpos-1|+ |PSpos-W|) according to formula(12) and relevant definitions. In situation-based RRD calcu-lation, since there are two different logic-based RRD cal-culation strategies as IC-LR and ICS-LR and the relevantcandidate resource recommendation set for these two strate-gies are CSIC and CSICS respectively, according to formulas(15) and (16), the CTR and TSD can be calculated withinO(|CSi ∪ CSc|) and O(|CSi ∪ CSc ∪ CSs |) respectively. Insummary, the time complexity of PL-CR2 algorithm can bedefined as O(|CSi | ·m)+O(p · |AIS|+k · |CSc|)+O(|CSi ∪CSc|) for IC-LR strategy and O(|CSi | ·m)+O(p · |AIS|+k ·|CSc|)+O(|CSs | · (|PSpos-1| + |PSpos-W|)+O(|CSi ∪ CSc ∪CSs |) for ICS-LR strategy respectively. As m can be denotedas constant, there is O(|CSi | ·m)+O(|CSi ∪ CSc ∪ CSs |) ≈O(n).

On the other hand, for traditional recommendation algo-rithms, the time complexity of content-based recommenda-tion algorithms usually can be described as O(n · |U |) ≈O(n) [21]. The time complexity of improved collaborative-based recommendation algorithms usually can be describedas O(p · |Union Set| + k · |CSc| + k′ · |⋃Rt∈RSi SRS(Rt ) −RSi |) [24], where |Union Set| denotes the size of union setof two learner’s rating vectors, k and k′ denote the size ofUi ’s neighbor set and a certain resource’s similar resourceset respectively, SRS(Rt ) denotes Rt ’s similar resource set.For the content and collaborative combined recommenda-tion mechanism mentioned in [30], in order to improve theoriginal hybrid algorithm, we use improved collaborative-

228 Cluster Comput (2010) 13: 213–239

based recommendation algorithm [24] as the collaborative-based part. Therefore, time complexity of the improved hy-brid recommendation algorithm can be described as O(n)+O(p · |Union Set|+k · |CSc|+k′ · |⋃Rt∈RSi SRS(Rt )−RSi |).As |Union Set| and |AIS| usually have the same order, thusthe PL-CR2 algorithm has the same time complexity withhybrid recommendation algorithm and only a little higherthan content or collaborative based algorithms.

In conclusion, the improvement between PL-CR2 and ex-isting recommendation algorithms are as follows:

(1) The resource access preference energy is introduced toovercome the drawback that traditional recommenda-tion algorithms can not reflect learner’s dynamic pref-erence information in time;

(2) The individual preference tree is introduced to representlearner’s own interests and access preference, and therelationship between resource attributes and learner’sresource access records can be reflected comprehen-sively and can overcome the new resource problem [9](a new resource would not be able to be recommendeduntil it is rated by a substantial number of learners);

(3) The multi-dimension attributes of resources, dynamicpreference information and learner’s rating informationare taken into account when computing RRD betweenUi and certain candidate resource Rj to overcome thesparsity rating problem (as it does not need strict match-ing);

(4) The individual preference based RRD and collabora-tive preference based RRD are taken into considera-tion synthetically to overcome the drawbacks of individ-ual content-based and/or collaborative-based algorithmsand improve the adaptability of recommendation;

(5) The sequential pattern of learner’s access records is con-sidered to reflect the repetition and periodicity charac-teristic of learning process and grasp learner’s accessrules. Meanwhile, this mechanism can overcome thenew user problems [9];

(6) According to situation-based RRD, the candidate re-source set can be filtered and ranked by using learner’srelevant contextual information. This mechanism canmuch more factually reflect the characteristic of P-Learning environments and can lead to better integratedutility.

5 Simulation and performance evaluation

In this section, we conduct several simulations to eval-uate recommendation performance of our new approach(since there are two different logic-based RRD calcula-tion strategies, our proposed algorithm can be described asPL-CR2

IC and PL-CR2ICS respectively). As existing recom-

mendation algorithms only consider logic-based RRD and

there is not any publicly available data set which can pro-vide all of the relevant contextual information (learner’sconnecting mode, storage location of resources and cor-responding bandwidths), in order to enhance validity ofevaluation process, the simulations of PL-CR2 are dividedinto two parts: the performance evaluation of recommenda-tion algorithm only considering logic-based RRD (γ = 1)

which corresponds to PL-CR2IC (γ = 1) and PL-CR2

ICS (γ =1); the performance evaluation of recommendation algo-rithm with considering integration-based RRD (0 < γ < 1)

which corresponds to PL-CR2IC (0 < γ < 1) and PL-CR2

ICS(0 < γ < 1).

5.1 Simulation environment and data set

All simulations in this paper were performed on DELL 5150PC with Intel Pentium 4 3.00 GHz CPU and 1.5 GB RAM,and Windows XP operating system. The algorithm is imple-mented by Matlab 7 and VC++ 6.0.

To evaluate PL-CR2 (γ = 1) algorithm, two real-worlddatasets are applied in our simulations as MovieLens [34]and D-Lib records. Therein, MovieLens is a web-based re-search recommender system that debuted in fall 1997. Theopen part of MovieLens dataset contains 100,000 ratings(1–5 scales) with corresponding access timestamp from 943users on 1682 movies where each user has rated at least20 items, the movies’ type (19 categories), release date andusers’ basic information (such as age, gender, occupationand zip code). The D-Lib records dataset comes from thedatabase of Southeast University library from August, 2000to February, 2002. After certain pretreatment process, the ef-fective part of this dataset contains 585516 lending recordsfrom 23836 users on 375657 books where each record con-tains timestamp (as lending-date which is accurate to sec-onds) and rating information (as the ratio of certain lendingtime segment to maximum lending time segment), mean-while it also contains books’ type information and users’basic information. The characteristics of these datasets aresummarized in Fig. 10. In simulations, the dataset will be or-dered by users’ access timestamp, and then be divided intotraining set and test set. For this purpose, we introduced avariable x that determines what percentage of data is usedas training and test sets. In order to increase the number ofrecords in test set as much as possible so as to eliminate theeffect of accidental factor, x will be set to 0.6. It means thatthe top 60% access records of each user in ordered datasetare used as training set and the remnant 40% access recordsare used as test set.

To evaluate PL-CR2 (0 < γ < 1) algorithm, as there isnot any publicly available dataset which can provide all ofthe relevant contextual information (such as learner’s con-necting mode and network bandwidths), we will build two

Cluster Comput (2010) 13: 213–239 229

Fig. 10 The characteristics of two datasets used in simulations

new datasets based on MovieLens and D-Lib records respec-tively to reflect the characteristics of P-Learning environ-ments via adding the type, size and corresponding storagenode to each item, meanwhile setting corresponding band-widths information between each user and all storage nodesand each user’s response time requirement of resource ac-cess.

In this paper, due to the space limitation we only reportsimulation results of MovieLens dataset. Similar results canbe observed from D-Lib records dataset.

5.2 Evaluation metrics

In this paper, the evaluation metrics of recommendationalgorithm are divided into two categories as logic-basedevaluation metrics and integration-based evaluation metrics.The former can be defined as precision [22] shown in for-mula (18), where p is the size of user set, RRS(Ui) denotesthe recommendation set of user Ui , |TestSet ∩ RRS(Ui)| de-notes the hits number of Ui ’s recommendation set.

Precision =∑p

i=1 |TestSet ∩ RRS(Ui)|p · |RRS(Ui)| (18)

The latter can be defined as resource access integrationutility and validity, where integration utility can be reflectedby mean time satisfaction (MTS), mean logic-based RRD(MLR) and mean integration-based RRD (MIR) synthet-ically; and validity denotes the ratio of the number of re-sources which can match user’s connecting mode and cansatisfy the lowest response time requirement of resource ac-cess to the number of all resources. The details can be de-fined as follows, where Hits set denotes the number of re-sources which can match user’s connecting mode and cansatisfy the lowest response time requirement of resource ac-cess.⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

MTU =∑p

i=1

∑|Top-N |j=1 TSD(Ui,Rj )

p · |Top-N(Ui)|

MLR =∑p

i=1

∑|Top-N |j=1 LRD(Ui,Rj )

p · |Top-N(Ui)|

MIR =∑p

i=1

∑|Top-N |j=1 IRD(Ui,Rj )

p · |Top-N(Ui)|Validity = |Hits|

p · |Top-N(Ui)|

(19)

Fig. 11 Relevant input parameters of simulation

5.3 Performance evaluation of PL-CR2 based on LRRD

In this section, the PL-CR2IC (γ = 1) and PL-CR2

ICS (γ = 1)

are compared with vector space model-based content-basedrecommendation algorithm [21], user and item combinedcollaborative-based recommendation algorithm [24] and im-proved hybrid recommendation algorithm [30] (as men-tioned before) to measure the performance based on tradi-tional recommendation environment and dataset. In relevantinput parameters, N denotes the number of recommenda-tion resources; p denotes the number of participated userswhich are selected from MovieLens dataset to build sim-ulation dataset. The value range of N , p and k are set inFig. 11. Meanwhile, in order to simplify SPR calculationprocess, the user’s access records are clustered accordingto timestamp to build shorter sequences, and min-supportthreshold μ of PrefixSpan is set to 5% which means theproportion of satisfied sequences to the number of overalllearners.

5.3.1 The impact of parameters on PL-CR2

Firstly, we will analyze how does α, β , λ, ωIC, ωsICS and

ωiICS affect the recommendation performance of PL-CR2

IC(γ = 1) and PL-CR2

ICS (γ = 1) in order to determine the val-ues of these parameters (as different dataset may correspondto different optimal value of these parameters, this analysisprocess is just based on MovieLens dataset). Figure 12(a)shows the impacts of α on the precision of PL-CR2

IC (γ = 1)

while N = 30, p = 250, K = 20 and ωIC = 1 (only considerIPR). It indicates that taking into consideration preferenceweight and user’s rating information synthetically to calcu-late IPR will play a positive role in recommendation process,but α does not acknowledge ‘the larger the better’ rule: thebest precision can be obtained with α = 0.6.

Figure 12(b) shows the impacts of β on the precision ofPL-CR2

IC (γ = 1) while N = 30, p = 250, K = 20, α = 0.6and ωIC = 0 (only consider CPR). It indicates that combin-ing preference weight and user’s rating information to calcu-late similarity between two users will lead to better recom-mendation performance, and the best precision can be ob-tained with β = 0.8.

Figure 12(c) shows the impacts of λ on the precisionof PL-CR2

IC (γ = 1) while N = 30, p = 250, K = 20 and

230 Cluster Comput (2010) 13: 213–239

Fig. 12 The impact of parameters about PL-CR2 based on traditional recommendation environment

ωiICS = ωc

ICS = 0 (only consider SPR). It indicates that tak-

ing into consideration LR1 and LRw as the prefix to calculate

SPR will add the effective recommended resources, the best

precision can be obtained with λ = 0.3.

Cluster Comput (2010) 13: 213–239 231

Figure 12(d) shows the impacts of ωIC on the preci-sion of PL-CR2

IC (γ = 1) with different p while N = 30,K = 20, α = 0.6 and β = 0.8. It can indicate that com-bining content-based and collaborative-based recommenda-tion strategies will overcome the drawbacks of unitary al-gorithms and lead to better recommendation results. Gen-erally, according to different situation, the optimal value ofωIC will be changed dynamically. When p is small, as it cannot obtain the similar users effectively, the best performancecan be obtained when ωIC is set to about 0.5. When p islarge, as the similarity between two users can be calculatedmuch more accurately, a smaller ωIC value will lead to bet-ter results. However, since the calculation of CPR does notconsider individual preference RRD between target user andcandidate resources, when ωIC is too small (as neglect IPR),the better results can not be obtained either (the best generalprecision can be obtained with ωIC = 0.3).

Figure 12(f) shows the impacts of ωsICS and ωi

ICS on theprecision of PL-CR2

ICS (γ = 1) with different p while N =30, K = 20, α = 0.6, β = 0.8 and λ = 0.3. As there maybe too many alternative combinations with different p, ωs

ICSand ωi

ICS, in order to simplify simulation process, the pro-portion between IPR and CPR will be kept as close as possi-ble to optimal ωIC value shown in Fig. 12(d) and then ωs

ICSbecomes the main variable which can indirectly determinethe ωi

ICS value. Based on it, the available alternative com-binations of weight value among IPR, CPR and SPR usedin this simulation can be described in Fig. 12(e), where themin-granularity of weight changing is 0.1. The simulationresult can indicate that introducing sequential pattern-basedRRD into recommendation process can predict learner’s be-havior more accurately, and may obtain better results tosome extent. Anyway, although the performance of SPRcan becomes better and better with the increasing of learn-ers’ access sequences, as collaborative based recommenda-tion mechanism usually can make better use of other users’information, a large ωs

ICS value usually leads to worse re-sults. Consequently, the best precision can be obtained withωs

ICS = ωiICS = 0.2 empirically.

Based on the analysis mentioned above, according toMovieLens dataset, the values of these weights are set asα = 0.6, β = 0.8, λ = 0.3, ωIC = 0.3 and ωs

ICS = ωiICS = 0.2

for the following simulations.

5.3.2 The comparison of five algorithms

The first comparison simulation is to compare the precisionof five recommendation algorithms with respect to p (thenumber of participated users) while N = 30 and K = 20.As shown in Fig. 13(a), with the increase of p, the preci-sion of each algorithm is increasing except content-basedalgorithm, where PL-CR2

IC (γ = 1) and PL-CR2ICS (γ = 1)

always produces better performance than any other algo-rithms. This is because that when p is small, as users’ in-formation can not be utilized efficiently, the collaborativebased recommendation mechanism can not detect effectivesimilar users and will lead to bad result. With p increas-ing, as users’ information can be utilized much more effi-ciently, the performance of collaborative based mechanismwill be enhanced gradually. Meanwhile, due to combin-ing content-based and collaborative-based recommendationmechanism, improved hybrid recommendation algorithm issomewhat better than either content-based or collaborative-based mechanism alone. On the other hand, as PL-CR2

IC(γ = 1) and PL-CR2

ICS (γ = 1) can not only combinecontent-based and collaborative-based mechanism but alsotake into consideration the time-related dynamic preference,multi-dimensional attributes of resources and users’ ratinginformation synthetically, the better results can be obtainedwhenever the number of p is small or large. Besides, whenp is small, since there are not enough access sequences, PL-CR2

ICS (γ = 1) will get worse performance than PL-CR2IC

(γ = 1). With the increase of p, PL-CR2ICS (γ = 1) algo-

rithm will get much more effective sequential patterns, thusit can make better results than PL-CR2

IC (γ = 1) to someextent.

The second comparison simulation is to compare the pre-cision of five recommendation algorithms with respect toK (the number of similar neighbors) while N = 30 andp = 250. As shown in Fig. 13(b), when K is limited in acertain value range, with the increasing of K , the precisionof each algorithm is increasing except content-based algo-rithm. When K reaches to a certain extent, with increasingof K , the precision of each algorithm is decreasing, espe-cially for collaborative-based algorithm. Moreover duringthe change of K , PL-CR2

IC (γ = 1) and PL-CR2ICS (γ = 1)

always produces better performance than any other algo-rithm. The reason is that when K increases to a certain ex-tent, since several dissimilar users may be denoted as simi-lar users by collaborative-based algorithm, the correspond-ing recommendation accuracy will decrease. For improvedhybrid algorithm, as it also considers content-based mech-anism, there is less performance degradation compared tocollaborative-based algorithm. However, PL-CR2

IC (γ = 1)

and PL-CR2ICS (γ = 1) will set a threshold in the sim-

ilar users calculation process to guarantee the quality ofthem. Meanwhile the dynamic users’ preference and the re-source’s attributes are taken into account based on tradi-tional collaborative-based mechanism. Therefore they canfind effective similar users more accurately, and then onlylead to a little performance degradation. Besides, with thesame reasons mentioned before, PL-CR2

ICS (γ = 1) algo-rithm can obtain better results than PL-CR2

IC (γ = 1) tosome extent.

The third comparison simulation is to compare the pre-cision of five recommendation algorithms with respect to N

232 Cluster Comput (2010) 13: 213–239

Fig. 13 The comparison of five algorithms’ performance based on traditional recommendation environment

(the number of recommendation resources) while p = 250and K = 20. As shown in Fig. 13(c), with the increasingof N , the precision of each algorithm is decreasing. Andmoreover PL-CR2

IC (γ = 1) and PL-CR2ICS (γ = 1) always

produces better performance than any other algorithm, espe-cially when N is small. It is because that during the chang-ing process, according to formula (18), the numerator anddenominator of precision will increase synchronously, butdenominator gets the higher increasing rate. In addition, asPL-CR2

IC (γ = 1) and PL-CR2ICS (γ = 1) can make good use

of the advantages of content-based and collaborative-basedrecommendation mechanism while integrate three kinds ofinformation: multi-dimensional attributes of resource, users’rating and users’ access preference weight, hence the ac-tual preference and interests of users can be reflected ac-curately. Based on them, PL-CR2

IC (γ = 1) and PL-CR2ICS

(γ = 1) can filter and rank the candidate resources much

more efficiently, therefore, the most improvement will beobtained when N is small. In addition, as PL-CR2

ICS (γ = 1)

can utilize more effective information as the dependencerelationships among users’ access sequences, consequentlyit can make some further improvement based on PL-CR2

IC(γ = 1).

The fourth comparison simulation is to measure the meanrunning times of all algorithms for single learner with re-spect to p (the number of participated users) while N = 30and K = 20. As shown in Fig. 13(d), at most time, content-based recommendation algorithm is faster than any otheralgorithms. The running time of PL-CR2

IC (γ = 1), PL-CR2

ICS (γ = 1) and improved hybrid recommendation al-gorithms are slightly larger than collaborative-based algo-rithm. In summary, based on simulations mentioned above,although PL-CR2

ICS (γ = 1) will get better results in mostcase, it will cost the largest running time due to considering

Cluster Comput (2010) 13: 213–239 233

Fig. 14 The comparison of PL-CR2IC (γ = 0.5) and PL-CR2

IC (γ = 1) on integration utility (a) and validity (b)

sequential pattern based RRD. Therefore, there is a trade-offbetween algorithm running time and recommendation preci-sion when choosing PL-CR2

ICS (γ = 1) algorithm.

5.4 Performance evaluation of PL-CR2 based on IRRD

The purpose of this section is to analyze and validate that therecommendation algorithms with considering integration-based RRD can adapt to P-Learning environments much bet-ter. As most of the existing recommendation algorithms areonly based on logic-based RRD, without loss of general-ity, we only take into account PL-CR2

IC (γ = 1) algorithmas the representative of logic-RRD based algorithms. Then,the better performance of integration-RRD based algorithmon resource access integration utility and validity metricsare validated according to comparing PL-CR2

IC (0 < γ < 1)

with PL-CR2IC (γ = 1). In order to simplify the simula-

tion process, only PL-CR2IC (γ = 0.5) is considered as the

integration-RRD based algorithm.As there is not any publicly available datasets which can

provide the relevant information such as learners’ connect-ing mode and network bandwidths, therefore the new datasetwhich can accord with characteristics of P-Learning envi-ronments can be built based on MovieLens dataset. The gen-eration strategy is described as follows:

(1) Setting 3GP format and AVI format to each movie ran-domly, where the former aims at cell phone users andthe latter aims at PC users;

(2) Setting S storage nodes in P-Learning environments asSN1,SN2, . . . ,SNs , and each movie is belonging to acertain storage node on probability 1/S;

(3) Setting the connecting mode as cell phone or PC to eachuser randomly, meanwhile the bandwidth between each

user and each storage node and the resource access re-sponse time (as [reqmin, reqmax]) are set to each userwith certain probability distribution function.

Based on them, the PL-CR2IC (γ = 0.5) and PL-CR2

IC(γ = 1) are compared on this newly generated dataset withN = 30, p = 250, K = 20, α = 0.6, β = 0.7, λ = 0.3 andωIC = 0.3. The simulations will be repeated 10 times.

The fifth comparison simulation is to compare the re-source access integration utility (represented by MTS, MLRand MIR synthetically) of PL-CR2

IC (γ = 0.5) and PL-CR2

IC (γ = 1) to prove that combination of the logic-basedRRD and response time satisfaction degree can make a bet-ter adaptability of recommendation for P-Learning environ-ments. As shown in Fig. 14(a), as the candidate resourcesare sorted according to LRRD in PL-CR2

IC (γ = 1), it willlead to the higher MLR (γ = 1). However, due to neglect-ing the response time satisfaction degree during resourceranking process, PL-CR2

IC (γ = 1) will lead to longer ac-cess response time (even out of users’ tolerance range), andthen will affect user’s resource access performance (as lowerMTS (γ = 1)). By comparison, in PL-CR2

IC (γ = 0.5) algo-rithm the candidate resources are sorted according to IRRD.Under this situation, although MLR (γ = 0.5) will decreaseto some extent, the corresponding MTS (γ = 0.5) will beimproved greatly. Therefore from the users’ point of view, itwill lead to a better integration performance (as higher MIR(γ = 0.5)) and will adapt the characteristics of P-Learningenvironments (distributed storage and multi-mode connect-ing) much better.

The sixth comparison simulation is to compare the re-source access validity performance of PL-CR2

IC (γ = 0.5)

and PL-CR2IC (γ = 1). As shown in Fig. 14(b), the Valid-

ity of PL-CR2IC (γ = 0.5) is always 1, but Validity of PL-

234 Cluster Comput (2010) 13: 213–239

Fig. 15 The integration architecture of our model into SEU-ESP platform

CR2IC (γ = 1) is much lower. It is because that for PL-

CR2IC (γ = 1) algorithm, during candidate resource ranking

process the resource type matching and access time satis-faction degree matching are neglected, thus there are sev-eral ineffective resources in the recommendation resourceset, whose type can not match user’s connecting mode orthe relevant access response time exceeds the user’s tol-erance. On the other hand, for PL-CR2

IC (γ = 0.5) algo-rithm, as considering CTR and TSD during building can-didate resource set, the ineffective resources will be fil-tered, the validity of recommended resources can be guar-anteed.

6 Prototype implementation and demonstration

In this section, we implement the context-aware resourcerecommendation model and corresponding algorithm pro-posed in this paper based on SEU-ESP (SouthEast Univer-sity digital Education public Services supporting Platform),meanwhile describe a usage case for demonstrating how ourproposal can help learners in the learning process.

6.1 SEU-ESP

SEU-ESP is a SOA based platform which can support in-tegration and share of massive, distributed and heteroge-neous education resources. This platform can provide reli-able registration, organization, storage and management ser-vices to eliminate the ‘resource island’. Learners can accessand download these education resources from this platformtransparently at any time, any where via multi-mode con-necting. This project is one part of the national key tech-

nologies R&D program of China during 11th five-year planperiod which has been supported by the Ministry of Scienceand Technology.

SEU-ESP mainly consists of three parts:

(1) Resources Acquisition Module which achieves the en-capsulation, registration, organization and managementof massive education resources;

(2) Resources Integration and Sharing Module which re-alizes the integration and sharing of massive, distrib-uted and heterogeneous education resources accordingto certain storage scheduling strategy based on Gridconcept;

(3) Access and Operation Module which supports multipleconnecting modes (such as desktop PC, notebook andPDA), the operation of system portal and the learners’management.

Based on SOA concept, we have developed several sup-port services to provide these functions to satisfy system’srequirement. Therein, all support services can be dividedinto two categories as External Service and Internal Ser-vice, where ES should provide the service interface to upperlayer calling, and IS should provide the interface and certainfunction to support ES.

The OS of SEU-ESP is Windows 2003 Server, and theservice container is based on IBM WebSphere Applica-tion Server (WAS), the directory service is Tivoli DirectoryServer (LDAP) of WebSphere Toolkit and database is DB2.The development platform is Rational Application Devel-oper v7.0 of WebSphere Toolkit based on J2SE and J2MEversions.

Cluster Comput (2010) 13: 213–239 235

Fig. 16 The class graph of resource recommendation prototype system

6.2 The implementation of context-aware resourcerecommendation module and algorithm

In order to verify our proposal, we have implemented arelevant prototype system based on the SEU-ESP platformas shown in Fig. 15. Therein, the development platform isbased on rational application developer v7.0 of WebSphereToolkit and DB2. The support service selection module isdeveloped as centralized mode and the resource recommen-dation module is embedded into study center as a distrib-uted mode for supporting the large-scale accessing request.In addition, the implemented class and relevant functions aredescribed in Fig. 16.

The execution sequence of recommendation process canbe described as follows: Recommendation request inputfrom learner through different connecting devices is submit-ted to the service selection module which can make use oflearners and relevant dynamic information of support ser-vices to select the most proper study center and support ser-vice to learners. Then, in resource recommendation process,context management should collect learner and his neigh-bor’s preference, historical resource access records and rel-ative dynamic information about resources. Based on theseinformation, resource recommendation module can calcu-late resources relevance to learner’s preference and find k

neighbors which are most similar to learner to collaborate

236 Cluster Comput (2010) 13: 213–239

Fig. 17 The recommendationpage of the prototype system

for recommendation; meanwhile the sequential patterns arealso be considered to predict resource accessing in near fu-ture. Then the ranked candidate resource list can be providedto learner as recommendation result. At last, after accessinga certain resource, the relevant context information, such aslearner’s IPT, will be updated.

6.3 Usage demonstration

We have deployed this prototype system to our two differentcampuses via high speed optical fiber. There is a study centerlocated in each campus whose IP address is 172.18.12.249and 10.3.6.124 respectively. The hardware is the IBM X335servers with a 3.0 GHz CPU and 2.5 GB of memory. Fur-thermore, we have registered several learners’ information,preference and several resources which belong to variety ofsubjects into database. Here, we suppose that there is a grad-uate student Tommy who wants to get several interestingcoursewares from this recommendation system.

He connects to this system via his desktop PC located athis lab and then submits recommendation request and theresource access response time requirement to service selec-tion module. As Tommy’s IP address is 172.18.12.141, thusbased on each support service’s S-context and learner’s L-context, the system should select corresponding recommen-dation service deployed on Study Center A (172.18.12.249).This selection process is transparent to learner. In resource

recommendation process, all candidate resources will be fil-tered and ranked by calculating IPR, CPR, SPR, CTR andTSD between Tommy and resource Rj according to L-context and R-context. Then, the recommendation list canbe provided via recommendation page as shown in Fig. 17.Finally, after clicking first resource of the list, the detailedinformation of this education resource can be obtained fromresource information page, and then learner could downloadit and rate it as shown in Fig. 18.

7 Conclusion and future works

In order to enhance the learning efficiency, it is signifi-cant to provide the resource recommendation mechanismbased on the context information in P-Learning environ-ments. However, as the amount of education resources isvery large and learners’ preference and relevant contextualinformation changes dynamically, there are several draw-backs when applying the existing recommendation algo-rithms to P-Learning environments directly. To address thesedrawbacks, with taking into consideration the relevant con-textual information, a context-aware based resource recom-mendation model and relevant algorithm for P-Learning en-vironments called PL-CR2 are presented in this paper. InPL-CR2 algorithm, in order to increase recommendation ac-curacy, the content-based and collaborative-based recom-mendation concepts are combined together while consider-

Cluster Comput (2010) 13: 213–239 237

Fig. 18 The resourceinformation page of theprototype system

ing multi-dimensional attributes of resources, learners’ dy-namic preference weight, relevant rating information and ac-cess sequential pattern. On the other hand, in order to im-prove the recommendation adaptability for P-Learning en-vironments, the connecting type relevant and time satisfac-tion degree of certain resource can be calculated accord-ing to learners’ dynamic context (such as connecting modeand network bandwidth). The simulation results show thatour algorithm can surpass traditional recommendation algo-rithms significantly in precision, recall, integrated utility andvalidity metrics and could be more suitable for P-Learningenvironments. Moreover, the prototype system for our rec-ommendation model and algorithm based on SEU-ESP isimplemented to demonstrate the detailed recommendationprocess and results.

In the future, we plan to continue our work on how tomake use of more access rules of learners to further improvethe recommendation performance. As learning processesusually have several periodicities and specific schedule, forexample ‘operating system lecture notes’ will be accessedwith higher probability by certain learners whose learningscheme includes certain subject or secondary subject. Thus,we may consider tracking and analyzing learner’s learningschedule, and then the learners’ preference and behaviorscan be predicted piecewise to personalize the recommenda-tion and improve the performance. Another interesting pathto follow would be to investigate how to automatically setand adjust the personalized weight parameters of PL-CR2

with different learners to make a further improvement. Lastbut not least, we will focus on the scalability optimization ofour algorithm.

Acknowledgements This work is supported by National NaturalScience Foundation of China under Grant Nos. 60773103 and 9041-2014, China Specialized Research Fund for the Doctoral Program ofHigher Education under Grant No. 200802860031, Jiangsu ProvincialNatural Science Foundation of China under Grant Nos. BK2007708and BK2008030, Jiangsu Provincial Key Laboratory of Network andInformation Security under Grant No. BM2003201, and Key Labora-tory of Computer Network and Information Integration, Ministry ofEducation of China, under Grant No. 93K-9.

References

1. Weiser, M.: The computer of the 21th century. Sci. Am. 265(3),66–75 (1991)

2. Schilit, B., Adams, N., Want, R.: Context-aware computing appli-cations. In: Proceedings of IEEE Workshop on Mobile ComputingSystems and Applications, pp. 85–90. Santa Cruz, California, De-cember 1994

3. Packard, H.: Cooltown Project. http://www.cooltown.com/cooltown/index.asp (2004)

4. Dey, A., Salber, D., Abowd, G.: A conceptual framework and atoolkit for supporting the rapid prototyping of context-aware ap-plications. Hum.-Comput. Interact. 16 (2001)

5. Mostefaoui, S.K., Bouzid, A.T., Hirsbrunner, B.: Using contextinformation for service discovery and composition. In: Proc. of5th International Conference on Information Integration and Web-based Applications and Services, pp. 129–138 (2003)

238 Cluster Comput (2010) 13: 213–239

6. Roman, M., et al.: A middleware infrastructure to enable activespaces. IEEE Pervas. Comput. 1(4), 74–83 (2002)

7. Ryan, N., Pascoe, J., Morse, D.: Enhanced reality fieldwork: Thecontext-aware archaeological assistant. Comput. Appl. Archaeol.(2007)

8. Yang, S.J.H.: Context aware ubiquitous learning environmentsfor peer-to-peer collaborative learning. Educ. Technol. Soc. 9(1),188–201 (2006)

9. El-Bishouty, M.M., Ogata, H., Yano, Y.: PERKAM: Personal-ized knowledge awareness map for computer supported ubiquitouslearning. Educ. Technol. Soc. 10(3), 122–134 (2007)

10. Yang, S.J.H., Huang, A.F.M.: Context model and context acqui-sition for ubiquitous content access in U-Learning environments.In: Proceedings of the IEEE International Conference on SensorNetworks, Ubiquitous, and Trustworthy Computing (2006)

11. Li, L., Zheng, Y., Ogata, H., Yano, Y.: A framework of ubiquitouslearning environments. In: Proceedings of the 4th InternationalConference on Computer and Information Technology (2004)

12. Ogata, H., Yano, Y.: Context-aware support for computer-supported ubiquitous learning. In: Proceedings of the 2nd IEEEInternational Workshop on Wireless and Mobile Technologies inEducation (2004)

13. Yang, T.-C., Kuo, F.-R., Hwang, G.-J., Chu, H.-C., , A.: Computer-assisted approach for designing context-aware ubiquitous learningactivities. In: IEEE International Conference on Sensor Networks,Ubiquitous, and Trustworthy Computing, pp. 524–530 (2008)

14. Syukur, E., Loke, S.W.: MHS learning services for pervasive cam-pus environment. In: Proceedings of the 4th IEEE InternationalConference on Pervasive Computing and Communications Work-shops (2006)

15. Siobhan, T.: Pervasive, persuasive eLearning: modeling the perva-sive learning space. In: Proceedings of the 3rd IEEE InternationalConference on Pervasive Computing and Communications Work-shops (2005)

16. Rogers, Y., Price, S., Randell, C., Fraser, D.S., et al.: Ubi-learningintegrating indoor and outdoor learning experiences. Commun.ACM 48(1), 55–59 (2005)

17. Joiner, R., Nethercott, J., Hull, R., Reid, J.: Designing educationalexperiences using ubiquitous technology. Comput. Human Behav.22(1), 67–76 (2006)

18. Adomavicius, G., Tuzhilin, A.: Toward the next generation of rec-ommender systems a survey of the state-of-the-art and possibleextensions. IEEE Trans. Knowl. Data Eng. 17(6) (2005)

19. Mostafa, J., Mukhopadhyay, S., Lam, W., Palakal, M.: A multi-level approach to intelligent information filtering: model, system,and evaluation. ACM Trans. Inf. Syst. 15(4), 368–399 (1997)

20. Pazzani, M., Billsus, D.: Learning and revising user profiles: theidentification of interesting web sites. Mach. Learn. 27, 313–331(1997)

21. Chen, L., Sycara, K.: WebMate: a personal agent for browsing andsearching. In: Proceedings of the Second International Conferenceon Autonomous Agents (1998)

22. Chen, C.C., Chen, M.C.: PVA: a self-adaptive personal viewagent. J. Intell. Inf. Syst. 18(2/3), 173–194 (2002)

23. Wang, J., de Vries, A.P., Reinders, M.J.T.: Unifying user-basedand item-based collaborative filtering approaches by similarity fu-sion. In: Proceedings of the 29th International ACM SIGIR Con-ference on Information Retrieval (2006)

24. Ma, H., King, I., Lyu, M.R.: Effective missing data prediction forcollaborative filtering. In: Proceedings of the 30th InternationalACM SIGIR Conference on Information Retrieval (2007)

25. Sarwar, B., Karypis, G.: Item-based collaborative filtering recom-mendation algorithms. In: Proceedings of the 10th InternationalConference on World Wide Web (2001)

26. Ziegler, C.N., McNee, S.M., Konstan, J.A.: Improving recommen-dation lists through topic diversification. In: Proceedings of the14th International Conference on World Wide Web (2005)

27. Adomavicius, G., Sankaranarayanan, R.: Incorporating contextualinformation in recommender systems using a multidimensionalapproach. ACM Trans. Inf. Syst. 23(1) (2005)

28. Deshpande, M., Karypis, G.: Item-based Top-N recommendationalgorithms. ACM Trans. Inf. Syst. 22(1), 143–177 (2004)

29. Linden, G., Smith, B., York, J.: Amazon.com recommendations:Item-to-item collaborative filtering. In: IEEE Internet Computing,pp. 76–80, January–February 2003

30. Claypool, M., Gokhale, A., Miranda, T., Murnikov, P., Netes, D.,Sartin, M.: Combining content-based and collaborative filters in anonline newspaper. In: Proceedings of the ACM SIGIR’99 Work-shop on Recommender Systems: Algorithms and Evaluation

31. Billsus, D., Pazzani, M.: User modeling for adaptive news access.User Model. User-Adapted Interact. 10(2–3), 147–180 (2000)

32. http://www.celtsc.edu.cn/. Accessed 20 January 200933. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H.: PrefixSpan: Mining

sequential patterns efficiently by prefix-projected pattern growth.In: Proceedings of the IEEE 17th International conference on DataEngineering, pp. 215–226 (2001)

34. http://movielens.umn.edu/. Accessed 20 January 2009

Junzhou Luo is currently a pro-fessor and the dean in School ofComputer Science and Engineer-ing, Southeast University, China.He received his M.S. and Ph.D. de-grees in Computer Science from theSoutheast University, China in 1992and 2000, respectively. His currentresearch interests include pervasivecomputing, network security, ser-vice computing and protocol engi-neering.

Fang Dong is currently a Ph.D. stu-dent in School of Computer Scienceand Engineering, Southeast Univer-sity, China. He received his B.S. andM.S. degrees in Computer Sciencefrom Nanjing University of Science& Technology, China in 2004 and2006, respectively. His current re-search interests include pervasivecomputing, service computing andtask scheduling.

Cluster Comput (2010) 13: 213–239 239

Jiuxin Cao is currently an associateprofessor in School of ComputerScience and Engineering, SoutheastUniversity, China. He received hisM.S. degree in Computer Applica-tion from Henan University of Sci-ence and Technology, China in 1999and Ph.D. degrees in Computer Sci-ence from Xi’an Jiaotong Univer-sity, China in 2003. His current re-search interests include pervasivecomputing, service computing andE-learning.

Aibo Song is currently an associateprofessor in School of ComputerScience and Engineering, SoutheastUniversity, China. He received hisM.S. and Ph.D. degrees in Com-puter Science from the SoutheastUniversity, Nanjing, China in 1996and 2003, respectively. His currentresearch interests include pervasivecomputing and grid computing.

top related