recommender system based on workflow

9
Recommender system based on workow Lu Zhen a, , George Q. Huang b , Zuhua Jiang c a Department of Industrial and Systems Engineering, National University of Singapore, Singapore b Department of Industrial and Manufacturing Systems Engineering, The University of Hong Kong, PR China c Department of Industrial Engineering and Management, Shanghai Jiao Tong University, Shanghai, PR China abstract article info Article history: Received 14 December 2008 Received in revised form 29 July 2009 Accepted 23 August 2009 Available online 29 August 2009 Keywords: Recommender system Workow Collaborative ltering Knowledge management This paper proposes a workow-based recommender system model on supplying proper knowledge to proper members in collaborative team contexts rather than daily life scenarios, e.g., recommending commodities, lms, news, etc. Within collaborative team contexts, more information could be utilized by recommender systems than ordinary daily life contexts. The workow in collaborative team contains information about relationships among members, roles and tasks, which could be combined with collaborative ltering to obtain members' demands for knowledge. In addition, the work schedule information contained in the workow could also be employed to determine the proper volume of knowledge that should be recommended to each member. In this paper, we investigate the mechanism of the workow-based recommender system, and conduct a series of experiments referring to several real- world collaborative teams to validate the effectiveness and efciency of the proposed methods. © 2009 Elsevier B.V. All rights reserved. 1. Introduction This study concerns knowledge recommender systems for collab- orative team contexts, rather than general situations in daily life, e.g., recommending commodities, news, lms to customers. Among a collaborative team, members usually come from diverse disciplines, each with particular expertise and contribution from their relevant areas. Thus, their demands for knowledge are also different from each other. Recommender system provides a platform to deliver right knowledge in the right context to the right person in the right volume [27,34,36]. This paper proposes a workow-based recommender system model, which is oriented to the collaborative team environment. Within this context, more information could be utilized by recom- mender systems, comparing to ordinary daily life situations. Work- ow is one type of collaborative processes and it virtually exists behind every collaborative team [37,38]. The workow in the collaborative team environment contains members-roles-tasks refer- ence information that describes which member plays which roles or fullls which tasks. This reference information could be combined with collaborative ltering to obtain members' demands for knowl- edge. It ensures that knowledge resources in proper domains will be recommended to proper members in collaborative team. Moreover, the volume of those recommended knowledge resources should also be proper for each member. Otherwise, too much knowledge is recommended to some busy members, which will cause information overload and interruption to them. In our study, the work schedule information contained in the workow is utilized to determine the proper volume of recommended knowledge for each member. This paper investigates the mechanism of the workow-based recommender system, and conducts a series of experiments referring to several real-world collaborative teams so as to validate the effectiveness and efciency of the proposed methods. The rest of this paper is organized as follows. Some related works done by other scholars are briey introduced in the next section. In Section 3, we introduce the application background: collaborative environment, which is the basis for our proposed method. Then, Section 4 addresses the general framework of the workow-based recommender system, and analyzes two key technical issues. Sections 5 and 6 investigate those issues in detail respectively: workow-based collaborative ltering, and recommendation volume control by using the schedule information in workow. For performances evaluation, several experiments are conducted to validate the proposed model and methods in Section 7. Closing remark and summary are then outlined in the last section. 2. Related works The recommendation technology has become a promising and hot area in both academia and industries; numerous recommender systems (RS) have been developed [2]. Tapestry [7] is one of the earliest RSs. Based on this work, several automated RSs were designed and implemented. A RS for news and movie recommendations was developed by Konstan et al. [13]. For book recommendation, Mooney and Roy proposed a content-based RS [23]. McNee et al. designed a RS Decision Support Systems 48 (2009) 237245 Corresponding author. Tel.: +65 6516 4608. E-mail address: [email protected] (L. Zhen). 0167-9236/$ see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.dss.2009.08.002 Contents lists available at ScienceDirect Decision Support Systems journal homepage: www.elsevier.com/locate/dss

Upload: lu-zhen

Post on 05-Sep-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Recommender system based on workflow

Decision Support Systems 48 (2009) 237–245

Contents lists available at ScienceDirect

Decision Support Systems

j ourna l homepage: www.e lsev ie r.com/ locate /dss

Recommender system based on workflow

Lu Zhen a,⁎, George Q. Huang b, Zuhua Jiang c

a Department of Industrial and Systems Engineering, National University of Singapore, Singaporeb Department of Industrial and Manufacturing Systems Engineering, The University of Hong Kong, PR Chinac Department of Industrial Engineering and Management, Shanghai Jiao Tong University, Shanghai, PR China

⁎ Corresponding author. Tel.: +65 6516 4608.E-mail address: [email protected] (L. Zhen).

0167-9236/$ – see front matter © 2009 Elsevier B.V. Aldoi:10.1016/j.dss.2009.08.002

a b s t r a c t

a r t i c l e i n f o

Article history:Received 14 December 2008Received in revised form 29 July 2009Accepted 23 August 2009Available online 29 August 2009

Keywords:Recommender systemWorkflowCollaborative filteringKnowledge management

This paper proposes a workflow-based recommender system model on supplying proper knowledge toproper members in collaborative team contexts rather than daily life scenarios, e.g., recommendingcommodities, films, news, etc. Within collaborative team contexts, more information could be utilized byrecommender systems than ordinary daily life contexts. The workflow in collaborative team containsinformation about relationships among members, roles and tasks, which could be combined withcollaborative filtering to obtain members' demands for knowledge. In addition, the work scheduleinformation contained in the workflow could also be employed to determine the proper volume ofknowledge that should be recommended to each member. In this paper, we investigate the mechanism ofthe workflow-based recommender system, and conduct a series of experiments referring to several real-world collaborative teams to validate the effectiveness and efficiency of the proposed methods.

© 2009 Elsevier B.V. All rights reserved.

1. Introduction

This study concerns knowledge recommender systems for collab-orative team contexts, rather than general situations in daily life, e.g.,recommending commodities, news, films to customers. Among acollaborative team, members usually come from diverse disciplines,each with particular expertise and contribution from their relevantareas. Thus, their demands for knowledge are also different from eachother. Recommender system provides a platform to deliver rightknowledge in the right context to the right person in the right volume[27,34,36].

This paper proposes a workflow-based recommender systemmodel, which is oriented to the collaborative team environment.Within this context, more information could be utilized by recom-mender systems, comparing to ordinary daily life situations. Work-flow is one type of collaborative processes and it virtually existsbehind every collaborative team [37,38]. The workflow in thecollaborative team environment contains members-roles-tasks refer-ence information that describes which member plays which roles orfulfills which tasks. This reference information could be combinedwith collaborative filtering to obtain members' demands for knowl-edge. It ensures that knowledge resources in proper domains will berecommended to proper members in collaborative team. Moreover,the volume of those recommended knowledge resources should alsobe proper for each member. Otherwise, too much knowledge isrecommended to some busy members, which will cause information

l rights reserved.

overload and interruption to them. In our study, the work scheduleinformation contained in the workflow is utilized to determine theproper volume of recommended knowledge for each member.

This paper investigates the mechanism of the workflow-basedrecommender system, and conducts a series of experiments referringto several real-world collaborative teams so as to validate theeffectiveness and efficiency of the proposed methods. The rest ofthis paper is organized as follows. Some related works done by otherscholars are briefly introduced in the next section. In Section 3, weintroduce the application background: collaborative environment,which is the basis for our proposedmethod. Then, Section 4 addressesthe general framework of the workflow-based recommender system,and analyzes two key technical issues. Sections 5 and 6 investigatethose issues in detail respectively: workflow-based collaborativefiltering, and recommendation volume control by using the scheduleinformation in workflow. For performances evaluation, severalexperiments are conducted to validate the proposed model andmethods in Section 7. Closing remark and summary are then outlinedin the last section.

2. Related works

The recommendation technology has become a promising and hotarea in both academia and industries; numerous recommendersystems (RS) have been developed [2]. Tapestry [7] is one of theearliest RSs. Based on this work, several automated RSs were designedand implemented. A RS for news and movie recommendations wasdeveloped by Konstan et al. [13]. For book recommendation, Mooneyand Roy proposed a content-based RS [23]. McNee et al. designed a RS

Page 2: Recommender system based on workflow

238 L. Zhen et al. / Decision Support Systems 48 (2009) 237–245

to help recommend research papers' citation [19]. Citeseer [3],Webpersonalizer [22], GroupLens [13], SiteSeer [25] filter andrecommend web information according to the similarities betweenweb resources and users' interests. For improving sales on e-commerce websites, a taxonomy RS was developed by Schafer et al.[26]. Ontology technologies were also brought into RS researches.Middleton et al. explored a novel ontological approach for userprofiling within RS, which could recommend on-line academicresearch papers [20,21]. Li and Zhong presented an abstract Webmining model for extracting approximate concepts hidden in userprofiles, which could make recommendation much more efficient[14,15]. Godoy and Amandi designed a document clustering algorithmthat carried out incremental, unsupervised concept learning overWebdocuments for acquiring user profiles to support recommendation ofweb information [6]. Yu et al. suggested a hybrid collaborativefiltering method for multiple-interests and multiple-content recom-mendation in e-commerce [31]. By analyzing customer behaviors(navigational patterns), Yong et al. designed a collaborative filteringbased RS for e-commerce sites [30]. Yeong et al. proposed a newmethodology in which customer purchase sequences were used toimprove the quality of collaborative filtering based recommendation[29]. Liang et al. developed a knowledge recommender that allowscustomized content to be suggested based on the user's browsingprofile. The method adopts a semantic-expansion approach to buildthe user profile by analyzing documents previously read by the person[16]. Malinowski et al. developed a relational recommendationapproach for providing an automated pre-selection of candidatesthat fit best with future teammembers [18]. Garfinkel et al. designed arecommender system which extends the one-product-at-a-timesearch approach used in current ‘Shopbot’ implementations toconsider purchasing plans for a bundle of items [4]. They alsodeveloped ‘Shopbot 2.0’ to integrate recommendations and promo-tions with comparison shopping [5]. Jiang et al. studied how tomaximize customer satisfaction through an online recommendersystem, and proposed a novel associative classification model.Products could be recommended to the potential buyer if the modelpredicts his/her satisfaction level will be high [9]. Kagie et al. designeda graphical shopping interface based on product attributes. Itrepresents the mutual similarities of the recommended products ina two dimensional map, where similar products are located close toeach other and dissimilar products far apart [11]. Wang et al.developed a navigation graph-based recommender system, in whichthe navigation patterns of previous website visitors are utilized toprovide recommendations for newcomers [28].

To adapt recommendation technologies to large scale peer-to-peer(P2P) environment, Han et al. suggested a distributed collaborativefiltering algorithm to construct a scalable distributed RS [8]. Kim et al.implemented an image content recommender in P2P architecture [12].Olsson developed a headline recommender system in P2P environ-ment without centralized control [24]. As to multi-dimensional RS [1],a workflow space based collaborative filtering method was proposedin [35]. However, those systems mainly concern personalizedrecommendations in ordinary daily life situations, e.g., recommendingcommodities, news, films to customers. They have not consideredspecific applications of knowledge recommendation in collaborativeteam context, or about some specific business processes. Jungproposed blog context overlay network architecture for contextmatching between blogs, so as to realize knowledge recommendationand distribution among members in community [10]. Liu and Wudeveloped a novel task-based knowledge recommender. A modifiedrelevance feedback technique, which is integrated with the task-relevance assessmentmethod, could be enabled to provide knowledgeworkers with task-relevant information based on task profiles [17].Based on knowledge grid environment, some conceptual models ofproactive knowledge recommender system and reactive knowledgequery platform were proposed in [32,33].

3. Workflow-centric collaborative environment

The recommender system model proposed in this paper is mainlyoriented to the workflow-centric collaborative environment, and oneexample is illustrated in Fig. 1. This collaborative environmentconsists of three key concepts: members, roles and tasks. Membersare the core factor in collaborative environment. All the knowledge isproduced and also used by members while performing their tasks.Each member in a collaborative organization may have one or moreroles, e.g., product engineer, design engineer, test engineer, mechanicdesigner, etc. Moreover, there are up-low relationships between thosevarious roles, all of which constitute a role hierarchy, denoted by ‘roletree’. The role tree reflects the organization architecture of acollaborative team. Besides the mapping with the member list, therole tree also has the mapping with the task model. As shown in Fig. 1,Mapping_1 deploys the roles onto tasks in workflow; Mapping_2deploys the roles onto members in member list.

The workflow-centric collaborative environment is the applicationcontext for our proposed recommender model, while traditionalrecommender systems are oriented to general situations in daily life.The collaborative environment contains some potentially usefulrelationships among the collaborative members, which could beutilized to support collaborative filtering in recommender systems.

4. The framework of recommender system based on workflow

As for the implementation of recommender systems in collabora-tive environment, there exist two core issues. Firstly, as to therecommended knowledge, which domains are suitable for eachmember? Secondly, what is the suitable volume of recommendedknowledge for each member? Once the above two questions areanswered, the proper knowledge could be delivered to the properpersons in the proper volume. Fig. 2 illustrates the general frameworkof the recommender system based on workflow. There are two coremodules in the model, which are marked in the figure.

For the first issue, a novel collaborative filtering based on task (orrole) relationships from workflow is proposed to obtain the‘members-to-knowledge domains' relation table, which reflects themembers' demands for suitable domain knowledge.

For the second issue, a statistic analysis method is proposed toobtain the occupancy scale from the schedule in workflow, so as todetermine the suitable volume of the recommended knowledge.

The following two sections will give details to the above two keyissues respectively.

5. Collaborative filtering based on workflow

This section investigates the first key issue: as to the recom-mended knowledge for each member, how to determine suitabledomain thatmay be potentially useful for him (or her).We proposed anovel collaborative filtering method based on workflow to reach theabove target.

Workflow realizes work cooperation between team membersthrough a definite logical process, and it is used to integratedistributed and heterogeneous tasks (activities) into a unified process.Abundant information is contained in workflow, e.g., the logicaldependence order relationships between team members' tasks(activities), and member-roles–tasks reference information thatdescribes which member plays which roles or fulfills which tasks.The above information could be combined with collaborative filteringso as to obtain members' demands for knowledge from theircorrelative colleagues. It guarantees that knowledge resources inproper domains could be recommended to proper members in thecollaborative team.

Page 3: Recommender system based on workflow

Fig. 1. An example of the workflow-centric collaborative environment.

239L. Zhen et al. / Decision Support Systems 48 (2009) 237–245

5.1. Collaborative filtering based on the same task or role

Collaborative methods try to predict a member's preferencesbased on the preferences of other similar members. More formally,the weight of a knowledge category c for member m is estimatedaccording to the weights assigned to category c by those memberswho are ‘similar’ to member m.

Various approaches have been used to quantify the similaritiesbetween members, such as Pearson correlation coefficient, Cosine-based approach, etc. In this paper, we make use of the information oftask and role. There is an assumption here: members in the same taskor role could be regarded as ‘similar’ members. In this way, we couldavoid calculation for members' similarities. The process of determin-ing ‘similar’members based on the information of task or role could bemuch faster and more rational. As shown in Fig. 3, ‘Member-1’ hasthree colleagues in the same task (T_3). The knowledge demands of‘Member-1’will be influence by the other three members (‘Member-2’,‘Member-3’, ‘Member-4’). In addition, those three colleagues (e.g.,Member-2, 3, and 4) may be different in working experiences; veterancolleagues will have more influence on the knowledge demands of‘Member-1’ than novice colleagues.

Fig. 2. The framework of the recommender system based on workflow.

As to the member's knowledge demands, the weight of category cifor member m (denoted by wm,i) is calculated as follows:

wm;i =―wm + k ∑

m0∈ ⌢Mwðm0Þ × ðwm0 ;i−―wm0 Þ ð1Þ

where M⌢

denotes the set of N members that belong to the same taskor role. w(m′)is the weight of the member m′ in the task or role. Inthis paper, it is measured by the difference between the current yearand the year the member took up the task or role. Veteran members(with longer working experiences) will have more influence onothers' knowledge demands than novice members. It is assumed thata member with hl years experience will have a half influence of expertwho may have the highest influence value. Here, the hl, abbreviatedfrom ‘half-life span’, is a parameter to determine the impact degreefrom experts to novices. For example, if the highest influence value isset as 1.0 and hl as 5, it means that a member with 5years workingexperience, his (or her) weightw(m) equals to 0.5. Members' weightswill approach 1.0 with their working experiences increase larger andlarger. In formula (1), the weight of members and some coefficientsare defined as:

wðm0Þ = 2πarctanðYearðm0Þ= hlÞ ð2Þ

k = 1= ∑m0∈⌢

M

wðm0Þ ð3Þ

―wm = ð1 = jSc j Þ × ∑c∈Sc

wm;c ð4Þ

Sc = fc∈C jwc;m≠ϕg: ð5Þ

5.2. Multi-layer workflow and relationship types

Considering the complex architectures of collaborative teams'workflow, the members who belong to two different tasks should alsobe regarded as having some relationships and influences on eachother. For example, members belong to two different tasks: ‘dieseldesign’ and ‘engine design’. The former one is actually a subtype of thelatter one. The two different tasks are virtually correlative. Therefore,the members, who belong to the two tasks, should also haveinfluences on each other. Besides the above mentioned ‘subtype’

Page 4: Recommender system based on workflow

Fig. 3. The collaborative filtering based on the same task.

240 L. Zhen et al. / Decision Support Systems 48 (2009) 237–245

relationship between tasks. There exist other relationship types asshown in Fig. 4.

The relationships among tasks could be defined as: T1-α-NT2,where α is a type of semantic relationship between two tasks T1 andT2. Some types of semantics are defined as follows:

Part of: denoted as T1-par-NT2, where T1 is a part of T2. As shown inFig. 4, ‘T_1’, ‘T_2’, ‘T_3’, ‘T_4’ are part of ‘T_b’ respectively.Sequential: denoted as T1-seq-NT2, which defines that T1 should befulfilled before T2. For example, the task ‘building mathematicmodel for compressor’ -seq-N ‘the simulation of compressor’.Subtype: denoted as T1-sub-NT2, which describes the relationshipbetween a general task T1 and a specific task T2. For example, thetask ‘engine design’ -sub-N ‘diesel design’.Supplement: denoted as T1-sup-NT2, which means that T2 serversas the supplementary or additional task to T1.Corequisite: denoted as T1-cor-NT2, which denotes that T1 and T2should be fulfilled in parallel. The corequisite relationship issymmetric. As shown in Fig. 4, ‘T_2’, and ‘T_3’ are corequisite witheach other.

Fig. 4. The architecture of

Besides the five types of relationships as described above, theremay be other types that are also involved in the workflow model. Foreach type of relationship, a Relationship Similarity Coefficient (RSC),which is between 0 and 1, is assigned by experts or knowledgeengineers. The purpose of RSC is to calculate the influence betweentwo members, which is defined by Influence Coefficient (InfC).

Rule 1: if two members M1 and M2, two tasks T1 and T2, M1∈T1and M2∈T2, ∃α: T1-α-NT2, (α is a type of relationship amongtasks); then InfC(M1, M2)=RSC(α)Rule 2: if two members M1 and M2, two tasks T1 and T2, M1∈T1and M2∈T2, ∃path1=α1

1 α21…αN1

1 , path2=α12α2

2…aN2

2 , …, pathm=α1mα2

m…αNm

m , T1-path1-NT2 ,T1-path2-NT2, …, T1-pathm-NT2, (α is atype of relationship among tasks, Nm is the length of the mthpath); then InfC(M1, M2)=Max(∏i RSC(αi

1),∏i RSC(αi2),…,∏i

RSC(αim)).

In the situation of ‘members belong to the same task or role’, asmentioned in the previous subsection, the RSC and the InfC(M1, M2)are both 1.

multi-layer workflow.

Page 5: Recommender system based on workflow

241L. Zhen et al. / Decision Support Systems 48 (2009) 237–245

Here, there may be more than one paths or relationship chainsbetween two tasks among the above multi-layer workflow. Asshown in Fig. 3, two tasks T_1, T_2, T1-seq-NT2, and two membersM1∈T1 andM2∈T2, we can have InfC(M1,M2)=RSC(seq). However,T_1 and T_2 also belong to the same task (T_b) in the upper layer.That is: T_1b-par-T_b-par-NT_2. According to the above rules, InfC(M1, M2) could also be defined as RSC(par)×RSC(par). Based on theabove two paths between tasks T_1 and T_2, the max value of ‘RSC(seq)’ and ‘RSC(par)×RSC(par)’ will be assigned to InfC(M1, M2).

It should be mentioned that those predefined values to the aboverelationships (e.g., RSC(par), RSC(seq), etc.) are very important, whichwill affect the influence value between members (InfC(M1, M2)). Fordetermining rational values to those relationships coefficients, weconduct some experiments in Section 7.

5.3. Collaborative filtering based on the correlative tasks or roles

In the previous section, we analyze the relationships among tasksfrom a view of multi-layer workflow architecture, and defineinfluence coefficients (InfC) between members. The goal of aboveanalysis is to calculate members' knowledge demands more precisely.

Based on the influence coefficient between twomembers, we couldcalculate the members' knowledge demands through collaborativefiltering. We could make some revision on the previous formula (1) inSection 5.1 by adding the influence coefficient (InfC). The new formulais revised as follows:

wm;i =―wm + k ∑

m0∈⌢Mwðm0Þ × InfCðm;m0Þ × ðwm0 ;i−―wm0 Þ ð6Þ

where:

wðm0Þ = 2πarctanðYearðm0Þ= hlÞ ð7Þ

k = 1= ∑m0∈⌢

Mwðm0Þ ð8Þ

―wm = ð1 = jSc j Þ × ∑c∈Sc

wm;c ð9Þ

Sc = fc∈C jwc;m≠ϕg: ð10Þ

The new formulas consider the influence between members inmore aspects and take into account the relationships among tasks. Allmembers are involved in collaborative team's workflow, which is acomplex architecture of tasks, roles andmembers. Those tasks or rolesrelationships would have impacts on the correlations betweencollaborative team members, hence influence their demands forknowledge.

6. Recommendation volume control based on workflow

Besides providing knowledge in suitable domains, the volume ofthe recommended knowledge should also be appropriate. This sectionwill give details to the second issue mentioned in Section 4. Wepropose a statistic analysis method to obtain the occupancy from theschedule in workflow, so as to determine the suitable volume forknowledge recommendation.

6.1. Control the volume by threshold

For busy members, only those knowledge resources that havehigher correlations with themwill be recommended. On the contrary,more knowledge resources would be delivered to free members. Weuse some threshold values to control the suitable volume ofrecommended knowledge. As to busy members (with a high

occupancy scale), only the knowledge resources whose correlationdegree exceed a high threshold value will be recommended. Whilemore knowledge resources are recommended to free members withlower occupancy scale. Here, the correlation degree is the similaritybetween member's demands and knowledge resources' descriptions.For example, for busy members who have high occupancy scale, wecould set a higher threshold value, e.g., 0.9. It means only theknowledge resources with the correlation degree higher than 0.9 willbe recommended to him (or her). On the contrary, for members withlow occupancy scale, the threshold value could be set lower, e.g., 0.6.

There are two main problems: (1) how to identify busy membersand free members, in other words, how to quantify the busy degreefor each member; and (2) how to connect the busy degree to thethreshold values setting.

As to the first problem, we use the occupancy scale to measure thebusy degree of members. As to the second problem, we build amapping table between the occupancy scale and a certain thresholdvalue, in which the difference in members' experience (novices–experts) is taken into account. Those two problems are introduced infollowing subsections respectively.

6.2. Occupancy scale to measure members' busy degree

The flowchart of determining the occupancy scale is illustrated inFig. 5. The occupancy scale is obtained by analyzing the total busy time(Tbusy) and variance of the free time (δidle) of each member. The busymembers, whose Tbusy is high, will have less time to browse theknowledge provided by recommender systems. In addition, if the freetime is allocated at different periods, themember has less flexibility totake advantage of the recommended knowledge. A mapping table(Tbusy×δidle) is established to obtain the occupancy scale of thespecified member. Then, the suitable volume for knowledge recom-mendation could be determined for each member.

Firstly, two indices (Tbusy and δidle) are calculated as follows:

Tbusy = ∑N

i=1Bi = ∑

N

i=1ðBEi−BSiÞ ð11Þ

δidle =∑N−1

j=1ðFIj−AvgðFIÞÞ2

ðN−1Þ−1=

∑N−1

j=1ðFIj−AvgðFIÞÞ2

N−2ð12Þ

AvgðFIÞ =∑N−1

j=1FIj

N−1ð13Þ

where Bi denotes the ith busy time, BSi and BEi denote the starting andending time of the ith busy period. FIj denotes free time intervalbetween the jth and j+1th busy period. If there are N busy timeperiods, there are N−1 free time periods. So i is from 1 to N, while j isfrom 1 to N−1.

Assume that the total busy time and variance of free time of allmembers follow the normal distribution, that is Tbusy~N(µ1,δ12) andδidle~N(µ2,δ22). In order to classify the members into categories basedon their workload, the lower and upper bounds of each categoryshould be derived. Based on total busy time, members could becategorized into k categories with equal probabilities and theboundary BounTi can be determined by:

PBounTi−μ1

σ1≤ Z

� �=

ik× 100%; i = 1;2; :::; k−1: ð14Þ

The target member could be assigned with a correspondingcategory based on the total busy time Tbusy and the category boundaryBoundTi.

Page 6: Recommender system based on workflow

Fig. 5. Statistical analysis of occupancy scale from schedules in workflow.

242 L. Zhen et al. / Decision Support Systems 48 (2009) 237–245

Similarly, based on the variance of free time, the members can becategorized into h categories with equal probabilities and theboundary Bounδj can be determined by:

PBounδj−μ2

σ2≤ Z

� �=

jh

× 100%; j = 1;2; :::;h−1: ð15Þ

The target member could be assigned with a correspondingcategory based on the variance of free time δidle and the categoryboundary Bounδj.

After determining the above boundaries, a mapping table formember category (or occupancy scale) can be established (as shownin Fig. 5). Members with category C1,k are regarded as busiest ones, soonly the knowledge resources that have the highest correlation withthem will be recommended. On the contrary, the members withcategory Ch,1 have more free time to browse the recommendedknowledge. In all, based on statistic analysis on members' busy timeand free time variance, their occupancy scales could be determined.According to the member's occupancy scale, recommender systemscan intelligently provide appropriate volume of knowledge to him (orher) so as to avoid information overload.

6.3. Mapping occupancy scale to threshold

Wedefine amapping table between occupancy scales and thresholdvalues. For example, the busiest occupancy scale (C1,k) maps thresholdvalue 0.95; the freest occupancy scale (Ch,1) maps threshold value 0.15.In thismapping table, the specific threshold values could be determinedby knowledge engineers or experts in advance.

As to the issue of determining suitable volume of recommendedknowledge, it should also take different experience levels of mem-bers (e.g., expert or novice) into account. For some experts, althoughthey are not busy, they don't really need a lot of knowledge documentsrecommended from the knowledge repository. As to some novices, theymay be busy, but they still need more related knowledge documents forbrowsing and learning. Therefore, it is not rational to use uniformmapping table for both experts and novices. We should take account of

the different experience levels of team members (expert or novice) inbuilding the mapping table. More specifically, the threshold values inthemapping table should be adjusted according tomembers' experiencelevels, which could be measured by their working years. For experts, ahigher factor (α) could be multiplied with previous threshold; whilefor novices, a lower factor would be used. For example, two members(m1, m2), 1 is with 10years working experiences (year(m1)=10), theother is just 1year (year(m2)=1). Their occupancy scales are both C3,2,which is corresponding to a threshold value 0.8. The final thresholdvalues for the above two members are different, the expert may be 0.72(0.8×α(m1), α(m1)=0.9), while the novice may be 0.56 (0.8×α(m2),α(m2)=0.7). Different years of working experience, year(m), havedifferent factors α(m).

We could employ many possible functions to map year(m) toα(m). In this paper, it is defined as follows:

αðmÞ = 2πarctanðyearðmÞ= hlÞ: ð16Þ

Here, year(m) is theyear difference between the current year and theyear the member took up the task or role. It is assumed that a memberwith hl years experience will have a middle factor (0.5). If the length ofworking period is long enough, the factor will approach 1.0 gradually.

7. Experimental evaluations

7.1. Experiment design

The experiments are set in the environment of a manufacturingenterprise. Three typesof collaborative teamsare involved inexperiments:

(1) R&D collaborative team: 30 engineers from the research anddesign department form a collaborative team, which mainly dosome research for new product development.

(2) Manufacturing control team: 30 engineers from the manufac-turing department form a collaborative team, which mainlymanage and control the manufacturing process.

Page 7: Recommender system based on workflow

Fig. 6. The performances of different CF modes.

Fig. 7. The performances of workflow-based CF under different RSCs' settings.

243L. Zhen et al. / Decision Support Systems 48 (2009) 237–245

(3) Routine office team: 30 clerks from the administration depart-ment form a collaborative team, which mainly do some routinedocuments handling work.

Our experiments used the data records from the above depart-ments. The data sets record 90 participants' knowledge querying andbrowsing tracks from the knowledge repository in 80 days. Everyparticipant queried and browsed a lot of knowledge documents eachday, and those knowledge documents belong to various domains orcategories. For each day, a frequency statistic about each domaincould be obtained. Thus, for every participant, we could have 80 suchdomain frequency statistic tables, which are aggregated with weights.The nearer is the date, the higher is the weight. Then, we could obtainone final ‘Member-Domain’ rating table from the above process. Thatwould form the basis for further collaborative filtering (CF).

Based on the above sourcedataset,we could generate new ‘Member-Domain’ rating table by using the CF based onworkflow. As to a certainmember, someknowledgedocumentswithhigher ratingdomains couldbe recommended to him (or her) from the knowledge repository in thecompany. As to the suitable volume of the recommendation, it isdetermined by the member's time occupancy scale, which is obtainedaccording to the member's schedule information in workflow.

For the benchmark (or ground truth) ranking data, we asked eachengineer in the team to rank and sort the items of knowledgecategories in a predefined list, which is based on the company'sindustrial domains. This benchmark data reflects the members'interests and requirements for the categories of knowledge. Withcomparison between recommendation results and the benchmarkdata, the precision values could be obtained for all participants. Theaverage of results is calculated as the final evaluation metrics to therecommender system.

In this paper, our experiments mainly address following six issues:

(1) We analyze the recommendation precisions with T changing.The parameter T denotes the length of interval involved incalculation.

(2) We make the comparison between the workflow-based CF andordinary CF (without consideration of workflow).

(3) We study the precisions under different RSC coefficientparameter settings. The RSC coefficients denote the weightsfor different relationships among tasks in workflow.

(4) We compare the recommender system's performances amongthree different collaborative teams: R&D, manufacturingcontrol, and routine office teams.

(5) We take into account the differences in members' experience(expert or novice) when using occupancy scale to determinesuitable recommendation volume.We test the method's effectson final performance.

(6) We study the precisions with hl changing. Parameter hl is keyto determining impact degree from experts to novices.

7.2. Results and analysis

7.2.1. Experiment 1: Different CF modes, with T changing (as to theabove issues (1) and (2))

In the first experiment, we compare the performances of differentCF modes: (1) CF based on workflow with considering relationshipsamong tasks (Section 5.2, 5.3), (2) CF based on the same task inworkflow (Sections 5.1), and (3) Ordinary CF.

Fig. 6 shows the precisions of above three CF modes with Tchanging. Parameter T denotes the length of interval used incalculation. Because the length of the source data sets is 80, T couldbe an integer from 1 to 80. If T is too small, the results vary in wideextent, and lack meanings for supporting the proposed methods. So Tis assigned from 20 to 80 in this experiment. From Fig. 6, we could seethat the precisions increase with T grows and hold at a steady valuewith T over 60.

The CF based on workflow, proposed in this paper, contains twoways: one is only based on the same task (introduced in Section 5.1),while the other takes into account the relationships among tasks(introduced in Section 5.2, 5.3). Fig. 6 shows three curves: one is aboutthe situation only considering the same tasks (denoted by ‘in SameTask’ in Fig. 6); one is considering relationships among tasks (denotedby ‘with RSC’ in Fig. 6); and one is using the ordinary CF. From thisexperiment, we could see the proposed CF based on workflowoutperforms the ordinary CF, especially the one considering relation-ships among tasks. Those results reflect that the collaborativeenvironment could supply more information about the implicativerelationships among members so as to improve the CF efficiency.

7.2.2. Experiment 2: Different RSCs settings, with T changing (as to theabove issue (3))

Considering the relationships among tasks will improve the CF'sperformances for collaborative environments. The assignment of RSC'svalues is essential in the above steps. The parameters RSCs (RelationshipSimilarity Coefficient) denote the weights for different relationshipsamong tasks in workflow. We perform the experiments to find outwhich sets of RSCs are proper. Fig. 7 shows three groups of experiments.The assignments of RSC's values are listed as follows.

RSC 1: RSC(seq)=0.9, RSC(par)=0.5, RSC(sub)=0.7;RSC 2: RSC(seq)=0.7, RSC(par)=0.5, RSC(sub)=0.9;RSC 3: RSC(seq)=0.7, RSC(par)=0.9, RSC(sub)=0.5;

From the results shown in Fig. 7, we could see that the three curvesare close to each other, whichmeans the changes of the assignment ofRSC do not have significant influence on the final results. However, thecurve of RSC1 is better than RSC2 and RSC3. The results reflect that theRSC(seq) should be set higher, while RSC(par) should be set lower. Itmeans the ‘sequential’ relationship has higher influence oncorresponding members than ‘subtype’ and ‘part of’ relationships.

Page 8: Recommender system based on workflow

Fig. 9. The performances with considering members' differences (expert–novice).

244 L. Zhen et al. / Decision Support Systems 48 (2009) 237–245

7.2.3. Experiment 3: Different collaborative teams, with T changing (as tothe above issue (4))

In order to examine the proposed methods' performances underdifferent collaborative environments,we conduct experiments for threetypes of collaborative teams: (1) R&D collaborative team: 30 engineersfrom the research and design department form a collaborative team,which mainly do some research for new product development;(2) Manufacturing control team: 30 engineers from the manufacturingdepartment form a collaborative team, which mainly manage andcontrol themanufacturingprocess; and (3)Routine office team: 30 clerksfrom the administration department form a collaborative team, whichmainly do some routine documents handling work.

Fig. 8 shows the results under the above three differentcollaborative environments. We could see that the workflow-basedCF has evident effects on R&D team, Manufacturing control team;while on the routine office team, the workflow-based method seemsnot effective. The reasons may lie in that the collaborative character-istics of routine office team are not as distinct as two other teams. TheR&D team deals with high level knowledge intensive work, so itcontains much stronger collaborative relationships among members.Therefore, the effect of workflow-based CF is muchmore evident thantwo other teams. This experiment shows that the proposedworkflow-based CF may be much more suitable to those teams or organizationswith more potential and implicative collaborations among members.

7.2.4. Experiment 4: Considering difference between expert and novice(as to the above issues (5) and (6))

The last experiment takes into account the differences in members'experience (expert or novice). We perform the experiment by usingoccupancy scale to determine suitable recommendation volume, andstudy hl's impact on precision. Here, the hl, abbreviated from ‘half-lifespan’, is a parameter to determine impact degree from experts tonovices, and it means that a member with hl years experience will havea half influence of expert who may have the highest influence value.

Fig. 9 shows recommendation precisions under two cases: (a) oneis to consider expert–novice differences in calculating occupancy scaleso as to determine suitable recommendation volume; (b) the other isto ignore the expert–novice differences. From Fig. 9, we could findthat the former one performs better than the latter one except somepeak points. When determining the suitable volume of recommen-dation, different strategies are adoptive for improving the recom-mender's performance under different environments.

As to the trends of precision with hl changing, we could see that theprecision increases when hl is below five, and decreases when hlexceeds five. Thus the ‘half-life span’ hl should be set as five for thatcollaborative team. Here, ‘hl=5’ means that a member (e.g., ME) with5years working experience will have a middle factor (0.5) according toformula (16) in previous Section 6.3, and amember (e.g.,MN) with only2years working experience has a factor (0.24). If the twomembers (ME

Fig. 8. The performances of the workflow-based CF in different collaborative teams.

andMN) have the same busy level (occupancy scale value) that maps toa threshold value (e.g., 0.8) in recommendation volume control, thefinalthreshold value for the ME is actually 0.4 (0.5×0.8) and MN is 0.2(0.24×0.8). That is theworking experiences' impact on the recommen-dation volume control for different members.

8. Summary

This paper introduces a workflow-based recommender systemmodel for collaborative team environment. Two workflow-centricapproaches for mining team members' knowledge demands anddetermining proper recommendation volume are proposed. Thisstudy paves theway for implementing a platformwhichwould ensurethat a proper volume of proper knowledge resources could berecommended to the proper members among the collaborative team.

However, there exist some limitations for the current model andmethods, which need further studies in future:

(1) The proposed methods consider expert–novice influence onmembers' demands for knowledge. However, the current studyhas not mentioned how ‘expert’ a new member is viewed. InSection 4.3, the manner in which hl is used has not taken intoaccount how much expertise an individual brings into thecollaborative team. This is actually a new user cold startingproblem. In future studies, the agency theory could be applied tohow knowledge workers address their knowledge needs. In thisway, itmay improveonor replace thecurrentmethodsof usinghl.

(2) In experiments, we are not able to use all possible combinationsof parameters. Currently, those settings are determined accord-ing to experience. As to different collaborative teams, the‘optimal’ settings are actually different from each other. Thereis no universal setting that could adapt to all contexts with thebest performance. The sensitivity analysis of those parametersettings for different application environments should be con-ducted in future studies.

References

[1] G. Adomavicius, R. Sankaranarayanan, S. Sen, A. Tuzhilin, Incorporating contextualinformation in recommender systems using a multidimensional approach, ACMTransactions on Information Systems 23 (2005) 103–145.

[2] G. Adomavicius, A. Tuzhilin, Toward the next generation of recommendersystems: a survey of the state of the art and possible extensions, IEEE Transactionson Knowledge & Data Engineering 17 (2005) 734–749.

[3] K.D. Bollacker, S. Lawrence, C.L. Giles, Discovering relevant scientific literature onthe Web, IEEE Intelligent Systems 15 (2) (2000) 42–47.

[4] R. Garfinkel, R. Gopal, A. Tripathi, F. Yin, Design of a shopbot and recommendersystem for bundle purchases, Decision Support Systems 42 (3) (2006) 1974–1986.

[5] R. Garfinkel, R. Gopal, B. Pathak, F. Yin, Shopbot 2.0: integrating recommendationsand promotionswith comparison shopping, Decision Support Systems 46 (1) (2008)61–69.

[6] D. Godoy, A. Amandi, Modeling user interests by conceptual clustering, InformationSystems 31 (4–5) (2006) 247–265.

Page 9: Recommender system based on workflow

245L. Zhen et al. / Decision Support Systems 48 (2009) 237–245

[7] D. Goldberg, D. Nichols, B.M. Oki, D. Terry, Using collaborative filtering to weave aninformation tapestry, Communications of the ACM 35 (1992) 12–13.

[8] P. Han, B. Xie, F. Yang, R. Shen, A scalable P2P recommender system based ondistributed collaborative filtering, Expert Systems with Applications 27 (2004)203–210.

[9] Y.C. Jiang, J. Shang, and Y.Z. Liu, Maximizing customer satisfaction through anonline recommendation system: A novel associative classification model, DecisionSupport Systems, (2009) In Press, doi:10.1016/j.dss.2009.06.006.

[10] J.J. Jung, Knowledge distribution via shared context between blog-basedknowledge management systems: a case study of collaborative tagging, ExpertSystems with Applications 36 (7) (2009) 10627–10633.

[11] M. Kagie, M. van Wezel, P.J.F. Groenen, A graphical shopping interface based onproduct attributes, Decision Support Systems 46 (1) (2008) 265–276.

[12] J.K. Kim, H.K. Kim, Y.H. Cho, A user-oriented contents recommendation system unpeer-to-peer architecture, Expert Systemswith Applications 34 (1) (2008) 300–312.

[13] J. Konstan, B. Miller, D. Maltz, J. Herlocker, L. Gordon, J. Riedl, GroupLens: applyingcollaborative filtering to usenet news, Communications of the ACM 40 (3) (1997)77–87.

[14] Y. Li, N. Zhong, Web mining model and its applications for information gathering,Knowledge-Based Systems 17 (2004) 207–217.

[15] Y. Li, N. Zhong, Mining ontology for automatically acquiring web user informationneeds, IEEE Transactions on Knowledge and Data Engineering 18 (2006) 554–568.

[16] T.P. Liang, Y.F. Yang, D.N. Chen, Y.C. Ku, A semantic-expansion approach topersonalized knowledge recommendation, Decision Support Systems 45 (3) (2008)401–412.

[17] D.R. Liu, I.C. Wu, Collaborative relevance assessment for task-based knowledgesupport, Decision Support Systems 44 (2) (2008) 524–543.

[18] J. Malinowski, T. Weitzel, T. Keim, Decision support for team staffing: an automatedrelational recommendation approach, Decision Support Systems 45 (3) (2008)429–447.

[19] S.M. McNee, I. Albert, D. Cosley, P. Gopalkrishnan, S.K. Lam, A.M. Rashid, On therecommending of citations for research papers, Proceedings of CSCW 2002,New Orleans, USA, 2002, pp. 116–125.

[20] S. E. Middleton, Capturing Knowledge of User Preferences with RecommenderSystems, Doctoral thesis, Univ. Southampton, (2003).

[21] S.E. Middleton, N.R. Shadbolt, D.C. De Roure, Ontological user profiling inrecommender systems, ACM Transactions on Information Systems 22 (1) (2004)54–88.

[22] B. Mobasher, R. Cooley, J. Srivastava, Automatic personalization based on webusage mining, Communications of the ACM 43 (8) (2000) 142–151.

[23] R.J. Mooney, L. Roy, Content-based book recommending using learning for textcategorization, In Proceedings of the ACM international conference on digitallibraries, San Antonia, Texas, USA, 2000 195–204.

[24] T. Olsson, Bootstrapping and decentralizing recommender system. Ph.D. Thesis,Dept. of Information Technology, Uppsala Univ., 2003.

[25] J. Rucker, M.J. Polanco, Siteseer: personalized navigation for the web, Commu-nications of the ACM 40 (3) (1997) 73–75.

[26] J.B. Schafer, J. Konstan, J. Riedl, Recommender systems in e-commerce, Proceedings of1stACMconference onelectronic commerce, Denver, Colorado, USA, 1999, pp. 158–166.

[27] A. Smirnov, M. Pashkin, N. Chilov, T. Levashova, Knowledge logistics in informationgrid environment, Future Generation Computer Systems 20 (2004) 61–79.

[28] Y.W. Wang, W.H. Dai, Y.F. Yuan, Website browsing aid: a navigation graph-basedrecommendation system, Decision Support Systems 45 (3) (2008) 387–400.

[29] B.C. Yeong, H.C. Yoon, H.K. Soung, Mining changes in customer buying behavior forcollaborative recommendations, Expert Systems with Applications 28 (2005)359–369.

[30] S.K. Yong, B.J. Yum, J. Song, M.K. Su, Development of a recommender system basedon navigational and behavioral patterns of customers in e-commerce sites, ExpertSystems with Applications 28 (2005) 381–393.

[31] L. Yu, L. Liu, X. Li, A hybrid collaborative filteringmethod formultiple-interests andmultiple-content recommendation in e-commerce, Expert Systems with Applica-tions 28 (2005) 67–77.

[32] L. Zhen, Z. Jiang, Knowledge grid based knowledge supply model, IEICETransactions on Information and Systems E91-D(4 (2008) 1082–1090.

[33] L. Zhen, Z. Jiang, Innovation-oriented knowledge query in knowledge grid, Journalof Information Science and Engineering 24 (2) (2008) 601–613.

[34] L. Zhen, Z. Jiang, H. Song, C. Liu, J. Liang, Information supply: an approach based ondemand modeling and information filtering, Proc.IMechE Part B: Journal ofEngineering Manufacture 222 (4) (2008) 541–557.

[35] L. Zhen, G.Q. Huang, Z. Jiang, Collaborative filtering based on workflow space,Expert Systems with Applications 36 (4) (2009) 7873–7881.

[36] L. Zhen, G.Q. Huang, and Z. Jiang, An inner-enterprise knowledge recommendersystem, Expert Systems with Applications, (2009) In Press, doi:10.1016/j.eswa.2009.06.057.

[37] H. Zhuge, Knowledge flow management for distributed team software develop-ment, Knowledge Based Systems 15 (8) (2002) 465–471.

[38] H. Zhuge, A knowledge flow model for peer-to-peer team knowledge sharing andmanagement, Expert systems with applications 23 (1) (2002) 23–30.

Lu Zhen received the B.E. and Ph.D. degrees in industrial engineering from Shanghai JiaoTong University (P.R. China). He is currently a postdoctoral research fellow at thedepartment of industrial and system engineering, National University of Singapore. Hiscurrent research interests include: knowledge management, information systems, andoperation researches. Hehas published 8 papers (thefirst author) in referred internationaljournals; another 8 papers (the first author) have been published in some internationalconference proceedings and domestic journals.

George Q. Huang is a Professor at the department of industrial andmanufacturing systemengineering, the University of Hong Kong. He received the Ph.D. degree in mechanicalengineering from Cardiff University, Cardiff, U.K., His main research interests include:collaborative product development, mass customization, supply chain management. Hehas published extensively in these topics, including over 200 technical papers, half ofwhich have appeared in refereed journals, twomonographs and an edited reference book.Prof. Huang received Outstanding Young Researcher Award from The University of HongKong (2001) andOverseasOutstandingYoung Scholar fromNatural Science Foundation ofChina (2007).

Zuhua Jiang is a Professor at the department of industrial engineering and management,Shanghai Jiao Tong University, P.R. China. He received the Ph.D. degree in mechanicalengineering from Shanghai Jiao Tong University, P.R. China. His current research interestsinclude: knowledge management, cooperative design, concurrent engineering. He haspublished 40 papers in referred international conferences and journals.