ontology-based context synchronization for ad hoc social collaborations

8
Ontology-based context synchronization for ad hoc social collaborations Jason J. Jung * Department of Computer Engineering, Yeungnam University, Dae-Dong, Gyeongsan 712-749, Republic of Korea article info Article history: Received 25 January 2007 Received in revised form 20 March 2008 Accepted 21 March 2008 Available online 29 March 2008 Keywords: Semantic context Social collaboration Synchronization abstract To efficiently support collaborations between people (agents) in real-time, we propose an ontology-based platform for acquainting the most relevant users (e.g., colleagues and classmates), according to their con- text. Thereby, we modeled two kinds of contexts with semantic information derived from ontologies; (i) personal context, and (ii) consensual context, integrated from several personal contexts. More impor- tantly, we formulate measurement criteria to compare them. Consequently, groups can be dynamically organized with respect to the similarities among several aspects of personal context. In particular, users can engage in complex collaborations related to multiple semantics. For experimentation, we imple- mented a social browsing system based on context synchronization. Ó 2008 Elsevier B.V. All rights reserved. 1. Introduction Knowledge sharing among people (e.g., co-workers) is an important procedure for efficient online collaborations. A social network is built by aggregating relationships between people. A variety of social co-occurrence patterns can be applied to establish social networks, which are composed of a set of people. Here, the essential assumption is that their contexts must be relatively clo- ser than others for collaborations within the same network. When two people are connected, there exists the possibility of cooperat- ing on their tasks and sharing their knowledge. In this sense, this is referred as a collaborative network. Such networks can be estab- lished by both explicit assertions (e.g., co-authoring, affiliation, and RDF schema such as FOAF 1 ) and by implicit information derived from user activities (e.g., preferences and opinions) [1]. Moreover, recent communication systems were applied to efficiently exchange useful information with remote people in an online environment. Such systems are simply based on message passing facilities (e.g., in- stant messaging and e-mail client software) among neighbors who are explicitly linked. In fact, computer-supported collaborative work (CSCW) com- munities were investigating groupware systems [2]. There were several collaborative systems which consider the ‘‘context” of par- ticipants [3,4]. Such systems are Doc2U [5] in document authoring and CoLLeGE [6] in e-learning domains. In this study, we focus on multi-agent model-based collaborative browsing systems in a peer-to-peer network, which is collabora- tively searching for relevant information (or resources) on the web [7,8]. While people are browsing the web to search for relevant infor- mation, their corresponding personal agents are aware of the con- texts, i.e., which information people are looking for, and exchange useful information on behalf of their neighbors’ contexts. However, there are some constraints for engaging in contextual collaboration between people in the same collaborative network (CN). Tasks grow more complicated, i.e., dependent upon multiple contexts related to various combined domain knowledge. This makes people involved into more than a CN for finding more rele- vant users. We state that a multi-disciplinary community of prac- tice (CoP) [9] is a similar example, in the context of collaborating over an extended period to share ideas, find solutions, and build innovations. More seriously, neighbors’ contexts could change over time [10]. This implies that people must be informed of the contextual transitions of their neighbors. We focus on a context of a user task which is decomposable into several sub-contexts. Hence, in this paper, we consider the follow- ing problems of contextual interactions; Local search with a limited number of people. Interactions between co-workers within a CN are limited, to efficiently accomplish complicated tasks, i.e., intermixed with multiple contexts. This implies that their multiple contexts can be sepa- rated into a set of sub-contexts, and they must be aware of their neighbors’ neighbors through merging of CNs related to the sin- gle sub-context [11]. Consequently, the most relevant people, i.e., those whose context is closest, can be reached. For example, as shown in Fig. 1, suppose that user C in network CN1 is working on a ‘‘Semantic sensor network,” and the rest of neighbors in CN1 are in different contexts. Thereby, CN1 must be merged with the other two networks CN2 and CN3. Then, he can cooperate with C 0 on ‘‘Ontologies” and C 00 on ‘‘Ubiquitous com- puting,” respectively. 0950-7051/$ - see front matter Ó 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.knosys.2008.03.015 * Tel.: +82 53 810 3534; fax: +82 53 810 4630. E-mail address: [email protected] 1 Friend of a friend (FOAF). http://www.foaf-project.org/. Knowledge-Based Systems 21 (2008) 573–580 Contents lists available at ScienceDirect Knowledge-Based Systems journal homepage: www.elsevier.com/locate/knosys

Upload: jason-j-jung

Post on 26-Jun-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Knowledge-Based Systems 21 (2008) 573–580

Contents lists available at ScienceDirect

Knowledge-Based Systems

journal homepage: www.elsevier .com/locate /knosys

Ontology-based context synchronization for ad hoc social collaborations

Jason J. Jung *

Department of Computer Engineering, Yeungnam University, Dae-Dong, Gyeongsan 712-749, Republic of Korea

a r t i c l e i n f o

Article history:Received 25 January 2007Received in revised form 20 March 2008Accepted 21 March 2008Available online 29 March 2008

Keywords:Semantic contextSocial collaborationSynchronization

0950-7051/$ - see front matter � 2008 Elsevier B.V. Adoi:10.1016/j.knosys.2008.03.015

* Tel.: +82 53 810 3534; fax: +82 53 810 4630.E-mail address: [email protected]

1 Friend of a friend (FOAF). http://www.foaf-project.

a b s t r a c t

To efficiently support collaborations between people (agents) in real-time, we propose an ontology-basedplatform for acquainting the most relevant users (e.g., colleagues and classmates), according to their con-text. Thereby, we modeled two kinds of contexts with semantic information derived from ontologies; (i)personal context, and (ii) consensual context, integrated from several personal contexts. More impor-tantly, we formulate measurement criteria to compare them. Consequently, groups can be dynamicallyorganized with respect to the similarities among several aspects of personal context. In particular, userscan engage in complex collaborations related to multiple semantics. For experimentation, we imple-mented a social browsing system based on context synchronization.

� 2008 Elsevier B.V. All rights reserved.

1. Introduction mation, their corresponding personal agents are aware of the con-

Knowledge sharing among people (e.g., co-workers) is animportant procedure for efficient online collaborations. A socialnetwork is built by aggregating relationships between people. Avariety of social co-occurrence patterns can be applied to establishsocial networks, which are composed of a set of people. Here, theessential assumption is that their contexts must be relatively clo-ser than others for collaborations within the same network. Whentwo people are connected, there exists the possibility of cooperat-ing on their tasks and sharing their knowledge. In this sense, this isreferred as a collaborative network. Such networks can be estab-lished by both explicit assertions (e.g., co-authoring, affiliation,and RDF schema such as FOAF1) and by implicit information derivedfrom user activities (e.g., preferences and opinions) [1]. Moreover,recent communication systems were applied to efficiently exchangeuseful information with remote people in an online environment.Such systems are simply based on message passing facilities (e.g., in-stant messaging and e-mail client software) among neighbors whoare explicitly linked.

In fact, computer-supported collaborative work (CSCW) com-munities were investigating groupware systems [2]. There wereseveral collaborative systems which consider the ‘‘context” of par-ticipants [3,4]. Such systems are Doc2U [5] in document authoringand CoLLeGE [6] in e-learning domains.

In this study, we focus on multi-agent model-based collaborativebrowsing systems in a peer-to-peer network, which is collabora-tively searching for relevant information (or resources) on the web[7,8]. While people are browsing the web to search for relevant infor-

ll rights reserved.

org/.

texts, i.e., which information people are looking for, and exchangeuseful information on behalf of their neighbors’ contexts.

However, there are some constraints for engaging in contextualcollaboration between people in the same collaborative network(CN). Tasks grow more complicated, i.e., dependent upon multiplecontexts related to various combined domain knowledge. Thismakes people involved into more than a CN for finding more rele-vant users. We state that a multi-disciplinary community of prac-tice (CoP) [9] is a similar example, in the context of collaboratingover an extended period to share ideas, find solutions, and buildinnovations.

More seriously, neighbors’ contexts could change over time[10]. This implies that people must be informed of the contextualtransitions of their neighbors.

We focus on a context of a user task which is decomposable intoseveral sub-contexts. Hence, in this paper, we consider the follow-ing problems of contextual interactions;

� Local search with a limited number of people. Interactionsbetween co-workers within a CN are limited, to efficientlyaccomplish complicated tasks, i.e., intermixed with multiplecontexts. This implies that their multiple contexts can be sepa-rated into a set of sub-contexts, and they must be aware of theirneighbors’ neighbors through merging of CNs related to the sin-gle sub-context [11]. Consequently, the most relevant people,i.e., those whose context is closest, can be reached.For example, as shown in Fig. 1, suppose that user C in networkCN1 is working on a ‘‘Semantic sensor network,” and the rest ofneighbors in CN1 are in different contexts. Thereby, CN1 must bemerged with the other two networks CN2 and CN3. Then, he cancooperate with C0 on ‘‘Ontologies” and C00 on ‘‘Ubiquitous com-puting,” respectively.

A’

CN2"Ontologies"

C’

B’

"Ubiquitous Computing"CN3 C’’

B

C

A

"Semantic Sensor Network"

CN1

Fig. 1. Merging three collaborative networks.

2 We assume that the standard ontologies are indicating the ontologies includingcommon objects and applicable across a wide range of domains. Such ontologies areCyc (http://www.opencyc.org/) and SUMO (http://www.ontologyportal.org/).

574 J.J. Jung / Knowledge-Based Systems 21 (2008) 573–580

� Temporal transitions. While people are conducting a multi-con-textual task, the up-to-date context of their working state candynamically change, from one sub-context to another. Further-more, people can conduct more than one multi-contextual taskat once. For contextual interactions between people, when tran-sitions are occurring, collaborations between correspondingpeople must be re-organized. For example, if C0 in CN2 is work-ing on ‘‘Ontology mapping” and, at a certain moment, his con-text changes from ‘‘Ontologies” to ‘‘Graph theory,” then C inCN1 must search for alternative people whose context is relatedto ‘‘Semantic Web.”

In order to deal with these problems, in this paper, we investi-gate a means of representing; (i) a user context by using his per-sonal ontologies, and (ii) a group context by integrating a set ofuser contexts involved in the same group. Efficient coordinationwas supported to enable collaborators’ contexts to be shared [12]and integrated with the so-called organization context [13]. Espe-cially, user contexts can be integrated by considering social fea-tures of the corresponding user in a CN (e.g., centralitymeasures). More importantly, peoples’ context transitions are de-tected for synchronized collaborations. At the moment a contexttransition is recognized, the corresponding user is automaticallyshifted into the other CN, in which people were working in themost relevant context. Thereby, we propose a social mediationbased approach to support ad hoc collaborations between peoplein different networks by context mapping methods. Anotherimportant issue is to merge with open CNs [14,15]. We anticipatethat this openness problem can be dealt with ‘‘consensus” ontologyalignment methods, because the group context may be semanti-cally biased.

The outline of this paper is as follows. In Section 2, we explainthe definitions for ontology-based context representation, and con-text mapping algorithms for building group context. Section 3 ad-dresses a contextual synchronization scheme to detect significantdifferences from the rest of group members. In Sections 3 and 4,we show an example of collaborative browsing, and experimentalresults, respectively. Section 5 discusses the proposed approachwith respect to the existing contextual collaboration systems. Fi-nally, in Section 6, we draw the conclusion and mention futurework.

2. Ontology-based context representation

In this paper, we want to define a novel means of representing auser context as semantics derived from ontologies. Thus, a soft-ware agent can automatically realize and understand other users’contexts by communicating with their agents.

Definition 1 (Ontology). An ontology O is represented as

O :¼ ðC;R;ER;ICÞ ð1Þ

where C and R are a set of classes (or concepts), a set of relations(e.g., equivalence, subsumption, disjunction, etc.), respectively.ER � C� C is a set of relationships between classes, representedas a set of triples fhci; r; cjijci; cj 2 C; r 2 Rg, and IC is a power setof instance sets of a class ci 2 C.

An important characteristic is that the ontology can importother ontologies and a set of partitioned ontology fragments. Forexample, some standard (upper-level) ontologies2 can be employedwith OWL vocabularies like owl:imports.

People annotate resources in their information repositories byusing the ontologies providing semantics to users.

Definition 2 (Personal ontology). A personal ontology is extendedwith subjective descriptions about contents in personal informa-tion repositories. For a given set of annotations Ak from resourcesRESk of user uk, his personal ontology OP

i is represented as

OPk ¼ ðC;R;ER;IC;AkÞ ð2Þ

whereAk ¼ fhci; r; resji; hinsa; r; resjijci 2 C; insa 2 IC; r 2 R; resj 2 RESkg.

As people research their resources, their own personal ontolo-gies can be incrementally built by creating new classes, appendingontology fragments, and defining additional correspondences be-tween classes. We state that a set of annotations Ak plays the roleof a contextual A-Box reflecting a user’s cognitive responses, interms of description logics.

In this paper, a context of a user is represented with his ownpersonal ontology, while researching (searching for) a certain re-source. Note that because the semantic metaphor of the resourcemight be ambiguous, the problem is the means to select the mostrelevant context among all possible candidates and to maximizeunderstandability of neighbors’ agents. Hence, we must conceptu-alize the corresponding user’s viewpoint about a certain resourceby contextualizing his personal ontology. In other words, in[16,17], the contextualization process of ontologies is called inter-pretation of the personal ontology to resources. Thus, we define thecontext as a set of concepts obtained by a mapping function be-tween the personal ontology and working resources, at a certainmoment.

Definition 3 (Context). A context ctxðtÞk of user uk obtained by amapping function M at time t is given by

ctxðtÞk ¼ fcijci 2MðOPk ; resðtÞÞg ð3Þ

where resðtÞ is a resource working at the moment.

As shown in Fig. 2, between two mapping functions M and M0

interpreting semantics about the working resource resðtÞ into dif-ferent contexts, we must choose the most suitable by respect com-parison functions S and S0 of the annotations Ak. Here, acomparison function S must be formulated, according to the typesof resources (e.g., text documents and images). In this function,content-based analysis (e.g., TF-IDF) must be included, so that aset of features can be extracted. We neglect to explain the featureextraction methods in detail.

Definition 4 (Mapping). A mapping function M is given by

MðOPk ; resðtÞÞ ¼ argfci jhci ;r;ai2OP

k gmax

Pf2SðresðtÞÞ;a2ASimLðf ; aÞ

jSðresðtÞÞj ð4Þ

Fig. 2. Context representation for semantic mapping.

J.J. Jung / Knowledge-Based Systems 21 (2008) 573–580 575

where SimL ¼ 1� Distanceðf ;aÞmaxðjf j;jajÞ measuring the similarity between two

given terms f and a based on string matching algorithms (e.g., Edit,Levenshtein, and Substring distances) (It will be explained in detailin Section 3.1). This function returns a set of concepts ci applied toannotation a.

Thus, the context of a user is discovered from resources ac-cessed by him, because the mapping function can discover the bestalignment between the personal ontology and derived featuresmaximizing the summation of similarities. In this case, only theirlabels are considered to measure these similarities.

For example, suppose that a user is accessing a web page to seekinformation, as shown in Fig. 3. The web page can be regarded as aresource on the web.

Example 1. From a given web page, comparison function S shouldbe feasible on hypertext analysis (e.g., weighted TF-IDF scheme[18]), so that we can find out a set of features {‘‘agents”,‘‘information”, ‘‘java”, ‘‘intelligent”}. The best mapping which ismaximizing the summation of similarities between them is shown;

SimLð\agents"; \Software agent"Þ ¼ 1� 1014¼ 0:286 ð5Þ

SimLð\information"; \Artificial Intelligence"Þ

¼ �1823¼ 0:217 ð6Þ

SimLð\java"; \Java"Þ ¼ 1� 04¼ 1:0 ð7Þ

SimLð\intelligent"; \Artificial Intelligence"Þ

¼ 1� 1323¼ 0:435 ð8Þ

Thus, his context is represented as {Artificial_Intelligence, Soft-ware_Agent, Java} where M ¼ 1:938.

In order to improve the precision of context representation attime t, we note two important properties of the context repre-sented by the proposed method; (i) continuity, and (ii)decomposability.

Property 1 (Continuity). A context can be sustained during a certaintime interval T. The resources accessed to during the time aresemantically similar to each other. Over time, the context similarityis getting decreased. If ta is closer to t than tb, it can be represented as

SimCTX ctxðtÞk ; ctxðtaÞk

� �> SimCTX ctxðtÞk ; ctxðtbÞ

k

� �ð9Þ

Webpage

agents

information

java

intelligent

Artificial_Intelligence

Software_Agent

Computer_Science

Programming

Java

Personal Ontology

Resource

Fig. 3. Example for context representation.

where SimCTX is a similarity measurement between two contexts (It willbe explained in detail in Section 3.1.). We assume that there are morethan one contextual boundary between ta and tb.

Thus, by discovering contextual boundaries, we can handle botha resource at a certain moment and an ordered sequence of re-sources resðtÞ; . . . ; resðt�Tþ1Þ during a time internal T, which are re-garded as more semantically interrelated with each other thanwith either resðtþ1Þ or resðt�TÞ. Consequently, unless features canbe derived from a given resource resðtÞ, they can replace the fea-tures of resources within the time interval T (i.e., resðt�1Þ andresðt�2Þ).

Property 2 (Decomposability). A context can be decomposed intoseveral sub-contexts;

ctxðtÞui�GKk¼1

ctxk?ui

ð10Þ

¼ctx1?uit ctx2?

uit . . . t ctxK?

uið11Þ

where ð8ctx1?uiÞ; ctxk?

uiv ctxðtÞui

. Unlike to general set partitioning prob-lem, the intersections between sub-contexts are not necessarily empty,i.e., ctxa?

uiu ctxb?

ui6¼ /.

This property is useful when people are researching a problemwith multiple contexts like ‘‘semantic sensor network” shown inFig. 1. In order to improve the performance of these kinds of tasks,the work can be separated into several sub-problems with the cor-responding sub-context.

3. Contextual synchronization

In this study, we assume that people in the same CN must beresearching semantically similar tasks. A centralized agent (morepractically, super-peer agents) must be able to coordinate peopleby; (i) discovering consensual contexts from their activities, (ii)detecting contextual transitions, and (iii) re-organizing CNs. We re-fer to this procedure of the centralized agent as contextualsynchronization.

3.1. Mapping contexts to build a consensual context

As the first step, we must build consensual context by mergingall of the contexts in a CN. A similarity-based alignment schemeis thereby employed to discover the best formation by collectingsimilarities measured between two contexts. Euzenat and Valtchev[19] defines all possible similarities (e.g., SimC; SimR; SimA, andSimI) between concepts, relations, attributes, and instances. Givena pair of concepts from two different contexts, the similarity mea-sure SimC is assigned in [0,1]. The similarity ðSimCÞ between c in ctxand c0 in ctx0 is defined as

SimCðc; c0Þ ¼X

E2NðCÞpC

E MSimYðEðcÞ; Eðc0ÞÞ ð12Þ

where NðCÞ � fE1 . . . Eng is the set of all relationships in which con-cepts participate (for instance, sub-concept, instances, or attri-butes). The weights pC

E are normalized (i.e.,P

E2NðCÞpCE ¼ 1). If we

consider concept labels (L) and three relationships in NðCÞ, whichare the super-concept ðEsupÞ, the sub-concept ðEsubÞ and the siblingconcept ðEsibÞ, Eq. (12) is reformulated as:

SimCðc; c0Þ ¼ pCL simLðLðAiÞ; LðBjÞÞþ pC

supMSimCðEsupðcÞ; Esupðc0ÞÞ

þ pCsubMSimCðEsubðcÞ; Esubðc0ÞÞ

þ pCsibMSimCðEsibðcÞ; Esibðc0ÞÞ: ð13Þ

where set functions MSimC calculate the similarity of two entity col-lections, which can be replaced with SimSTX .

Fig. 4. Aggregation of contexts from user activities ðT ¼ 3Þ.

576 J.J. Jung / Knowledge-Based Systems 21 (2008) 573–580

In fact, a context matching between two sets of concepts can beestablished by finding a maximal matching maximizing thesummed similarity between the concepts:

MSimCðctx; ctx0Þ ¼max

Phc;c0 i2Pairingðctx;ctx0 ÞSimCðc; c0Þ

� �maxðjctxj; jctx0jÞ ; ð14Þ

in which Pairing provides a matching of the two sets of concepts.Methods like the Hungarian method enable us to directly find di-rectly the pairing which maximizes similarity. The ontology align-ment algorithm is an iterative algorithm that calculates thissimilarity [19]. This measure is normalized because if SimC is nor-malized, the divisor is always greater or equal to the dividend. Itis very similar to the mapping function M in Eq. (4).

Thus, a collaborative network CNi consists of CNi ¼ hUi;Viiwhere Ui is a set of users fu1; . . . ;ujCNi jg and Vi � jUij � jUij. A linkvab 2Vi between two users ua and ub is attached withMSimCðctxa; ctxbÞ. If we can assume that all of them are contextu-ally cohesive (this testing process is explained subsequently), theconsensual context of CNi is represented as

ctx>CNi¼[jUi j

k¼1

ctxukð15Þ

where uk 2 CNi. Consequently, we can easily calculate the centralityCTRðukÞ in collaborative network CNi by

CTRðukÞ ¼PjUi j

j¼1;j6¼kMSimCðctxk; ctxjÞjUij � 1

: ð16Þ

Then, the most centralized user (denoted as u}CNi) whose CTR is a

maximum is regarded as capable of playing the role of a contextualrepresentative of the corresponding CN.

Differing from incremental building of consensus ontologiesover time [20], this consensual context must be updated wheneverpeople access new topics over time.

3.2. Detection of context transition

In this section, we consider that the context of a user can bechanged while researching a certain resource. Thus, if any contex-tual transition of the context is detected, the CN must be re-orga-nized. Context transition is simply based on comparison between acontext ctxðtÞ and previous ones, e.g., ctxðt�1Þ. If their difference islarger than a threshold sCTX , we assume that the corresponding user(or users) are researching on distinct resources. For instance, thefunction for user uk may be formulated by testing the followingstep;

SimCTXðctxðtÞuk; ctxðt�1Þ

ukÞ 6 sCTX : ð17Þ

Instead of the contexts of every individual user, the consensual con-texts of CNs are applied to enable greater efficiency in testingcalculation.

More importantly, several semantic factors are defined to mea-sure the various temporal patterns indicating relationships be-tween the contexts of users in a CN, and consensual contexts. Wemainly focus on a sequence of contexts ½ctxðt�Tþ1Þ; ctxðtÞ� by usingthe sliding windows method, where T is the size of the time inter-val. Assume that a sequence of context ½ctxðt�Tþ1Þ

uk; ctxðtÞuk

� in a timeinterval T is given.

Definition 5 (Semantic distance matrix D}). A semantic distancematrix D} is represented by

D}½ctxðt�Tþ1Þuk

; ctxðtÞuk� ¼

. . . . . . . . .

. . . 1� SimCTXðctxtiuk; ctx

tjukÞ . . .

. . . . . . . . .

264

375 ð18Þ

where ti; tj 2 ½t � T þ 1; t�. The size of this matrix is T � T , and thediagonal elements are equal to zero. Also, by the commutativelaw, it is a symmetric matrix.

Definition 6 (Semantic distance mean l}). Semantic distance meanis the average value of upper (or lower) triangular elements in D}

except diagonal components, and it is given by

l} ¼ 2TðT � 1Þ

XT�1

i¼1

XT

j¼iþ1

D}ði; jÞ ð19Þ

where T is the time interval, indicating the size of D}.

Semantic distance mean can measure the semantic consistencyof the given context sequence (including ctxuk

and ctx>CNi) derived

from the corresponding resources. Smaller l} implies the highercohesion of the given context sequence.

Definition 7 (Semantic distance deviation r}). A semantic standarddeviation is calculated by l} and the components from D}. It isformulated by

r} ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2

TðT � 1ÞXT�1

i¼1

XT

j¼iþ1

ðD}ði; jÞ � l}Þ2vuut ð20Þ

It is simply a statistical value measuring the degree of dispersion ofthe semantic distance values from a given set of concepts fromresources.

As calculation of semantic factors in a given time interval is re-peated over time, we can establish the distribution of contextualtransitions. Based on temporal dynamics of semantic factors overtime, we expect to identify semantically significant transition mo-ments of the contexts (and consensual context) during researchabout a certain task. Hence, particular special triggering patternsfrom these signals are regarded as important evidence for specificmoments of contextual synchronization.

More practically, activities of users in CNi are aggregated dur-ing a given time interval T. A matrix WðCNiÞ contains the con-texts represented as a set of concepts from personal ontologies,as shown in Fig. 4. From this, we obtain a sequence of conceptsets which are aggregated by both contextual dynamics of; (i)a particular user’s activities (ctxuk

is indicated by kth row compo-nents in WðCNiÞ) and (ii) a consensual context ctx>CNi

obtainedfrom merging the column components at each moment. Forexample, assume that for all pairs of concepts ðci; cjÞ, similaritiesSimCðci; cjÞ are identical.

The contextual synchronization process is organized as twosteps;

(1) Alerting step. Semantic distance deviation r} of column com-ponents in wðcniÞ is applied to derive significant contextualtransitions of a particular user uk in cni. this is a certainmoment of change tp when the context ctxuk

is different fromthe corresponding consensual context ctx>cni

. basically, thisstep is very similar to the outlier detection from a given

J.J. Jung / Knowledge-Based Systems 21 (2008) 573–580 577

streaming dataset from various application domains (e.g.,financial transactions, environmental and scientific datasources) [21–23]. In this paper, a set of time points tAlert forgenerating alerts can be characterized as;

tAlert ¼ ftpjtp 2 ½t � T þ 1; t�;FALRMðCNi; tpÞP kAlertg ð21Þ

where kAlert is the threshold value for alerting that there existsome users uAlert whose context is semantically different fromthe other members in the collaborative network (ctx>CNi

tran-sitions). Function FALRM is given by

FALRMðCNi; tpÞ ¼ jr} ctx>ðtpÞCNi

� �� r} ctx>ðtp�1Þ

CNi

� �j ð22Þ

Then, at a given time point tp 2 tAlert, the users uðtpÞAlert are simply

detected by

uðtpÞAlert ¼ ujjuj ¼ argi max

XH

h¼1

D}ðj;hÞ( )

ð23Þ

where H means the size of the semantic distance matrix D}.This discovers and removes i most dissimilar users (the max-imal summation of semantic distances d}) within CNi. Forremoving uðtpÞ

Alert from D}, we must perform iteration, untilthe temporal difference of consensual context in Eq. (21) isless than kAlert (formulated as jr}ðctx>ðtpÞ

CNi�

r}ðctx>ðtp�1ÞCN Þj 6 kAlert).

(2) Confirming step. In the previous step, we discovered a set ofusers utp

Alert whose contexts are probably changed at time tp.The confirming step involves:

(a) Confirming whether users exhibit contextual transi-

tions or not (We discuss further details about associa-tions between contexts of people and consensualcontext of the corresponding CN in Section 5.), and

(b) If so, discovering the specific transition moments ofcontexts of the confirmed users by comparing the pre-vious context.

Thereby, semantic distance mean l} is measured to deter-mine whether the alerted users’ contexts ctxui

whereui 2 uAlert are changed or not. By the continuity property ofontology-based contexts (in Property 1), it can be discov-ered by constructing the semantic distance matrix D} byusing the subsequence derived from each row componentof WðCNiÞ. If users uAlert have exhibited any contextual tran-sitions, we can detect more specific time points tS of contex-tual transitions. Similar to the previous ‘‘alerting” step, theconfirming step for the alerted users can be characterized as

uConfirm ¼ fhuj; tsq ijuj 2 uAlert;FCFRMðuj; tsq ÞP kConfirmg ð24Þ

where the threshold kConfirm must be pre-defined by users forconfirming contextual transitions of context ctx

ðtsq Þuj

. FunctionFCFRM is given by � �

FCFRMðuj; tsq Þ ¼ kl} ctx

ðt�sq Þuj

� �� l} ctx

ðtþsqÞ

ujk ð25Þ

where t�sqand tþsq

mean bisected time intervals ½ts0 ; tsq � 1� and½tsq ; tT �, respectively. Hence, a set of time points tj

S is likely themoment when the personal context of the correspondinguser changes.

3 Borland Delphi. http://www.borland.com/.4 JXTA API. http://www.jxta.org/.5 W3C OWL. http://www.w3.org/TR/owl-features/.6 This image collection is available in http://www.intelligent.pe.kr/AnnotGrid/.7 Open Directory Project, http://www.dmoz.org/.

Each time streaming activities within a collaborative networkCNi are stored in WðCNiÞ, the two-step procedure for detectingcontextual transitions must be implemented. We employ thesemantic distance deviation r} to establish the dispersion of mem-bers in CNi, rather than the consensual context itself. Subsequently,if some users are detected in this step, the confirming step canestablish if their transitions are validated or not, because thesemantic distance mean l} is useful for measuring the semanticcohesion within a certain time interval.

3.3. Re-organization

Confirmed users ui 2 uConfirm must be provided with the CN forwhich the consensual context is more relevant to ctxui

. The re-orga-nization process must be conducted by the following objectivefunction

maxX

CNi ;CNj2X1� SimCTX ctx>ðtpþ1Þ

CNi; ctx>ðtpþ1Þ

CNj

� �0@

1A ð26Þ

where X is the complete set of CNs fCNijCNi ¼ hUi;Viig which peo-ple can access. This implies that the summation of semantic dis-tances between consensual contexts of two CNs must bemaximized. As a result, topology patterns of these CNs are dynamic.Here, the decomposability property (in Property 2) of the context isconsidered. We want to discover which CNs are most relevant tosub-contexts ctxk?

uiv ctxui

by using

maxXK

k¼1

SimCTXðctxk?ui; ctxCNj

Þ ð27Þ

where K is the total number of sub-contexts, based on the numberof concepts shared with the consensual contexts of CNs. This im-plies that user ui is contextually associated with K CNs. Moreover,as many users are commonly participating in different CNs, theCNs can eventually be merged into a single CN.

4. Experimental results

In order to evaluate the proposed approach, we have extendedthe collaborative web browsing system on peer-to-peer platform,presented in [24]. The development specification is mainly sepa-rated into two components; (i) peer modules (e.g., Graphic UserInterface (GUI) module and web browser), for which we employedBorland Delphi3 and a personal ontology editor, and (ii) super-peermodules (e.g., peer manager and alarm manager), for which JXTA APIlibraries4 was employed.

Three groups of 30 users (ten users in each group), GA;GB, andGC were organized, and they were asked to build their own per-sonal ontologies with Web Ontology Language (OWL).5 In addition,these ontologies were enriched by annotating a given set of images.6

Then, we collected the 30 sequences of weblogs by enablingthese users to browse the testing bed with their own contextswhich are supposed to be fixed. After cleansing the collected data-set by the preprocessing scheme proposed in [25], we prepared thetesting dataset, which is composed of 5622 web pages classifiedaccording to 28 categories from ODP.7

The first issue is to evaluate the precision of detecting contex-tual transitions. We generated 100 synthesized sequences, includ-ing 794 contextual transitions, by intermixing fragments randomlysegmented from the 30 weblog sequences (i.e., fragments from dif-ferent weblog sequences are contextually distinct). We examinedthe degree to which the transitions could be detected with respectto two measurements, precision and recall. Additionally, F1-value iscalculated by 2recall�precision

recallþprecision for combining these two measures. Asshown in Fig. 5, we obtained an average F1 ¼ 0:52, and the bestperformance is F1 ¼ 0:544 when kAlert ¼ 0:4. The average perfor-mance of the confirming step was F1 ¼ 0:72, and whenkConfirm ¼ 0:6, the best results were exhibited ðF1 ¼ 0:759Þ. Weempirically established the best threshold values kAlert ¼ 0:4 andkConfirm ¼ 0:6. The threshold level of the confirming step seems

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1Threshold for Alarming

RecallPrecision

F1

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1Threshold for Conrming

RecallPrecision

F1

Fig. 5. Detection of contextual transitions in alarming and confirming step as ch-anging kAlert and kConfirm.

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20 25

Prec

isio

n

Days

Simple Co-browsing (G )Bayesian Synchronization (G )

Ontology-based Synchronization (G )

Fig. 7. Performance of information searching from three CNs with precision.

Table 1Evaluation of the performance of communications

Communications GA GB GC

Group communications – 13441 12325Web accesses 6232 3662 3285

Ratio (%) – 58.76 52.71

578 J.J. Jung / Knowledge-Based Systems 21 (2008) 573–580

slightly more critical, because it is for the personal context. Overall,the average performance of the confirming process is approxi-mately 37% higher than that of alerting step.

The second issue is the performance of information sharing byre-organizing CNs. This proves the efficiency of online collabora-tion during browsing, rather than individual browsing or basicco-browsing systems (i.e., without contextual synchronization).While users in GA browsed with simple collaboration (assumingthat contexts are fixed), GB and GC was performed with online col-laborations, detecting context transitions dynamically. In fact, weasked users in GB to use a context mapping method based onBayesian influence propagation [24]. GC were the only users pro-vided with the proposed ontology mapping algorithms. We moni-tored the performance of information searching tasks in threegroups over time, by comparing the topics derived from the re-trieved information with the topics the users selected as theirinterests prior to experiments. As shown in Fig. 6, GC , co-browsingwith ontology mapping-based synchronization, exhibited the bestrecall results (in 9th day, approximately four times higher than GA,and in 7th day, 79% higher than GB). This implies that ontologymapping-based synchronization for co-browsing can supportusers, particularly during the early stage.

With respect to precision, in Fig. 7, we discovered that co-brows-ing systems finally exhibited convergence exceeding the 80% preci-sion level, even though, during the initial stage, individualbrowsing exhibited the best performance. In the case of individualbrowsing, users put their preferences into corresponding personalcrawlers.

Overall, Table 1 shows the final results of three group members’browsing over four weeks. GC in online co-browsing exhibited only53% web access with cooperation according to the context. Com-

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20 25

Rec

all

Days

Simple Co-browsing (G )Bayesian Synchronization (G )

Ontology-based Synchronization (G )

Fig. 6. Performance of information searching from three CNs with recall.

pared with GB, our proposed method has slightly improved by11.5%.

5. Discussion

In this paper, we proposed collaborative personal agents on het-erogeneous web information environment. Basically, the focusedcrawler systems, also known as topical (topic-driven) crawlers,are rather accessible to most of web spaces systematically. Wefirstly need to compare a single ontology-based platform with amultiple ontology-based platform. Secondly, more importantly,we discuss the dynamics of personal and group contexts onsuper-peer network.

5.1. Comparison between single and multiple ontologies

In this work, we employ a single centralized ontology, i.e., webdirectory. This enables the automatic resolution of the semanticheterogeneity problem, because every topic is annotated (or la-beled) by referring to the single ontology.

However, we must consider a platform which provides multipleontologies. Each information source (or system) can build its ownontology. These types of ontologies might be domain-specific andcause semantic heterogeneity problems. In our case, the users’agents can be embedded with their personal interests. Also, theymight edit their own personal ontologies. Thereby, ontology align-ment (mapping) methods were proposed. Recently, Shvaiko andEuzenat explained the classification method of ontology alignment(matching) [26] and ontology mapping algorithms [27]. Severalalignment methodologies were introduced. Since Dieng and Hugproposed an algorithm for matching conceptual graphs using ter-minological linguistic techniques and comparing superclassesand subclasses [28], Euzenat developed T-tree to infer the depen-dencies between classes (bridges) of different ontologies sharingthe same set of instances based only on the ‘‘extensions of classes”[29]. Additionally, FCA-merge uses formal concept analysis tech-niques to merge two ontologies sharing the same set of instanceswhile properties of classes are ignored [30]. Meanwhile, Cupid isa first approach combining many the other techniques. It alignsacyclic structures taking terminology and data types (internal

J.J. Jung / Knowledge-Based Systems 21 (2008) 573–580 579

structure) into account and giving more importance to leaves [31].Especially, we studied the ontological mediator framework forsharing semantic information between personal crawlers [20].

5.2. Dynamics of personal contexts and group context

We assumed that the context can be changed during webbrowsing. In most collaboration systems, the context is coupledand related; (i) between people, (ii) between groups, and (iii) be-tween a person and a group. For supporting online collaborationsbetween users, we realized that deriving and comparing dynamicsof contexts is more important than representing and comparingthe contexts themselves.

There were two well-known P2P networks; Napster8 and Gnu-tella.9 Especially, we employ a super-peer network scheme similarto Napster. A pure P2P network is a ‘‘degenerate” super-peer net-work where cluster size is one. While this implies that every nodeis a super-peer with no clients, super-peer networks such as KaZaA10

use heterogeneity of peers to their advantage. Also, regarding thecomputational complexity, the overhead of maintaining an indexin the super-peer networks is small by comparison to the savingsin query cost this centralized index enables [32]. Furthermore, Xiaoet al. proposed a dynamic super-peer network based on a dynamiclayer management [33].

6. Conclusion and future work

Social browsing emerged from various collaborative systems. Inthis paper, we proposed an online co-browsing system with spatiallyremote and temporal synchronous characteristics. This is capable ofdetecting contextual transitions of users in a group, so that they areefficiently shifted into the relevant group communications. As themain contribution of this paper, most importantly, we proposetracking the contextual dynamic of groups while co-browsing,rather than modeling the consensual context of the groups.

However, there are many problems remaining with this systemthat must be dealt with by future work. We modified Levenshtein editdistance [34] to measure hierarchical path-labeled web pages. In or-der to support more general types of users, we obviously considervarious semantic annotation methods [35] to compare their rela-tionships. Another issue is the topology of the P2P network. As men-tioned in [36], a hyperlinked environment has various topologicalfeatures such as authorities and hubs, thus we must consider theselection process of super-peers. Finally, more globally, by usingthe grid computing paradigm [37], we also consider the social gridenvironment, providing k-redundant super-peer networks [32].

Acknowledgement

This work was supported by the Korea Research FoundationGrant funded by the Korean Government (MOEHRD) (KRF-2005-037-D00347).

References

[1] A. Schmidt, Management of dynamic and imperfect user context information,in: R.M. et al. (Eds.), Proceedings of the International on the Move FederatedConferences (OTM), vol. 3292 of Lecture Notes in Computer Science, Springer-Verlag, 2004, pp. 779–786.

[2] C.A. Ellis, S.J. Gibbs, G. Rein, Groupware: some issues and experiences,Communications of the ACM 34 (1) (1991) 39–58.

[3] V. Anupam, C.L. Bajaj, Shastra: multimedia collaborative design environment,IEEE Multimedia 1 (2) (1994) 39–49.

8 Napster.http://www.napster.com/.9 Gnutella.http://www.gnutella.com/.

10 KaZaA. http://www.kazaa.com/.

[4] J. Kilker, Conflict on collaborative design teams: understanding the role ofsocial identities, IEEE Technology and Society Magazine 18 (3) (1999) 12–21.

[5] A.L. Morán, J. Favela, A.M.M. Enrı́quez, D. Decouchant, On the design ofpotential collaboration spaces, International Journal of Computer Applicationsin Technology 19 (3–4) (2004) 184–194.

[6] A. Ravenscroft, M. Matheson, Developing and evaluating dialogue games forcollaborative e-learning, Journal of Computer Assisted Learning 18 (1) (2002)93–101.

[7] V. Hassler, Online collaboration products, IEEE Computer 37 (11) (2004) 106–109.

[8] J.J. Jung, Collaborative web browsing based on semantic extraction of userinterests with bookmarks, Journal of Universal Computer Science 11 (2) (2005)213–228.

[9] J. Swan, H. Scarbrough, M. Robertson, The construction of ‘communities ofpractice’ in the management of innovation, Management Learning 33 (4)(2002) 477–496.

[10] B. Kokinov, A dynamic theory of implicit context, in: Proceedings of the 2ndEuropean Conference on Cognitive Science, 1997.

[11] L. Terveen, D.W. McDonald, Social matching: a framework and researchagenda, ACM Transactions on Computer–Human Interaction 12 (3) (2005)401–434.

[12] T. Gross, W. Prinz, Modelling shared contexts in cooperative environments:concept, implementation, and evaluation, Computer Supported CooperativeWork 13 (3) (2004) 283–303.

[13] C. Simone, M. Divitini, Integrating contexts to support coordination: the chaosproject, Computer Supported Cooperative Work 8 (3) (1999) 239–283.

[14] T. Helmy, T. Mine, G. Zhong, M. Amamiya, Open distributed autonomousmulti-agent coordination on the web, in: Proceedings of the 7th InternationalConference on Parallel and Distributed Systems (ICPADS), IEEE ComputerSociety, Washington, DC, USA, 2000, p. 461.

[15] W. Treese, Open systems for collaboration, Networker 8 (1) (2004) 13–16.[16] P. Bouquet, F. Giunchiglia, F. van Harmelen, L. Serafini, H. Stuckenschmidt,

Contextualizing ontologies, Journal of Web Semantics 1 (3) (2004) 325–343.[17] A. Zimmermann, J. Euzenat, Three semantics for distributed systems and their

relations with alignment composition., in: I.F. Cruz, S. Decker, D. Allemang, C.Preist, D. Schwabe, P. Mika, M. Uschold, L. Aroyo (Eds.), Proceedings of the 5thInternational Semantic Web Conference (ISWC 2006), vol. 4273 of LectureNotes in Computer Science, Springer, 2006, pp. 16–29.

[18] P. Soucy, G.W. Mineau, Beyond tfidf weighting for text categorization in thevector space model., in: L.P. Kaelbling, A. Saffiotti (Eds.), Proceedings of theNineteenth International Joint Conference on Artificial Intelligence (IJCAI-05),Edinburgh, Scotland, UK, July 30–August 5, 2005, Professional Book Center,2005, pp. 1130–1135.

[19] J. Euzenat, P. Valtchev, Similarity-based ontology alignment in OWL-Lite, in:Proceedings of the 16th European Conference on Artificial Intelligence, 2004,pp. 333–337.

[20] J.J. Jung, Ontological framework based on contextual mediation for collaborativeinformation retrieval, Information Retrieval 10 (1) (2007) 85–109.

[21] V. Puttagunta, K. Kalpakis, Adaptive methods for activity monitoring ofstreaming data, in: M.A. Wani, H.R. Arabnia, K.J. Cios, K. Hafeez, G. Kendall(Eds.), Proceedings of the 2002 International Conference on Machine Learningand Applications (ICMLA 2002), CSREA Press, 2002, pp. 197–203.

[22] S. Papadimitriou, A. Brockwell, C. Faloutsos, Adaptive. unsupervised streammining, The VLDB Journal 13 (3) (2004) 222–239.

[23] S. Papadimitriou, J. Sun, C. Faloutsos, Streaming pattern discovery in multipletime-series, in: K. Böhm, C.S. Jensen, L.M. Haas, M.L. Kersten, P.-Å. Larson, B.C.Ooi (Eds.), Proceedings of the 31st International Conference on Very Large DataBases (VLDB), ACM, 2005, pp. 697–708.

[24] J.J. Jung, Semantic co-browsing system based on contextual synchronizationon peer-to-peer environment, Computing and Informatics 26 (5) (2007) 469–488.

[25] J.J. Jung, Semantic preprocessing of web request streams for web usage mining,Journal of Universal Computer Science 11 (8) (2005) 1383–1396.

[26] Y. Kalfoglou, M. Schorlemmer, Ontology mapping: the state of the art,Knowledge Engineering Review 18 (1) (2003) 1–31.

[27] P. Shvaiko, J. Euzenat, A survey of schema-based matching approaches, Journalon Data Semantics IV 3730 (2005) 146–171.

[28] R. Dieng, S. Hug, Comparison of ‘‘personal ontologies” represented throughconceptual graphs, in: Proceedings of the 13th European Conference onArtificial Intelligence, 1998, pp. 341–345.

[29] J. Euzenat, Brief overview of t-tree: the tropes taxonomy building tool, in:Proceedings of the 4th ASIS SIG/CR workshop on classification research,Columbus, USA, 1994, pp. 69–87.

[30] G. Stumme, A. Maedche, FCA-merge: bottom-up merging of ontologies, in:Proceedings of the 17th International Joint Conference on Artificial Intelligence(IJCAI’01), Seattle, USA, 2001, pp. 225–230.

[31] J. Madhavan, P. Bernstein, E. Rahm, Generic schema matching using Cupid, in:Proceedings of the 27th International Conference on Very Large Data Bases(VLDB’01), Roma, Italy, 2001, pp. 48–58.

[32] B. Yang, H. Garcia-Molina, Designing a super-peer network, in: U. Dayal, K.Ramamritham, T.M. Vijayaraman (Eds.), Proceedings of the 19th InternationalConference on Data Engineering (ICDE 2003), IEEE Computer Society, 2003, pp.49–60.

[33] L. Xiao, Z. Zhuang, Y. Liu, Dynamic layer management in superpeerarchitectures, IEEE Transactions on Parallel and Distributed Systems 16 (11)(2005) 1078–1091.

580 J.J. Jung / Knowledge-Based Systems 21 (2008) 573–580

[34] I. Levenshtein, Binary codes capable of correcting deletions, insertions, andreversals, Cybernetics and Control Theory 10 (8) (1996) 707–710.

[35] J.J. Jung, Exploiting semantic annotation to supporting user browsing on theweb, Knowledge-Based Systems 20 (4) (2007) 373–381.

[36] J.M. Kleinberg, Authoritative sources in a hyperlinked environment, Journal ofthe ACM 46 (5) (1999) 604–632.

[37] H. Zhuge, Semantics, resource and grid, Future Generation Computer Systems21 (1) (2004) 1–5.