ontology mapping composition for query transformation on distributed environments
TRANSCRIPT
Expert Systems with Applications 37 (2010) 8401–8405
Contents lists available at ScienceDirect
Expert Systems with Applications
journal homepage: www.elsevier .com/locate /eswa
Ontology mapping composition for query transformationon distributed environments
Jason J. JungDepartment of Computer Engineering, Yeungnam University, Dae-Dong, Gyeungsan, Republic of Korea
a r t i c l e i n f o a b s t r a c t
Keywords:Query transformationOntology mappingMapping composition
0957-4174/$ - see front matter � 2010 Elsevier Ltd. Adoi:10.1016/j.eswa.2010.05.041
E-mail addresses: [email protected], j2jung@
Semantic heterogeneity should be dealt with for supporting automated information sharing processbetween information systems in distributed environments. To do so, traditional approaches have beenbased on explicit mapping between ontologies obtained from human experts of the particular domain.However, the manual tasks are very expensive, so that it is difficult to obtain ontology mappings betweenall possible pairs of the information systems. Thereby, in this paper, we propose a semantic distributedsystem to make the existing mapping information sharable and exchangeable. It means that the proposedsystem can collect the existing mappings and aggregate them. Consequently, we can estimate the ontol-ogy mappings in an indirect manner. In particular, this paper focuses on query propagation on the distrib-uted networks. Once we have found the indirect mapping between the systems, the queries can beefficiently transformed to automatically exchange knowledge between heterogeneous informationsystems.
� 2010 Elsevier Ltd. All rights reserved.
1. Introduction
Information systems in various domains e.g., e-learning (Shih,Yang, & Tseng, 2009), telecommunication (Jung, Lee, & Choi,2009), have been trying to build their own domain ontologies forefficiently managing local resources and information.
For many purposes (Morbach, Yang, & Marquardt, 2007; Payne,Mendonça, Johnson, & Starren, 2007; Shih et al., 2009; Villanueva-Rosales & Dumontier, 2008), such systems have been interlinkedwith each other. On this distributed environment, the informationsystems have to be able to automatically interact with each other.In particular, we are focusing on ontology-based information sys-tems where such interactions can be based on ontology mappingprocess.
Most of the local ontologies however are constructed by domainexperts and from local database schema. Then, the semantics ofeach information system is distinct. In our previous studies Jung(2007) and Jung (2010a), each information system tends to includethe resources which are:
� related to the specific and unique topics,� represented as the consistent linguistic terminologies,� organized by local database schema, and� annotated with local metadata.
ll rights reserved.
gmail.com
It means the resources in an information system are semanti-cally attached with a metadata from the corresponding localontologies.
Due to these semantic heterogeneity problems between localontologies, it is difficult for the information systems on distributedenvironment to be semantically interoperable with others. In otherwords, the systems are supposed to be automatically interact witheach other by sharing their resources and knowledge. So far, wehave believed that finding mappings between ontologies is an effi-cient solution.
More seriously, we have realized that manual ontology map-ping by human users are expensive and high cost, because of thelack of domain expertise as well as the complex internal structureof the ontologies (e.g., a large number of concepts and properties).Thus, many ontology mapping algorithms for automatically dis-covering correspondences between ontologies have been proposed(Euzenat & Shvaiko, 2007).
However, most of the mapping algorithms have a scalabilityproblem. While from given two source ontologies they can obtainexplicit and direct mapping results, from an increasing number ofontologies on a general distributed environment they have someserious difficulties on scalability.
To deal with this problem, we want to compose an indirectmapping by reusing the mapping information which are alreadydone before. It means that given two ontologies OA and OB, wecan make two information systems SA and SB interoperable bycomposing two existing mappings MðOA;OCÞ and MðOC ;OBÞ, in-stead of finding out direct mapping MðOA;OBÞ.
8402 J.J. Jung / Expert Systems with Applications 37 (2010) 8401–8405
Thereby, In this paper, the ontologies and mapping informationshould be formalized with some more definitions for the mappingcomposition. A novel measurement for ontology mapping perfor-mance will be introduced. Especially, to evaluate the performanceof sharing and composing mapping results, a multi-agent platformhas been employed. Any two heterogeneous agents on the multi-agent platform can communicate with each other by query-answering process, so that we can measure how precisely the pro-posed mapping composition is conducted.
The outline of this paper is as follows. In the following Section 2,major components for building ontology-based information sys-tems will be proposed and similarity-based ontology mappingalgorithm will be investigated. Section 3 will show how to shareand compose the existing mapping information. Section 4 will ex-hibit experimental results collected by evaluation. Finally, in Sec-tion 5 will draw a conclusion of this work.
2. Ontology-based information systems
Ontology-based information systems are supposed to bemachine processible. In this study, the system is mainly com-posed of two parts; (i) ontology O, and (ii) a mapping set withneighbors.
Definition 1 (Ontology). An ontology O is represented as:
O :¼ ðC;R; ER; ICÞ; ð1Þ
where C and R are a set of classes (or concepts), a set of relations(e.g., equivalence, subsumption, disjunction, etc.), respectively.ER # C � C is a set of relationships between classes, represented asa set of triples fhci; r; cjijci; cj 2 C; r 2 Rg. IC is a power set of in-stance sets of a class ci 2 C.
These ontologies are grounded with a set of instances. In termsof description logic, IC can be replaced with A-Box.
By any mapping algorithm, each mapping result can be repre-sented as a set of correspondences between ontology entities withconfidence value.
Definition 2 (Correspondences). Given two ontologies O and O0, aset of correspondences are given by:
MðO;O0Þ ¼ he; e0; rM;CFije 2 O; e0 2 O0; rM; CF 2 ½0;1�f g; ð2Þ
where e and e0 are a pair of matched entities. Mapping relationshipis rM ¼ f�;v;w;?g. CF indicates a confidence value of the pair.
A B C
Person
faculty
SecretaryProf
Full Prof
People
Professor
Full ProfAssistan tProf
Researcher
Peoples
Senior
ProfessorLecturer
Junior
Researcher Engineer
Student
0.33
0.05
0.30.48
1.0
0.83
1.01.0
0.210.21
Fig. 1. An example of similarity-based ontology mapping and three ontology-basedinformation systems on distributed environment.
Definition 3 (Confidence). Confidence value means the precisionbetween the ontology entities. It can be computed in several differ-ent ways. For example, a confidence value can be measured by astring matching function Dist:
CFhe;e0 i ¼1� DistðLðeÞ; Lðe0ÞÞ
maxðLðeÞ; Lðe0ÞÞ ; ð3Þ
where L is a function for returning a label of ontology entity.There are several well-known ontology mapping tools
(Dhamankar, Lee, Doan, Halevy, & Domingos, 2004; Ehrig & Sure,2005; Maedche, Motik, Silva, & Volz, 2002; Noy & Musen, 2000).In this paper, we want to choose similarity-based ontology map-ping approach (Euzenat & Valtchev, 2004). It defines similarities(e.g., SimC, SimR, SimA) between classes, relationships, attributes,and instances. It is based on the principle that the more featuresof two entities are similar, the more these entities are similar. Gi-ven a pair of classes from two different ontologies, the similaritymeasure SimC is assigned in [0,1]. The similarity (SimC) between cand c0 is defined as:
SimCðc; c0Þ ¼X
E2NðCÞpC
E MSimYðEðcÞ; Eðc0ÞÞ; ð4Þ
where NðCÞ# fE1 . . . Eng is the set of all relationships in which clas-ses participate (for instance, subclass, instances, or attributes). Theweights pC
E are normalized (i.e.,P
E2NðCÞpCE ¼ 1).
If we restrict ourselves to class labels (L) and three relationshipsin NðCÞ, which are the superclass (Esup), the subclass (Esub) and thesibling class (Esib), Eq. (4) is rewritten as:
SimCðc; c0Þ ¼ pCL simLðLðAiÞ; LFðBjÞÞþ pC
supMSimCðEsupðcÞ; Esupðc0ÞÞ
þ pCsubMSimCðEsubðcÞ; Esubðc0ÞÞ
þ pCsibMSimCðEsibðcÞ; Esibðc0ÞÞ; ð5Þ
where the set functions MSimC compute the similarity of two entitycollections.
As a matter of fact, a distance between two sets of classes can beestablished by finding a maximal matching maximizing thesummed similarity between the classes:
MSimCðS; S0Þ ¼max
Phc;c0 i2PairingðS;S0 Þ SimCðc; c0Þð Þ
� �max jSj; jS0j
� � ; ð6Þ
in which Pairing provides a matching of the two set of classes. Meth-ods like the Hungarian method allow to find directly the pairingwhich maximizes similarity. The OLA algorithm is an iterative algo-rithm that compute this similarity (Euzenat & Valtchev, 2004). Thismeasure is normalized because if SimC is normalized, the divisor isalways greater or equal to the dividend.
A normalized similarity measure can be turned into a distancemeasure by taking its complement to 1 Edist
C ðx;yÞ¼1�SimCðx;yÞ� �
.Such a distance introduces a new relation Edist
C in the concept net-work C.
As a simple example in Fig. 1 from Jung (2010b) once twoontologies OA and OB are mapped, we can obtain the mappingresults (indicated as blue arrows) in Table 1.
At last, given a number of ontology-based systems, we can for-mulate a distributed ontology-based information system, asfollows.
Definition 4 (Distributed ontology-based information system). Adistributed ontology-based information system G consists of NG
number of ontology-based information systems fS1; . . . ; SNGg. Someof the information systems are interlinked with each other. Thislinkage between Si and Sj means the existence of mappinginformation between the corresponding ontologies MððOÞi; ðOÞjÞ.Thus, a distributed ontology-based information system G is repre-sented as:
G ¼ fMðOi;OjÞjTGðSi; SjÞg; ð7Þ
Table 1Mapping results between ontologies OA and OB .
he e0 R CFi
Person People � 0.33Faculty Assistant prof � 0.05Secretary Researcher � 0.3Prof Professor � 0.48Full_Prof Full Prof � 1.0
J.J. Jung / Expert Systems with Applications 37 (2010) 8401–8405 8403
where function TG returns topological feature to find out whether Si
and Sj are linked or not.
Fig. 2. Mapping composition with semantic coverage; SA = {c1, . . . ,c8},SB = {c9, . . . ,c12}, and SC = {c13, . . . ,c18}. Two sets of query-activated classesCQ(q1) = {c10 = ‘Secretary’, c11 = ‘Full_Prof’} and CQ(q2) = {c3 = ‘Researcher’, c7 = ‘FullProf’}.Table 2Example on query transformation.
Step Query Mapping Query0
1st c10 ^ c11 hc10,c3,�,0.45i, hc11,c7,�,1.0i c3 ^ c7
2nd c3 ^ c7 hc7,c13,�,0.5i, hc6,c16,�,0.33i c13
Example 1. Suppose that a distributed ontology-based informa-tion system G is constructed as shown in Fig. 1. There are onlytwo links, which mean mapping results, between SA and SB andbetween SB and SC.
G ¼ MðOA;OBÞ;MðOB;OCÞjTG ¼0 1 11 0 01 0 0
264
375
8><>:
9>=>;: ð8Þ
Queries can be sent for information sharing between SA and SB. Byreferring to the direct mapping byMðOA;OBÞ, a query q = ‘Secretary’which is not understandable in SB can be rewritten into ‘Researcher.’
However, it is still difficult for SA and SC to be interoperable, be-cause there is no direct mapping between them.
3. Mapping composition for query transformation
In this work, we want to estimate indirect mapping betweeninformation systems of which ontology mapping result do not ex-ist. To do so, we are focusing on reusing the existing mapping re-sults and properly composing them to reach from the sourceinformation system until destination. For example, even thoughthere is no direct mapping between SA and SC (i.e., a query fromsystem SA is not understandable in SC), we can compose two map-ping results MðOA;OBÞ and MðOB;OCÞ.
Definition 5 (Indirect mapping). An indirect mapping fM in adistributed ontology-based information system G is represented as:
fMðSSrc; SDestÞ ¼ RSDestSSrcðMðOi;OjÞÞ; ð9Þ
where SSrc are SDest the source and destination information sourcesin G. Here, we can find out whether there is a path between themby repeating multiplication of the topology matrix TG.
Now we want to show a query transformation by using the indi-rect mapping between ontologies in a distributed environment.
3.1. Semantic query transformation
A query from a source systems can be transformed to make itunderstandable to the destination system by referring to the com-posed mapping results. Thereby, we have to realize a set of query-activated class CQ.
Definition 6 (Query). A query from an ontology-based informa-tion system SA is simply represented as:
q ::¼ cj:qjq ^ q0jq _ q0; ð10Þ
where c 2 CA.
1 SparQL, http://www.w3.org/TR/rdf-sparql-query/
Definition 7 (Query-activated class [Jung, 2010b]). Given a querytraveling to an ontology-based information system Sk, a set ofquery-activated class CQ(q) can be extracted as:
CQ ðqÞ ¼ fcjc 2 q; c 2 CAg: ð11Þ
For example, suppose that the following query q1, which is writ-
ten by SparQL1, is sent from SA to SC in Fig. 1.PREFIX abc: <http://intelligent.pe.kr/TestOntology#>SELECT? Secretary? Full_ProfWHERE {?Full_ Prof abc:Teach abc:Course;
?Secretary abc:Assist abc:Prof;}
Thus, as shown in Fig. 2 from Jung (2010b), we can extract a setof query-activated class CQ(q1) = {c10 = ‘Secretary’, c11 = ‘Full_Prof’}but class ‘Teach’ and ‘Course’ are not. Consequently, the queryq = c10 ^ c11 can be transformed through two steps, as shown inTable 2.
Here, we can see information loss by mismatching during secondstep (i.e., c3). This issue is an important issue to discover the opti-mal state (i.e., minimizing error propagation), and it will be dis-cussed later.
3.2. Semantic coverage
More importantly, we have to take into account more generalcases. As shown in Fig. 3, when there are a large number of ontol-ogy-based information systems, the network of the distributedontology-based information system can be more complex. It meansthat there can be more than one path from arbitrary informationsystems (i.e., source and destination). The shortest path is not al-way the best choice, but the ontology mapping condition andsemantics of a given query. For example, if a query should travelfrom A to E, we have to decide the best composition pathfMðA; EÞ out of the following candidates:
�PðMðA;CÞ;MðC; EÞÞ;
�PðMðA;DÞ;MðD; FÞ;MðF; EÞÞ;
� and more.
Thus, we need to decide which path for composing mapping re-sults will be better than others. In this paper, heuristic approach is
A
B
GC
D
E
F
Fig. 3. A general case with a large number of information systems.
Table 3Measuring semantic coverage ratio by two heuristics.
Heuristics Path1 (SA ? SB) Path2 (SA ? SC)
H1 sH1q2¼ 2
2 ¼ 1 sH1q2¼ 0:45þ1
2 ¼ 0:725H2 sH2
q2¼ 1
2 ¼ 0:5 sH2q2¼ 0:5
1 ¼ 0:5
8404 J.J. Jung / Expert Systems with Applications 37 (2010) 8401–8405
exploited, and we want to empirically justify these heuristics. Weintroduce a semantic coverage ratio for representing two differentheuristics (H1 and H2).
Definition 8 (Semantic coverage ratio). A semantic coverage ratiosQ means the matching ratio of the size of two correspondence setsto a given query-activated classes. This sQ can be defined by thefollowing two heuristics;
� H1: As the more correspondences are mapped with query-acti-vated classes, the semantic coverage ratio is increased:
sH1Q ðSSrc; SDestÞ ¼
jfcjc 2 eSrc;MðOSrc;ODestÞgjjCQ j
: ð12Þ
� H2: As the confidence values of correspondences are higher, thesemantic coverage ratio is increased:
Table 4Recall and precision of mapping composition.MðOA;OBÞ is simply rewritten to (M)AB.
� �
sH2Q ðSSrc; SDestÞ ¼P
ek2CQCFk
jfek 2 CQgj: ð13Þ
Direct mappingðMÞ
Indirect mapping fM RecallR
PrecisionP
MAB MAC � MAB 0.76 0.65MAD � MDC � MCB 0.73 0.62MAE � MED � MDC � MCB 0.66 0.6MAF � MFE � MED � MDC � MCB 0.53 0.57MAG � MGF � MFE �MED � MDC � MCB 0.51 0.56
MBC MBD � MDC 0.86 0.74MBE � MED � MDC 0.74 0.72MBF � MFE �MED � MDC 0.72 0.65MBG � MGF � MFE � MED �MDC 0.69 0.64MBA � MAG � MGF � MFE � MED �MDC 0.66 0.62
MCD MCE � MED 0.73 0.66MCF � MFE �MED 0.67 0.63MCG � MGF � MFE � MED 0.66 0.56MCA � MAG � MGF � MFE � MED 0.54 0.52MCB � MBA � MAG � MGF � MFE � MED 0.51 0.52
MDE MDF � MFE 0.64 0.72MDG � MGF � MFE 0.63 0.73MDA � MAG � MGF � MFE 0.58 0.63MDB � MBA � MAG � MGF � MFE 0.52 0.55
As an example in Fig. 2, let a query q2 to be sent from SA thougheither SB or SC. To decide the better mapping path, we can measurethe semantic coverage ratios by using those two heuristics H1 andH2 in Table 3. It means we can expect that the query should betransformed via SB.
3.3. Transformation path selection
The best path for transforming a query is selected by serialaggregation of the semantic coverage ratio. Given two informationsystems SSrc and SDest, the aggregated semantic coverage ratio iscomputed by:
sQ ðSSrc; SDestÞ ¼ maxPathk
YSDest
SSrc
sQ ðSi; SjÞ; ð14Þ
where Pathk is a set of all possible paths from SSrc to SDest.
MDC � MCB � MBA � MAG �MGF � MFE 0.45 0.48MEF MEG � MGF 0.79 0.75MEA � MAG � MGF 0.75 0.72MEB � MBA � MAG � MGF 0.74 0.71MEC � MCB � MBA � MAG � MGF 0.69 0.7MED � MDC � MCB � MBA � MAG � MGF 0.68 0.66
MFG MFA � MAG 0.77 0.75MFB � MBA � MAG 0.72 0.7MFC � MCB � MBA � MAG 0.68 0.62MFD � MDC � MCB � MBA � MAG 0.67 0.58MFE � MED � MDC � MCB � MBA �MAG 0.63 0.52
4. Experimental results and discussion
In order to evaluate the proposed distributed ontology-basedinformation system, we have built seven ontology-based informa-tion systems (i.e., SA to SG) with linkages, as shown in Fig. 3. All ofthe mapping results have been collected by human experts.
We have focused on two evaluation issues (i.e., mapping com-position and transformation path selection), and have collectedexperimental results.
4.1. Evaluation on mapping composition
By using OLA API (Euzenat, 2004), we have automatically col-lected the direct mapping results (i.e.,M). The mapping results havebeen composed in all possible cases (i.e., fM). The performance ofmapping composition has been tested by precision and recall.
Precision ¼M\ fM��� ���fM��� ��� and Recall ¼
M\ fM��� ���jMj : ð15Þ
The results of three cases (i.e., MAB; MBC , and MCD) are shown inTable 4. In average, we have obtained relatively good results (73%recall and 79% precision). We note that as the mapping results arecomposed (i.e., the number of mapping composition is increased)in all cases, the recall and precision is getting decreased by nature.This is the information loss cased by mismatching problem of ontol-ogy mapping algorithms.
4.2. Evaluation on transformation path selection
In second issue, we have tested the performance of transforma-tion path selection resulting from two heuristics (i.e., H1 and H2) byinviting real users. The link topology of the distributed ontology-based information system has been simply built. Thirty users wereasked to generate 10 queries with SparQL to search for a certaininformation. These queries have been able to be sent to only threesystem SA, SD, and SG, for considering multiple paths along with thelinkages.
Table 5 shows the performance of transformation path selectionfor two users. In average for all the invited users, heuristic H1 hasshown 65.3% recall and 74.2% precision, while H2 has shown59.5% recall and 68.3% precision. Hence, we found out that H1 out-performs H2 by about 12.4%.
Table 5Performance of transformation path selection for two users.
Heuristics Users Information systems Recall Precision
H1 U1 B 0.78 0.67C 0.68 0.63E 0.72 0.65F 0.63 0.73
U2 B 0.67 0.57C 0.69 0.67E 0.73 0.75F 0.79 0.82
H2 U1 B 0.67 0.64C 0.74 0.37E 0.83 0.7F 0.73 0.62
U2 B 0.63 0.63C 0.75 0.77E 0.67 0.48F 0.47 0.52
J.J. Jung / Expert Systems with Applications 37 (2010) 8401–8405 8405
5. Concluding remarks
As a large number of ontology-based information systems aregetting involved into a global network, we have to somehow estab-lish an efficient interoperability platform to semantically under-stand the resources from remote and heterogeneous systems.More importantly, this system might count on the scalability ofontology mapping process. In this paper, we have proposed querytransformation application on such ontology-based distributedenvironment.
Acknowledgements
This work was supported by the Korean Science and Engineer-ing Foundation (KOSEF) grant funded by the Korean government(MEST) (2009-0066751).
References
Dhamankar, Robin, Lee, Yoonkyong, Doan, AnHai, Halevy, Alon, & Domingos, Pedro(2004). Imap: Discovering complex semantic matches between database
schemas. In Gerhard Weikum, Arnd Christian König, & Stefan Deßloch (Eds.),Proceedings of the ACM SIGMOD international conference on management of data(pp. 383–394). Paris, France: ACM.
Ehrig Marc, & Sure York (2005). Foam – Framework for ontology alignment andmapping – Results of the ontology alignment evaluation initiative. In BenjaminAshpole, Marc Ehrig, Jérôme Euzenat, & Heiner Stuckenschmidt (Eds.),Proceedings of the K-CAP 2005 workshop on integrating ontologies, Banff,Canada. CEUR workshop proceedings (Vol. 156). CEUR-WS.org.
Euzenat, Jérôme (2004). An API for ontology alignment. In Sheila A. McIlraith,Dimitris Plexousakis, & Frank van Harmelen (Eds.), Proceedings of the thirdinternational semantic web conference. Lecture notes in computer science (Vol.3298, pp. 698–712). Springer.
Euzenat, Jérôme, & Shvaiko, Pavel (2007). Ontology matching. Heidelberg, Germany:Springer.
Euzenat, Jérôme, & Valtchev, Petko (2004). Similarity-based ontology alignment inOWL-Lite. In Ramon López de Mántaras & Lorenza Saitta (Eds.), Proceedings ofthe 16th european conference on artificial intelligence (ECAI’2004) (pp. 333–337).Valencia, Spain: IOS Press.
Jung, Jason J. (2007). Ontological framework based on contextual mediation forcollaborative information retrieval. Information Retrieval, 10(1), 85–109.
Jung, Jason J. (2010a). An empirical study on optimizing query transformation onsemantic peer-to-peer networks. Journal of Intelligent & Fuzzy Systems, 21(3),187–195.
Jung, Jason J. (2010b). Reusing ontology mappings for query routing in semanticpeer-to-peer environment. Information Sciences. doi:10.1016/j.ins.2010.04.018.
Jung, Jason J., Lee, Hojin, & Choi, Kwang Sun (2009). Contextualizedrecommendation based on reality mining from mobile subscribers.Cybernetics and Systems, 40(2), 160–175.
Maedche, Alexander, Motik, Boris, Silva, Nuno, & Volz, Raphael (2002). Mafra – Amapping framework for distributed ontologies in the semantic web. InAsunción Gómez-Pérez & V. Richard Benjamins (Eds.), Proceedings of the 13thinternational conference on knowledge engineering and knowledge management(EKAW 2002). Lecture notes in computer science (Vol. 2473, pp. 235–250).Siguenza, Spain: Springer.
Morbach, Jan, Yang, Aidong, & Marquardt, Wolfgang (2007). Ontocape – A large-scale ontology for chemical process engineering. Engineering Applications ofArtificial Intelligence, 20(2), 147–161.
Noy, Natalya Fridman, & Musen, Mark A. (2000). Prompt: Algorithm and tool forautomated ontology merging and alignment. In Proceedings of the 17th nationalconference on artificial intelligence and twelfth conference on innovativeapplications of artificial intelligence (pp. 450–455). Austin, Texas, USA: AAAIPress/The MIT Press.
Payne, Philip R. O., Mendonça, Eneida A., Johnson, Stephen B., & Starren, Justin B.(2007). Conceptual knowledge acquisition in biomedicine: A methodologicalreview. Journal of Biomedical Informatics, 40(5), 582–602.
Shih, Wen-Chung, Yang, Chao-Tung, & Tseng, Shian-Shyong (2009). Ontology-basedcontent organization and retrieval for scorm-compliant teaching materials indata grids. Future Generation Computer Systems, 25(6), 687–694.
Villanueva-Rosales, Natalia, & Dumontier, Michel (2008). yOWL: An ontology-driven knowledge base for yeast biologists. Journal of Biomedical Informatics,41(5), 779–789.