emergent communities for semantic collaboration in multi ... unimi/ultimi... · collaboration in...

Download Emergent Communities for Semantic Collaboration in Multi ... UNIMI/ultimi... · Collaboration in Multi-Knowledge…

Post on 27-Nov-2018

212 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • DOTTORATO DI RICERCA IN INFORMATICAXIX CICLO

    SETTORE SCIENTIFICO DISCIPLINARE INF/01 INFORMATICA

    Emergent Communities for SemanticCollaboration in Multi-Knowledge

    Environments: Methods and Techniques

    Tesi di Dottorato di Ricerca di:Stefano Montanelli

    Relatore:Prof.ssa Silvana Castano

    Coordinatore del Dottorato:Prof. Vincenzo Piuri

    Anno Accademico 2006/2007

  • Emergent Communities for SemanticCollaboration in Multi-Knowledge

    Environments: Methods and TechniquesStefano Montanelli

    Anno Accademico 2006/2007Dottorato di Ricerca in Informatica

    XIX Ciclo

    Tutor: Prof.ssa Silvana Castano

    The need of methods and techniques to foster semantic collaboration is a challengingissue at the current stage of development of P2P networks and open distributed systemsin general. In this respect, the emergence of collaboration among peers requires thecapabilities to share data and resources by dynamically selecting the most appropriatepartners within the context of a given task. These scenarios are multi-knowledge, in thatno centralized authorities are defined to manage a comprehensive view of the resourcesshared by all the nodes in the system, due to the high dynamism and variability ofcollaboration and sharing requirements.

    With respect to this scenario, the thesis work has focused on investigating the pe-culiar aspects and requirements of knowledge sharing in open distributed systems,and in P2P systems in particular. In this context, ontologies and ontology matchingtechniques have been identified as a key solution for enabling peers with similar con-tents to gradually emerge and for allowing the network to self-configure by linkingthem as semantic neighbors. The main contribution of the thesis work regards thedefinition of methods and techniques for enforcing semantic collaboration among au-tonomously emergent semantic neighbors. In particular, two main goals have beenobtained. On one side, the development of a matching-driven semantic routing mech-anism, called H-Link, for scalable distribution of knowledge requests on a semanticbasis and for effective sharing of distributed resources. On the other side, the defi-nition of consensus-driven techniques which exploit ontological resource descriptionsand ontology matching in order to form, maintain, and disband semantic communitiesof peers, with application to the Helios knowledge-sharing P2P system.

  • et dormiat et exsurgat nocte et die, et semengerminet et increscat, dum nescit ille.[Mc 4,27]

    iii

  • Acknowledgements

    The first acknowledgment is dedicated to my advisor Prof. Silvana Castano. Workingunder her supervision has represented a great opportunity for my professional andscientific growth. I would also like to acknowledge the referees, Prof. Steffen Staaband Prof. Paolo Tiberio, for their comments and attention. Methods and techniquespresented in the thesis have been developed in the following research projects.

    FIRB WEB MINDS

    FP6 INTEROP NoE

    MIUR PRIN ESTEEM

    The staff involved in these projects has considerably contributed to improve this thesiswith their invaluable work.A special recognition goes to the Balsi staff, namely Alfio Ferrara and GianpaoloRacca, for the helpful discussions, the support, the friendship, and the fun. Finally,a kind thought is dedicated to the core persons of my life: Manuela, my family, myfriends. Thank you for everything.

    Bergamo, November 12th, 2006

    iv

  • Abstract

    The need of methods and techniques to foster semantic collaboration is a challengingissue at the current stage of development of P2P networks and open distributed systemsin general. In this respect, the emergence of collaboration among peers requires thecapabilities to share data and resources by dynamically selecting the most appropriatepartners within the context of a given task. These scenarios are multi-knowledge, in thatno centralized authorities are defined to manage a comprehensive view of the resourcesshared by all the nodes in the system, due to the high dynamism and variability ofcollaboration and sharing requirements.

    With respect to this scenario, the thesis work has focused on investigating the pe-culiar aspects and requirements of knowledge sharing in open distributed systems,and in P2P systems in particular. In this context, ontologies and ontology matchingtechniques have been identified as a key solution for enabling peers with similar con-tents to gradually emerge and for allowing the network to self-configure by linkingthem as semantic neighbors. The main contribution of the thesis work regards thedefinition of methods and techniques for enforcing semantic collaboration among au-tonomously emergent semantic neighbors. In particular, two main goals have beenobtained. On one side, the development of a matching-driven semantic routing mech-anism, called H-Link, for scalable distribution of knowledge requests on a semanticbasis and for effective sharing of distributed resources. On the other side, the defi-nition of consensus-driven techniques which exploit ontological resource descriptionsand ontology matching in order to form, maintain, and disband semantic communitiesof peers, with application to the Helios knowledge-sharing P2P system.

    v

  • Contents

    1 Introduction 11.1 Thesis contribution and outline . . . . . . . . . . . . . . . . . . . . . 41.2 Research projects involved . . . . . . . . . . . . . . . . . . . . . . . 6

    2 P2P systems: the state of the art 82.1 Classification of P2P systems . . . . . . . . . . . . . . . . . . . . . . 9

    2.1.1 Architectural classification . . . . . . . . . . . . . . . . . . . 102.1.1.1 Hybrid systems . . . . . . . . . . . . . . . . . . . 102.1.1.2 Pure systems . . . . . . . . . . . . . . . . . . . . . 122.1.1.3 SuperPeer systems . . . . . . . . . . . . . . . . . . 13

    2.1.2 Structural classification . . . . . . . . . . . . . . . . . . . . . 162.1.2.1 Structured systems . . . . . . . . . . . . . . . . . . 172.1.2.2 Adaptive systems . . . . . . . . . . . . . . . . . . 182.1.2.3 Non-adaptive systems . . . . . . . . . . . . . . . . 20

    2.2 P2P semantic routing techniques . . . . . . . . . . . . . . . . . . . . 202.2.1 Query routing through good peers . . . . . . . . . . . . . . . 202.2.2 The REMINDIN query routing algorithm . . . . . . . . . . . 222.2.3 Socialized.Net and the Seers search protocol . . . . . . . . . 242.2.4 P2P semantic link networks . . . . . . . . . . . . . . . . . . 262.2.5 The intelligent search mechanism . . . . . . . . . . . . . . . 272.2.6 Hierarchical semantic routing for Grid resource discovery . . 292.2.7 Query routing through semantic mappings . . . . . . . . . . . 322.2.8 Other interesting approaches . . . . . . . . . . . . . . . . . . 34

    2.3 Peer community formation and consensus negotiation techniques . . . 352.3.1 Semantic overlay networks for P2P systems . . . . . . . . . . 35

    vi

  • CONTENTS

    2.3.2 Formation of P2P communities through escalation . . . . . . 362.3.3 Communities of peers by trust and reputation . . . . . . . . . 382.3.4 Peer selection in P2P networks with semantic topologies . . . 402.3.5 Other interesting approaches . . . . . . . . . . . . . . . . . . 42

    3 Towards emergent semantics in P2P systems:critical review of the state of the art 443.1 Schema-based P2P networks . . . . . . . . . . . . . . . . . . . . . . 44

    3.1.1 Building blocks of schema-based P2P networks . . . . . . . . 453.1.2 Open issues in schema-based P2P networks . . . . . . . . . . 47

    3.2 Knowledge-sharing P2P systems . . . . . . . . . . . . . . . . . . . . 493.3 Emergent semantics issues in

    knowledge-sharing P2P systems . . . . . . . . . . . . . . . . . . . . 513.4 Emergent semantics requirements . . . . . . . . . . . . . . . . . . . 52

    3.4.1 Knowledge representation . . . . . . . . . . . . . . . . . . . 523.4.2 Matching techniques . . . . . . . . . . . . . . . . . . . . . . 533.4.3 Query routing . . . . . . . . . . . . . . . . . . . . . . . . . . 553.4.4 Community support . . . . . . . . . . . . . . . . . . . . . . 56

    3.5 Critical review of the state of the art . . . . . . . . . . . . . . . . . . 573.5.1 Comparison on architectural and structural properties . . . . . 573.5.2 Comparison on emergent semantics requirements . . . . . . . 58

    4 The H-Link semantic routing mechanism 614.1 Main features of H-Link . . . . . . . . . . . . . . . . . . . . . . . . 62

    4.1.1 Motivating and running example . . . . . . . . . . . . . . . . 644.2 Peer ontology definition . . . . . . . . . . . . . . . . . . . . . . . . . 65

    4.2.1 The content knowledge layer . . . . . . . . . . . . . . . . . . 654.2.2 The network knowledge layer . . . . . . . . . . . . . . . . . 664.2.3 Building the network knowledge . . . . . . . . . . . . . . . . 674.2.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684.2.5 Considerations on the computation of confidence values . . . 69

    4.3 The H-Match semantic matchmaker . . . . . . . . . . . . . . . . . . 704.3.1 The H-Match matching process . . . . . . . . . . . . . . . . 714.3.2 Linguistic features . . . . . . . . . . . . . . . . . . . . . . . 71

    vii

  • CONTENTS

    4.3.3 Contextual features . . . . . . . . . . . . . . . . . . . . . . . 724.3.4 Basic matching functions of H-Match . . . . . . . . . . . . . 734.3.5 Property and semantic relation closeness function . . . . . . . 744.3.6 The H-Match matching models . . . . . . . . . . . . . . . . 744.3.7 Matching policy . . . . . . . . . . . . . . . . . . . . . . . . 764.3.8 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 774.3.9 Considerations on H-Match . . . . . . . . . . . . . . . . . . 78

    4.4 The H-Link routing mechanism . . . . . . . . . . . . . . . . . . . . . 794.4.1 H-Link invocat