exploiting natural language definitions and (legacy) data for facilitating agreement processes

19
Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes Christophe Debruyne and Cristian Vasquez Presented @ SWQD 2013, January 2013

Upload: christophe-debruyne

Post on 05-Jul-2015

79 views

Category:

Technology


0 download

DESCRIPTION

Debruyne, C. and Vasquez, C. (2013) Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes. In Proc. of Software Quality. Increasing Value in Software and Systems Development 2013 (SWQD 2013), LNBIP, Springer In IT, ontologies to enable semantic interoperability is only of the branches in which agreement between a heterogeneous group of stakeholders are of vital importance. As agreements are the result of interactions, appropriate methods should take into account the natural language used by the community. In this paper, we extend a method for reaching a consensus on a conceptualization within a community of stakeholders, exploiting the natural language communication between the stakeholders. We describe how agreements on informal and formal descriptions are complementary and interplay. To this end, we introduce, describe and motivate the nature of some of the agreements and the two distinct levels of commitment. We furthermore show how these commitments can be exploited to steer the agreement processes. Concepts introduced in this paper have been implemented in a tool for collaborative ontology engineering, called GOSPL, which can be also adopted for other purposes, e.g., the construction a lexicon for larger software projects.

TRANSCRIPT

Page 1: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes Christophe Debruyne and Cristian Vasquez

Presented @ SWQD 2013, January 2013

Page 2: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

Introduction

• Agreements within a heterogeneous group of stakeholders are vital for many domains in IT

• Contribution

• Presentation of a framework and method• Formal representations grounded in natural language

• Informal representations

• “There is no entity without identify” (Quine) – Reference structures

• Proposal of a layered architecture for such agreements• Nature of different agreements

• Exploitation of the layered approach

• Exploitation of the natural language aspect for retrieving information

• Presentation of the tools implementing proposed ideas

• Applied in Ontology Engineering, but fear not …

Page 3: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

Introduction

• Ontologies

• Shared, formal specifications of a domain

• Key for semantic interoperability between autonomously developed information systems

• Constitutes a community

• The result of social interactions within a community leading to agreements

• Ontology-engineering

• Sets of guidelines and activities constituting a method for building such ontologies

Page 4: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

Hybrid Ontology Engineering

• DOGMA Hybrid Ontology Descriptions <Ω, ci, K, G>

• Ω a lexon base, a finite set of plausible binary fact types called lexons, e.g., <Vendor Community, Offer, has, is of, Title>

• ci a function mapping community-identifiers and terms to concepts

• K a finite set of ontological commitments containing

• A selection of lexons

• A mapping from application symbols to ontology terms

• Predicates over those terms and roles to express constraints

• G is a glossary, a triple with components

• Gloss, a set of linguistic, human-interpretable glosses

• g1, mapping community-term pairs to glosses

• g2, mapping lexons to glosses

⟨VCard Community, Email Address⟩ “The address

of an email, a system of world-wide electronic communication in which a user can compose a message at one terminal that can be regenerated at the recipient’s terminal when the recipient logs in”

Page 5: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

Hybrid Ontology Engineering

• Example of an application-commitment

• Ω-RIDL: Verheyden et al. (SWDB 2004), Trog et al. (RuleML 2007)

Page 6: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

Hybrid Ontology Engineering

• Grounding ontologies with social processes & NL

• Hybrid Ontology Engineering Method

Page 7: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

(1) Nature of Agreements

• Gloss-equivalence:

• Two communities c1 and c2 consider that the glosses they used to describe their terms – t1 and t2 respectively – refer to the same concept EQG(g1(c1,t1),g1(c2,t2))

• Synonymy:

• Two communities c1 and c2 consider that the labels they used in the formal descriptions (lexons) refer to the same concept ci(c1,t1) ≣C ci(c2,t2)

• Gloss-equivalence and synonymy only an equivalence-relation within one agreements process!

Page 8: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

(1) Nature of Agreements

• Why this distinction?

• Glossary-consistency principle: for every two community-term pairs: if the glosses of those terms were deemed to refer to the same concept (gloss-equivalence), then so should the term-labels (synonymy).

• Motivation 1: Separate processes for each type of agreements

• Synonymy requires terms already to be present in a lexon

• Motivation 2: Glossary-consistency principle used a means for driving agreements

• Revalidation by the community (/communities)

Page 9: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

(2) Layered Commitments

• Distinction between community-commitment and application-commitments

• Community-commitment: engagement by the community to comply with this set of fact types and knowledge

• Application-commitment: a selection of community-commitment + additional fact types and constraints for annotating data sources

Page 10: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

(3) Exploiting commitments

• Hybrid ontology easily translated into other formalisms

• E.g. OWL, UML, etc.

• Services set up with translation

• Natural language interface for annotated data via lexons

• LIST Artist NOT with Gender with Code = ‘M’

• SELECT DISTINCT ?a WHERE { ?a a myOnto0:Artist. OPTIONAL { ?g myOnto0:Gender_of_Artist ?a. ?g myOnto0:Gender_with_Code ?c. }FILTER(?c != "M" || !bound(?c)) }

• Object Role Modeling “like” subtype definitions

Page 11: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

Tool

Page 12: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

Experiment

• Experiment in the cultural domain

• within the context of a linked data project in Brusselshttp://www.oscb.be/

• Selection of terms (at the time of writing)

• Non-lexical

• At least four interactions involving this term

• Appearing in a lexon

• Terms were more likely to change in their formal description of the natural language definitions were not provided first

• Indeed, freedom was given to the users concerning this aspect

Page 13: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

Experiment

• We noticed that terms used for attributes were less likely to be fully articulated

• Either the process of teaching the method needs to stress the importance of such alignment (e.g., encoding)

• Tool should encourage the users in articulating all concepts

Page 14: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

Conclusions

• Importance of agreements

• Extended a framework for hybrid ontology engineering

• (1) Describing the nature of agreements

• (2) Proposing a layered architecture

• (3) Exploitation of commitments

• Ideas were integrated in a tool

• Experiment

• Future work

• Encouraging users to fully follow the method

• Reasoning on the commitments

Page 15: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

Thank you!Any [email protected]

Page 16: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes
Page 17: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

Example of commitments

• Community-commitment

• A relational DB

Page 18: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

BEGIN SELECTION

['Cultural Domain’]

<'MyOrganization', Work Of Art, with, of, WID>

END SELECTION

BEGIN CONSTRAINTS

LINK('Cultural Domain', Artist, 'MyOrganization', Artist).

LINK('Cultural Domain', Work Of Art, 'MyOrganization', Work Of Art).

EACH Artist with AT MOST 1 AID.

EACH Artist with AT LEAST 1 AID.

EACH AID of AT MOST 1 Artist.

EACH Work Of Art with AT MOST 1 WID.

EACH Work Of Art with AT LEAST 1 WID.

EACH WID of AT MOST 1 Work Of Art.

END CONSTRAINTS

BEGIN MAPPINGS

MAP 'Artist'.'name' ON Name of Artist.

MAP 'Artist'.'birthyear' ON Year of birth of Artist.

MAP 'Artist'.'id' ON AID of Artist.

MAP 'piece'.'name' ON Title of Work Of Art.

MAP 'piece'.'year' ON Year of Work Of Art.

MAP 'piece'.'id' ON WID of Work Of Art.

MAP 'artistpiece'.'a_id' ON AID of Artist contributed to Work Of Art.

MAP 'artistpiece'.'p_id' ON WID of Work Of Art with contributor Artist.

END MAPPINGS

Page 19: Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes

Tool: Example of a “scenario”