guide to a repeatable process for ontology creation (v 0.1) draft copy fouo

33
Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

Upload: rolf-flynn

Post on 31-Dec-2015

237 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

Guide to aRepeatable Process

forOntology Creation (v 0.1)

Draft CopyFOUO

Page 2: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

Point of Contact Bill Mandrick, Ph.D.

MBO Partners(585) 721-7599

[email protected]

Guide to aRepeatable Process

forOntology Creation

Page 3: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

Repeatable Process for Ontology CreationPurpose

The purpose of this guide is to provide a standardized process for creating an accurate and consistent domain representation, also known as an ontology. An accurate and consistent ontology, aimed at the representation of some portion of reality, necessarily contains accurate and consistent semantics.

So what is an ontology used for? Most uses for an ontology are related in some way to the problem of coping with the large amount of information being generated in a given field (e.g. Civil Information Management, Stability Operations, Logistics, Position Reporting, Contact Reporting, etc.). However, another reason to create an ontology is to achieve a better understanding of some domain (e.g. Command and Control, Intelligence Preparation of the Operational Environment, Enemy Situation Reporting, etc. ).

There are a number of ways that the use of a common set of ontologies, maintained by domain experts committed to the acceptance of tested best practices and vetted and maintained by a community of authorities in a well-documented governance process, can contribute to understanding and coping with prolific information. They include:

• Improved understanding of the domain itself — The use of ontologies to represent the types of entities and events in a domain greatly improves understanding for the IT development community.

• Improved reusability – The strategy of a perspective-neutral approach – as contrasted with an application-centric approach – means that the ontologies are designed in such a way as to be reusable by a large and varied community of users. Perspective-Centric approaches to creating ontologies or models result in data silos (e.g. to a Targeting Officer everything in the operational environment is a “Target”).

Page 4: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

Repeatable Process for Ontology Creation• Improved Discoverability – The use of a repeatable process for ontology creation makes it possible for groups to more easily discover and understand the data assets of other groups, thereby reducing the number of redundant efforts and increasing the collaborative use, and thus the value, of data and software tools.

• Semantic Interoperability (Semantic Consistency) — The use of a repeatable process for ontology creation, along with an effective governance process, can bring about a network effect where the value of each ontology exponentially increases as more people use it to describe their respective data. A standard process for ontology creation is absolutely necessary for consistent semantics. Fortuitous interoperability (i.e. interoperability by way of good fortune or chance) is the best we can hope for when disparate communities employ their own idiosyncratic techniques for creating domain semantics and representation.

Defining Ontology

Ontologies are often mischaracterized as a type of Conceptual Data Model, when in fact they are intended to represent (i.e. model) portions of reality—not concepts or data about reality. Although the proposed process relies upon Subject Matter Expertise (SME) in a given domain (e.g. Logistics, Operations, Tactics, Intelligence, Forensics, etc.) , it does not result in an idiosyncratic (perspective-centric) product. Instead, the role of the Subject Matter Expert (SME) is to provide accurate (true) statements about the domain at hand—statements which are perspective-neutral.

Who practices ontology? Everyone practices ontology, especially when faced with a new and unfamiliar situation where a response is necessary. As we observe an unfolding situation, and try to make sense of it (i.e. orient ourselves to it), we naturally look for the relations that exist between the entities and events that make up that situation. In some cases we have to create a new lexicon (e.g. the Improvised Explosive Device Lexicon). In short, the practice of ontology is a form of sense-making, and is a natural human activity.

Page 5: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

Ontology DefinedGood ontology and good modeling...can be advanced by the cultivation of a discipline that is devoted precisely to the representation of entities as they exist in reality...[1]

An ontology is a representation of some part of reality, (e.g. medicine, social reality, physics, etc.). Smith states that: “Ontology is the science of what is, of the kinds and structures of objects, properties, events, processes and relations in every area of reality...Ontology seeks to provide a definitive and exhaustive classification of entities in all spheres of being.”[2]

Ontologies enable the formulation of robust and shareable descriptions of a given domain by providing a common controlled vocabulary for doctrine writers, IT Developers, and war-fighters alike, thereby allowing these disparate communities to communicate with each other. An ontology should be a shared resource between communities, and its continued collaborative development should support the integration of information and facilitate knowledge discovery.[3] These two goals are realized by ensuring wide dissemination of the ontology, so that it will be used by many stakeholders, and its terms will be correspondingly familiar and readily used for search.

[1] Barry Smith, Beyond Concepts: Ontology as Reality Representation, Forthcoming in Achille Varzi and Laure Vieu (eds.), Proceedings of FOIS 2004. International Conference on Formal Ontology and Information Systems, Turin, 4-6 November 2004

[2] Preprint version of chapter “Ontology”, in L. Floridi (ed.), Blackwell Guide to the Philosophy of Computing and Information, Oxford: Blackwell, 2003, 155–166. http://ontology.buffalo.edu/smith/articles/ontology_pic.pdf

[3] Blake, Judith. Bio-Ontologies—Fast and Furious. Nature and Biotechnology, volume 22 Number 6 June 2004 http://www.nature.com/nbt/index.html

Page 6: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

Ontology DefinedOne challenge that faces many ontology development projects is that there is little guidance on how best to develop ontologies. In this guide, we sketch out a repeatable ontology modeling process that is designed to encapsulate ontology best practices and design patterns in order to improve the quality of ontology development efforts and transfer ontology development knowledge and skills to a broader base of modelers. The process is broken down into five major activities:

1) Scope the domain2) Create initial lexicon 33) Create initial ontology, 3) 4) Verify and revise ontology, and 5) Publish ontology to potential users.

The result of this process is a reality-centric Ontology developed using the Web Ontology Language (OWL)[4] that extends from a Common Upper Ontology such as BFO or UCore-SL.

In what follows, these activities will be broken down and explained so that the ontology developer(s) can proceed with confidence, even into a domain that may be unfamiliar to them. An accurate ontology not only provides superior semantics and an accurate representation of a given domain—it will also instill confidence in the developers and users, which results from truly understanding a domain.

[4] http://www.w3.org/TR/owl2-overview/

Page 7: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

Perspective Neutrality & RealismThe Repeatable Process for Ontology Creation begins with a “Perspective Neutral” view of the domain being represented or modeled. The intent is to avoid an idiosyncratic (i.e. perspective-laden) approach to representing a domain. The basic idea here is that these different perspectives describe different portions of the same reality, and often end up creating stove-pipe ontologies that are semantically incompatible with other ontologies. For example, an infantry perspective may describe an armored infantry vehicle much differently than a logistics perspective or a targeting perspective—i.e. the logistics perspective may describe an armored vehicle as “cargo” while the targeting officer may describe the same armored vehicle as a “target”.

This happens because community-specific ontologies are often built bottom up with little thought to how others outside of their communities might reuse or understand their ontologies. In order to make community specific ontologies interoperable with one another it is practical to use a perspective-neutral common upper level from which the community specific ontologies extend. The upper level ontology provides the basis for a shared understanding across multiple communities and makes it possible to identify inconsistencies.

For example, a Targeting Officer maintains a targeting perspective, which results in the categorization of buildings, vehicles, and people as being all “targets”. Likewise, a logistics planner will maintain a logistics perspective, which results in the categorization of buildings as “facilities”, vehicles as “cargo”, and people as “passengers”. It is precisely the perspective-centric approach to creating data models and taxonomies that results in data silos. The repeatable process for ontology creation employs a number of ontological distinctions from the outset that are designed to overcome many of the limitations of perspective-centric approaches. For example, the proposed process fastidiously distinguishes between roles and types. “Building,” “vehicle” and “person” are types. “Target,” “cargo” and “passenger” are roles. Roles are context and time sensitive. In one context, a vehicle may be a target and in another context the same vehicle may be cargo. The way to represent these sorts of facts is to say that a vehicle is in the target role for some temporal period (see next page for a graphic representation which distinguishes between types and roles).

Page 8: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

Types and Roles

Person Civilian

Combatant

Key Leaderhas_role

has_role

has_role

Insurgenthas_role

Commanderhas_role

Figure 1: It is important to distinguish between Types and Roles. Person is a Type and Key Leader is a Role that some person can be in. A Person can be in a variety of Roles, and in some cases a Person can be in several Roles at the same time.

Page 9: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

Domain DefinitionDomain Description

Initial List of Domain TermsStatement of Metrics

Iterative List of TermsList of RelationsDomain Lexicon

Revised/Versioned OWL filesRevised Final Briefings

Semantic Conformance Testing Summary

Revised Domain Lexicon

Domain RepositoryLexicon

Lessons LearnedFinalized OWL Files

Final Briefings Change Request Process

owl fileRelations Schematics

SME Update

DoD Directives

SME Guidance

User Requirements

Doctrinal Descriptions

Doctrinal Definitions

SME Feedback

Authoritative Descriptions

Doctrinal Models

IER’s

OutputsActivitiesInputs

The Repeatable Process for Ontology Creation

Page 10: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

1. Scope the Domain

Page 11: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

1. Scope the Domain

Scoping a specific domain is essentially a defining and boundary setting process. The scoping activity encloses the domain within boundaries to determine which entities and events should be included. This is done by addressing very basic questions about the domain at hand. Ten example questions include:

•What is the baseline description or definition of this domain?• What entities make up the domain?• What properties do they have? • What are the baseline definitions for these entities? • What events do they participate in?• What are the baseline definitions for these events? • What outcomes are there in this domain?• What are the relations between entities?• What are the relations be events? • What are the relations between entities and events?

This activity will result in a Domain Questionnaire that can be presented to the Subject Matter Experts (SME’s) for the domain. The answers to these questions will result in a baseline document as part of the Domain Scope.

Page 12: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

1.1 SME Interaction

Subject Matter Experts (SME’s) are critical in the Scoping, Verification, and Revisions activities in the Repeatable Process for Ontology Creation.

In order to properly scope some domain it is important to interact with the Subject Matter Experts. For example, to properly scope out a Disease Domain the developer would need to consult with an epidemiologist or some other expert in that domain. Another example would be consulting a Command and Control (C2) doctrine writer for the C2 Domain. SME’s can answer the baseline questions pertaining to the domain at hand, making them an indispensible asset for the Scoping Activity.

Later in the Repeatable Process the SME’s will play an important role in the Verification and Revisions activities.

1.2 Identify Authoritative References (Doctrine)

It is also important to refer to authoritative references (doctrine) in the Scoping activity. The authoritative references and doctrine are the written expression of subject matter expertise and serves as the primary source for the ontology’s content. Therefore, it is a best practice to refer to the SME descriptions of the domain at hand and start to compile the source materials in a repository. Below is an example of a collection of authoritative references used in the creation of a Joint Operations Ontology.

Merriam-Webster’s Collegiate Dictionary

Joint Publication 1-02 DoD Dictionary of Military and Related Terms

Joint Publication 3-0 Joint Operations

Joint Publication 3-13 Joint Command and Control

Joint Publication 3-24 Counterinsurgency

Joint Publication 3-57 Civil-Military Operations

JP 3-10, Joint Security Operations in Theater

Joint Publication 3-16 Multinational Operations

Joint Publication 5-0 Joint Operations Planning

Page 13: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

1.3 Survey Authoritative References (Doctrine)

Because Subject Matter Expertise is expressed in various authoritative references and doctrine, it is important to conduct a thorough survey of these documents. The Repeatable Process for Ontology Development does not require the developer to become a Subject Matter Expert. However, the developer must have a good understanding of the composition of the domain, which is gained by SME interaction and doctrinal investigation.

1.4 Create or Identify Domain Definition(s)

This activity starts by identifying the most basic entities and events for the domain at hand. For example, the Joint Operations Planning Domain would start with the doctrinal definition for Joint Operation Planning, which is defined as:

Planning activities associated with joint military operations by combatant commanders and their subordinate joint force commanders in response to contingencies and crises. Joint operation planning includes planning for the mobilization, deployment, employment, sustainment, redeployment, and demobilization of joint forces. (Joint Publication 3-0 Joint Operations)

This definition then needs to be decomposed into its constituent elements, which results in a rapidly expanding Joint Operation Planning Lexicon (see next page for decomposition).

Page 14: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

The above definition for Joint Operation Planning is decomposed into the following 15 elements:

Each of these new elements are added to the Joint Operations Planning Domain Lexicon, which will eventually contain all of the content for the ontology. Furthermore, each of these new elements must also be defined and decomposed in the same way.

Combatant Commander Employment Planning Activity

Contingency Joint Force Redeployment

Crisis Joint Military Operation Response

Demobilization Mobilization Subordinate Joint Force Commander

Deployment Planning Sustainment

Page 15: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

1.5 Domain Description

The Domain Description activity follows SME Interaction, Survey of Authoritative References (Doctrine), Creation of Domain Definitions, Decomposition of Domain Definitions, and the Creation of a Domain Lexicon. At this point the developer should be able to compose a Domain Description Document, which describes the domain as an ontology or representation. The Domain Description should capture, at a minimum, the high-level entities and events for the domain, as well as their relations. It should also contain any SME descriptions as well as doctrinal definitions.

1.6 Devise Metrics

Devising Metrics for the Domain Ontology is done by way of SME input. It is a best practice to devise a list of questions that the ontology must be able to answer. The ontology’s ability to answer these questions is called Coverage of the domain. The inability to answer these questions identifies gaps in the ontology, which must be filled with additional content. The next page is a list of questions used to determine coverage of the C2 Ontology:

Page 16: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

What is the baseline definition/description for this domain?

What are the primary activities involved in this domain?

What are the subordinate activities in this domain?

Who participates in these activities?

What environment do these activities take place in?

What are the intended outcomes of these activities?

What are the intended products of these activities?

What information is consumed in these activities?

Who consumes this information?

What information is produced by these activities?

Where is this information found?

Where is this information stored?

What organizations are involved in this domain?

How are these organizations related?

What do these outputs contribute to?

What is the relation between agents and organizations in this domain?

What are the ultimate goals for the domain?

What are the subordinate goals for the domain?

What larger enterprise/objective does this domain contribute to?

What happens if these activities fail to produce their intended outcomes?

Metrics:20 Questions for C2 Related Domains

Page 17: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

2. Create Iterative Lexicon

Page 18: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

2.1 Decompose Terms and Definitions

This is a continuation of the activity described earlier, where each of the domain terms are decomposed, defined, and added to the Domain Lexicon. A handful of baseline terms can quickly grow into a rather large lexicon consisting of hundreds of terms.

2.2 Create Ontological Definitions

Ontological definitions consist of two parts. The fist part of the definition refers to the parent class of the thing being defined (e.g. Dog is an Animal, Tsunami is a Natural Event, Car is a Vehicle, etc.). The second part of the definition describes the differentia for the thing being defined—i.e. that which makes this thing different from every other thing in its class. So a Dog is defined as:

Dog: An Animal [parent class] which is a member of the genus Canis, probably descended from the common wolf, that has been domesticated by man since prehistoric times; occurs in many breeds [differentia from all other animals] (Merriam Webster’s Collegiate Dictionary)

Definitions should always be written in this format:

Parent Class…Differentia

Page 19: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

2.3 Create List of Relations

This activity results in the compilation of relations to be used in the domain ontology. The relations should be listed in the Lexicon in their own section.

2.4 Create Evolving Lexicon

This is a continuation of the activity described earlier, but the focus turns towards consistent (ontological) definitions, organization of the terms, and graphic depictions of the relations between entities and events in the domain (see next page for an example of a graphic depiction)

The Domain Lexicon can be organized alphabetically or by some other criteria, which makes more sense out of the content (e.g. by categories or subjects within the domain).

The Lexicon may include a section for graphic depictions for the relations between entities and events are included. These graphic depictions are intended to answer the questions identified in the Scoping/Metrics activity (see next page for example).

Page 20: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

Graphic Depiction of Relations

Geospatial Location

is_a

occurs_at

Latitude

Longitude

denoteshas_property

Altitude

Elevation

Event Hazardous Explosion

Event

IED Detonation

instance_of

denotes

denotesMilitary Symbol

Symbol Code

Longitude Measurement

Latitude Measurement

has_property

denotes

has_property

has_property

Page 21: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

3. Create Initial Ontology

Page 22: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

3. Create Initial Ontology

This activity uses content from the Domain Lexicon to create a hierarchical taxonomy, as well as a more robust domain representation that includes relations between Entities and Events.

3.1 Extend from a Common Upper Ontology (CUO)

The Basic Formal Ontology (BFO) and UCore-Semantic Layer (UCore-SL) are examples of Common Upper Ontologies, which consist of the most general categories of reality. These categories enable the developer to quickly organize the terms in the Domain Lexicon. Developers should become familiar with both Common Upper Ontologies in order to choose the one that is more appropriate for their work. They are available for download at:

http://www.ifomis.org/bfo/home

https://www.milsuite.mil/wiki/UCore-SL_Implementation_Guidance_August_2010

Page 23: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

3.1.1 Extend to Domain Continuants (these are “Entities” in UCore-SL)

A Continuant is defined as: an entity that exists in full at any time in which it exists at all, persists through time while maintaining its identity and has no temporal parts. These are referred to as “Entities” in he UCore-SL Ontology.

Examples include: a heart, a person, the color of a tomato, the mass of a cloud, a symphony orchestra, the disposition of blood to coagulate, the lawn and atmosphere in front of our building, the capability of some military organization.

3.1.2 Extend to Domain Occurrents (these are “Events” in UCore-SL)

In BFO an Occurrent is defined as: an entity that has temporal parts and that happens, unfolds or develops through time. Sometimes also called perdurants. These are referred to as “Events” in the UCore-SL Ontology

Examples of occurrents include: the life of an organism, a surgical operation, the maneuvering of a Brigade Combat Team, the most interesting part of Van Gogh's life, the flight of an artillery round, etc.

*The next two figures depict a samples of the BFO and UCore-SL.

Page 24: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

BFO Continuants & Occurrents

Domain Ontologies extend from BFO top-level categories. If chosen as the CUO, developers need to become familiar with the content in BFO.

Page 25: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

UCore-SL Entities & Events• Entities

– Information Content Entity• Analysis• Objective• Opinion• Plan

– Physical Entity• Agent• Artifact• Environment• Geographic Feature• Geospatial Boundary• Geospatial Region• Information Bearing Entity• Organization• Physical Object

– Property • Capability • Physical Property • Role

• Event– Act

• Act of Communication • Act of Observation• Criminal Act• Terrorist Act

– Cyberspace Event– Economic Event– Hazardous Event– Military Event– Natural Event– Planned Event – Political Event– Social Event

Page 26: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

3.2 Relate Continuants and Occurrents

Relations are how we make sense of the world around us. Ontological relations allow us to think and say things such as “A house is_a type of building” or “An Infantry Company is part_of an Infantry Battalion”. Without relations data is meaningless. Consider the elements of this simple message: 3rd Platoon is located_at grid coordinates AV 3479 8477. It is the relation “located_at” that gives meaning to the elements “3rd Platoon” and “AV 3479 8477”.

3.2.1 Relate Continuants to Continuants (Example):

Infantry Company is part_of a Battalion 3.2.2 Relate Continuants to Occurrents (Example):

Civil Affairs Team participates_in a Civil Reconnaissance 3.2.3 Relate Occurrents to Occurrents

Military Engagement is part_of a Battle

Page 27: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

3.2.4 Relate Universals to Universals

House is a Building

3.2.5 Relate Instances to Universals

3rd Platoon, Alpha Company participates_in Combat Operations

3.2.6 Relate Instances to Instances

3rd Platoon, Alpha Company is located_at Forward Operating Base Warhorse

Page 28: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

4. Revise Ontology

Page 29: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

4. Revisions Process

The revisions process results in a complete and accurate ontology, which contains all if the entities, events, and relations needed to represent a given domain. The intent of the revisions process is to improve, as much as possible, the ontology’s ability to answer questions about the domain at hand.

4.1 SME Feedback

Domain Subject Matter Experts will verify that the ontology is accurate and that it covers the domain sufficiently. SME’s work closely with the Ontologist, making revisions to the content—a change to the Lexicon will result in a change to the OWL file and vice versa. This activity focuses upon elements names, their definitions, and the relations between entities and events.

Page 30: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

Although SME input is an essential part of the ontology development process, it is still important to review the ontology terms and definitions with SMEs to assure that the content is accurate. It is important to note that some SMEs may find it difficult to review ontologies as these artifacts can be difficult to understand for the uninitiated. There are number of ways to alleviate this issue, one of which is to generate spreadsheets with the relevant information for SMEs to review. If SMEs are interested reviewing the ontologies directly, there are number of free ontology editors that SMEs can use to review the ontology, including Protégé OWL, TopBraid Composer and Knoodl.

4.2 Domain Coverage

Domain coverage is determined by the ontology’s ability to answer questions about the domain at hand.

http://protege.stanford.edu/http://www.topquadrant.com/http://knoodl.com/

Page 31: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

4.3 Semantic Conformance Testing

The repeatable ontology development process also involves a number of quality control measures referred to here as semantic conformance tests. These test are intended to ensure that best practices are being adhered to. Examples include:

• Run OWL Reasoner to identify inconsistencies

• Identify cases of multiple inheritance

• Identify classes that do not extend from the common upper level ontology

• Verify that every class has a preferred name and definition

• Verify that every relation has a domain and range

Some of these tests are hard violations (i.e. the violation must be corrected prior to publication) and some of these test are soft violations (i.e. it is left to the discretion of the ontologist to determine if the violation should be corrected or not).

Page 32: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

5. Publish and Share Ontology

Page 33: Guide to a Repeatable Process for Ontology Creation (v 0.1) Draft Copy FOUO

Publish and Post to Repository