ontology 101 - new york semantic technology conference

152
+ Ontology 101: An Introduction to Knowledge Representation, the Web Ontology Language (OWL) & Ontology Development Elisa Kendall, Thematix Partners LLC & Deborah McGuinness, RPI / McGuinness Associates 15 October 2012

Upload: robertk

Post on 11-May-2015

6.490 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Ontology 101 - New York Semantic Technology Conference

+

Ontology 101: An Introduction to Knowledge Representation, the Web Ontology Language (OWL) & Ontology Development

Elisa Kendall, Thematix Partners LLC & Deborah McGuinness, RPI / McGuinness Associates

15 October 2012

Page 2: Ontology 101 - New York Semantic Technology Conference

2

+

Mars was photographed by the Hubble Space Telescope in August 2003 as the planet passed closer to Earth than it had in nearly 60,000 years. Image Credit: NASA, J. Bell (Cornell U.) and M. Wolff (SSI)A sunset on Mars creates a glow due to

the presence of tiny dust particles in the atmosphere. This photo is a combination of four images taken by Mars Pathfinder, which landed on Mars in 1997. Image credit: NASA/JPL

Recent images from instruments on board the Mars Reconnaissance Orbiter take much more detailed, narrower views of specific features of the Martian surface. Image credit: NASA/JPL

The Planetary Data Store (PDS) is a distributed repository of 40+ years’ imagery & data taken by a range of instruments on many

diverse missions, available for scientific research.

Content management

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

2

Page 3: Ontology 101 - New York Semantic Technology Conference

3

+

Provenance/sources for tracking family members in the 19th century include early census data (often error prone), military records, passenger & immigration lists, online documents (e.g., county histories, church histories, etc.)

• Historical/forensic research requires cross-domain search of a wide variety of resources within a given geo-spatial/temporal context

• Similar capabilities are essential for business intelligence, law enforcement, government applications – all require terminology reconciliation

Smart search

3

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 4: Ontology 101 - New York Semantic Technology Conference

4

+Merchandising

4Attribute 1Attribute 2Leg RoomAttribute 4Attribute 5Attribute 5On-Off AccessAttribute 7Attribute 8Attribute 9Media OptionsAttribute 11Attribute 12Meal OptionsAttribute 14Attribute 15Time of DayAttribute 17Attribute 18Distance to GateAttribute 20Attribute 21Flight CharacteristicsAttribute N…

“Comfort-Oriented”

Flight

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 5: Ontology 101 - New York Semantic Technology Conference

5

+Search Engine Optimization

5

Use search engines as a front door to transaction

Dramatically increase relevance by inclusion of more meaningful, synonymic tags

Enrich search results, adding ratings, video, prices, avails, amenities, locality etc.

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 6: Ontology 101 - New York Semantic Technology Conference

6

+Tutorial Outline

Introduction to Knowledge Representation

OWL Basics

Tools & Applications

6

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 7: Ontology 101 - New York Semantic Technology Conference

7

+Part 1: Introduction to Knowledge Representation

A little background and a few definitions

Layers of abstraction & conceptual modeling

Classifying ontologies

A little methodology

7

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 8: Ontology 101 - New York Semantic Technology Conference

8

+

An ontology is a specification of a conceptualization. – Tom Gruber

Knowledge engineering is the application of logic and ontology to the task of building computable models of some domain for some purpose. – John Sowa

Artificial Intelligence can be viewed as the study of intelligent behavior achieved through computational means. Knowledge Representation then is the part of AI that is concerned with how an agent uses what it knows in deciding what to do. – Brachman and Levesque, KR&R

Knowledge representation means that knowledge is formalized in a symbolic form, that is, to find a symbolic expression that can be interpreted. – Klein and Methlie

The task of classifying all the words of language, or what's the same thing, all the ideas that seek expression, is the most stupendous of logical tasks. Anybody but the most accomplished logician must break down in it utterly; and even for the strongest man, it is the severest possible tax on the logical equipment and faculty. – Charles Sanders Peirce, letter to editor B. E. Smith of the Century Dictionary

Definitions

8

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 9: Ontology 101 - New York Semantic Technology Conference

9

+What is an ontology?

An ontology specifies a rich description of the

Terminology, concepts, nomenclature

Properties explicitly defining concepts

Relations among concepts (hierarchical and lattice)

Rules distinguishing concepts, refining definitions and relations (constraints, restrictions, regular expressions)

relevant to a particular domain or area of interest.

9

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 10: Ontology 101 - New York Semantic Technology Conference

10

+Logic and ontological commitment

Predicate logic is harder to read than the original English, but is more precise:

Every semi-trailer truck has at least 3 axles.

( x)(((SemiTrailerTruck(x) ( y)(SemiTrailer(y) (hasPart (x,y))) (SemiTrailerTruck(x) ( z)(TractorUnit(z) (hasPart (x,z))))

( s)(set(s) (count(s,(≥3))

( w)(member(w,s) (Axle(w) hasPart(x,w))) )).

Logic is a simple language with few basic symbols.

The level of detail depends on the choice of predicates – these predicates represent an ontology of the relevant concepts in the domain.

Different choices of predicates represent different ontological commitments.

* Derived from Knowledge Representation: Logical, Philosophical, and Computational Foundations, John F. Sowa, Brooks/Cole, Pacific Grove, CA, 2000.

10

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 11: Ontology 101 - New York Semantic Technology Conference

11

+

Ontologies provide a common vocabulary for use by independently developed resources, processes, services

Agreements among organizations sharing common services can be made with regard to their usage; the meaning of relevant concepts can be expressed unambiguously

By composing / mapping ontologies and mediating terminology across participating events, resources and services, independently-developed services can work together to share information and processes consistently, accurately, and completely

Ontologies also ensure Valid conversations among agents to collect, process, fuse,

and exchange information Accurate searching by ensuring context using concept

definitions and relations instead of/in addition to statistical relevance of keywords

Ontology-based technologies

11

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 12: Ontology 101 - New York Semantic Technology Conference

12

+KR language features

Vocabulary – a collection of symbols Domain-independent logical symbols (e.g., or ) Domain-dependent constants, identifying individuals, properties,

or relations in the application domain or universe of discourse Variables, whose range is governed by quantifiers Punctuation that separates or groups other symbols

Syntax – formation rules that determine how symbols can be combined in

well-formed expressions rules may be stated in a linear grammar, graph grammar, or

independent abstract syntax

Semantics – a theory of reference that determines how the constants and

variables are associated with things in the universe of discourse a theory of truth that distinguishes true statements from false

statements

Rules of Inference – rules that determine how one pattern can be inferred from another if the logic is sound, the rules of inference must preserve truth as

determined by the semantics

12

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 13: Ontology 101 - New York Semantic Technology Conference

13

+Classifying logics

Logics vary from classical FOL along six dimensions: Syntax Subsets limit permissible operators or combinations, e.g., propositional logic

(without quantifiers), Horn-clause (excludes disjunctions in conclusions, such as Prolog), terminological or definitional logics (containing additional restrictions, e.g., description logics)

Proof Theory restricts or extends permissible proofs: • Intuitionistic logic and relevance logic rule out certain extraneous information

• Non-monotonic logics allow introduction of default assumptions

• Access-limited logic restricts the number of times a proposition can be used in a proof; Linear logic allows a proposition to be used only once

• Modal logic incorporates modal auxiliaries (p means p is possibly true; p means p is �necessarily true), temporal logic extends model logic to include always, sometimes

• Intensional logics express concepts such as need, ought, hope, fear, wish, believe, know, expect, and intend

Model Theory modifies the truth value of statements in terms of some model of the world: classical FOL is two-valued; a three-valued logic introduces unknowns; fuzzy logic uses the same notation as FOL but with an infinite range of certainty factors (1.0 to 0.0)

Ontology – frameworks may include support for built-in components, such as set theory or time

Meta-language – language for encoding information about objects

13

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 14: Ontology 101 - New York Semantic Technology Conference

14

+Description logic

A family of logic-based Knowledge Representation formalisms Descendants of semantic networks and KL-ONE Describe domain in terms of concepts (classes), roles (relationships), and

individuals (instances)

Distinguished by Formal semantics

• Decidable fragments of FOL• Closely related to Propositional, Modal, and Dynamic Logics

Provision of inference services• Sound and complete decision procedures for key problems• Implemented systems (highly optimized)

Applications include Configuration – product configurators, consistency checking, constraint

propagation, first significant industrial application (e.g., CLASSIC) Ontologies – ontology engineering (design, maintenance, integration),

reasoning with ontology-based mark-up, service description and discovery

Databases – consistency of conceptual schemata, schema integration, query subsumption (w.r.t. conceptual schemata)

14

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 15: Ontology 101 - New York Semantic Technology Conference

15

+

An ontology is a conceptual model of some aspect of a particular universe of discourse (or of a domain of discourse)

Typically, ontologies contain only “rarified” or “special” individuals, metadata, representing elemental concepts critical to the domain

A knowledge base is a persistent repository for Ontology & metadata representing individuals, facts, & rules about

how they can be combined or relate to one another Metadata, facts & rules only – in some applications and frameworks

the ontology is separately maintained

Most inference engines require in-memory deductive databases for efficient reasoning (including commercially available reasoners)

A knowledge base may be implemented in a physical, external database, such as a relational database, but reasoning is typically done on a subset (partition) of that knowledge base in memory

Knowledge bases, databases, and ontology

15

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 16: Ontology 101 - New York Semantic Technology Conference

16

+

Reasoning is the mechanism by which the assertions made in an ontology and related knowledge base are evaluated by an inference engine.

In classical logic, the validity of a particular conclusion is retained even if new information is received.

This may change if some of the preconditions are actually hypothetical assumptions invalidated by the new information.

The same idea applies for arbitrary actions – new information can make preconditions invalid.

Generally, there are two issues that a reasoner must address: If some conclusion is invalid, which other conclusions are also invalid? If some action cannot be performed, which others are at risk?

The “housekeeping” associated with tracking the threads that support answering these questions is called truth maintenance.

Reasoning and truth maintenance

16

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 17: Ontology 101 - New York Semantic Technology Conference

17

+

If all new information is “positive”, then all prior conclusions should remain valid.

Problems arise if new information negates a prior assumption, causing it to be withdrawn.

What does it mean to negate (withdraw) an assumption? Conclusive information is not available? The assumption cannot be proven? The assumption is not provable using certain methods? The assumption is not provable given a fixed quantity of time?

The answers to these questions can result in different definitions of negation and differing interpretations by non-monotonic reasoners.

Solutions include chronological and “intelligent” backtracking algorithms, heuristics, circumscription algorithms, justification or assumption based retraction, and so forth, depending on the reasoner and methods used for truth maintenance.

Reasoning efficiency is dependent, in part, on the algorithms applied for truth maintenance.

Negation

17

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 18: Ontology 101 - New York Semantic Technology Conference

18

+Explanations and proofs

When a reasoner draws a particular conclusion, many users and applications want to understand why?

Primary motivations include interoperability, reuse, and trust

Understanding the provenance of the information and results is crucial, especially when web-based information is involved What information sources were used (source) How recently they were updated (currency) How reliable these sources are (authoritativeness) Was the information directly available or derived, and if derived, how

(method of reasoning)

Methods used to explain why a reasoner reached a particular conclusion include explanation generation and proof specification

18

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 19: Ontology 101 - New York Semantic Technology Conference

19

+

Knowledge Representation / Management for Large Scale Applications Provide broad metadata, process, service & asset management facilities (including

feedback/lessons learned…) Enable rich cross-domain, cross-process, cross organizational modeling supported

by mapping & transformation services to provide maximum flexibility, interoperability

Leverage standards and best practices in information architecture, metadata modeling, management, registration, and governance, and asset management & registration

Provide incremental reasoning capabilities for model validation, transformation services

Repeatable, reusable, interoperable

Contextu

al

• Identify subject areas

Conceptu

al

• Define the meaning of things in the organization

Logical

• Describe the logical representation of properties

Physical

• Describe the physical means by which data is stored

Definition

• Represent the coding language on a specific development platform

Instance

• Hold the values of the properties applied to the data in a schema

Ontology

Ontology, Conceptual ER, Conceptual Business Process …

ER, Relational, XML Schema

XML, source code, scripting languages, stored procedures…

Physical KBs, databases, asset repositories…

*Layering diagram courtesy Kenn Hussey

Abstraction layers

19

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 20: Ontology 101 - New York Semantic Technology Conference

20

+Hypothetical conceptual model “EU-Rent”

Produced using Embarcadero EA/Studio Business Modeler Edition, courtesy Kenn Hussey

20

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 21: Ontology 101 - New York Semantic Technology Conference

21

+Hypothetical logical model “EU-Rent”

1

P

Z

P P

P

Rental

non-local currency

booking mode

scheduled pick-up date-time

actual pick-up date-time

scheduled return date-time

actual return date-time

credit card

valid credit card

valid driver license

estimated rental charge

actual rental charge

rental state

Rental Car

VIN

acquisition date

odometer reading

service reading

rental car state

Rental Movement

Rental Organization Unit

rental organization unit id

ISO country code

site address

organization function

Service Depot

Advance Rental

booking date-time

payment type

advance rental state

Agency

Walk-in Rental

Agent

agent id

agent name

Airport Branch

Branch

branch name

hours of operation

car storage capacity

branch type

Car Group

group name

passenger capacity

Car Model

model id

manfacturer name

model name

engine capacity

car body style

Car Movement

movement id

movement type

movement direction

Car Transfer

transfer date

transfer pick-up date-time

transfer drop-off date-time

City Branch

Local Area

company name

branch type

organization function

booking mode

movement type

Driver

driver id

ISO country code

name

current contact details

driver license

driver type

loyalty club id

points balance

driver state

has

is of

includes

is included in

is of

has

is specified by

specifies

has

is of

has

is of

is included in

includes

requests

is requested by

stores

is stored at

operates

is operated by

owns

is owned by

is of

has

is upgrade for

is upgraded to

includes

is included in

includes

included in

is assigned to

has assigned

is responsible for

is responsibility of

Produced using Embarcadero ER/Studio, courtesy Kenn Hussey

21

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 22: Ontology 101 - New York Semantic Technology Conference

22

+

P P

ZZZ

Z ZZ

ZZ

ZZ

1

P

Z

P

Local_Area

local_area_id (PK)(FK)

company_name

Rental

rental_id (PK)(FK)

renter_id (FK)

pick_up_branch (FK)

drop_off_branch (FK)

non_local_currency

booking_mode

scheduled_pick_up_date_time

actual_pick_up_date_time

scheduled_return_date_time

actual_return_date_time

credit_card

valid_credit_card

valid_driver_license

estimated_rental_charge

actual_rental_charge

rental_state

Rental_Car

VIN (PK)

stored_at_branch (FK)

owner_local_area (FK)

model_id (FK)

acquisition_date

odometer_reading

service_reading

rental_car_state

Rental_Movement

rental_id (PK)(FK)

Rental_Organization_Unit

rental_organization_unit_id (PK)

ISO_country_code

site_address

organization_function

Service_Depot

service_depot_id (PK)(FK)

local_area (FK)

Advance_Rental

rental_id (PK)(FK)

requested_car_model (FK)

booking_date_time

payment_type

advance_rental_state

Walk_in_Rental

rental_id (PK)(FK) Agency

branch_id (PK)(FK)

agent (FK)

Agent

agent_id (PK)

agent_name

Airport_Branch

branch_id (PK)(FK)

Branch

branch_id (PK)(FK)

local_area (FK)

branch_name

hours_of_operation

car_storage_capacity

branch_type

Car_Group

group_name (PK)

upgrade_group (FK)

passenger_capacity

Car_Model

model_id (PK)

group_name (FK)

manfacturer_name

model_name

engine_capacity

car_body_style

Car_Movement

movement_id (PK)

rental_car (FK)

group_name (FK)

receiving_branch (FK)

sending_branch (FK)

movement_type

movement_direction

Car_Transfer

transfer_id (PK)(FK)

transfer_date

transfer_pick_up_date_time

transfer_drop_off_date_time

City_Branch

branch_id (PK)(FK)

Driver

driver_id (PK)

ISO_country_code

name

current_contact_details

driver_license

driver_type

loyalty_club_id

points_balance

driver_state

includes

is included in

includes

included in

is assigned to

has assigned

has

is of

includes

is included in

is of

has

is specified by

specifies

has

is of

has

is of

is included in

includes

requests

is requested by

stores

is stored at

operates

is operated by

owns

is owned by

is of

has

is upgrade for

is upgraded to

is responsible for

is responsibility of

Produced using Embarcadero ER/Studio, courtesy Kenn Hussey

Hypothetical physical model “EU-Rent”

22

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 23: Ontology 101 - New York Semantic Technology Conference

23

+Classifying ontologies

Level of Complexity

Level of

Expre

ssiv

ity

Simple Taxonomy

Glossary

Topic Map

Concept Map

Hierarchical Taxonomy

Entity – Relationship Model

Database Schema

OO Software Model

KR System

XML Schema

Classification techniques are as diverse

as conceptual models; and generally include understanding

Level of Expressivity Level of Complexity / Structure Granularity Target Usage, Relevance Amount of Automation, Reasoning

Requirements Prescriptive vs. Descriptive / Reliability /

Level of Authoritativeness Design Methodology Governance Vocabulary Management, Metrics

23

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 24: Ontology 101 - New York Semantic Technology Conference

24

+Framework of dimensions

Semantic Dimensions Expressiveness: represents how well a KR language

addresses increasingly complex semantics Structure: represents how well an ontology encodes

semantics, with the same or less expressivity than the KR language

Granularity: represents the level of detail specified in an ontology

Pragmatic Dimensions Intended use: the original use case(es), or purpose for

developing a particular ontology Automated reasoning: the extent to which the ontology is

designed to be used for automated reasoning Prescriptive vs. Descriptive: the extent to which an ontology

was intended to be used for descriptive purposes vs. normative prescriptive use (i.e., with high degree of concern for correctness)

Reference: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2007

24

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 25: Ontology 101 - New York Semantic Technology Conference

25

+Considerations

Intended use of ontologies, including domain requirements (e.g., scientific and engineering apps require formulas, units of measure, computations that may be challenging to represent)

Intended use of KRSs that implement them, including reasoning requirements, questions to be answered

For distributed environments, the number and kinds of resources, processes, services requiring ontologies – how distributed, how unique, developed collaboratively or independently, dynamic community participation or static

What kinds of transformations are required among processes, resources, services to support semantic mediation

Ontology and KRS alignment / de-confliction / ambiguity resolution requirements

Ontology and KRS composition requirements, dynamic vs. static composition, in what environment and under what constraints

Performance, sizing, timing requirements of target environment

25

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 26: Ontology 101 - New York Semantic Technology Conference

26

+

Requirements, domain & use case analysis are critical Develop initial source/reference material Focus on system or application requirements Iterative development starting with a “thread” that covers

basic capabilities can ground the work and prioritize decisions

Need to understand and communicate Architectural trade-offs, cost & technical benefits The nature of the information & kinds of questions that

need to be answered drive the architecture, approach, and ontology scoping and design

Reuse standards and well-tested, available ontologies whenever possible

A little methodology …

26

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 27: Ontology 101 - New York Semantic Technology Conference

27

+What to look for A controlled vocabulary

FAA & IATA airport codes, ACRISS car codes, …

A hierarchical or taxonomic structure (for query expansion) Vehicle, Ground-based Vehicle, Wheeled Vehicle, Powered Vehicle, Automobile, Sedan…

Knowledge supporting structured queriesFind all available hybrid SUVs or hybrid sedans that can seat four adults

within a reasonable taxi ride of Planet Hollywood, Las Vegas for 3-4 days the week of June 20th

Efficient inference (i.e., limited expressive power) vs. increased expressivity (potentially expensive or resource bounded computation)

Custom reasoning for temporal relations, geospatial, dynamic algorithm / equation evaluation, process-specific, conditional operations

Computational tractability

27

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 28: Ontology 101 - New York Semantic Technology Conference

28

+ General concepts as well as domain-specific knowledge Basic starting point – cross-domain definitions

Namespace definitions, metadata, naming conventions, governance policies Commonly used structures & vocabularies, such as domain-specific vocabulary

& messaging standards, international country & language codes (ISO), national postal addressing or other government standards, industry best practices

Common metadata for ontology & schema management (e.g., Dublin Core, for documents & models, ISO 1087 for synonyms & similar relations, MIME media types for images & multimedia, SKOS for annotations, etc.)

Domain vocabularies must be prioritized, selected based on business requirements, clear ROI

Common early targets include Smart search (pull); customer experience & cross-sell / upsell (push) – RDFa,

extensions to schema.org Richer interoperability among trading partners – extensions to messaging

agreements Service registration, description, discovery & management Augmented decision support and business rule applications Asset/artifact repository search & retrieval Automated verification

Start with canonical definitions

28

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 29: Ontology 101 - New York Semantic Technology Conference

29

+

Layout a high-level architecture for ontology elements

Identify the relationships among elements – roles, domain, interface, process, utility

For each ontology element Describe its domain and scope, how it will be used Identify example questions and anticipated/sample answers for

the application(s) it will support Identify key stakeholders, ownership, maintenance, resources

for instance knowledge Describe anticipated reuse/evolution path Identify critical standards, resources that it must interoperate

with, dependencies

Resources http://www.idef.com/IDEF5/html

http://www.kbsi.com/technology/methods/sbont.htm

IDEF5 (Integrated Definition Methods) Ontology Capture Method Analysis

29

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 30: Ontology 101 - New York Semantic Technology Conference

30

+Principles & Methods for Terminology Work (ISO 704)

Methodology for describing concepts & terms Uses ISO 1087 for terminology Uses ISO 860 for terminology “harmonization” (alignment)

methods Basis for typical methods used for taxonomy development today

Describes how to flesh out definitions Recommendations strategies for relating terms to one

another using standard vocabulary ISO 1087 – great resource for language to describe kinds of

relationships, acronyms & other designations, preferred vs. deprecated terms, etc.

ISO 860 augments this with recommendations for vocabulary comparison

30

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 31: Ontology 101 - New York Semantic Technology Conference

31

+Key methodology issues

Naming conventions and versioning policies are critical for –every– organization Namespace definitions for ontologies are critical, along with

policies for their management that are well understood http://<authority>/<subdomain>/<topic>/<date, in

YYYYMMDD form>/filename.extension is common practice in some organizations, with content negotiation to the latest version

Levels of hierarchy may be added for large organizations or modularized ontologies

Use of “id.name.ext” or “ontology.name.ext” vs. “www.name.ext,” for the authority component of the URL is increasing

Namespace prefixes (abbreviations) for individual modules, especially where there are multiple modules, can be important due to longevity

31

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 32: Ontology 101 - New York Semantic Technology Conference

32

+More on naming

For naming entities in an ontology – there are some rules of thumb that vary by community of practice data modelers often use underscores at word boundaries, spaces

in names – which semantic web tools may not handle well (ok in labels)

some name properties <domain><predicate><range>, others <predicate><range>, & others <predicate>, but not necessarily consistently

semantic web practitioners typically use camel case; property naming varies by domain & community of practice

for very large ontologies, some use unique identifiers to name concepts and properties, with human readable names in labels only –depends on tooling that helps people interpret the content

The key is to establish & consistently use guidelines tailored to your organization

Guidelines for classes/concepts (e.g., OWL classes, CL names, SBVR concepts), properties (e.g., OWL properties, CL relations, functions, SBVR roles), individuals (objects in UML) – so that conceptual models developed in one tool can be exported consistently to other tools & formats essentialCopyright © 2012 Thematix Partners LLC & McGuinness Associates

32

Page 33: Ontology 101 - New York Semantic Technology Conference

33

+Change management Change management and traceability is not well defined at the

element level for ontologies, still a research topic Current approach to element-level versioning to support reasoning about

change is to track two parallel streams: one for additions to the ontology, one for retractions; often managed as two separate files

UML/MOF versioning suggests linking to a workspace; use of a default workspace would get the latest version, and tools such as MagicDraw© & repositories such as Adaptive can provide an indication of the differences

UML-based versioning does not support reasoning about the differences so that users can determine the impact of applying the changes addition or deletion of axioms can change downstream reasoning results,

especially where there are complex dependencies

Mechanisms that preserve the versioning detail in artifacts that are generated from such models, such as RDF/XML serialized OWL, and are compatible with the above approaches, is an ongoing topic of discussion Consider the use cases/justification for whatever level of versioning detail is

needed on a project by project basis Balance with usability/performance

33

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 34: Ontology 101 - New York Semantic Technology Conference

34

+Modularity

Guidance on modularity is limited When is any model “too big”?

Can individual parts evolve independently, and if so, shouldn’t that dictate module boundaries?

Can individual parts be used independently, and if so, shouldn’t that also dictate module boundaries?

For ontologies, particularly OWL ontologies, individuals should be managed separately from “schema” to facilitate reasoning during development (i.e., if you add an individual that creates a logical inconsistency, most reasoners won’t load the ontology at all)

Current practices in the semantic web community address modularity either statically or dynamically Static approaches include adherence to “DL-Safe” rules and

suggestions in papers by Alan Rector (University of Manchester)

Dynamic approaches include use of tools that can assist in determining whether or not an ontology is appropriately modularized, using reasoning to perform rewriting with respect to soundness and completeness for a given ontology

34

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 35: Ontology 101 - New York Semantic Technology Conference

35

+Metadata for ontologies and elements

Metadata should be standardized at both the model level and the element level for every model (not just ontologies) Model level metadata can reuse properties from the Dublin Core

Metadata Terms, from the Simple Knowledge Organization System (SKOS), ISO 11179 Metadata Registry standard with ISO 1087 Terminology support, the Ontology Metadata Vocabulary (OMV), emerging W3C work on provenance and others

Most annotations should be optional at the element level, but a minimal set, including names, labels, and formal, text definitions, is important for reusability & collaboration

Consistent use of the same annotations (properties, tags) improves readability, facilitates automated documentation generation, and enables better search over ontology repositories

Model level metadata may reuse organization-specific taxonomies to enable better search

through RDFa tagging, for example

good relations & schema.org are both well known & in use in industry today

35

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 36: Ontology 101 - New York Semantic Technology Conference

36

+

A quick intro to RDF and RDF Schema

Basic constructs: descriptions & classes

Class expressions

Properties & restricting property values

Individuals & data ranges

Miscellaneous class axioms

New features of OWL 2

A few rules of thumb Depth vs. breadth, naming, synonymous terms Class vs. property value, class vs. individual Limiting scope

Part 2: OWL Basics

36

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 37: Ontology 101 - New York Semantic Technology Conference

37

+

"The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation."

-- Tim Berners-Lee

Semantic Web

37

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 38: Ontology 101 - New York Semantic Technology Conference

38

+Guiding Principles

Historically, knowledge representation and reasoning systems have operated under closed world assumptions

Uncertainty is magnified under open-world, “wild, wild web” conditions, making reasoning much more difficult

Semantic web languages are designed to support less certainty, to provide “better” search results, informed answers to questions, not absolutes

Because they are based on XML, such languages can assist businesses in leveraging existing investment in mark-up, content, and data To augment business intelligence/analysis and knowledge mining To support knowledge sharing and collaboration, augment enterprise

information integration Enrich web services and other applications Support policy-based applications and ensure compliance with policy

at a lower cost with higher potential ROI than traditional computing methods

38

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 39: Ontology 101 - New York Semantic Technology Conference

39

+

Describes relationships

Uses URIs used for naming

Language has graph based model RDF/XML serialization (exchange syntax) other presentation syntaxes (N3, Turtle, …)

Specification, W3C presentations, tools are available at Semantic Web: http://www.w3.org/standards/semanticweb/ Linked Data: http://www.w3.org/standards/semanticweb/data RDF: http://www.w3.org/standards/techs/rdf#w3c_all

New RDF “next steps” revision working group under way to make specific enhancements, “fixes” to the standard

RDF Working Group: http://www.w3.org/2011/rdf-wg/wiki/Main_Page

Resource Description Framework (RDF)

39

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 40: Ontology 101 - New York Semantic Technology Conference

40

+

http://semtech2012.semanticweb.com/travel.owl#Conference

http://semtech2012.semanticweb.com/travel.owl#HiltonSFUnionSquare

http://semtech2012.semanticweb.com/travel.owl#heldAtGraph:

XML/RDF:<rdf:Description rdf:ID=“Conference"/> <semtech:heldAt rdf:resource="#HiltonSFUnionSquare"/> </rdf:Description>

N3: semtech:Conference semtech:heldAt semtech:HiltonSFUnionSquare .

Subject

Predicate

Object

RDF Notation Options

40

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 41: Ontology 101 - New York Semantic Technology Conference

41

+

An RDF vocabulary that provides for identifying: classes,

subsumption (inheritance) relations for classes,

subsumption (inheritance) relations for properties,

domain and range for properties

RDF Schema (RDFS)

41

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 42: Ontology 101 - New York Semantic Technology Conference

42

+

Graph:

XML/RDF:<rdf:Description rdf:ID="Hotel"> <rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/> </rdf:Description> <rdfs:Class rdf:ID=“ConferenceHotel"> <rdfs:subClassOf rdf:resource="#Hotel"/> </rdfs:Class>

<semtech:ConferenceHotel rdf:ID=“HiltonSFUnionSquare“/>

RDF Schema (RDFS)

rdfs:Class

semtech:Hotel

semtech:ConferenceHotel

rdf:typerdfs:subClassOf

rdf:type

semtech:HiltonSFUnionSquare

rdf:type

42

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 43: Ontology 101 - New York Semantic Technology Conference

43

+The Web Ontology Language (OWL)

Two languages emerged in parallel to address semantic web requirements DAML-ONT, developed through the DARPA/DAML program OIL (Ontology Inference Layer) developed by EU & US researchers

Merged DAML+OIL was submitted to the W3C 2002, formed the basis for the WebOnt Working Group

OWL extends RDF Schema Has an RDFS based syntax and reuses some RDF vocabulary (e.g.,

subClassOf, domain, range) Adds rich primitives and redefines others (transitivity, inverse,

cardinality constraints, complex class definitions)

Describes the structure of a domain in terms of classes and properties

Uses RDFS for class/property membership assertions (ground facts), XML Schema Datatypes

OWL specifications became W3C recommendations in February 2004

OWL 2 specifications were adopted in October 2009

43

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 44: Ontology 101 - New York Semantic Technology Conference

44

+

a WINE

a LIQUIDa POTABLE

grape: chardonnay, ... [>= 1]sugar-content: dry, sweet, off-drycolor: red, white, roseprice: a PRICEwinery: a WINERY

grape dictates color (modulo skin)harvest time and sugar are related

General Categories

Structured Components

InterconnectionsBetween Parts

General nature of descriptions

44

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 45: Ontology 101 - New York Semantic Technology Conference

45

+

Number / Card Restrictions

ValueRestrictions

Class

Superclass

Roles /Properties

General nature of descriptions

a WINE

a LIQUIDa POTABLE

grape: chardonnay, ... [>= 1]sugar-content: dry, sweet, off-drycolor: red, white, roseprice: a PRICEwinery: a WINERY

grape dictates color (modulo skin)harvest time and sugar are related

General Categories

Structured Components

InterconnectionsBetween Parts

45

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 46: Ontology 101 - New York Semantic Technology Conference

46

+Describing classes

A class is a concept in the domain Vintage – a wine made from grapes grown in a specified year A set of characteristics (flavor, body, color, sugar…)

A class is a collection of elements with similar properties White wine – wines made from white grapes White table wine – wines made from white grapes that are not

appellations or regional (not “quality wine” in the EU)

A class expression defines necessary conditions for membership (specific varietal, field, micro-climate, date picked, date bottled)

Instances of classes Daniel Gehrs 2009 Merlot Robert M. Parker, Jr,’s Wine Advocate review, dated February

28, 2002 Los Olivos, Santa Barbara County, California, USA

46

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 47: Ontology 101 - New York Semantic Technology Conference

47

+

Classes are organized into subclass-superclass (or generalization-specialization) hierarchies

True subclass relationships are the basis of a formal is-a hierarchy Classes are “is-a” related if an instance of the subclass is an instance of

the superclass is-a hierarchies are essential for classification Violation of true is-a hierarchical relationships can have unintended

consequences & and cause errors in reasoning

Class expressions should be viewed as sets, and subclasses as a subset of the superclass, in contrast with a collection of attributes as classes are sometimes specified in object oriented programming

Examples RedWine is a subclass of Wine

Every red wine is a wine; every instance of a red wine (e.g., Foxen Cabernet Franc) is an instance of wine

NapaValleyWine is a subclass of CaliforniaWine

Every wine from Napa Valley is a wine from California

Class inheritance

47

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 48: Ontology 101 - New York Semantic Technology Conference

48

+Class expressions

48

6 primary kinds of class expressions in OWL 5 anonymous Definitions can be nested using set-theoretic constructs

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 49: Ontology 101 - New York Semantic Technology Conference

49

+Example class hierarchy with other expressions

Example from ISO 1087 combining subclass-superclass and union expressions

Provides a more accurate representation of the relationships defined in the ISO standard than a strict is-a hierarchy would

49

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 50: Ontology 101 - New York Semantic Technology Conference

50

+Example class hierarchy with other expressions

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

50

Page 51: Ontology 101 - New York Semantic Technology Conference

51

+Modeling considerations for class hierarchy development

Practitioners tend to start Top-down - define the most general concepts first and then

specialize them Bottom-up - define the most specific concepts and then

organize them into more general classes Combination (typical – breadth at the top level and depth along

a few branches to test design)

Class inheritance is transitive A is a subclass of B (white wine, dessert wine are subclasses of

wine) B is a subclass of C (sauvignon blanc is a subclass of white

wine, late harvest wine is a subclass of dessert wine) therefore A is a subclass of C (late harvest sauvignon blanc is a

subclass of white wine, dessert wine, & wine) Start with a more formal is-a approach, tease out whether other

expressions are more appropriate based on use cases, formal definitions when available, etc.

51

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 52: Ontology 101 - New York Semantic Technology Conference

52

+Properties

Properties describe characteristics, features, or attributes of the members of a classEvery flowering plant has a bloom color, color pattern, flower form,

petal form, etc.

Classes of properties intrinsic: properties of the plant itself such as the optimal sunlight,

soil conditions, expected height, and so forth extrinsic: properties imposed externally such as the grower and price whole-part relations geospatial, mereonomic relations: what microclimates are this plant

well suited to, where in a particular historic garden will you find it

Data and object properties simple attributes typically have primitive data values (e.g., strings,

numbers) complex properties refer to other entities (e.g., an individual grower,

such as Monrovia, or organization such as the American Orchid Society)

OWL reasoning requires strict separation of data and object properties

52

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 53: Ontology 101 - New York Semantic Technology Conference

53

+Properties in OWL 2

53

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 54: Ontology 101 - New York Semantic Technology Conference

54

+Domain & range properties

In OWL and many other KR languages, relations (properties) are strictly binary

The domain & range represent the source & target arguments, respectively, for the property

Domain – the class (or classes) that may have the property Wine is the domain of the property hasWineColor

Range – the class (or classes) defining valid property values everything that fulfills the hasWineColor property is an instance of the enumerated class {red, white, rose}

Some KR languages that inherently support n-ary relations, such as CL, do not make this distinction More flexible, intuitively more like mathematics, where functions

have ranges (or return types) but not all relations are functions Requires additional relations to specify argument order, which can be

critical for ontology alignment

54

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 55: Ontology 101 - New York Semantic Technology Conference

55

+Class & property inheritance

A subclass inherits all the properties from its super classIf a wine has a name and flavor, a red wine also has a name and flavor

If a class has multiple super classes, it inherits properties and restrictions from all of themLatin is both an individual language and an ancient language. It

inherits “has attested literature: true” from ancient language and “has indigenous name: lingua latina” from individual language

55

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 56: Ontology 101 - New York Semantic Technology Conference

56

+Class expressions that restrict property values

Number restrictions describe or limit the number of possible values a particular property can have A language must be associated with at least one English name

and at least one French name A language may be associated with zero or more Indigenous

names

Cardinality – similar meaning to classical set theory, measures the number of elements in the set (restriction class) Cardinality – cardinality N means the class defined by the

property restriction must have exactly N values (individual or literal values)

Minimum cardinality - 1 means that there must be at least one value (required), 0 means that the value is optional

Maximum cardinality - 1 means that there can be at most one value (single-valued), N means that there can be up to N values (N > 1, multi-valued)

56

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 57: Ontology 101 - New York Semantic Technology Conference

57

+Simple number restriction examples

Creating a class of individuals that have exactly one color & a class of individuals that have more than one color

Note that the relationship between SingleColoredThing and the restriction class is modeled as an equivalence relationship, meaning that membership in the restriction class is a necessary and sufficient condition for being a SingleColoredThing

57

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

<owl:ObjectProperty rdf:about="&re;hasColor"> <rdfs:range rdf:resource="&re;Color"/> <rdfs:domain rdf:resource="&re;ColoredThing"/> </owl:ObjectProperty>  <owl:Class rdf:about="&re;Color"/> <owl:Class rdf:about="&re;ColoredThing"/>  <owl:Class rdf:about="&re;MultiColoredThing"> <owl:equivalentClass> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasColor"/> <owl:minCardinality rdf:datatype="&xsd;nonNegativeInteger">2</owl:minCardinality> </owl:Restriction> </owl:equivalentClass> </owl:Class>  <owl:Class rdf:about="&re;SingleColoredThing"> <owl:equivalentClass> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasColor"/> <owl:cardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:cardinality> </owl:Restriction> </owl:equivalentClass> </owl:Class>

Page 58: Ontology 101 - New York Semantic Technology Conference

58

+Qualified cardinality number restriction example

OWL 2 provides the capability to further qualify cardinality by specific classes and data ranges

Patterns that reuse a smaller number of significant properties, building up more complex class expressions that limit the set of valid values are common, useful in complex classification systems

58

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 59: Ontology 101 - New York Semantic Technology Conference

59

+Qualified cardinality number restriction example

<owl:ObjectProperty rdf:about="&re;hasCharacteristic"> <rdfs:range rdf:resource="&re;Characteristic"/> </owl:ObjectProperty>  <owl:Class rdf:about="&re;BloomColor"> <rdfs:subClassOf rdf:resource="&re;Characteristic"/> <rdfs:subClassOf rdf:resource="&re;Color"/> </owl:Class>  <owl:Class rdf:about="&re;Characteristic"/>  <owl:Class rdf:about="&re;Color"/>   <owl:Class rdf:about="&re;FloweringPlantType"> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasCharacteristic"/> <owl:onClass rdf:resource="&re;BloomColor"/> <owl:minQualifiedCardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:minQualifiedCardinality> </owl:Restriction> </rdfs:subClassOf> </owl:Class>

59

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 60: Ontology 101 - New York Semantic Technology Conference

60

+Class expressions that restrict possible values by type

Universal (allValuesFrom) and existential (someValuesFrom) quantification allValuesFrom is the OWL equivalent for (x), and restricts the possible

values for a property to members of a particular class or data range someValuesFrom is the OWL equivalent for (x), and restricts the

possible values for a property to at least one member of a particular class or data range

Specific value (hasValue) restrictions a single data value (e.g., the color property for a RedWine must be filled

with the value “red”) an individual member of a class (e.g., Winery is the value restriction on

the hasMaker property on the class Wine)

Enumerations – lists of allowable individuals or data elements

Combinations of the above that include both restrictions and boolean class expressions (union – logical or, intersection – logical and, complement – logical not)

60

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 61: Ontology 101 - New York Semantic Technology Conference

61

+Class expression restricting all values for a given property

Azaleas are members of the set of flowering plants whose bloom color must be either an RHSColor (Royal Horticultural Society) or ASAColor (Azalea Society of America)

61

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

* courtesy Azalea Society of America

Page 62: Ontology 101 - New York Semantic Technology Conference

62

+And the OWL for that …

   <owl:Class rdf:about="&re;Azalea"> <rdfs:subClassOf rdf:resource="&re;FloweringPlantType"/> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasBloomColor"/> <owl:allValuesFrom> <owl:Class> <owl:unionOf rdf:parseType="Collection"> <rdf:Description rdf:about="&re;ASAColor"/> <rdf:Description rdf:about="&re;RHSColor"/> </owl:unionOf> </owl:Class> </owl:allValuesFrom> </owl:Restriction> </rdfs:subClassOf> </owl:Class>

62

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

* courtesy Azalea Society of America

Page 63: Ontology 101 - New York Semantic Technology Conference

63

+Another typical pattern combining restrictions & boolean connectives

Use of equivalence expressions like this, describing RHSFloweringPlantType as a flowering plant and something whose bloom colors must be from the set of colors defined by the Royal Horticultural Society is a common pattern

63

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 64: Ontology 101 - New York Semantic Technology Conference

64

+And the corresponding OWL for that …

64

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

<owl:Class rdf:about="&re;RHSFloweringPlantType"> <owl:equivalentClass> <owl:Class> <owl:intersectionOf rdf:parseType="Collection"> <rdf:Description rdf:about="&re;FloweringPlantType"/> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasBloomColor"/> <owl:allValuesFrom rdf:resource="&re;RHSColor"/> </owl:Restriction> </owl:intersectionOf> </owl:Class> </owl:equivalentClass> </owl:Class>

Page 65: Ontology 101 - New York Semantic Technology Conference

65

+Class expression restricting some & exact values for a given property

65

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 66: Ontology 101 - New York Semantic Technology Conference

66

+And the OWL for that …

66

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

<owl:Class rdf:about="&re;Azalea"> <rdfs:subClassOf rdf:resource="&re;FloweringPlantType"/> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasBloomColor"/> <owl:allValuesFrom> <owl:Class> <owl:unionOf rdf:parseType="Collection"> <rdf:Description rdf:about="&re;ASAColor"/> <rdf:Description rdf:about="&re;RHSColor"/> </owl:unionOf> </owl:Class> </owl:allValuesFrom> </owl:Restriction> </rdfs:subClassOf> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasBloomColorPattern"/> <owl:someValuesFrom rdf:resource="&re;ASAColorPattern"/> </owl:Restriction> </rdfs:subClassOf> </owl:Class>

Page 67: Ontology 101 - New York Semantic Technology Conference

67

+... <owl:Class rdf:about="&re;SingleColoredAzalea"> <owl:equivalentClass> <owl:Class> <owl:intersectionOf rdf:parseType="Collection"> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasBloomColorPattern"/> <owl:hasValue rdf:resource="&re;Solid"/> </owl:Restriction> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasBloomColor"/> <owl:cardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:cardinality> </owl:Restriction> </owl:intersectionOf> </owl:Class> </owl:equivalentClass> <rdfs:subClassOf rdf:resource="&re;Azalea"/> </owl:Class>  <owl:NamedIndividual rdf:about="&re;Solid"> <rdf:type rdf:resource="&re;ColorPattern"/> </owl:NamedIndividual>

67

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 68: Ontology 101 - New York Semantic Technology Conference

68

+Individuals and data ranges

An individual (instance, object in other paradigms) Any class that an individual is a member of, or is an individual of, is a

type of the individual Any superclass of a class is an ancestor of (or type of) the individual

Specify property values for the individual Property values should conform to the constraints such as range, value

type, cardinality restrictions, etc.

Enumerated classes & data ranges are commonly used to specify the complete set of valid values for a particular property

68

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 69: Ontology 101 - New York Semantic Technology Conference

69

+Example enumerated class of individuals

69

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 70: Ontology 101 - New York Semantic Technology Conference

70

+

Subsumption (necessary) A ⊆ B where B

is a class description

partial or primitive class

Definition (necessary and sufficient) C ≡ D where D

is a class description

complete or defined class

A

B

C

D

Class axioms

70

Courtesy Evan Wallace, NIST

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 71: Ontology 101 - New York Semantic Technology Conference

71

+Meta-properties – global cardinality restrictions

Functional

Inverse Functional

Domain

Domain

Range

Range

Courtesy Evan Wallace, NIST

71

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 72: Ontology 101 - New York Semantic Technology Conference

72

+Disjoint classes

A

B Classes are disjoint if they cannot have common instances

Disjoint classes cannot have any common subclasses

If winery and wine are disjoint, then there is no instance that is both a winery and a wine; there is no class that is both a subclass of winery and a subclass of wine

Disjointness is often used to aid consistency checking

Disjointness is also helpful in teasing out subtle distinctions among classes across multiple ontologies

Equivalence is also often used to identify the same concepts across ontologies that may be named differently, or to name classes defined through class axioms

72

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 73: Ontology 101 - New York Semantic Technology Conference

73

+New features in OWL 2

New property constructs Reflexive, irreflexive, & asymmetric properties Property chains Disjoint properties Keys Self-reflexive properties Qualified cardinality restrictions Negative property assertions

New class axioms Disjoint unions Disjoint classes

73

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 74: Ontology 101 - New York Semantic Technology Conference

74

+More new features

Extended datatype handling Better support for XML Schema Datatypes

New datatypes for OWL specifically (owl:real, owl:rational, rdf:PlainLiteral)

Custom datatype definition & use of xsd facets / datatype restrictions

New data range restrictions Intersections, unions, complements

Support for punning Limited support for a particular element to be defined as both a class &

individual, for example

Improved support for annotations (e.g., metadata about ontologies, provenance, transformation detail, etc.)

See http://www.w3.org/TR/owl2-new-features/ & http://www.w3.org/TR/owl2-quick-reference/ for detail

Current work specifically focused on provenance: http://www.w3.org/2011/prov/wiki/Main_Page

74

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 75: Ontology 101 - New York Semantic Technology Conference

75

+Rules of Thumb

Cycles are common in many KR systems, though rarely “a good thing”

Cycles are disallowed by some tools because they prohibit “code generation”, export -- including RDF/OWL

Classes A, B, and C have equivalent sets of instances By many definitions, A, B, and C are

equivalent Use owl:equivalentClass instead of

creating cycles

75

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 76: Ontology 101 - New York Semantic Technology Conference

76

+Class hierarchy analysis

Siblings should be specified at roughly the same level of generality -- compare to section and subsections in a book

76

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 77: Ontology 101 - New York Semantic Technology Conference

77

+More on class definition analysis

Classes that have single subclasses may be incompletely defined, unless additional children are specified in subordinate modules

Subclasses of a class typically have more detailed definitions Additional properties Constraints/restrictions on inherited properties Participate in further qualified relations Compare to bullets in a bulleted list

If a class has a large number of subclasses, it may be useful to define intermediate levels For wine, for example there are natural groupings around

wine color If no natural classification exists, the long list may be

appropriate

77

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 78: Ontology 101 - New York Semantic Technology Conference

78

+

A “wine” is not a subclass of “wines”

A particular vintage should be classified as an instance of the class Wines

Class names should be either all singular or all plural

Synonymous names for the same concept should be modeled as alternate designations, not as unique classes (in the same ontology)

Many systems & standards facilitate synonymous term representation (e.g., ISO 1087)

OWL allows defining necessary and sufficiency condition definitions thereby allowing synonym definitions to be “first class” terms (as appropriate, through equivalence / same as relations)

MariettaOldVinesRed

Class

Instance

instance-of

Inheritance, naming, synonyms

78

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 79: Ontology 101 - New York Semantic Technology Conference

79

+

Are concepts with distinct property values used in restrictions on other properties?

How important is the distinction for the domain?

Class definitions for most domains should be fairly stable – i.e., they should not change frequently once the definitions are established and individuals created

Class vs. property value

79

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 80: Ontology 101 - New York Semantic Technology Conference

80

+Class vs. individual

Individuals are the most specific entities in an ontology

If concepts form a natural hierarchy, represent them as classes

If they might have individuals, represent them as classes

80

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 81: Ontology 101 - New York Semantic Technology Conference

81

+Limiting scope

An ontology should not contain all the possible information about the domain No need to specialize or generalize more than the

application requires No need to include all possible properties of a class

• Only the most salient properties• Only the properties that the application requires

Ontologies of wine, food, and their pairings probably will not include details such as: Bottle size (half bottle, full bottle, magnum, …) Label color Wine bottle color (green, amber, …)

81

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 82: Ontology 101 - New York Semantic Technology Conference

82

+

Ontology-powered applications (both light weight and richer applications) Wine Agent Semantically-enabled Environmental Monitoring Population Science Health Data Exploration Scientific Observations Virtual Observatories Cognitive Assistant, Learning, and Explanation

Technical Underpinnings Understanding, Trust, & Proof: InferenceWeb Semantic Methodology Checking: Syntax, Consistency, Evaluation Services Linked Data

Questions

Part 3: Putting It All Together

82

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 83: Ontology 101 - New York Semantic Technology Conference

83

+Ontology-enhanced applications

Simple ‘ontologies’, taxonomy driven applications Controlled, shared vocabularies (content management, faceted search) –

e.g., PopSciGrid Web site organization & navigation, expectation setting Browsing & search engine optimization via mapping to tagged structures

such as the new schema.org from Yahoo!, Google & Bing, Good Relations ontology – see http://purl.org/goodrelations/

“Smarter” search support – Wine Agent, Environmental portal

More expressive ontologies provide more options for Configuration, e.g., AT&T PROSE/Questar Data Integration, e.g., Virtual Observatories (VSTO, …)

83

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 84: Ontology 101 - New York Semantic Technology Conference

84

Semantic Sommelier…ontology-driven, context-aware, mobile app

Semantically-enabled advisors utilize: Ontologies Reasoning Social Mobile Provenance Context

Patton & McGuinness.et. al

tw.rpi.edu/web/project/Wineagent

Page 85: Ontology 101 - New York Semantic Technology Conference

85

+TW Wine Agent – semantic interoperability

RDF & OWL for encoding wine/food listings and pairing recommendations

Semantic MediaWiki for publishing user-contributed recommendations*

Twitter and Facebook for posting recommendations

Pellet for deriving knowledge from wine ontology and recommendations

SPARQL for querying wine/food listings with recommendations

InferenceWeb for explaining Wine Agent’s recommendations

Wine Agent receives a meal description and retrieves a selection of matching wines available on the Web, using an ensemble of standards and tools

85

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 86: Ontology 101 - New York Semantic Technology Conference

86

+Social Semantic Web data publishing

Collaborative wine recommendations

Using Semantic MediaWiki to manage users and user-contributed food/wine pairings recommendations

Using semantic form to add OWL instance data

86

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 87: Ontology 101 - New York Semantic Technology Conference

87

+Semantic Query:food hierarchy & recommendation

87

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 88: Ontology 101 - New York Semantic Technology Conference

88

+ Semantic Sommelier

Previous versions used ontologies to infer descriptions of wines for meals and query for winesNew version uses

Context: GPS location, local restaurants and wine lists, user preferencesSocial input: Twitter, Facebook, Wiki, mobile, …

Source variability in quality, contradictions exist, Maintenance is an issue… however new models emerging

Page 89: Ontology 101 - New York Semantic Technology Conference

89

+Extensibility

Wine Agent Ontology is extensible “Menu file” in RDF

Restaurants can introduce new classes and instances to provide better matches for users

Searches can be restricted to imported menus and/or imported wine lists

In process:

Group recommendations using multiple iPhones/iPod touches

Recommendations based on personal dietary needs and preferences

‘iPellet’, an OWL Reasoner designed for iPhone, for offline reasoning

89

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 90: Ontology 101 - New York Semantic Technology Conference

90

+Observations from the Wine Agent

Background knowledge is reasonably simple and built in OWL (includes foods and wine and pairing information similar to the original Living with CLASSIC paper, CLASSIC tutorial, OWL Guide, Ontology Engineering 101, …)

Background knowledge can be used for simple query expansion over wine sources to retrieve for example documents about red wines (including zinfandel, syrah, …)

Background knowledge used to interact with structured queries such as those possible on wine sites like wine.com

Constraints allows a reasoner like Pellet to infer consequences of the premises and query.

Explanation system (Inference Web) can provide provenance information such as information on the knowledge source (McGuinness’ wine ontology) and data sources (such as wine reviews or particular restaurants or wine stores)

Services work could allow automatic “matchmaking” instead of hand coded linkages with web resources

90

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 91: Ontology 101 - New York Semantic Technology Conference

91

+ SemantEco/SemantAqua Enable/Empower citizens

& scientists to explore pollution sites, facilities, regulations, and health impacts along with provenance.

Demonstrates semantic monitoring possibilities.

Map presentation of analysis

Explanations and Provenance available

1

2 3

http://was.tw.rpi.edu/swqp/map.html and http://aquarius.tw.rpi.edu/projects/semantaqua

45

1. Map view of analyzed results2. Explanation of pollution 3. Possible health effect of contaminant (from EPA)4. Filtering by facet to select type of data5. Link for reporting problems

Page 92: Ontology 101 - New York Semantic Technology Conference

92

+System Architecture

access

Virtuoso

92

Page 93: Ontology 101 - New York Semantic Technology Conference

93

+Ontology

Extends existing best practice ontologies, e.g. SWEET, OWL-Time.

Includes terms for relevant pollution concepts

Can be used to conclude: “any water source that has a measurement outside of its allowable range” is a polluted water source.

Portion of the SemantEco and SemantAqua ontologies.

93

Page 94: Ontology 101 - New York Semantic Technology Conference

94

+Ontology

Regulation Ontology models the federal and

state water quality regulations for drinking water sources

We can recognize pollution, e.g. “any water source that contains 0.01 mg/L of Arsenic or more is a polluted water source.”

Portion of Cal. Regulation Ontology.

94

Page 95: Ontology 101 - New York Semantic Technology Conference

95

+Querying

With OWL reasoning during query:

SELECT DISTINCT ?site ?lat ?lng

WHERE { ?site a pol:PollutedSite ;

geo:lat ?lat ; geo:long ?lng . }

OWL reasoning hides the complexity of the relationships allowing the developer (or user) to ask simple questions without requiring deep knowledge of environmental regulations

95

Page 96: Ontology 101 - New York Semantic Technology Conference

96

+Reflections

Uses semantic technologies along with simple ontologies and provenance-encoding tools

Was generated by students in a regular term class, then extended to be the basis of a masters degree (Wang), and the foundation of 3 other class projects

Small background extensible ontology

Now being expanded to include endangered species, migration, health effects, etc.

96

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 97: Ontology 101 - New York Semantic Technology Conference

97

+ A Community-Driven Scientific Observations Network to

achieve Semantic Interoperability of Environmental and Ecological Data

Build a community-sanctioned, evolving data model for observational data

Build a network of practitioners Build repository of motivating use cases All to enable interoperability among existing data

resources, supporting cross-disciplinary synthetic research in the environmental sciences

https://sonet.ecoinformatics.org

Schildhauer1, Bowers2, Gries3, McGuinness4, Dibner5, Madin6, Jones1, Bermudez7

1NCEAS UC Santa Barbara, 2Gonzaga University 3NTL/LTER and Univ. of Wisconsin, 4McGuinness & Associates, 5OGC Interoperability Institute, 6Macquarie University, 7Southeastern Universities Research Association

SONet: Scientific Observations Network

97

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 98: Ontology 101 - New York Semantic Technology Conference

98

+Population Sciences Grid Goals

Convey complex health-related information to consumer and public health decision makers for community health impact

Leverage the growing evidence base for communicating health information on the Internet

Inform the development of future research opportunities effectively utilizing cyberinfrastructure for cancer prevention and control

98

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 99: Ontology 101 - New York Semantic Technology Conference

99

+Computer Science Perspective on PopSciGrid Goals

How can semantic technologies be used to integrate, present, and analyze data for a wide range of users?

Can tools allow lay people to build their own demos and support public usage and accurate interpretation?

How do we facilitate collaboration and “viral” applications?

Within PopSciGrid: Which policies (taxation, smoking bans, etc) impact health and health

care costs? What data should be displayed to help scientists and lay people evaluate

related questions? What data might be presented so that people choose to make (positive)

behavior changes? What does the data show? why should someone believe that? What are appropriate follow ups?

99

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 100: Ontology 101 - New York Semantic Technology Conference

100

+PopSciGrid Workflow

100

CSV2RDF4LODDirect

SemDiff

Archive

CSV2RDF4LODEnhance

Visualize

derive derive

integrate

derive

arch

ive

Publish

Ban coverage

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 101: Ontology 101 - New York Semantic Technology Conference

101

+PopSciGrid: Conn

Extensible Mashups via Linked Data Diverse datasets from NIH Potentially linking to other content (e.g. “unemployment rate”)Accountable Mashups via Provenance Annotate datasets used in demos Feedback users’ comment to gov contact (e.g. %) Annotation capabilities in process

101

http://logd.tw.rpi.edu/demo/trends_in_smoking_prevalence_tobacco_policy_coverage_and_tobacco_prices

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 102: Ontology 101 - New York Semantic Technology Conference

102

+PopSciGrid II

102

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 103: Ontology 101 - New York Semantic Technology Conference

103

+ PopSciGrid II

Page 104: Ontology 101 - New York Semantic Technology Conference

104

+Reflections

Successful but….

What if we could allow data experts to build their own demos?

What if we could allow non-subject matter experts to function as subject-literate staff?

What if team members could interchange roles (and thus make contributions in other areas)?

What technological infrastructure is required?

Claim: all of this is being done now – but not at scale

104

Page 105: Ontology 101 - New York Semantic Technology Conference

105

+ Updates and Motivations from a Computer Science Perspective

Old:

Raw conversions

Per-dataset vocabularies

Custom queries

Custom data management code

Limited use because of Google Visualization licenses

State-level data

New:

Enhanced conversions

Vocabulary reuse

Generic queries

Re-usable data management code

Unlimited use of new open source visualization toolkit

State and county-level data

105

Page 106: Ontology 101 - New York Semantic Technology Conference

106

+RDF Data Cube Vocabulary

For publishing multi-dimensional data, such as statistics, on the web in such a way that it can be linked to related data sets and concepts using RDF.

Compatible with the cube model that underlies SDMX (Statistical Data and Metadata eXchange).

Also compatible with: SKOS, SCOVO, VoiD, FOAF,

Dublin Core Terms

Integrated with the LOGD data conversion infrastructure

Integrated with other tooling like Stats2RDF

106

Page 107: Ontology 101 - New York Semantic Technology Conference

107

County average life expectancy(Summary Measures of

Health)

Page 108: Ontology 101 - New York Semantic Technology Conference

108

+Reflections

Uses semantic technologies along with simple ontologies and provenance-encoding tools

Provides an operational specification for other policy exploration tools – initially tobacco data and related policies, now also physical education and nutrition policies in elementary, junior and senior high school (CLASS - http://class.cancer.gov/About.aspx ).

Similar model being used in the Tetherless SemantEco / SemantAqua / SemantAir Portal family

108

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 109: Ontology 101 - New York Semantic Technology Conference

109

Why did I present wine, water, and linked data applications?

Wine advisor shows semantic technology in action making actionable recommendations

Water application shows a “typical” semantic integration web 3.0 application

Both of these styles are needed for many semantic applications and these features are realizable today

Next – they depend on Understandable and Actionable applications Semantic web stack Data availability – linked data cloud is growing Ontologies Semantic methodologies

Page 110: Ontology 101 - New York Semantic Technology Conference

110

+

If users (humans and agents) are to use, reuse, and integrate system answers, they must trust them.

System transparency supports understanding and trust.

Even simple “lookup” systems benefit from providing information about their sources.

Systems that manipulate information (with sound deduction or potentially unsound heuristics) benefit from providing information about their manipulations.

Goal: Provide interoperable infrastructure that supports explanations of sources, assumptions, and answers as an enabler

for trust.

Foundations: Trust & Understanding

110

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 111: Ontology 101 - New York Semantic Technology Conference

111

+Foundations: Explanations, Proof Analysis

Framework for explaining question answering tasks Stores and manages meta-information about proofs and

explanations through a distributed repository Uses the Proof Markup Language (PML) for proof interchange

(OWL-based) Services include proof presentation, strategy-based

views, filtering, trust, combination, expansion, checking, search, and other capabilities

Integrated browsing and display of PML documents from diverse sources

Rewriting capabilities for improved understanding Multi-modal dialogue options including alternative

strategies for presenting explanations, visualizations, and summaries

111

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 112: Ontology 101 - New York Semantic Technology Conference

112

+

Explanation via Graph

Explanation via Summary

Explanation via Annotation

Foundations: Inference Web (IW)

End Users

End-User Interaction

services

Distributed

PML data

Data Access & Data Analysis

Services

Validate published PML data

Access published PML data

Inference Web is a semantic web-based knowledge provenance management infrastructure:

• PML for encoding and interchange provenance metadata in distributed environment • Interactive explanation services for end-users• Data access and analysis services for enriching the value of knowledge provenance

112

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 113: Ontology 101 - New York Semantic Technology Conference

113

+Foundations: Proof Markup Language (PML)

A new kind of linked data on the Web

Modularized & extensible Provenance: annotate provenance properties Justification: encodes provenance relations Trust: add trust annotation

Semantic Web based

Enterprise Web

Enterprise Web

World Wide Web

D D

PMLdata

PMLdata

DD

D

PMLdata

PMLdata

PMLdata

D

D PMLdata

PMLdata

D

113

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 114: Ontology 101 - New York Semantic Technology Conference

114

+Technical Architecture

Data Access

edit

PML (provenance, justification, trust)

IW Registrar

PML databasecall

generate

Proofs (N3, KIF)

PML Translator

editWeb ServiceInterface

generate

Edit Interface

PML APItranslate

PML Web Documents

is-stored-as

parse

IW Search

harvests & indexesharvests

subscribes

reads

Computation Tools(validation, conflict detection,

abstraction)

PML API

Presentation Tools(IW Browser)

uses

PML Java Objects

searches

domain expertsmachine agents

114

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 115: Ontology 101 - New York Semantic Technology Conference

115

115

Making Systems Actionable with Knowledge Provenance

Mobile Wine Agent

GILA

Combining Proofs in TPTP

CALO

Knowledge Provenance

in Virtual Observatorie

s

Intelligence Analyst Tools

NOW including Data-govCopyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 116: Ontology 101 - New York Semantic Technology Conference

116

+Example Proof Tree Browser

116

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 117: Ontology 101 - New York Semantic Technology Conference

117

+

Source: fbi_01.txt (in NL)Source Usage: span from 01

to 78

This extractor decided that Person_fbi-

01.txt_46 is a Person and not Occupation (in

logical format – e.g. KIF)

Same conclusion from multiple

extractors

Conflicting conclusion from

one extractor (with annotations)

Explaining Extracted Entities

117

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 118: Ontology 101 - New York Semantic Technology Conference

118

+

From CIA

From FBI

From intercepts

Trustworthiness of Ramazi report

118

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 119: Ontology 101 - New York Semantic Technology Conference

119

SPARQL to Xquery translator RDFS materialization(Billion triple winner)

Govt metadata searchLinked Open Govt Data

SPARQL WG, earlier QL –OWL-QL, Classic’ QL, …

OWL 1 & 2 WG Edited main OWL Docs, quick reference, OWL profiles (OWL RL),

Earlier languages: DAML, DAML+OIL, Classic

RIF WGAIR accountability tool

DL, KIF, CL, N3Logic

Inference Web, Proof Markup Language, W3C Provenance Working group formal model, W3C incubator group, …

Inference Web IW Trust, Air + Trust

Visualization APIsS2S

Govt Data

Ontology repositories (ontolinguag),Ontology Evolution env:Chimaera, Semantic eScience Ontologies, MANY other ontologies

Transparent AccountableDatamining Initiative (TAMI)

Foundations: Web Layer Cake

Page 120: Ontology 101 - New York Semantic Technology Conference

120

Foundations: Linked Data Cloud

Page 121: Ontology 101 - New York Semantic Technology Conference

121

+Foundations: The Tetherless World Constellation Linked Open Government Data Portal

121

Create

TWC LOGD

ConvertQuery/Access

LOGDSPARQL Endpoint

Enhance

• RDF• RSS• JSON• XML• HTML• CSV• …

Community Portal

Data.gov deployment

Page 122: Ontology 101 - New York Semantic Technology Conference

122

Semantic Web Methodology

Originally developed for VSTO, now in SSIII, SESDI, SESF, OOI …McGuinness, Fox, West, Garcia, Cinquini, Benedict, Middleton The Virtual Solar-Terrestrial Observatory: A Deployed Semantic Web Application Case Study for Scientific Research. Proc. 19

Conf. on Innovative Applications of Artificial Intelligence (IAAI-07),

http://www.vsto.org

Page 123: Ontology 101 - New York Semantic Technology Conference

123

+

A distributed, scalable education and research environment for searching, integrating, and analyzing observational, experimental, and model databases – aimed at PHD researchers

Subject matter covers the fields of solar, solar-terrestrial and space physics

Provides virtual access to specific data, model, tool and material archives containing items from a variety of space- and ground-based instruments and experiments, as well as individual and community modeling and software efforts bridging research and educational use

3 year NSF-funded project; initial deployment in year 1, multiple deployments by year 2; year 3 outreach and broadening

While aimed at one interdisciplinary area, it also serves as a replicable prototype for interdisciplinary virtual observatories

NSF follow-on for provenance extension (Semantic Provenance Capture in Data Ingest Systems) and generalization – Semantic eScience Framework

Virtual Solar Terrestrial Observatory – Setting for initial semantic methodology design

123

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 124: Ontology 101 - New York Semantic Technology Conference

124

+

November 9, 2006

Virtual Observatory (VSTO)

General: Find data subject to certain constraints and plot appropriately

Specific: Plot the observed/measured Neutral Temperature as recorded by the Millstone Hill Fabry-Perot interferometer while looking in the vertical direction at any time of high geomagnetic activity in a way that makes sense for the data.

from vsto.org

124

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 125: Ontology 101 - New York Semantic Technology Conference

125125

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 126: Ontology 101 - New York Semantic Technology Conference

126

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 127: Ontology 101 - New York Semantic Technology Conference

127

Partial exposure of Instrument class hierarchy

Semantic filtering by domain or instrument hierarchy

127

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 128: Ontology 101 - New York Semantic Technology Conference

128128

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 129: Ontology 101 - New York Semantic Technology Conference

129

+Provenance aware faceted search

129

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 130: Ontology 101 - New York Semantic Technology Conference

130130

Additional work on provenance is being incorporated

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 131: Ontology 101 - New York Semantic Technology Conference

131

+Cognitive Assistant that Learns & Organizes

131

DARPA IPTO funded program

Personal office assistant, tasked with: Noticing things in the cyber and physical environments Aggregating what it notices, thinks, and does Executing, adding/deleting, suspending/resuming tasks Planning to achieve abstract objectives Anticipating things it may be called upon to do or respond to Interacting with the user Adapting its behavior in response to past experience, user

guidance

22 participating organizations

Led to other efforts including Siri (later acquired by Apple and in new iPhone

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 132: Ontology 101 - New York Semantic Technology Conference

132

+ Working with a Cognitive Assistant

CALO users need to Understand system behavior and responses Trust system reasoning and actions

To believe and act on recommendations from CALO, users need ways of exploring how and why the system acted, responded, recommended, and reasoned the way it did.

Additional wrinkle: CALO knowledge, behavior, and assumptions are constantly changing through several forms of machine learning.

A unified framework for explaining behavior and reasoning is essential for users to trust and adopt cognitive assistants.

132

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 133: Ontology 101 - New York Semantic Technology Conference

133

+ ICEE Architecture

Collaboration Agent

Justification Generator

Task Manager (TM)

TM Wrapper

Explanation Dispatcher

Task State Database

TM Explainer

KM Explainer

Knowledge Manager (KM)

Constraint Explainer

Constraint Reasoner

133

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 134: Ontology 101 - New York Semantic Technology Conference

134

+The Use-Ask-Understand-Update Cycle

Use Ask

Understand / AcceptUpdate

CALO explanation joint withGlass, Wolverton, Chang, Ding

134

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 135: Ontology 101 - New York Semantic Technology Conference

135

+Foundations: Consistency checking

Requirements are typically application specific

RDF vocabularies should be checked by an RDF validator or rule engine Jena Semantic Web Framework - http://jena.sourceforge.net/ Pychinko, MindSwap’s Rete-based RDF-friendly rule engine – (CWM clone)

http://www.mindswap.org/~katz/pychinko/

OWL Ontologies should be run through a consistency checking reasoner Pellet (open source, originally from Mindswap, supported by Clark & Parsia,

LLC) – http://clarkparsia.com/pellet/, RacerPro – http://www.racer-systems.com/ FaCT++ – http://owl.man.ac.uk/factplusplus/ HermiT OWL Reasoner – http://www.hermit-reasoner.com/ KAON2 infrastructure for managing OWL-DL, SWRL, and F-Logic ontologies –

http://kaon2.semanticweb.org/ VIStology's ConsVISor OWL Consistency checker –

http://68.162.250.6:8080/consvisor/

OWL instance data may also be run through checking tools TW Instance data evaluation - http://onto.rpi.edu/demo/oie/

135

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 136: Ontology 101 - New York Semantic Technology Conference

136

+

Valid RDF/XML Documents(Valid OWL Syntax Representation)

Valid OWL (FOL consistency checked,

deductive closure)

Pellet OWL DL Reasoner

Valid OWL (DL consistency checked, concept

satisfiability checked)

Ontology Registry & Library

Example process

Ontologies should be checked to ensure consistency, limit potential for invalid conclusions

Application Environment

TW Instance Data Evaluation

136

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 137: Ontology 101 - New York Semantic Technology Conference

137

+

A new ontology A new instance data

Wine ontology

wine:EarlyHarvest wine:LateHarvest

wine:Wine

rdfs:subClassOf rdfs:subClassOf

owl:disjointWith

wine:BadWineClass

rdfs:subClassOf rdfs:subClassOf

wi:BadWineInstance

rdf:type rdf:type

Case1: semantic inconsistencycaused by a new class.

Case2: semantic inconsistencycaused by a new instance.

Inconsistencies caused by distributed OWL data

137

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 138: Ontology 101 - New York Semantic Technology Conference

138

+TW OWL instance-data evaluation

Motivation Objective metrics are needed to evaluate if semantic web

instance data is ready for practical use Next generation Chimaera for more diverse and more

instance-oriented applications

Challenges Criteria for “ready for use” vary across applications Existing syntax and logical-consistency checking is not

enough for applications using some standard practical assumptions (e.g. closed world, unique names, …)

Solution Extensible & customizable service-oriented architecture Integrated environment for evaluating issues beyond

standard syntactic correctness and logical consistency

See http://onto.rpi.edu/demo/oie/

138

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 139: Ontology 101 - New York Semantic Technology Conference

139

+Some issues in OWL instance data

Missing property value (MPV)

Excessive property value (EPV)

EPVMPVW

Wine

rdf:type

rdfs:subClassOf

W

Wine

rdf:type

Red

White

hasColor

hasColor

hasColor

owl:Restriction

"1"

owl:cardinality

owl:onProperty

rdf:typehasMaker

owl:Restriction

"1"

owl:cardinality

owl:onProperty

rdf:type

rdfs:subClassOf

139

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 140: Ontology 101 - New York Semantic Technology Conference

140

Foundations: Linked Data Cloud

Page 141: Ontology 101 - New York Semantic Technology Conference

141

+Foundations: The Tetherless World Constellation Linked Open Government Data Portal

141

Create

TWC LOGD

ConvertQuery/Access

LOGDSPARQL Endpoint

Enhance

• RDF• RSS• JSON• XML• HTML• CSV• …

Community Portal

Data.gov deployment

Page 142: Ontology 101 - New York Semantic Technology Conference

142

+

Introduced Semantic Technology Foundations Provided numerous examples of functioning systems

with value propositions, many users, etc. Foundational technology is available from many sources

Questions

Summary

142

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 143: Ontology 101 - New York Semantic Technology Conference

143

Questions?143

Elisa KendallPartner, Thematix Partners LLC(650) 960-2456ekendall @ thematix.comhttp://thematix.com

Deborah McGuinnessTetherless World Constellation Chair, Professor of Computer Science and Cognitive ScienceRensselaer Polytechnic Institute andCEO, McGuinness Associates(518) 276-4404dlm @ cs.rpi.eduhttp://tw.rpi.edu

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 144: Ontology 101 - New York Semantic Technology Conference

144

+Acronym Soup

CL – ISO 24707 Common Logic: a family of first order logic languages, including Conceptual Graphs & Common Logic Interchange Format – a successor to the Knowledge Interchange Format (KIF), http://cl.tamu.edu/

DAML – DARPA Agent Mark-up Language, one of the primary languages leading to the development of OWL, http://www.daml.org/

DARPA – Defense Advanced Research Projects Agency, http://www.darpa.mil/

DL – Description Logics: a subset of first order logic, for which tractable & complete reasoning systems are available

Linked Data and RDFa, http://www.w3.org/standards/techs/rdfa#w3c_all

MDA – Model-Driven Architecture, http://www.omg.org/mda/

ODM – Ontology Definition Metamodel, http://www.omg.org/spec/ODM/1.0/

OWL – W3C Web Ontology Language, OWL 2 specifications dated 27 October 2009, http://www.w3.org/standards/techs/owl#w3c_all

OWL-S – a set of OWL ontology components that extend the W3C OWL specifications to support Semantic Web Services, http://www.daml.org/services/owl-s/1.1/

144

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 145: Ontology 101 - New York Semantic Technology Conference

145

+More Acronym Soup

PML – Proof Markup Language – Provenance Interlingua, www.inference-web.org

PRR – Production Rules Representation, http://www.omg.org/spec/PRR/1.0/

RIF – Rule Interchange Format, http://www.w3.org/standards/techs/rif#w3c_all

RDF – Resource Description Framework, http://www.w3.org/TR/rdf-concepts/

Semantic Annotations for WSDL and XML (SAWSDL), http://www.w3.org/standards/techs/sawsdl#w3c_all

Simple Knowledge Organization System (SKOS), http://www.w3.org/standards/techs/skos#w3c_all

SPARQL – Semantic Web Query Language, http://www.w3.org/standards/techs/sparql#w3c_all

Semantic Web Rule Language (SWRL), http://www.w3.org/Submission/SWRL/

WSDL – Web Services Description Language

145

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 146: Ontology 101 - New York Semantic Technology Conference

146

+Extra

146

Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Page 147: Ontology 101 - New York Semantic Technology Conference

147

SPARQL to Xquery translator RDFS materialization(Billion triple winner)

Govt metadata searchLinked Open Govt Data

SPARQL WG, earlier QL –OWL-QL, Classic’ QL, …

OWL 1 & 2 WG Edited main OWL Docs, quick reference, OWL profiles (OWL RL),

Earlier languages: DAML, DAML+OIL, Classic

RIF WGAIR accountability tool

DL, KIF, CL, N3Logic

Inference Web, Proof Markup Language, W3C Provenance Working group formal model, W3C incubator group, …

Inference Web IW Trust, Air + Trust

Visualization APIsS2S

Govt Data

Ontology repositories (ontolinguag),Ontology Evolution env:Chimaera, Semantic eScience Ontologies, MANY other ontologies

Transparent AccountableDatamining Initiative (TAMI)

Foundations: Web Layer Cake

Page 148: Ontology 101 - New York Semantic Technology Conference

148

Semantic Web Methodology

Originally developed for VSTO, now in SSIII, SESDI, SESF, OOI …McGuinness, Fox, West, Garcia, Cinquini, Benedict, Middleton The Virtual Solar-Terrestrial Observatory: A Deployed Semantic Web Application Case Study for Scientific Research. Proc. 19

Conf. on Innovative Applications of Artificial Intelligence (IAAI-07),

http://www.vsto.org

Page 149: Ontology 101 - New York Semantic Technology Conference

149

+ Inference Web: Making Data Transparent and Actionable Using Semantic Technologies How and when does it make sense to use smart system

results & how do we interact with them?

149

Knowledge Provenance in

Virtual Observatories

Hypothesis Investigation

/ Policy Advisors

(Mobile) Intellige

nt Agents

Intelligence Analyst Tools

NSF Interops:SONETSSIII – Sea Ice

Page 150: Ontology 101 - New York Semantic Technology Conference

150

+The Semantic Web enables…

New models of intelligent services

E-commerce solutions

M-commerce

Web assistants

• Semantic Technologies: ready for use• Tools & tutorials available; deep apps

future planning may benefit from consultants• Context-aware, semantic apps are the future

New forms of web assistants/agents that act on a human’s behalf requiring less from humansand their communication devices…

More info: dlm @ cs.rpi.edu

Page 151: Ontology 101 - New York Semantic Technology Conference

151

+Semantic Water Quality Portal / SemantAqua

151

Copyright © 2012 Thematix Partners LLC & McGuinness Associates