cosmo and the defining vocabulary: next steps the foundation ontology as a conceptual defining...

13
COSMO and the Defining Vocabulary: Next Steps The Foundation Ontology as a Conceptual Defining Vocabulary Patrick Cassidy Ontology Summit 2007 April 23, 2007

Upload: cole-green

Post on 27-Mar-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

COSMO and the Defining Vocabulary: Next StepsThe Foundation Ontology as a

Conceptual Defining Vocabulary

Patrick Cassidy

Ontology Summit 2007

April 23, 2007

2

The Linguistic Defining vocabulary• Several Dictionaries employ a restricted “defining

vocabulary” which is used to define all of the words in the dictionary.

• For Longman’s Dictionary of Contemporary English the defining vocabulary is about 2000 base words

• An experiment was performed using the Longman vocabulary (expanded with its morphological variants and some foundation ontology concepts), to define 500 word senses not already in the base vocabulary

• It was necessary to add only 2 new words to the base vocabulary in order to recursively define the new terms with respect to the base vocabulary

• Since the words are labels for concepts, this suggests that a comparable and parallel conceptual defining vocabulary, in the form of a foundation ontology, could be used to specify the meanings of all specialized concepts, by combinations of the base concepts

3

Recursive Definition Example:‘atomic nucleus’

Definition: the central part of an atom, containing most of the atom's mass, and containing all of the atom's protons[1] and neutrons[1].

Trace to base: [nucleus[1] -> [proton -> [subatomic -> base] [particle -> base] [charge[1] -> [attribute -> base] [molecule -> base] [subatomic -> base] [particle -> base]]] [neutron -> [subatomic -> base] [particle -> base] [charge[1] -> [attribute -> base] [molecule -> base] [subatomic -> base] [particle -> base]] [proton-> [subatomic -> base] [particle -> base] [charge[1] -> [attribute -> base] [molecule -> base] [subatomic -> base] [particle -> base]]]]]

(See http://colab.cim3.net/file/work/SICoP/ontac/reference/ControlledVocabulary/SupplementalVocabularyDefinitions.xls)

4

How Large Will the Conceptual Defining Vocabulary Be?

• The Longman’s defining vocabulary has 2000 words, many used in more than one sense – therefore it is likely that over 4000 concepts will be needed to form a conceptual vocabulary (a foundation ontology) of comparable size to the linguistic.

• Development of the COSMO ontology can serve as an experiment to try to determine the minimum “conceptual defining vocabulary”

5

COSMO Ontology development Plan

To develop a rules-based (FOL or quasi-second order logic) ontology, with an OWL-Full version, focusing on the concepts needed to represent the concepts used in an English-language controlled defining vocabulary

Version 0.43 uses parts of:• OpenCyc OWL version 0.78• SUMO + MILO• DOLCE• BFO

– And is supplemented as necessary with the ultimate goal of allowing automatic conversion of linguistic definitions to the FOL form

– FOL form will be definitive, the OWL form is maintained for applications that can use it

6

0 25 50 75 100 125 150 175 200 225 250 2750

50

100

150

200

250

300

350

400

450

500

550

COSMO Supplementation

Number of Base Classes Added

Num

ber

of T

otal

Cla

sses

Add

ed (

star

t =

250

9)

+ 295

+ 224

128/256

7

COSMO Status

• Current version on ONTACWG site:– See http://colab.cim3.net/cgi-bin/wiki.pl?CosmoWG

– About 200 classes

• Supplemented version in OWL (coming soon to COSMO site)– Ca. 3100 classes– 300 relations

• Supplementation will continue by representing words from the Linguistic Defining Vocabulary

• Conversion to FOL form (and/or adding rules to the OWL form) will be a high priority

8

COSMO Design Considerations

• Intended as a Common foundation ontology – must accommodate any desired concept, and relate it to the remainder

• Intended to have a convenient natural-language interface – common words should be represented in their most common sense where possible

9

Resulting Differences From Cyc:

example of Color• Attributes in Cyc are now represented as objects

*with* that attribute: <owl:Class rdf:ID="RedColor">

<rdf:type rdf:resource="#Color"/>

<rdfs:subClassOf rdf:resource="#ColoredThing"/>

<rdfs:subClassOf rdf:resource="#Reddish"/>

</owl:Class>

• Attributes in COSMO are represented as attributes that are related to Objects by any of several ‘hasAttribute’ relations (next slide)

10

COSMO Color attributes:distinguished from substances and objects

having that color <owl:Class rdf:ID="Red"> <rdfs:comment>The attribute of having a color sufficiently close to a pure red to be called 'red' rather than some variant such as 'maroon' or 'purple' or 'rose'. Still provides a wide range of actual colors.</rdfs:comment> <rdf:type rdf:resource="#ColorAttributeType"/> <rdfs:subClassOf rdf:resource="#Reddish"/> </owl:Class>

<owl:Class rdf:ID="RedSubstance"> <rdfs:comment>A PhysicalSubstance that is red.</rdfs:comment> <rdf:type rdf:resource="#SubstanceType"/> <rdfs:subClassOf rdf:resource="#PhysicalSubstance"/> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty> <owl:ObjectProperty rdf:about="#hasColor"/> </owl:onProperty> <owl:hasValue rdf:resource="#Red"/> </owl:Restriction> </rdfs:subClassOf> </owl:Class>

In KIF: (Every RedSubstance hasColor Red)

11

Other Differences with Cyc

• Representing Substances as abstract, rather than as the collection of all objects with that composition– An object is ‘composed of’ a substance.

• Representing Events and Processes as disjoint classes (with deference to linguistic usage)

12

Differences with SUMO

• Use of reified Roles as well as role-based relations (e.g. Mother vs. relation ‘mother’)

• Use of Abstract Texts in addition to concrete textual objects– Shakespeare wrote ‘Hamlet’ (the abstract

conceptual work, not every physical copy)

13

In General:

• In order to simplify the task of creating an integrated alignment of the linguistic defining vocabulary and the conceptual defining vocabulary, the COSMO ontology tries to use terms in their most common intuitive senses (where possible) and will focus on representing all of the terms that are used in (English) linguistic definitions, as suggested by the controlled defining vocabularies.

COSMO project:• http://colab.cim3.net/cgi-bin/wiki.pl?CosmoWG