formal ontology meets industry: best practices

23
On some best practices in large-scale ontology development: The Chronious Ontology Suite as a Case Study Luc Schneider[1], Mathias Brochhausen[1], David Koepsell [2] [1] Institute for Formal Ontology and Medical Information Science, Saarland University, Saarbrücken, Germany [2] Technical University Delft, Delft, Netherlands FOMI, TU Delft, Delft, July 7-8, 2011

Upload: david-koepsell

Post on 15-Jun-2015

88 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Formal Ontology Meets Industry: Best Practices

On some best practices in large-scale ontology development:

The Chronious Ontology Suite as a Case Study

Luc Schneider[1], Mathias Brochhausen[1], David Koepsell [2]

[1] Institute for Formal Ontology and Medical Information Science, Saarland University, Saarbrücken, Germany

[2] Technical University Delft, Delft, Netherlands

FOMI, TU Delft, Delft, July 7-8, 2011

Page 2: Formal Ontology Meets Industry: Best Practices

The importance of best practices in ontology development

• Methodological pluralism in ontology developments reflects differing challenges and facilitates solutions adapted to the respective business requirements.

• It may lead to ad hoc selections of categorical criteria or design choices and the adoption of kludges, as opposed to a consistent adherence to foundational principles. Ad hoc solutions may cause impediments in the development and the usage of products.

• Applied ontologies based on sound design principles and theoretical foundations provide advantages compared to ontologies constructed in an ad hoc manner, namely consistency, re-usability and harmonisation.

Page 3: Formal Ontology Meets Industry: Best Practices

The CHRONIOUS Ontology Suite• The CHRONIOUS Ontology Suite has been developed in the context

of a European project aiming at the development of an integrated telemedical platform for monitoring the general health status of patients with chronic health conditions and providing decision support for clinicians.

• The CHRONIOUS Ontology Suite consists of three components:– the Middle Layer Ontology for Clinical Care (MLOCC)– the COPD (Chronic Obstructive Pulmonary Disease) ontology– the CKD (Chronic Kidney Disease) ontology.

• MLOCC contains 476 classes, the COPD ontology 964 and the CKD ontology 972.

Page 4: Formal Ontology Meets Industry: Best Practices

Outline of the talk

• We illustrate a number of best practices in large-scale ontology construction by using examples drawn from the CHRONIOUS Ontology Suite.

• We will focus on three topics:– realist upper level ontologies as a means for ex-ante

harmonisation– modularization of ontologies and reuse of pre-existing

resources– general design principles to ensure the consistency of

taxonomies.

Page 5: Formal Ontology Meets Industry: Best Practices

Realist Upper Level Ontologies for Ex-Ante Harmonisation

Page 6: Formal Ontology Meets Industry: Best Practices

The rationale for realist top-level ontologies (1)

• To guarantee data exchange across heterogeneous sources, we must resort to ontologies that unify the different ways in which the domain is represented by the various end-users and information systems designers.

• Realist ontologies, i.e. ontologies that depict reality independently of the mental or digital representation of reality by end users and knowledge engineers, provide a unified way of representing the domain from the start, without the need of an ex-post integration of heterogeneous perspectives.

Page 7: Formal Ontology Meets Industry: Best Practices

The rationale for realist top-level ontologies (2)

• Upper level ontologies help to adjust a system/service to an ever growing user-base and to access a growing number of heterogeneous data repositories.

• Upper level ontology development profits from the adoption of a realist view, since the representation of reality by the ontologies allows all possible perspectives.

Page 8: Formal Ontology Meets Industry: Best Practices

Basic Formal Ontology

• Basic Formal Ontology (BFO) grows out of a philosophical orientation which overlaps with that of DOLCE and SUMO. Unlike these, however, it is narrowly focused on the task of providing a genuine upper ontology which can be used in support of domain ontologies developed for scientific research, as for example in biomedicine within the framework of the OBO Foundry.

• Thus BFO does not contain physical, chemical, biological or other terms which would properly fall within the special sciences domains. BFO is the upper level ontology upon which OBO Foundry ontologies are built.

Page 9: Formal Ontology Meets Industry: Best Practices

Basic Formal Ontology

Page 10: Formal Ontology Meets Industry: Best Practices

MLOCC as an extension of BFO• The Middle Layer Ontology for Clinical Care (MLOCC) and thus the COPD

and CKD ontologies have been built on top of BFO by appending the upper-level classes of MLOCC onto leaves of BFO. By linking MLOCC-classes to BFO-classes, the meaning of the latter are given a reality-driven semantics.– The BFO-class Disposition subsumes the MLOCC-classes Disease and

Malfunction.– The BFO-class Object subsumes MLOCC-classes such as Organism,

Chemical Substance, Institution and TechnicalObject (including devices and instruments).

– The BFO-class Process subsumes the MLOCC-classes Intentional Process and Natural Process; the first subsumes classes related to human and social activities, in particular medical (diagnostic and therapeutic) processes, while the second subsumes Chemical Process and Organismal Process.

Page 11: Formal Ontology Meets Industry: Best Practices

Excerpt from COPD:The branch Realizable Entity

Page 12: Formal Ontology Meets Industry: Best Practices

Avoiding design mistakes using upper-level ontologies

• A realist upper-level ontology provides constraints on the classification of domain entities. These restrictions prevent ontological mistakes that can lead to costly re-designs of domain ontologies. Some examples from BFO/MLOCC are:– Occurrents do not participate in other occurrents. A process of

heart beating does not undergo an increase, but a heart rate, which is the quality of an organism resulting from a heart beating, does.

– Dispositions, functions or roles do not participate in occurents, but are realised by processes. The renal filtration function does not change, but is realised by the process of renal filtration which has a glomerular filtration rate as an outcome.

Page 13: Formal Ontology Meets Industry: Best Practices

Modularity and Re-use

Page 14: Formal Ontology Meets Industry: Best Practices

Perspectivalism, modularity and reusability

• Perspectivalism is the recognition that there are many representations of reality that are equally adequate because they capture different salient aspects of the same world.

• Reality can be assayed – in terms of substances and their qualities or powers as well as in

terms of processes;– at various levels of granularity, ranging from the atomic and

molecular levels to those of cells, tissues and organisms. • Hence, reality cannot be accounted for in terms of a single

monolithic ontology, but only in terms of a multitude of modular ontologies that are orthogonal to each other and thus are also re-usable.

Page 15: Formal Ontology Meets Industry: Best Practices

Modularity of the CHRONIOUS Ontology Suite (1)

• The CHRONIOUS ontologies are built on top of an established upper-level ontology, namely BFO, as far as classes are concerned.

• As to relations or object properties, we have used the Relations Ontology (RO) of the OBO Foundry, a set of formal relations that are used in biomedical applications. Aside from minor modifcations, the object properties of the CHRONIOUS ontologies represent an extension of RO.

• The branch below the class Organismal Independent Continuant mirrors the structure and content of the Foundational Model of Anatomy (FMA), a reference ontology for the domain of anatomy

Page 16: Formal Ontology Meets Industry: Best Practices

The Relation Ontology

Page 17: Formal Ontology Meets Industry: Best Practices

FMA classes in MLOCC

Page 18: Formal Ontology Meets Industry: Best Practices

Modularity of the CHRONIOUS Ontology Suite (2)

• The core of the Middle Layer Ontology for Clinical Care has been extracted from the ACGT Master Ontology.

• Moreover, the COPD Ontology and the CKD Ontology import MLOCC, which represents the common core of the chronic disease ontologies, that expand on it in domain-specific ways.

Modular Structure of the CHRONIOUS Ontology Suite

Page 19: Formal Ontology Meets Industry: Best Practices

Some general principles for designing class taxonomies

Page 20: Formal Ontology Meets Industry: Best Practices

Formal constraints on taxonomies (1)• Taxonomies should contain only types, not instances or

tokens. – However, information objects like questionnaires are

ontologically tricky in this respect. Is Chronic Respiratory Disease Questionnaire a type or a token ?

• Taxonomies are based on formal subsumption, i.e. subsumption ties have to be rigid and context-independent. – „Tiotropium bromide is a bronchodilator drug“ has to be

interpreted as „a certain amount of tiotropium bromide is used as a bronchodilator“. To be a bronchodilator is a a role, i.e. an extrinsic feature of tiotropium bromide. Hence Tiotropium Bromide is not a subclass of Bronchodilator (and hence Drug), but a Chemical Substance that has the role of a Bronchodilator.

Page 21: Formal Ontology Meets Industry: Best Practices

Formal constraints on taxonomies (2)• Multiple inheritance of primitive classes should be avoided.

– TiotropiumBromide cannt be subsumed both by Bronchodilator and ChemicalSubstance. Instead of multiple inheritance, one should privilege intrinsic or formal subsumption on the one hand, and role attribution on the other.

• Sibling classes should be disjoint. • UnknownX as well as other catch-all classes for remaining

cases should be avoided. Such classes do not cut at a joint of reality, i.e. do not represent an ontological, but an epistemological distinction.

Page 22: Formal Ontology Meets Industry: Best Practices

Conclusion

• We have illustrated some best practices related to – ex-ante harmonisation through realist upper-level ontologies, – modularisation and re-use – general guidelines pertaining to the consistency of taxonomies.

• Our source of examples is the CHRONIOUS ontology suite, a large-scale bio-medical ontology development project.

• From our experience we can vouch for the efficiency of the strategies and principles described above for stream-lining ontology construction and ensuring the quality and adequacy of domain ontologies.

Page 23: Formal Ontology Meets Industry: Best Practices

Acknowledgements

Research leading up to the present article has been supported by the ICT-2007-1-216461 grant within the Seventh Framework Programme of the EU, as well as by a post-doc grant from the National Research Fund, Luxembourg (cofunded under the Marie Curie Actions of the European Commission [FP7-COFUND]), and has been carried out under subcontract to the Fraunhofer Institute for Biomedical Engineering, St. Ingbert (Germany).