biological ontologies

29
Biological Ontologies Neocles Leontis April 20, 2005

Upload: debbie

Post on 18-Feb-2016

76 views

Category:

Documents


6 download

DESCRIPTION

Biological Ontologies. Neocles Leontis April 20, 2005. What Is An Ontology?. An ontology is an explicit description of a domain of knowledge: Concepts -- Entities and Relations Properties and attributes of Entities and Relations Constraints on properties and attributes - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Biological Ontologies

Biological Ontologies

Neocles LeontisApril 20, 2005

Page 2: Biological Ontologies

What Is An Ontology?• An ontology is an explicit description of a

domain of knowledge:– Concepts -- Entities and Relations– Properties and attributes of Entities and Relations– Constraints on properties and attributes– Individuals (“Instances”)

• An ontology defines: – a common vocabulary– a shared understanding of the domain of knowledge– Commitments on how to use the vocabulary

Page 3: Biological Ontologies

What Is “Ontology Engineering”?Ontology Engineering: Defining terms in

the domain and relations among them– Defining concepts in the domain (Classes)– Arranging the concepts in a hierarchy

(Subclass-Superclass hierarchy)– Defining which attributes and Properties

classes can have (slots) and constraints on their values (facets)

– Defining individuals and filling in slot values (instantiation)

Page 4: Biological Ontologies

Why Develop an Ontology?

• To share common understanding of the structure of information – among people– among software agents

• To enable reuse of domain knowledge– to avoid “re-inventing the wheel”– to introduce standards to allow

interoperability between ontologies

Page 5: Biological Ontologies

More Reasons…• To make domain assumptions

explicit– easier to change domain assumptions– easier to understand and update

legacy data• To separate domain knowledge

from the operational knowledge– re-use domain and operational

knowledge separately

Page 6: Biological Ontologies

An Ontology Is Often Just the Beginning

Ontologies

Problem-solving

methods

DatabasesDeclarestructure

Knowledgebases

Providedomain

description

Page 7: Biological Ontologies

Ontology-Development Process

In Logical order:determine

scopeconsider

reuseenumerate

termsdefine

classesdefine

propertiesdefine

constraintscreate

instances

In reality - an iterative process:determine

scopeconsider

reuseenumerate

termsdefine

classesconsider

reuseenumerate

termsdefine

classes

defineproperties

createinstances

defineclasses

defineproperties

defineconstraints

createinstances

defineclasses

considerreuse

defineproperties

defineconstraints

createinstances

Page 8: Biological Ontologies

Protégé• Graphical ontology-development

tool• Supports a rich knowledge mode• Open-source and freely available

(http://protege.stanford.edu)

Page 9: Biological Ontologies

Authoring Program (Protégé 2000)• Enforces the implementation of foundational

principles and definitional desiderata• Frame-based architecture compatible with OKBC

protocol = Open Knowledge Base Connectivity• Frames are used to represent anatomical

concepts• Frames allow for distinguishing between class

and instance• Protégé allows for selective inheritance of

attributes• Protégé enhances specificity and expressivity of

attributes by assigning them their own attributes.

Page 10: Biological Ontologies

Determine Domain and Scope

• What is the domain that the ontology will cover?

• Who is going to use the ontology?• For what are they (we) going to use the

ontology?• For what types of questions should the

information in the ontology provide answers (competency questions)?

Answers to these questions may change during the lifecycle

determinescope

considerreuse

enumerateterms

defineclasses

defineproperties

defineconstraints

createinstances

Page 11: Biological Ontologies

RNA Ontology Scope: DOMAIN

– RNA Sequences (1D) -- Coding and Non-Coding

– RNA 2D structures – RNA 3D structures– Alignments of homologous RNA sequences– Relationships between alignments and 3D

structures

Page 12: Biological Ontologies

RNA Ontology ScopeWHO?

– Molecular biologists & biochemists – Structural biologists – Evolutionary biologists– Nanotechnologists

Page 13: Biological Ontologies

RNA Ontology Scope: WHAT?

– How to improve prediction of RNA 3D structure

– How to improve sequence alignments of homologous RNAs

– To identify and annotate RNA genes in genomes

– How are RNA 3D structure and evolution coupled?

– How is RNA evolution coupled to biological evolution

Page 14: Biological Ontologies

Consider Reuse

• Why reuse other ontologies?– to save the effort– to interact with the tools that use

other ontologies– to use ontologies that have been

validated through use in applications

determinescope

considerreuse

enumerateterms

defineclasses

defineproperties

defineconstraints

createinstances

Page 15: Biological Ontologies

Enumerate Important Terms

• What are the terms (entities) we need to talk about?

• What are the properties and attributes of these entities?

• What are the relationships between entities?

considerreuse

determinescope

enumerateterms

defineclasses

defineproperties

defineconstraints

createinstances

Page 16: Biological Ontologies

Define Classes and the Class Hierarchy

• A class is a concept in the domain– a class of wines– a class of wineries– a class of red wines

• A class is a collection of elements with similar properties

• Instances of classes– a glass of California wine you’ll have for lunch

considerreuse

determinescope

defineclasses

defineproperties

defineconstraints

createinstances

enumerateterms

Page 17: Biological Ontologies

Class Hierarchy

Page 18: Biological Ontologies

Class Inheritance• Classes usually constitute a taxonomic

hierarchy (a subclass-superclass hierarchy)• A class hierarchy is usually an IS-A hierarchy:

an instance of a subclass is an instance of a superclass

• If you think of a class as a set of elements, a subclass is a subset

Page 19: Biological Ontologies

FMA -- High Level Scheme• FMA = (AT, ASA, ATA, Mk)

– AT = Anatomy taxonomy (assigns anatomical entities as class concepts

– ASA = Anatomy Structural Abstraction -- includes structural relationships among entities of the AT

– ATA = Anatomical Transformation Abstraction -- relationships that describe morphological & physical transformations of anatomical entities

– MK = Metaknowledge -- principles and sets of rules

Page 20: Biological Ontologies

ASA -- High Level Scheme• ASA = (Dt, PPt, Bn, Pn, SAn)

– Dt = Dimensional taxonomy– PPt = Physical Properties taxonomy– Bn = Boundary network– Pn = Partonomy network– SAn = Spatial Association network

Page 21: Biological Ontologies

Boundary Network (Bn)• Specification of boundaries is critical for

segmentation of images and volumetric datasets

• Definition: Boundary = Non-material physical anatomical entity of two or fewer dimensions that delimits anatomical entities that are of one higher dimension than the bounding entity

Page 22: Biological Ontologies

Boundary Network (Bn)Inverse Relationships:

-bounded by--bounds-

Real vs. Virtual Boundaries:Rea boundaries correspond to its surface and designate discontinuities between constitutional parts of anatomical entities

Page 23: Biological Ontologies

Partonomy Network (Pn)Inverse Relationships:

-has part-

Page 24: Biological Ontologies

Rule of Dimensional Consistency

Distinguishes between boundary and partonomy relationships.

Parthood relations -- only allowed for entities of the same dimension

Ex: Cavity of stomach (3D) -has part- Cavity of pyloric antrum (3D)

Ex: Internal surface of stomach (2D) -has part- Internal surface of pyloric antrum (2D)

Page 25: Biological Ontologies

What to Reuse?• Ontology libraries

– DAML ontology library (www.daml.org/ontologies)

– Ontolingua ontology library (www.ksl.stanford.edu/software/ontolingua/)

– Protégé ontology library (protege.stanford.edu/plugins.html)

• Upper ontologies– IEEE Standard Upper Ontology

(suo.ieee.org)– Cyc (www.cyc.com)

Page 26: Biological Ontologies

RNA Ontology Consortium• To share common understanding of the

structure of information – among people– among software agents

• To enable reuse of domain knowledge– to avoid “re-inventing the wheel”– to introduce standards to allow

interoperability

Page 27: Biological Ontologies

What to Reuse? (II)• General ontologies

– DMOZ (www.dmoz.org)– WordNet (www.cogsci.princeton.edu/~wn/)

• Domain-specific ontologies– UMLS Semantic Net– GO (Gene Ontology) (www.geneontology.org

)– FMA (Foundational Model of Anatomy)

Page 28: Biological Ontologies

Foundational Model of Anatomy

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

http://sig.biostr.washington.edu/projects/fm/index.html

• Reference ontology for biomedical informatics

• Representation of Anatomical Entities and Relationships

• Symbolic modeling of the structure of the human body at the highest level of granularity

• Evolving Resource for knowledge-based applications requiring anatomical information

Page 29: Biological Ontologies

FMA: Modeling Challenges• Representing complex structural relations• Representing different levels of granularity• Developing a model that is scalable to a

very large number of concepts• Using consistent organizational principles