ontology and the future of biomedical research barry smith

188
Ontology and the Future of Biomedical Research Barry Smith http://ifomis.org

Post on 19-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ontology and the Future of Biomedical Research Barry Smith

Ontology and the Future of Biomedical Research

Barry Smithhttp://ifomis.org

Page 2: Ontology and the Future of Biomedical Research Barry Smith

Institute for Formal Ontology and Medical Information Science

Saarland University

Page 3: Ontology and the Future of Biomedical Research Barry Smith

From chromosome

to disease

Page 4: Ontology and the Future of Biomedical Research Barry Smith

Problem:how to reason with data deriving from different sources, each of which uses its own system of classification ?

Page 5: Ontology and the Future of Biomedical Research Barry Smith

Solution:

Ontology !

Page 6: Ontology and the Future of Biomedical Research Barry Smith

Examples of current needs for ontologies in biomedicine

– to enforce semantic consistency within a database

– to enable data sharing and re-use– to enable data integration (bridging

across data at multiple granularities)

– to allow querying

Page 7: Ontology and the Future of Biomedical Research Barry Smith

What is needed

strong general purpose classification hierarchies created by domain specialists clear, rigorous definitionsthoroughly tested in real use casesupdated in light of scientific advance

Page 8: Ontology and the Future of Biomedical Research Barry Smith

The actuality (too often)

myriad special purpose ‘light’ ontologies, prepared by ontology engineers and deposited in internet ‘repositories’ or ‘registries’

Page 9: Ontology and the Future of Biomedical Research Barry Smith

ontologies for ‘agent’

Page 10: Ontology and the Future of Biomedical Research Barry Smith
Page 11: Ontology and the Future of Biomedical Research Barry Smith

General trend

on the part of NIH, FDA and other bodies to consolidate ontology-based standards for the communication and processing of biomedical data.

Page 12: Ontology and the Future of Biomedical Research Barry Smith

Responses to this trend

Old: UMLS (Unified Medical Language System) – rooted in the faithfulness to the ways language is used by different medical communities

Page 13: Ontology and the Future of Biomedical Research Barry Smith

SNOMED

DEMONS

U M L S

Page 14: Ontology and the Future of Biomedical Research Barry Smith

– congenital absent nipple is_a nipple– cancer documentation is_a cancer– disease prevention is_a disease– repair and maintenance of wheelchair is_a

disease– water is_a nursing phenomenon– part-whole =def. a nursing phenomenon with

topology part-whole

U M L S

Page 15: Ontology and the Future of Biomedical Research Barry Smith

MeSH

MeSH Descriptors Index Medicus Descriptor Anthropology, Education, Sociology and Social Phenomena (MeSH Category) Social Sciences Political Systems National Socialism

Page 16: Ontology and the Future of Biomedical Research Barry Smith

MeSH

National Socialism is_a Political SystemsNational Socialism is_a Anthropology ...National Socialism is_a Social SciencesNational Socialism is_a MeSH Descriptors

Page 17: Ontology and the Future of Biomedical Research Barry Smith

New: Semantic Web deposits

Pet Profile Ontology

Review Vocabulary

Band Description Vocabulary

Musical Baton Vocabulary

MusicBrainz Metadata Vocabulary

Kissology

Page 18: Ontology and the Future of Biomedical Research Barry Smith

http://www.w3.org/

Beer Ontology

all instances of hops that have ever existed are necessarily ingredients of beer.

Page 19: Ontology and the Future of Biomedical Research Barry Smith

some nice computational resources, but low expressivityand few genuinely scientific demonstration cases

OWL-based ontologies …

Page 20: Ontology and the Future of Biomedical Research Barry Smith

OWL’s syntactic regimentation is not enough to ensure high-quality

ontologies

– the use of a common syntax and logical machinery and the careful separating out of ontologies into namespaces does not solve the problem of ontology integration

Page 21: Ontology and the Future of Biomedical Research Barry Smith

Both UMLS- and OWL-type responses involve ad hoc creation of new terminologies by each community

Many of these terminologies remain as torsos, gather dust, poison the wells, ...

Page 22: Ontology and the Future of Biomedical Research Barry Smith

How to do better?How to create the conditions for a step-by-step evolution towards high quality ontologies in the biomedical domainwhich will serve as stable attractors for clinical and biomedical researchers in the future?

Page 23: Ontology and the Future of Biomedical Research Barry Smith

A basic distinction

type vs. instance

science text vs. clinical document

dog vs. Fido

Page 24: Ontology and the Future of Biomedical Research Barry Smith

Instances are not represented in an ontology built for

scientific purposesIt is the generalizations that are

important

(but instances must still be taken into account)

Page 25: Ontology and the Future of Biomedical Research Barry Smith

A 515287 DC3300 Dust Collector Fan

B 521683 Gilmer Belt

C 521682 Motor Drive Belt

Catalog vs. inventory

Page 26: Ontology and the Future of Biomedical Research Barry Smith

Ontology Types Instances

Page 27: Ontology and the Future of Biomedical Research Barry Smith

Ontology = A Representation of Types

Page 28: Ontology and the Future of Biomedical Research Barry Smith

Ontology = A Representation of Types

Each node of an ontology consists of:

• preferred term (aka term)

• term identifier (TUI, aka CUI)

• synonyms

• definition, glosses, comments

Page 29: Ontology and the Future of Biomedical Research Barry Smith

Each term in an ontology represents exactly one type

hence ontology terms should be singular nouns

National Socialism is_a Political Systems

Page 30: Ontology and the Future of Biomedical Research Barry Smith

An ontology is a representation of types

We learn about types in reality from looking at the results of scientific experiments in the form of scientific theories – which describe not what is particular in reality but rather what is general

Ontologies need to exploit the evolutionary path to convergence created by science

Page 31: Ontology and the Future of Biomedical Research Barry Smith

High quality shared ontologies build communities

NIH, FDA trend to consolidate ontology-based standards for the communication and processing of biomedical data.

caBIG / NECTAR / BIRN / BRIDG ...

Page 32: Ontology and the Future of Biomedical Research Barry Smith

http://obo.sourceforge.net

Page 33: Ontology and the Future of Biomedical Research Barry Smith

http://www.geneontology.org/

Page 34: Ontology and the Future of Biomedical Research Barry Smith
Page 35: Ontology and the Future of Biomedical Research Barry Smith
Page 36: Ontology and the Future of Biomedical Research Barry Smith

The Methodology of Annotations

GO employs scientific curators, who use experimental observations reported in the biomedical literature to link gene products with GO terms in annotations.

This gene product exercises this function, in this part of the cell, leading to these biological processes

Page 37: Ontology and the Future of Biomedical Research Barry Smith

The Methodology of Annotations

This process of annotating literature leads to improvements and extensions of the ontology, which in turn leads to better annotations

This institutes a virtuous cycle of improvement in the quality and reach of both future annotations and the ontology itself.

Annotations + ontology taken together yield a slowly growing computer-interpretable map of biological reality.

Page 38: Ontology and the Future of Biomedical Research Barry Smith

The OBO The OBO FoundryFoundry

Page 39: Ontology and the Future of Biomedical Research Barry Smith

A subset of OBO ontologies, whose developers have agreed in advance to accept a common set of principles designed to ensure

– intelligibility to biologists (curators, annotators, users)

– formal robustness – stability– compatibility– interoperability – support for logic-based reasoning

The OBO FoundryThe OBO Foundry

Page 40: Ontology and the Future of Biomedical Research Barry Smith

Custodians

•Michael Ashburner (Cambridge)•Suzanna Lewis (Berkeley)•Barry Smith (Buffalo/Saarbrücken)

The OBO FoundryThe OBO Foundry

Page 41: Ontology and the Future of Biomedical Research Barry Smith

A collaborative experiment

participants have agreed in advance to a growing set of principles specifying best practices in ontology developmentdesigned to guarantee interoperability of ontologies from the very start

The OBO FoundryThe OBO Foundry

Page 42: Ontology and the Future of Biomedical Research Barry Smith

The developers of each ontology commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement. They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single reference ontology.

The OBO FoundryThe OBO Foundry

Page 43: Ontology and the Future of Biomedical Research Barry Smith

Initial Candidate Members of the OBO Foundry

– GO Gene Ontology– CL Cell Ontology– SO Sequence Ontology– ChEBI Chemical Ontology – PATO Phenotype Ontology– FuGO Functional Genomics Investigation

Ontology– FMA Foundational Model of Anatomy– RO Relation Ontology 

The OBO FoundryThe OBO Foundry

Page 44: Ontology and the Future of Biomedical Research Barry Smith

Under development – Disease Ontology– NCI Thesaurus– Mammalian Phenotype Ontology – OBO-UBO / Ontology of Biomedical Reality – Organism (Species) Ontology– Plant Trait Ontology– Protein Ontology– RnaO RNA Ontology

The OBO FoundryThe OBO Foundry

Page 45: Ontology and the Future of Biomedical Research Barry Smith

Considered for development

– Environment Ontology– Behavior Ontology– Biomedical Image Ontology– Clinical Trial Ontology

The OBO FoundryThe OBO Foundry

Page 46: Ontology and the Future of Biomedical Research Barry Smith

CRITERIA

The OBO FoundryThe OBO FoundryThe OBO FoundryThe OBO Foundry

The ontology is open and available to be used by all.

The developers of the ontology agree in advance to collaborate with developers of other OBO Foundry ontology where domains overlap.

The ontology is in, or can be instantiated in, a common formal language.

Page 47: Ontology and the Future of Biomedical Research Barry Smith

The ontology possesses a unique identifier space within OBO.

The ontology provider has procedures for identifying distinct successive versions.

The ontology includes textual definitions for all terms.

CRITERIA

The OBO FoundryThe OBO Foundry

Page 48: Ontology and the Future of Biomedical Research Barry Smith

The ontology has a clearly specified and clearly delineated content.

The ontology is well-documented.

The ontology has a plurality of independent users.

CRITERIA

The OBO FoundryThe OBO Foundry

Page 49: Ontology and the Future of Biomedical Research Barry Smith

The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.*

*Genome Biology 2005, 6:R46

CRITERIA

The OBO FoundryThe OBO Foundry

Page 50: Ontology and the Future of Biomedical Research Barry Smith

CRITERIA

Further criteria will be added over time in order to bring about a gradual improvement in the quality of the ontologies in the Foundry

The OBO FoundryThe OBO FoundryThe OBO FoundryThe OBO Foundry

Page 51: Ontology and the Future of Biomedical Research Barry Smith

A reference ontology

is analogous to a scientific theory; it seeks to optimize representational adequacy to its subject matter to the maximal degree that is compatible with the constraints of computational usefulness.

Page 52: Ontology and the Future of Biomedical Research Barry Smith

An application ontology

is comparable to an engineering artifact such as a software tool. It is constructed for a specific practical purpose.Examples:

National Cancer Institute Thesaurus FuGO Functional Genomics

Investigation Ontology

Page 53: Ontology and the Future of Biomedical Research Barry Smith

Reference Ontology vs. Application Ontology

Currently, application ontologies are often built afresh for each new task; commonly introducing not only idiosyncrasies of format or logic, but also simplifications or distortions of their subject-matters. To solve this problem application ontology development should take place always against the background of a formally robust reference ontology framework

Page 54: Ontology and the Future of Biomedical Research Barry Smith

Advantages of the methodology of shared coherently defined

ontologies• promotes quality assurance (better

coding)• guarantees automatic reasoning across

ontologies and across data at different granularities

• yields direct connection to temporally indexed instance data

Page 55: Ontology and the Future of Biomedical Research Barry Smith

Advantages of the methodology of shared coherently defined

ontologies

We know that high-quality ontologies can help in creating better mappings e.g. between human and model organism phenotypes

S Zhang, O Bodenreider, “Alignment of Multiple Ontologies of Anatomy: Deriving Indirect Mappings from Direct Mappings to a Reference Ontology”, AMIA 2005

Page 56: Ontology and the Future of Biomedical Research Barry Smith

Advantages of the methodology of shared coherently defined ontologies

once the interoperable gold standard reference ontologies are there, it will make sense to reformulate parts of existing incompatible terminologies (e.g. in UMLS) in terms of the standard ontologies in order to achieve greater domain coverage and alignment of different but veridical views. Thus not everything that was done in the past turns out to be a waste.

Page 57: Ontology and the Future of Biomedical Research Barry Smith

Goal: to create a family of gold standard reference ontologies upon which terminologies developed for specific applications can draw

The OBO FoundryThe OBO Foundry

Page 58: Ontology and the Future of Biomedical Research Barry Smith

Goal: to introduce the scientific method into ontology development:– all Foundry ontologies must be constantly

updated in light of scientific advance– all Foundry ontology developers must work

with all other Foundry ontology developers in a spirit of scientific collaboration

The OBO FoundryThe OBO Foundry

Page 59: Ontology and the Future of Biomedical Research Barry Smith

Goal: to replace the current policy of ad hoc

creation of new database schemas by each clinical research group by providing reference ontologies in terms of which database schemas can be defined

The OBO FoundryThe OBO Foundry

Page 60: Ontology and the Future of Biomedical Research Barry Smith

Goal: to introduce some of the features of scientific peer review into biomedical ontology development

The OBO FoundryThe OBO Foundry

Page 61: Ontology and the Future of Biomedical Research Barry Smith

Goal:to create controlled vocabularies for use by clinical trial banks, clinical guidelines bodies, scientific journals, ...

The OBO FoundryThe OBO Foundry

Page 62: Ontology and the Future of Biomedical Research Barry Smith

Goal:to create controlled vocabularies for use by clinical trial banks, clinical guidelines bodies, scientific journals, ...

The OBO FoundryThe OBO Foundry

Page 63: Ontology and the Future of Biomedical Research Barry Smith

Goal:to create an evolving map-like representation of the entire domain of biological reality

The OBO FoundryThe OBO Foundry

Page 64: Ontology and the Future of Biomedical Research Barry Smith

GO’s three ontologies

molecular function

cellular component

biological process

Page 65: Ontology and the Future of Biomedical Research Barry Smith

cell (types)

molecular function

(GO)

species

molecular process

cellular anatom

y

anatomy(fly, fish,

human...)

cellularphysiology

organism-levelphysiology

ChEBI,Sequence,

RNA ...

Page 66: Ontology and the Future of Biomedical Research Barry Smith

cell (types)

molecular function

(GO)

species

molecular process

cellular anatom

y

anatomy(fly, fish, human...)

cellularphysiology

organism-levelphysiology

ChEBI,Sequence,

RNA ...

normal(functionings)

Page 67: Ontology and the Future of Biomedical Research Barry Smith

pathophysiology(disease)

pathoanatomy(fly, fish, human ...)

pathological(malfunctionings)

Page 68: Ontology and the Future of Biomedical Research Barry Smith

cell (types)

molecular function

(GO)

species

molecular process

cellular anatom

y(GO)

anatomy(fly, fish, human...)

cellularphysiology

organism-levelphysiology

ChEBI,Sequence,

RNA ...

pathophysiology(disease)

pathoanatomy(fly, fish, human ...)

Page 69: Ontology and the Future of Biomedical Research Barry Smith

cell (types)

molecular function

(GO)

species

molecular process

cellular anatom

y

anatomy(fly, fish, human...)

cellularphysiology

organism-levelphysiology

ChEBI,Sequence,

RNA ...

pathophysiology(disease)

pathoanatomy(fly, fish, human ...)

phenotype

Page 70: Ontology and the Future of Biomedical Research Barry Smith

cell (types)

molecular function

(GO)

species

molecular process

cellular anatom

y

anatomy(fly, fish, human...)

cellularphysiology

organism-levelphysiology

ChEBI,Sequence,

RNA ...

pathophysiology(disease)

pathoanatomy(fly, fish, human ...)

phenotype

investigation(FuGO)

Page 71: Ontology and the Future of Biomedical Research Barry Smith

Ende

Page 72: Ontology and the Future of Biomedical Research Barry Smith

First step

Alignment of OBO Foundry ontologies through a common system of formally defined relations in the OBO Relation Ontology

See “Relations in Biomedical Ontologies”, Genome Biology Apr. 2005

Page 73: Ontology and the Future of Biomedical Research Barry Smith

Judith Blake:

“The use of bio-ontologies … ensures consistency of data curation, supports extensive data integration, and enables robust exchange of information between heterogeneous informatics systems. .. ontologies … formally define relationships between the concepts.”

Page 74: Ontology and the Future of Biomedical Research Barry Smith

"Gene Ontology: Tool for the Unification of Biology"

an ontology "comprises a set of well-defined terms with well-defined relationships" (Ashburner et al., 2000, p. 27)

Page 75: Ontology and the Future of Biomedical Research Barry Smith

is_a (sensu UMLS)A is_a B =def

‘A ’ is narrower in meaning than ‘B ’

grows out of the heritage of dictionaries

(which ignore the basic distinction between types and instances)

Page 76: Ontology and the Future of Biomedical Research Barry Smith

is_acongenital absent nipple is_a nipplecancer documentation is_a cancerdisease prevention is_a diseaseNazism is_a social science

Page 77: Ontology and the Future of Biomedical Research Barry Smith

is_a (sensu logic)A is_a B =def

For all x, if x instance_of A then x instance_of B

cell division is_a biological process

adult is_a child ???

Page 78: Ontology and the Future of Biomedical Research Barry Smith

Two kinds of entitiesoccurrents (processes, events,

happenings)cell division, ovulation, death

continuants (objects, qualities, ...)cell, ovum, organism, temperature of organism, ...

Page 79: Ontology and the Future of Biomedical Research Barry Smith

is_a (for occurrents)A is_a B =def

For all x, if x instance_of A then x instance_of B

cell division is_a biological process

Page 80: Ontology and the Future of Biomedical Research Barry Smith

is_a (for continuants)A is_a B =def

For all x, t if x instance_of A at t then x instance_of B at t

abnormal cell is_a celladult human is_a humanbut not: adult is_a child

Page 81: Ontology and the Future of Biomedical Research Barry Smith

Part_of as a relation between types is more problematic than is standardly supposed

heart part_of human being ?human heart part_of human being ?human being has_part human testis ?human testis part_of human being ?

Page 82: Ontology and the Future of Biomedical Research Barry Smith

two kinds of parthood

1. between instances:Mary’s heart part_of Marythis nucleus part_of this cell

2. between typeshuman heart part_of humancell nucleus part_of cell

Page 83: Ontology and the Future of Biomedical Research Barry Smith

Definition of part_of as a relation between types

A part_of B =Def all instances of A are instance-level parts of some instance of B

ALL–SOME STRUCTURE

Page 84: Ontology and the Future of Biomedical Research Barry Smith

part_of (for occurrents)A part_of B =Def

For all x, if x instance_of A then there is some y, y instance_of B and x part_of ywhere ‘part_of’ is the instance-level part relation

Page 85: Ontology and the Future of Biomedical Research Barry Smith

part_of (for continuants)A part_of B =def.

For all x, t if x instance_of A at t then there is some y, y instance_of B at t and x part_of y

where ‘part_of’ is the instance-level part relation

ALL-SOME STRUCTURE

Page 86: Ontology and the Future of Biomedical Research Barry Smith

How to use the OBO Relation OntologyOntologies are representations of types and

of the relations between typesThe definitions of these relations involve

reference to times and instances, but these references are washed out when we get to the assertions (edges) in the ontology

But curators should still be aware of the underlying definitions when formulating such assertions

Page 87: Ontology and the Future of Biomedical Research Barry Smith

part_of (for occurrents)A part_of B =Def

For all x, if x instance_of A then there is some y, y instance_of B and x part_of ywhere ‘part_of’ is the instance-level part relation

Page 88: Ontology and the Future of Biomedical Research Barry Smith

A part_of B, B part_of C ...The all-some structure of such

definitions allowscascading of inferences (true path

rule)(i) within ontologies(ii) between ontologies(iii) between ontologies and repositories of instance-data

Page 89: Ontology and the Future of Biomedical Research Barry Smith

Strengthened true path ruleWhichever A you choose, the instance of

B of which it is a part will be included in some C, which will include as part also the A with which you began

The same principle applies to the other relations in the OBO-RO:

located_at, transformation_of, derived_from, adjacent_to, etc.

Page 90: Ontology and the Future of Biomedical Research Barry Smith

Kinds of relationsBetween types:

– is_a, part_of, ...

Between an instance and a type– this explosion instance_of the type

explosion

Between instances:– Mary’s heart part_of Mary

Page 91: Ontology and the Future of Biomedical Research Barry Smith

In every ontologysome terms and some relations are primitive = they cannot be defined (on pain of infinite regress)

Examples of primitive relations:– identity– instantiation– (instance-level) part_of– (instance-level)

continuous_with

Page 92: Ontology and the Future of Biomedical Research Barry Smith

Fiat and bona fide boundaries

Page 93: Ontology and the Future of Biomedical Research Barry Smith

Continuity

Attachment

Adjacency

Page 94: Ontology and the Future of Biomedical Research Barry Smith

everything here is an independent continuant

Page 95: Ontology and the Future of Biomedical Research Barry Smith

structures vs. formations = bona fide vs. fiat boundaries

Page 96: Ontology and the Future of Biomedical Research Barry Smith

Modes of Connection

The body is a highly connected entity.

Exceptions: cells floating free in blood.

Page 97: Ontology and the Future of Biomedical Research Barry Smith

Modes of Connection

Modes of connection:attached_to (muscle to bone) synapsed_with (nerve to nerve, nerve to muscle)

continuous_with (= share a fiat boundary)

Page 98: Ontology and the Future of Biomedical Research Barry Smith

articular eminencearticular (glenoid)fossa

ANTERIOR

Attachment, location, containment

Page 99: Ontology and the Future of Biomedical Research Barry Smith

Containment involves relation to a hole or cavity

1: cavity2: tunnel, conduit (artery)3: mouth; a snail’s shell

Page 100: Ontology and the Future of Biomedical Research Barry Smith

Fiat vs. Bona Fide Boundaries

Fiat boundary Physical boundary

Page 101: Ontology and the Future of Biomedical Research Barry Smith

Double Hole Structure

Medium (filling the environing hole)

Tenant (occupying the central hole)

Retainer (a boundary of some surrounding structure)

Page 102: Ontology and the Future of Biomedical Research Barry Smith

head of condyle

neck of condyle

fossa

fiat boundary

THE TEMPOROMANDIBULAR JOINTTHE TEMPOROMANDIBULAR JOINT

Page 103: Ontology and the Future of Biomedical Research Barry Smith

continuous_with(a relation between instances which

share a fiat boundary)

is always symmetric:

if x continuous_with y , then y continuous_with x

Page 104: Ontology and the Future of Biomedical Research Barry Smith

continuous_with(relation between types)

A continuous_with B =Def.

for all x, if x instance-of A then there is some y such that y instance_of B and x continuous_with y

Page 105: Ontology and the Future of Biomedical Research Barry Smith

continuous_with is not always symmetric

Consider lymph node and lymphatic vessel:

Each lymph node is continuous with some lymphatic vessel, but there are lymphatic vessels (e.g. lymphs and lymphatic trunks) which are not continuous with any lymph nodes

Page 106: Ontology and the Future of Biomedical Research Barry Smith

Adjacent_toas a relation between types

is not symmetric

Considerseminal vesicle adjacent_to urinary bladder

Not: urinary bladder adjacent_to seminal vesicle

Page 107: Ontology and the Future of Biomedical Research Barry Smith

instance levelthis nucleus is adjacent to this

cytoplasmimplies:

this cytoplasm is adjacent to this nucleus

type levelnucleus adjacent_to cytoplasmNot: cytoplasm adjacent_to nucleus

Page 108: Ontology and the Future of Biomedical Research Barry Smith

ApplicationsExpectations of symmetry e.g. for

protein-protein interactions may hold only at the instance level

if A interacts with B, it does not follow that B interacts with A

if A is expressed simultaneously with B, it does not follow that B is expressed simultaneously with A

Page 109: Ontology and the Future of Biomedical Research Barry Smith

c at t1

C

c at t

C1

time

same instance

transformation_of

pre-RNA mature RNA

adultchild

Page 110: Ontology and the Future of Biomedical Research Barry Smith

transformation_of

A transformation_of B =Def. Every instance of A was at some earlier time an instance of B

adult transformation_of child

Page 111: Ontology and the Future of Biomedical Research Barry Smith

C

c at t c at t1

C1

tumor development

Page 112: Ontology and the Future of Biomedical Research Barry Smith

C

c at t

C1

c1 at t1

C'

c' at t

time

instances

zygote derives_fromovumsperm

derives_from

Page 113: Ontology and the Future of Biomedical Research Barry Smith

two continuants fuse to form a new continuant

C

c at t

C1

c1 at t1

C'

c' at t fusion

Page 114: Ontology and the Future of Biomedical Research Barry Smith

one initial continuant is replaced by two successor continuants

C

c at t

C1

c1 at t1

C2

c1 at t1

fission

Page 115: Ontology and the Future of Biomedical Research Barry Smith

one continuant detaches itself from an initial continuant, which itself continues to exist

C

c at t c at t1

C1

c1 at t

budding

Page 116: Ontology and the Future of Biomedical Research Barry Smith

one continuant absorbs a second continuant while itself continuing to exist

C

c at t

c at t1

C'

c' at t capture

Page 117: Ontology and the Future of Biomedical Research Barry Smith

A suite of defined relations between typesFoundation

al is_apart_of

Spatial located_incontained_inadjacent_to

Temporal transformation_ofderives_frompreceded_by

Participation

has_participanthas_agent

Page 118: Ontology and the Future of Biomedical Research Barry Smith

To be added to the Relation Ontology

lacks (between an instance and a type, e.g. this fly lacks wings)

dependent_on (between a dependent entity and its carrier or bearer)

quality_of (between a dependent and an independent continuant)

functioning_of (between a process and an independent continuant)

Page 119: Ontology and the Future of Biomedical Research Barry Smith

Low Hanging Fruit

Ontologies should include only those relational assertions which hold universally (= have the ALL-SOME form)

Often, order will matter here:We can include

adult transformation_of childbut not

child transforms_into adult

Page 120: Ontology and the Future of Biomedical Research Barry Smith

The Gene Ontology

Page 121: Ontology and the Future of Biomedical Research Barry Smith

GO’s three ontologies

molecular functions

cellular components

biological processes

Page 122: Ontology and the Future of Biomedical Research Barry Smith

When a gene is identified

three types of questions need to be addressed:

1. Where is it located in the cell? 2. What functions does it have on the

molecular level? 3. To what biological processes do these

functions contribute?

Page 123: Ontology and the Future of Biomedical Research Barry Smith

Three granularities:

Cellular (for components)Molecular (for functions)Organ + organism (for processes)

Page 124: Ontology and the Future of Biomedical Research Barry Smith

GO has cells

but it does not include terms for molecules or organisms within any of its three ontologiesexcept e.g. GO:0018995 host=Def. Any organism in which another organism spends part or all of its life cycle

Page 125: Ontology and the Future of Biomedical Research Barry Smith

Are the relations between functions and processes a matter of granularity?

Molecular activities are the ‘building blocks’ of biological processes ?

But they are not allowed to be represented in GO as parts of biological processes

Page 126: Ontology and the Future of Biomedical Research Barry Smith

GO’s three ontologies

molecular functions

cellular components

biological processes

Page 127: Ontology and the Future of Biomedical Research Barry Smith

What does “function” mean?

an entity has a biological function if and only if it is part of an organism and has a disposition to act reliably in such a way as to contribute to the organism’s survival

the function is this disposition

Page 128: Ontology and the Future of Biomedical Research Barry Smith

Improved version

an entity has a biological function if and only if it is part of an organism and has a disposition to act reliably in such a way as to contribute to the organism’s realization of the canonical life plan for an organism of that type

Page 129: Ontology and the Future of Biomedical Research Barry Smith

This canonical life plan might include

canonical embryological development

canonical growthcanonical reproductioncanonical agingcanonical death

Page 130: Ontology and the Future of Biomedical Research Barry Smith

The function of the heart is to pump blood

Not every activity (process) in an organism is the exercise of a function – there are – mal functionings– side-effects (heart beating)– accidents (external

interference)– background stochastic activity

Page 131: Ontology and the Future of Biomedical Research Barry Smith

Kidney

Page 132: Ontology and the Future of Biomedical Research Barry Smith

Nephron

Page 133: Ontology and the Future of Biomedical Research Barry Smith

Functional Segments

Page 134: Ontology and the Future of Biomedical Research Barry Smith

Functions

Page 135: Ontology and the Future of Biomedical Research Barry Smith

FunctionsThis is a screwdriverThis is a good screwdriverThis is a broken screwdriver

This is a heartThis is a healthy heartThis is an unhealthy heart

Page 136: Ontology and the Future of Biomedical Research Barry Smith

Functions are associated with certain characteristic process shapes

Screwdriver: rotates and simultaneously moves forward simultaneously transferring torque from hand and arm to screw

Heart: performs a contracting movement inwards and an expanding movement outwards

Page 137: Ontology and the Future of Biomedical Research Barry Smith

Not functioning at all

leads to death, modulo internal factors:

plasticity redundancy (2 kidneys)criticality of the system involved

external factors:prosthesis (dialysis machines, oxygen tent)special environmentsassistance from other organisms

Page 138: Ontology and the Future of Biomedical Research Barry Smith

What clinical medicine is for

to eliminate malfunctioning by fixing broken body parts(or to prevent the appearance of malfunctioning by intervening e.g. at the molecular level)

Page 139: Ontology and the Future of Biomedical Research Barry Smith

Hypothesis: there are no ‘bad’ functions

It is not the function of an oncogene to cause cancer Oncogenes were in every case proto-oncogenes with functions of their ownThey become oncogenes because of bad (non-prototypical) environments

Page 140: Ontology and the Future of Biomedical Research Barry Smith

Is there an exception for molecular functions?

Does this apply only to functions on biological levels of granularity

(= levels of granularity coarser than the molecule) ?

If pathology is the deviation from (normal) functioning, does it make sense to talk of a pathological molecule?

(Pathologically functioning molecule vs. pathologically structured molecule)

Page 141: Ontology and the Future of Biomedical Research Barry Smith

Is there an exception for molecular functions?

A molecular function is a propensity of a gene product instance to perform actions on the molecular level of granularity. Hypothesis 1: these actions must be reliably such as to contribute to biological processes.Hypothesis 2: these actions must be reliably such as to contribute to the organism’s realization of the canonical life plan for an organism of that type.

Page 142: Ontology and the Future of Biomedical Research Barry Smith

The Gene Ontology

is a canonical ontology – it represents only what is normal in the realm of molecular functioning

Page 143: Ontology and the Future of Biomedical Research Barry Smith

The GO is a canonical representation

“The Gene Ontology is a computational representation of the ways in which gene products normally function in the biological realm”

Nucl. Acids Res. 2006: 34.

Page 144: Ontology and the Future of Biomedical Research Barry Smith

The FMA is a canonical representation

It is a computational representation of types and relations between types deduced from the qualitative observations of the normal human body, which have been refined and sanctioned by successive generations of anatomists and presented in textbooks and atlases of structural anatomy.

Page 145: Ontology and the Future of Biomedical Research Barry Smith

The importance of pathways (successive causality)

Each stage in the history of a disease presupposes the earlier stages

Therefore need to reason across time, tracking the order of events in time, using relations such as derives_from, transformation_of ...

Need pathway ontologies on every level of granularity

Page 146: Ontology and the Future of Biomedical Research Barry Smith

The importance of granularity (simultaneous causality)

Networks are continuantsAt any given time there are networks existing

in the organism at different levels of granularity

Changes in one cause simultaneous changes in all the others

(Compare Boyle’s law: a rise in temperature causes a simultaneous increase in pressure)

Page 147: Ontology and the Future of Biomedical Research Barry Smith

The Granularity Gulf

most existing data-sources are of fixed, single granularity

many (all?) clinical phenomena cross granularities

Therefore need to reason across time, tracking the order of events in time

Page 148: Ontology and the Future of Biomedical Research Barry Smith

Good ontologies require:

consistent use of terms, supported by logically coherent (non-circular) definitions, in equivalent human-readable and computable formats

coherent shared treatment of relations to allow cascading inference both within and between ontologies

Page 149: Ontology and the Future of Biomedical Research Barry Smith

Three fundamental dichotomies

• continuants vs. occurrents• dependent vs. independent • types vs. instances

Page 150: Ontology and the Future of Biomedical Research Barry Smith

ONTOLOGIES AREREPRESENTATIONS OF TYPES

aka kinds, universals, categories, species, genera, ...

Page 151: Ontology and the Future of Biomedical Research Barry Smith

Continuants (aka endurants)– have continuous existence in time– preserve their identity through

change– exist in toto whenever they exist at

all

Occurrents (aka processes)– have temporal parts– unfold themselves in successive

phases– exist only in their phases

Page 152: Ontology and the Future of Biomedical Research Barry Smith

You are a continuant

Your life is an occurrent

You are 3-dimensional

Your life is 4-dimensional

Page 153: Ontology and the Future of Biomedical Research Barry Smith

Dependent entities

require independent continuants as their bearers

There is no run without a runnerThere is no grin without a cat

Page 154: Ontology and the Future of Biomedical Research Barry Smith

Dependent vs. independent continuants

Independent continuants (organisms, cells, molecules, environments)

Dependent continuants (qualities, shapes, roles, propensities, functions)

Page 155: Ontology and the Future of Biomedical Research Barry Smith

All occurrents are dependent entities

They are dependent on those independent continuants which are their participants (agents, patients, media ...)

Page 156: Ontology and the Future of Biomedical Research Barry Smith

Top-Level Ontology

ContinuantOccurrent

(always dependent on one or more

independent continuants)

IndependentContinuant

DependentContinuant

Page 157: Ontology and the Future of Biomedical Research Barry Smith

= A representation of top-level types

Continuant Occurrent

IndependentContinuant

DependentContinuant

cell component

biological process

molecular function

Page 158: Ontology and the Future of Biomedical Research Barry Smith

Top-Level Ontology

Continuant Occurrent

IndependentContinuant

DependentContinuant

Functioning

Side-Effect, Stochastic Process, ...

Function

Page 159: Ontology and the Future of Biomedical Research Barry Smith

Top-Level OntologyContinuant Occurrent

IndependentContinuant

DependentContinuant

Functioning Side-Effect, Stochastic Process, ...

Function

Page 160: Ontology and the Future of Biomedical Research Barry Smith

Top-Level OntologyContinuant Occurrent

IndependentContinuant

DependentContinuant

Quality Function Spatial Region

Functioning Side-Effect, Stochastic Process, ...

instances (in space and time)

Page 161: Ontology and the Future of Biomedical Research Barry Smith

Smith B, Ceusters W, Kumar A, Rosse C. On Carcinomas and Other Pathological Entities, Comp Functional Genomics, Apr. 2006

Page 162: Ontology and the Future of Biomedical Research Barry Smith

everything here is an independent continuant

Page 163: Ontology and the Future of Biomedical Research Barry Smith

Functions, etc.

Some dependent continuants are realizable

expression of a geneapplication of a therapycourse of a diseaseexecution of an algorithmrealization of a protocol

Page 164: Ontology and the Future of Biomedical Research Barry Smith

Functions vs Functionings

the function of your heart = to pump blood in your body

this function is realized in processes of pumping blood

not all functions are realized (consider the function of this sperm ...)

Page 165: Ontology and the Future of Biomedical Research Barry Smith

Concepts

Biomedical ontology integration will never be achieved through integration of meanings or concepts

The problem is precisely that different user communities use different concepts

Concepts are in your head and will change as your understanding changes

Page 166: Ontology and the Future of Biomedical Research Barry Smith

ConceptsOntologies represent types: not

concepts, meanings, ideas ...Types exist, with their instances, in

objective reality– including types of image, of imaging

process, of brain region, of clinical procedure, etc.

Page 167: Ontology and the Future of Biomedical Research Barry Smith

Rules on typesDon’t confuse types with wordsDon’t confuse types with conceptsDon’t confuse types with ways of

getting to know typesDon’t confuse types with ways of

talking about typesDon’t confuses types with data about

types

Page 168: Ontology and the Future of Biomedical Research Barry Smith

Some other simple rules for high quality ontologies

Page 169: Ontology and the Future of Biomedical Research Barry Smith

Univocity Terms should have the same meanings

on every occasion of use.They should refer to the same kinds of

entities in realityBasic ontological relations such as is_a

and part_of should be used in the same way by all ontologies

Page 170: Ontology and the Future of Biomedical Research Barry Smith

Positivity

Complements of types are not themselves types. Hence terms such as

non-mammal non-membrane other metalworker in New Zealand

do not designate types in reality

Page 171: Ontology and the Future of Biomedical Research Barry Smith

Ontology of types logic of termsThere are no conjunctive and

disjunctive types:

anatomic structure, system, or substance

musculoskeletal and connective tissue disorder

rheumatism, excluding the back

Page 172: Ontology and the Future of Biomedical Research Barry Smith

ObjectivityWhich types exist in reality is not a

function of our knowledge.Terms such as

unknownunclassifiedunlocalizedarthropathies not otherwise specified

do not designate types in reality.

Page 173: Ontology and the Future of Biomedical Research Barry Smith

Keep Epistemology Separate from OntologyIf you want to say that

We do not know where A’s are located

do not invent a new class of A’s with unknown locations(A well-constructed ontology should grow linearly; it should not need to delete classes or relations because of increases in knowledge)

Page 174: Ontology and the Future of Biomedical Research Barry Smith

Syntactic SeparatenessDo not confuse sentences with terms

If you want to say

I surmise that this is a case of pneumonia

do not invent a new class of surmised pneumonias

Page 175: Ontology and the Future of Biomedical Research Barry Smith

Single Inheritance

No kind in a classificatory hierarchy should have more than one is_a parent on the immediate higher level

Page 176: Ontology and the Future of Biomedical Research Barry Smith

Multiple Inheritance

thing

car

blue thing

blue car

is_a is_a

Page 177: Ontology and the Future of Biomedical Research Barry Smith

Multiple Inheritance

is a source of errorsencourages lazinessserves as obstacle to integration with

neighboring ontologieshampers use of Aristotelian methodology

for defining terms

Page 178: Ontology and the Future of Biomedical Research Barry Smith

Multiple Inheritance

thing

car

blue thing

blue car

is_a1 is_a2

Page 179: Ontology and the Future of Biomedical Research Barry Smith

is_a Overloading

The success of ontology alignment demands that ontological relations (is_a, part_of, ...) have the same meanings in the different ontologies to be aligned.

Page 180: Ontology and the Future of Biomedical Research Barry Smith

Example: is_a is pressed into service by the GO to express location

is-located-at and similar relations are expressed by creating special compound terms using:

site of …… within …… in …extrinsic to …

yielding associated errors

Page 181: Ontology and the Future of Biomedical Research Barry Smith

e.g. errors with ‘within’lytic vacuole within a protein storage

vacuole

lytic vacuole within a protein storage vacuole is-a protein storage vacuole

Compare:embryo within a uterus is-a uterus

Page 182: Ontology and the Future of Biomedical Research Barry Smith

similar problems with part_of

extrinsic to membrane part_of membrane

Page 183: Ontology and the Future of Biomedical Research Barry Smith

CompositionalityThe meanings of compound terms

should be determined 1. by the meanings of component terms

together with2. the rules governing syntax

Page 184: Ontology and the Future of Biomedical Research Barry Smith

Why do we need rules/standards for good ontology?

Ontologies must be intelligible both to humans (for annotation and curation) and to machines (for reasoning and error-checking): the lack of rules for classification leads to human error and blocks automatic reasoning and error-checking

Intuitive rules facilitate training of curators and annotators

Common rules allow alignment with other ontologies

Page 185: Ontology and the Future of Biomedical Research Barry Smith

When we annotate the record of an experiment

we use terms representing types to capture what we learn about:– this experiment (instance), performed here

and now, in this laboratory– the instances experimented upon

These instances are typical = they are representatives of types – of experiment (described in FuGO)– of gene product molecules, molecular

functions, cellular components, biological processes (described in GO)

Page 186: Ontology and the Future of Biomedical Research Barry Smith

Experimental records

document a variety of instances (particular real-world examples or cases), ranging from instances of gene products (including individual molecules) to instances of biochemical processes, molecular functions, and cellular locations

Page 187: Ontology and the Future of Biomedical Research Barry Smith

Experimental records

provide evidence that gene products of given types have molecular functions of given types by documenting occurrences in the real world that involve corresponding instances of functioning.

They document the existence of real-world molecules that have the potential to execute (carry out, realize, perform) the types of molecular functions that are involved in these occurrences.

Page 188: Ontology and the Future of Biomedical Research Barry Smith

Motivation: To capture realityInferences and decisions we make are

based upon what we know of reality.An ontology is a computable

representation of biological reality, which is designed to enable a computer to reason over the data we collect about this reality in (some of) the ways that we do.