collaborative ontology development by scientists

31
Collaborative ontology development by scientists Melissa Haendel

Upload: tolla

Post on 24-Feb-2016

25 views

Category:

Documents


0 download

DESCRIPTION

Collaborative ontology development by scientists. Melissa Haendel. Setting the stage. Who we are and what do we need What are our bottlenecks: Getting info from the domain experts Ontology tools Synchronizing ontologies 3 . Modularizing anatomy ontologies - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Collaborative ontology development by scientists

Collaborative ontology development by scientists

Melissa Haendel

Page 2: Collaborative ontology development by scientists

Setting the stage

1. Who we are and what do we need2. What are our bottlenecks:

Getting info from the domain experts Ontology tools Synchronizing ontologies

3. Modularizing anatomy ontologies4. Ideas for collaborative ontology

editing

Page 3: Collaborative ontology development by scientists

Who are we? What do we want?Domain Experts:Anatomists, comparative morphologists,developmental biologists, immunologists, neuroscientists, etc.

Ontologists:Biologists-gone-informatics, computer scientists and logicians

Engineers:Our tool builders

Ontologies and tools to develop them

Domain experts: want to query for gene expression and phenotypes across species

Ontologists: have to be able to interpret and

represent domain knowledge

computationally

Engineers: have to build tools that can

consume ontologies and give the

Domain Experts the right results

Page 4: Collaborative ontology development by scientists

Anatomy and phenotype ontologies have work hard for us

Ontologies must be intelligible to:

Humans Machines

Enable comparison of structures across different organisms Standardization of vocabulary among communities Integration across databases Query across large amount of data Automatic reasoning to infer related classes Error checking Annotation consistency

Page 5: Collaborative ontology development by scientists

Term needed for annotation

Ontology development workflow and bottlenecks

reconcile

Page 6: Collaborative ontology development by scientists

Term requested

Ontology development workflow and bottlenecks

reconcile

Page 7: Collaborative ontology development by scientists

Term discussed by community

Ontology development workflow and bottlenecks

reconcile

Page 8: Collaborative ontology development by scientists

Ontology development workflow and bottlenecks

reconcile

Page 9: Collaborative ontology development by scientists

GO

CL

CARO

TAOAAO

XAO ZFA

MA

MP UBERON

Ontology development workflow and bottlenecks

reconcile

Synchronize?

Page 10: Collaborative ontology development by scientists

1) Extracting domain knowledge into an ontology efficiently

2) Multiple ontology editing tools, each with pros and cons, neither easily used by domain experts

3) Synchronization across interoperable ontologies

Three bottlenecks

Page 11: Collaborative ontology development by scientists

How can we increase the efficiency of extracting knowledge from domain experts?

An example of what has worked well so far:

1862 Christian Schussele

Familiar tooling: Google docs, Phenote, ExcelVisualization: Cmap, Vue, GraphViz

Need too merge different sources of informationNeed a way to get this information into a computable form

Page 12: Collaborative ontology development by scientists

Two ontology editors (and viewers) commonly used by the biomedical community

http://oboedit.org/

OBOEdit- OBO ontology editor and viewer

Protégé - OWL ontology editor and viewer

http://protege.stanford.edu/

Both tools are non-trivial to learn to use Neither have a lot of bulk operations, import/export different formats easily, or deal with synchronization readily

There is a barrier for domain experts to contribute knowledge, and a bottleneck for editors to get this knowledge into ontologies efficiently

More biologist-friendly (thank you John!)

Tool used by broader community

Page 13: Collaborative ontology development by scientists

How to synchronize ontologies

Mapping (bioportal set, ..) Direct reconciliation (TAO and ZFA) Synchronization using imports

Three approaches:

Page 14: Collaborative ontology development by scientists

Ontology mappings are often not useful

FMA (human) tibia FBbt (fruitfly) tibia

FMA extensor retinaculum of wrist

MA retina

GAZ (geography) Colon FMA (human) Colon

ZFA (zebrafish) aortic arch MA (mouse) arch of aorta

GAZ (geography) Serpentine CHEBI (chemistry) serpentine

Dictyostelium giant cell FMA giant cell

ZFA (zebrafish) blastoderm Fbbt blastoderm stage

PATO (quality) male Chebi (chemical) maleate 2(-)

(For anatomy, you may want to remove the mappings that NCBO Bioportal creates for your ontology and/or ask not to allow mapping)

Page 15: Collaborative ontology development by scientists

Zebrafish terms are is_a subtypes of teleost terms

is_a

Zebrafish Anatomy Teleost Anatomy Ontology

Reconciliation and linking between TAO and ZFA

Logic implemented via Xrefs- difficult to keep synchronizedXrefs logic can be less clear and more difficult to use

Page 16: Collaborative ontology development by scientists

Synchronization by import across ontologies

One can import a whole ontology or just portions of another ontologyMIREOT: Minimum information to reference an external ontology term

This strategy requires better facilities while editing

CARO

VAO

Present TAO Modularized ontology

Page 17: Collaborative ontology development by scientists

OntoFox: a Web Server for MIREOTing Good things: Based on MIREOT principle Web-based data input and output Output OWL file can be directly imported in your ontology No programming needed Programmatically accessible

Improvements: Integration into ontology editing tools More customizable http://ontofox.hegroup.org

Page 18: Collaborative ontology development by scientists

We need synchronization solutions that are integrated within ontology editing

tools

Page 19: Collaborative ontology development by scientists

What IS the anatomy ontology landscape?How can we efficiently build our anatomy

ontologies to be most interoperable?

We could have built: A single ontology for ontology editors and consumers Different editors have editing rights to different ontology partitions

- by taxon- by domain (e.g. neuroscience, skeletal anatomy)

No taxon-specific subtypes- use structure, function etc. as differentia

Dynamic views according to user needs

Page 20: Collaborative ontology development by scientists

Ontology landscape model view

cell tissue

muscletissue

mesonephros

limb

antenna

weberian ossicle

mammary gland

nervous system

mollusc foot

tentacle

mantle

pupal DN3 period neuron

mushroom body

brachial lobe

pons

vertebravertebralcolumn

circulatory system appendage

mesoderm

gut

tibia

gland

bone

skeletaltissue

parietalbone

fin

gonad

trachea

respiratoryairway

link(small sample)

tibiafibula

larva

user/editorview

metencephalon

neuroview

skeletalview

mammalianview

ventralnervecord

molluscview neuro

view

skeletalview

Page 21: Collaborative ontology development by scientists

Proposed model moving forward

Maintain series of ontologies at different taxonomic levels- euk, plant, metazoan, vertebrate, mollusc, arthropod,

insect, mammal, human, drosophila Each ontology imports/MIREOTs relevant subset of

ontology “above” it- this is recursive

Subtypes are only introduced as needed Work together on commonalities at appropriate

level above your ontology

Page 22: Collaborative ontology development by scientists

zebrafish

caro / uberon/allcell tissue

metazoa

muscletissue

vertebrata

mesonephros

limb

arthropoda

antenna

teleost

weberian ossicle

mammalia

mammary gland

nervous system

mollusca

foot

cephalopod

tentacle

mantle

drosophila

neuron types XYZ

mushroom body

brachial lobe

NO pons

vertebravertebralcolumn

circulatory system

appendage

mesoderm

gut

tibia

gland

bone

skeletaltissue

parietalbone

fin

gonad

trachea

respiratoryairway

cross-ontologylink (sample)

amphibia

tibiafibula

larva

shellcuticle

skeleton

import

mouse human

Model view

Page 23: Collaborative ontology development by scientists

Idealized protocol for new AOs

1. Collect draft list of terms2. Subdivide roughly into applicability at taxonomic

levels3. Request new terms from existing AOs above you4. Is a new mid-level AO required?

- yes – collaborate and create, go to 1.5. Import pre-reasoned subset from next AO above6. Build your ontology (David will take it from here in his talk later today)

Page 24: Collaborative ontology development by scientists

Modularizing ontologies- positive reinforcement Identify key points of integration between ontologies Modularize based on domain or taxon

Import and reuse rather than cross-referencing or “aligning”

Let the reasoner help do the work Work together to distribute work

Page 25: Collaborative ontology development by scientists

• To get the imports working well• To have distributed social responsibility assigned• Design patterns to ensure we are all doing the same thing• To check for consistency and errors across multiple ontologies using reasoners to get correct results for all users

-These ontologies are supposed to be orthogonal but aren’t always

• Visualization tools that can aid non-ontology experts in identifying errors across multiple ontologies

Modularizing ontologies – We need:

Page 26: Collaborative ontology development by scientists

Returning to the bottlenecks in our process…Looking for solutions

Need easy-to-use tools for information captureIdeally based on existing familiar toolsAuto-populated from/to ontologiesSocial management - who is responsible for what

Need better import/export functionality: - into/out of ontology editors from simple collection tools- from a myriad of ontology sources

Need better interoperability between editors/formatsNeed enhanced bulk operations

Need to know specific requirements for building tools and user feedback

Need money and opportunities to interact (like this one!)

Page 27: Collaborative ontology development by scientists

Existing tools for collaborative ontology editing don’t quite get us there

Google Refine has nice features for manipulating data, including RDF exports, but isn’t collaborative

Mapping Master for Protégé enables generation of OWL from spreadsheets, but is not collaborative and requires ontology knowledge

Web Protégé isn’t fully-fledged and is not useful for non-technical contribution

Page 28: Collaborative ontology development by scientists

Ideas for collaborative ontology editing

Extracted from ontology with perl script Need to be edited by domain experts, and then

converted back in OWL Need to be merged with existing OWL file

Example: File extracted from ontology for this meeting:

There is a better way…..

Page 29: Collaborative ontology development by scientists

Ideas for using Google Docs Enable creation of Google spreadsheets that curators and domain

experts can edit with the following features:

Tell Google spreadsheet which columns are which from ontology input file: labels, parents, URIs, xref, class, etc

Live-updated with latest external ontology versions using SPARQL

Export OBO/ RDF/ OWL serialization Enable search on external ontologies via autocomplete Track changes

This will solve some of the sync problems because the queries are executed whenever the doc is open or updated

Page 30: Collaborative ontology development by scientists

Ideas for using Google Docs Enable creation of Google Drawings that curators and

domain experts can edit with the following features: Import of external ontologies Have relations and classes exported out from Google Drawing Export OBO/ RDF/ OWL serialization Linked to Google Spreadsheet Track changes

Page 31: Collaborative ontology development by scientists

Ontology editor dreamsA truly collaborative web-based editing platform (a

la Web Protégé) compatible with OWL and OBOSupporting:

Import and export of customizable spreadsheets from Google Docs

Creation of “live templates” (spreadsheet in synch with SPARQL endpoints)

Supports MIREOT import Users roles and permission Web based versioning