collaborative ontology building: so much more than authoring an ontology

36
Collaborative Ontology building: So much more than authoring an Ontology Robert Stevens BioHealth Informatics Group The University of Manchester Manchester United Kingdom [email protected]

Upload: robertstevens65

Post on 12-Nov-2014

126 views

Category:

Science


2 download

DESCRIPTION

Keynote talk at Workshop on Collaborative Construction, Management and Linking of Structured Knowledge (CK 2009), Washinton, 2009

TRANSCRIPT

Page 1: Collaborative Ontology building: So much more than authoring an Ontology

Collaborative Ontology building: So much more than authoring an

OntologyRobert Stevens

BioHealth Informatics GroupThe University of Manchester

ManchesterUnited Kingdom

[email protected]

Page 2: Collaborative Ontology building: So much more than authoring an Ontology

Overview

• An experiment in collaborative authoring• Issues raised• Observations made• The process and the artefact• Bits of technology

Page 3: Collaborative Ontology building: So much more than authoring an Ontology

Ontologists: What’s their Problem?

David RandallManchester Metropolitan University

Page 4: Collaborative Ontology building: So much more than authoring an Ontology

What do I Know about Collaborative Ontology Authoring?

• “you’ve never built a real ontology”• Advisor in projects• Experiments in collaborative authoring• Doing it for real in a Kidney and urinary Pathway

Ontology• Informal observational studies with collaborative

protégé

Page 5: Collaborative Ontology building: So much more than authoring an Ontology

The Software Engineering Life-CycleOntolo

gy

Page 6: Collaborative Ontology building: So much more than authoring an Ontology

Issues in OntologyAuthoring

SCOPESCOPE

COMPLEXITYCOMPLEXITY

COSTCOST

AUTHORINGAUTHORING

EVALUATIONEVALUATION

http://ontogenesis.ontonet.org/ppt/Issues_mindmapSB.pdf

Page 7: Collaborative Ontology building: So much more than authoring an Ontology

The NCL Study• A small group met to normalise the OBO Cell

Ontology (CL)• Transform an axiomatically lean hand-crafted

“tangled” ontology to:• An axiomatically rich ontology where the structure is

computationally maintained• Study the process and deliver the artefact• http://www.gong.manchester.ac.uk/CTON.html• Two two day meetings; videoed and observed by an

ethnographer• Part of the OntoGenesis network

Page 8: Collaborative Ontology building: So much more than authoring an Ontology

Contractile cell CL

Page 9: Collaborative Ontology building: So much more than authoring an Ontology

What is Ontology Normalisation?

• Hand-crafted ontologies with multiple inheritance are “tangled”

• Usually axiomatically lean• We classify along one axis and use

“restrictions” to other modules to capture other axes

• Then re-build the multiple inheritance using the axiomatically rich ontology

Page 10: Collaborative Ontology building: So much more than authoring an Ontology

Tangled Ontology of Cars

Tangled Untangled Inferred

Page 11: Collaborative Ontology building: So much more than authoring an Ontology

Contractile cell nCL

Page 12: Collaborative Ontology building: So much more than authoring an Ontology

The People

• Ten people “friends and family”• All some sort of biologists• All familiar with OWL and normalisation• All “singing from the same hymn sheet”

Page 13: Collaborative Ontology building: So much more than authoring an Ontology

The Overall Process• Analyse issues in current OBO CL• Determine primary axis of classification• Identify supporting ontologies• Identify properties and design patterns; determine

representation• Gather knowledge• Generate OWL encoding• Evaluate, iterate• Two face to face meetings; separate work; email and

skype

Page 14: Collaborative Ontology building: So much more than authoring an Ontology

Questions Raised

• When do we work as a larger group; smaller groups and singly?

• What resources do we use?• Who knows what?• What strategies do we use?• What expertise do we need?• What are the vested interests?

Page 15: Collaborative Ontology building: So much more than authoring an Ontology

Producing the “schema”

• What is it we want to say about cells?• How do we want to say it?• Most time was spent on these questions (one day)• Best Face to face as the whole group• Perhaps a fait accompli in the large• Lots of modifications through debate• Strong chair and process (“bhenevolent

dictatorship”)

“what about sea urchins?”

“what about sea urchins?”

Ethnographer’sObservations!

Ethnographer’sObservations!

I don’t knowabout plantsI don’t knowabout plants

Page 16: Collaborative Ontology building: So much more than authoring an Ontology

NCL Schema Captured in a Spreadsheet

Term Name CTO id ploidy morphologyCellular component size germ line nucleation process

slow muscle cell CL:0000189

PATO:0001873

GO:0030017 ; GO:0005739 Large n/a

PATO:0001908 GO:0031444

blue sensitive photoreceptor cell CL:0000495

PATO:0001394

PATO:0001154 ; PATO:0001873 Large Somatic

PATO:0001407

GO:0050908 ; GO:0007603

green sensitive photoreceptor cell CL:0000496

PATO:0001394

PATO:0001154 ; PATO:0001873 Large Somatic

PATO:0001407

GO:0050908 ; GO:0007603

R1 photoreceptor cell CL:0000687

PATO:0001394 ?? Variable Somatic

PATO:0001407

GO:0050908 ; GO:0007603

Page 17: Collaborative Ontology building: So much more than authoring an Ontology

CL normalisation Workflow

Ontology API

Page 18: Collaborative Ontology building: So much more than authoring an Ontology

CL Spreadsheet

Page 19: Collaborative Ontology building: So much more than authoring an Ontology

The Ontology Preprocessor Language

• Adding “select”, “add” and “remove” keywords to MOS

• A “scripting” language for OWL• We generate a list of instructions to build an

ontology• We can embed patterns in to this generation• Saves “mouse clicks”• Rapid production of large amounts of ontology• Easy to apply changes; acts as a macro language

Page 20: Collaborative Ontology building: So much more than authoring an Ontology

OPPL sampleADD Class: CL_0000811;REMOVE subClassOf owl:Thing;ADD label ``CD8-positive, alpha-beta immature T cell'';ADD subClassOf cto:Cell;ADD subClassOf cto:has_ploidy some pato:PATO_0001394;ADD comment ``MORPHOLOGY: pleiomorphic'';ADD comment ``CELULAR COMPONENT: '';ADD subClassOf cto:has_size some cto:Small;ADD comment ``GERM LINE: n/a'';ADD subClassOf cto:has_nucleation some pato:PATO_0001407;ADD subClassOf cto:participates_in some go:GO_2456;ADD subClassOf cto:participates_in some go:GO_0021700;ADD subClassOf cto:participates_in some go:GO_0032940;ADD comment ``PROCESS: '';ADD comment ``LINEAGE: mesoderm'';ADD subClassOf cto:appears_in some cto:Animalia;ADD comment ``ORGANISM COMMENT: '';ADD subClassOf cto:potentiality some cto:TerminallyDifferentiated;

Page 21: Collaborative Ontology building: So much more than authoring an Ontology

What we GenerateClass: 'CD8-positive alpha-beta immature T cell'

SubClassOf: Cell, has_morphology some pleomorphic, has_nucleation some mononuclete, has_ploidy some diploid, has_potentiality some TerminallyDifferentiated, derives_from some 'double-positive alpha-beta immature T cell', located_in some 'Animalia',

not (participates_in some gametogenesis), participates_in some 'T cell mediated immunity', participates_in some 'developmental maturation', participates_in some 'secretion by cell'

Page 22: Collaborative Ontology building: So much more than authoring an Ontology

A Defined ClassClass: “diploid cell”EquivalentTo: cellThat has_ploidy some diploid

• Picks up all cells that has_ploidy some diploid• Trivial, but difficult to do by hand and be complete

Class: “germline cell”EquivalentTo: cellThat (participates_in some gametogenesis) or

(directly_derived_from some gamete)

Page 23: Collaborative Ontology building: So much more than authoring an Ontology

The Representation

• Aligning with RO and most OBO conventions• Red_blood_cell participates_in some

Oxygen_transport• Red_blood_cell has_disposition some

(realisable_entity that is_realised_in some oxygen_transport)

• First is simple and useful, but not actually true• Second is more ontologically formal and “right”, Can

easily expand the “schema” to either representation• Do experiments with patterns

Page 24: Collaborative Ontology building: So much more than authoring an Ontology

Entity Quality or Entity Property Quality Pattern?

• At least two ways of representing qualities• Need only one instance of a quality type inhering in

each entity• has_quality exactly 1 diploid • coupled with has_quality max 1 ploidy• Otherwise:• has_ploidy some diploid • has_ploidy is functional and in property hierarchy

under has_quality• Again, applying patterns is easy; do experiments;

gain consistency

Page 25: Collaborative Ontology building: So much more than authoring an Ontology

Time Spent

• First two day meeting• One day “planning the schema”• Half a day describing 30 cells and producing

an ontology• An hour or so evaluating and re-generating• Quick iterations and always having an

ontology to look at

Page 26: Collaborative Ontology building: So much more than authoring an Ontology

The Second Meeting

• Six months gatherhing material • An hour or so of review all together• Pairs adding more material• A review• More pair work• More review• Then dispersed activity (all “spare time”)• Short iteration periods (in terms of work spent)

Page 27: Collaborative Ontology building: So much more than authoring an Ontology

Resources used

• Brain power;• The Web – Wikipedia is our friend• Other ontologies• Text books (minor use)• Research papers• The developing ontology and the reasoner• Phone a friend (who is an authority in the

field?)

Page 28: Collaborative Ontology building: So much more than authoring an Ontology

Identifying Issues in OBO CL

• CL generated in a few days and not really touched (not true now)

• Lots of well recognised issues: Wrong biology; missing biology; ontological defects; …

• Still observed to be very useful• Issues gave us some “tests”

Page 29: Collaborative Ontology building: So much more than authoring an Ontology

Identifying Supporting Ontologies

CL Ontology

PATO Qualities

GO

Biological Process

GO

Cellular Component

NCBITaxonomy

FMA Anatomy

Nucleation

Morphology

Size

Ploidy

Muscle ContractionSecretion

Bacillus anthracis str. Ames

ChloroplastCell Membrane

Epithelium

Kidney

Page 30: Collaborative Ontology building: So much more than authoring an Ontology

“It lets me do the biology” • Is what one of our biologists said• I can see what we’ve said about a cell• I can see where it is in the structure• I relate the two• The work is “turned around”: thinking about the biology and

its consequences• P1: flight muscle cell, thats interesting ... no, a cardiac muscle

cell is not a skeletal muscle cell!! • P2; a flight muscle cell is never a cardiac muscle cell.’• “Why has it put it there?”• Hereit” is the reasoner

Page 31: Collaborative Ontology building: So much more than authoring an Ontology

Strategies

• Pinning down the scope: Only cells in vivo• Dealing with a representative set of cells:

developing a test plan• Collective wisdom: testing against current

knowledge – “pericytes”• Concentrating on biology and less on ontology

egineering• Using the owners and authorities

Page 32: Collaborative Ontology building: So much more than authoring an Ontology

Being “Agile”

• Software engineering has moved on from simplistic life cycles

• Agile methods are the fashion• Embedding users• Always have something working• Test driven development• Short iterations• Deliver early

Page 33: Collaborative Ontology building: So much more than authoring an Ontology

Observations on Collaboration• The work is not mechanical• It involves extensive synchronous face-to-face work on

deciding on scope and purpose• It relies on a socially distributed expertise, and ‘knowing

who knows’• It involves the synchronous or rapid use of a number of

different artefacts, and an understanding of how best to use them.

• It involves constant ‘testing’ and the delaying of final decisions through ambiguity resolution and error checking, and the constant recording of rationales for decision-making

Page 34: Collaborative Ontology building: So much more than authoring an Ontology

The New KUPO Process

CollaborativeSpreadsheetCollaborativeSpreadsheet

Individual SpreadsheetIndividual

Spreadsheet

Semantic WikiSemantic Wiki

Issue TrackerIssue Tracker

OPPLScript

Formulation

OPPLScript

Formulation

Generate OWL

Generate OWL

Reasoned OntologyReasoned Ontology

View OntologyView Ontology

Page 35: Collaborative Ontology building: So much more than authoring an Ontology

Summary

• Mass direct authoring of an ontology seems bad• In NCL we only used Protégé to “look at it” – no

hand-building• Mass knowledge gathering and commenting seems

good• Keeping “Agile” seems good• Doing too much by hand seems bad• Developing the schema in a team seems good • The team should have a coherent, non-clashing

interests

Page 36: Collaborative Ontology building: So much more than authoring an Ontology

Acknowledgements

• Mikel Aranguren and Simon Jupp for slides• Mikel Aranguren, Simon Jupp, Helen

Parkinson, Phil Lord, David Shotton, James Malone, Jonathan Bard, Midori Harris did the work

• Dave Randall did the ethnography• The EPSRC for funding OntoGenesis