ontology at manchester

32
Ontology at Manchester Robert Stevens BioHealth Informatics Group School of Computer Science University of Manchester [email protected]

Upload: robertstevens65

Post on 11-May-2015

86 views

Category:

Science


1 download

TRANSCRIPT

Page 1: Ontology at Manchester

Ontology at Manchester

Robert Stevens

BioHealth Informatics Group

School of Computer Science

University of Manchester

[email protected]

Page 2: Ontology at Manchester

2

Ontology Research at Manchester

Language and Reasoning

Tools

Modelling

Page 3: Ontology at Manchester

3

So what is an ontology?

Catalog/ID

Thesauri

Terms/glossary

Informal Is-a

FormalIs-a

Formalinstance

Frames(properties)

General Logicalconstraints

Valuerestrictions

Disjointness,Inverse, partof

Gene Ontology

Mouse AnatomyEcoCyc

PharmGKB

TAMBIS

Arom

After Chris Welty et al

Page 4: Ontology at Manchester

4

A Definition

o a set of logical axioms designed to account for the intended meaning of a formal vocabulary used to describe a certain (conceptualisation of) reality [Guarino 1998]

o “conceptualisation of” inserted by me

o “Logical axioms” means a formal definition of meaning of terms in a formal language

o Formal language—something a computer an reason with

o Use symbols to make inferences

o Symbols represent things and their relationships

o Making inferences about things computationally amenable

Page 5: Ontology at Manchester

5

OWL

• Ontologies will form the back bone of the semantic web

• OWL is the latest standard in ontology languages from the W3C

• Layered on top of RDF and RDF Schema• Underpinned by Description Logics

Page 6: Ontology at Manchester

6

OWL represents classes of instances

A

BC

Page 7: Ontology at Manchester

7

Interpretations

• Individuals are interpreted as objects

• Classes are interpreted as sets containing objects

• Properties are interpreted as binary relations on objects

Page 8: Ontology at Manchester

8

Logical Descriptions

• Class: Water• EquivalentTo: Molecule that

– madeOf 1 OxygenAtom and– madeOf 2 HydrogenAtom and– madeOf only (OxygenAtom or HydrogenAtom)

Class: WaterSubClassOf: Molecule that

hasBoilingPoint value 100 and

hasFreezingPoint value 0 and

hasState some Liquid

*Not beautiful modelling….!*

Page 9: Ontology at Manchester

9

Reasoning

• These OWL descriptions can be submitted to a DL reasoner

• Translated into DL• Checked for consistency—is what we’ve said

satisfiable• Also infers subsumption hierarchy implied by

statements• Mistakes all too easy without help• Formality is your friend

Page 10: Ontology at Manchester

10

Language & Reasoning

• Supporting ontology engineering by automated reasoning– Classification– Consistency checking– Query answering

• Say the things you want to say and still reason• Explain reasoning results• Help debugging unexpected results• Supporting modularity in ontologies• Segmenting large ontologies into modules

Page 11: Ontology at Manchester

11

Language & Reasoning

• Inspecting ontologies to find missing knowledge

• Scalability: Larger ontologies; faster reasoning; more instances; more expressivity

• Instance Store: Query answering over vast numbers of instances

Page 12: Ontology at Manchester

Old Protégé (matrix wizard)

Page 13: Ontology at Manchester

New Protégé (matrix tab)

Page 14: Ontology at Manchester

SWOOP (crop circles)

Page 15: Ontology at Manchester

15

ComparaGRID

Page 16: Ontology at Manchester

16

Classsifying Protein Phosphatases

• Annotating a genome’s proteins is a bottleneck

• Classifying proteins is a first step to annotation

• Tools for detecting features• Need human knowledge to determine class

membership• Can we capture “how to recognise a

phosphatase” in an ontology?

Page 17: Ontology at Manchester

17

Definition of Tyrosine Phosphatase

Class: TyrosinePhosphatase Complete

(Protein and - (contains atLeast-1

ProteinTyrosinePhosphataseDomain) and- (contains 1 TransmembraneDomain))

Page 18: Ontology at Manchester

18

Definition for R2A Phosphatase

Class R2A Complete(Protein and - (contains 2 ProteinTyrosinePhosphataseDomain) and- (contains 1 TransmembraneDomain )and - (contains 4 FibronectinDomains) and- (contains 1 ImmunoglobulinDomain) and- (contains 1 MAMDomain) and- (contains 1 Cadherin-LikeDomain) and- (contains only (TyrosinePhosphataseDomain or

TransmembraneDomain or FibronectinDomain or ImnunoglobulinDomain or Clathrin-LikeDomain or ManDomain)))

Page 19: Ontology at Manchester

19

Building the Ontology

• Classifications already made by biologists – based on protein functionality;

• Protein domain composition and other details in the literature;

• Some 50 classes of phosphatase, 30 protein domains and 39 relationships;

• ”Value partition” of protein domains (covering and disjoint);

• Defines range of contains property;• Literature contains knowledge of how to recognise

members of each class of phosphatase.

Page 20: Ontology at Manchester

20

Incremental Addition of Protein Functional Domains

Phosphatase catalytic

Cadherin-like

Immunoglobulin

MAM domain Cellular retinaldehyde

Adhesion recognition Transmembrane

Fibronectin III Glycosylation

Page 21: Ontology at Manchester

21

Classification of the Classical Tyrosine Phosphatases

Page 22: Ontology at Manchester

22

What is the Ontology Telling Us?

• Each class of phosphatase defined in terms of domain composition

• We know the characteristics by which an individual protein can be recognised to be a member of a particular class of phosphatase

• We have this knowledge in a computational form• If we had protein instances described in terms of

the ontology, we could classify those individual proteins

• A catalogue of phosphatases

Page 23: Ontology at Manchester

23

Classification of Protein Tyrosine Phosphatases

Page 24: Ontology at Manchester

24

Results

• Human “gold standard”: Same results plus two more

• Partially annotated A. fumigatis: Better results and two new putative phosphatases

• Easily generated and compared phosphatase profiles

• Parasites• Whole range of unexpected results---back to

bioinformatics sequence analysis

Page 25: Ontology at Manchester

25

myGrid Service Ontology

• myGrid services and workflow toolkit

• Web service discovery and composition

• Semantic content of provenance repository

• Wide use of service ontology

• Links wit BioMOBY

• Workflows as knowledge management

Page 26: Ontology at Manchester

26

Informal Modelling

• OWL is formal, but ontology has a long informal stage

• Tool forms of knowledge elicitation techniques such as card sorting and laddering

• Experiments with text to ontology tools • With suitable text can truncate the informal

stage• Provide useful starting points for later stages

Page 27: Ontology at Manchester

27

Casual Modelling

• OWL can be scary

• Need the equivalent of pseudo-code

• Work on concept maps as an elicitation tool

• Convertible to OWL

• Converting spreadsheets to OWL

• Converting thesaurae to OWL

Page 28: Ontology at Manchester

28

Community Building of Ontologies

• Collaboration with University of British Columbia, Vancouver

• No money and no centre: What can you do?

• Use your community to build, extend, check facts in your ontology

• Currently running experiments

Page 29: Ontology at Manchester

29

The Sealife Browser

• An EU project to build a Semantic Grid browser for the life sciences

• Uses ontology as background knowledge• Dynamically link to terms on a page• Link to tools, data, documents, etc• A semantic shopping cart• Need to use a broad range of ontologies and

many conversions

Page 30: Ontology at Manchester

30

Modelling Biology & Medicine

• Describing biological phenomena• Reconciling descriptions• Analysing biological data• Describing and analysing healthcare

records• Guiding annotation: Creating and filling

forms• Describing medical phenomena

Page 31: Ontology at Manchester

31

Outside Relationships

• BioPAX

• FUGO/OBI

• Plant Ontology

• CBIO

• HL7, …

Page 32: Ontology at Manchester

32

Training

• Introductory OWL tutorials: Non-biological

• Advanced tutorial: Biology orientated

• Hundreds trained in UK and overseas (mainly life sciences)

• Hands-on training