ontology at manchester

Post on 11-May-2015

87 Views

Category:

Science

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Ontology at Manchester

Robert Stevens

BioHealth Informatics Group

School of Computer Science

University of Manchester

Robert.stevens@manchester.ac.uk

2

Ontology Research at Manchester

Language and Reasoning

Tools

Modelling

3

So what is an ontology?

Catalog/ID

Thesauri

Terms/glossary

Informal Is-a

FormalIs-a

Formalinstance

Frames(properties)

General Logicalconstraints

Valuerestrictions

Disjointness,Inverse, partof

Gene Ontology

Mouse AnatomyEcoCyc

PharmGKB

TAMBIS

Arom

After Chris Welty et al

4

A Definition

o a set of logical axioms designed to account for the intended meaning of a formal vocabulary used to describe a certain (conceptualisation of) reality [Guarino 1998]

o “conceptualisation of” inserted by me

o “Logical axioms” means a formal definition of meaning of terms in a formal language

o Formal language—something a computer an reason with

o Use symbols to make inferences

o Symbols represent things and their relationships

o Making inferences about things computationally amenable

5

OWL

• Ontologies will form the back bone of the semantic web

• OWL is the latest standard in ontology languages from the W3C

• Layered on top of RDF and RDF Schema• Underpinned by Description Logics

6

OWL represents classes of instances

A

BC

7

Interpretations

• Individuals are interpreted as objects

• Classes are interpreted as sets containing objects

• Properties are interpreted as binary relations on objects

8

Logical Descriptions

• Class: Water• EquivalentTo: Molecule that

– madeOf 1 OxygenAtom and– madeOf 2 HydrogenAtom and– madeOf only (OxygenAtom or HydrogenAtom)

Class: WaterSubClassOf: Molecule that

hasBoilingPoint value 100 and

hasFreezingPoint value 0 and

hasState some Liquid

*Not beautiful modelling….!*

9

Reasoning

• These OWL descriptions can be submitted to a DL reasoner

• Translated into DL• Checked for consistency—is what we’ve said

satisfiable• Also infers subsumption hierarchy implied by

statements• Mistakes all too easy without help• Formality is your friend

10

Language & Reasoning

• Supporting ontology engineering by automated reasoning– Classification– Consistency checking– Query answering

• Say the things you want to say and still reason• Explain reasoning results• Help debugging unexpected results• Supporting modularity in ontologies• Segmenting large ontologies into modules

11

Language & Reasoning

• Inspecting ontologies to find missing knowledge

• Scalability: Larger ontologies; faster reasoning; more instances; more expressivity

• Instance Store: Query answering over vast numbers of instances

Old Protégé (matrix wizard)

New Protégé (matrix tab)

SWOOP (crop circles)

15

ComparaGRID

16

Classsifying Protein Phosphatases

• Annotating a genome’s proteins is a bottleneck

• Classifying proteins is a first step to annotation

• Tools for detecting features• Need human knowledge to determine class

membership• Can we capture “how to recognise a

phosphatase” in an ontology?

17

Definition of Tyrosine Phosphatase

Class: TyrosinePhosphatase Complete

(Protein and - (contains atLeast-1

ProteinTyrosinePhosphataseDomain) and- (contains 1 TransmembraneDomain))

18

Definition for R2A Phosphatase

Class R2A Complete(Protein and - (contains 2 ProteinTyrosinePhosphataseDomain) and- (contains 1 TransmembraneDomain )and - (contains 4 FibronectinDomains) and- (contains 1 ImmunoglobulinDomain) and- (contains 1 MAMDomain) and- (contains 1 Cadherin-LikeDomain) and- (contains only (TyrosinePhosphataseDomain or

TransmembraneDomain or FibronectinDomain or ImnunoglobulinDomain or Clathrin-LikeDomain or ManDomain)))

19

Building the Ontology

• Classifications already made by biologists – based on protein functionality;

• Protein domain composition and other details in the literature;

• Some 50 classes of phosphatase, 30 protein domains and 39 relationships;

• ”Value partition” of protein domains (covering and disjoint);

• Defines range of contains property;• Literature contains knowledge of how to recognise

members of each class of phosphatase.

20

Incremental Addition of Protein Functional Domains

Phosphatase catalytic

Cadherin-like

Immunoglobulin

MAM domain Cellular retinaldehyde

Adhesion recognition Transmembrane

Fibronectin III Glycosylation

21

Classification of the Classical Tyrosine Phosphatases

22

What is the Ontology Telling Us?

• Each class of phosphatase defined in terms of domain composition

• We know the characteristics by which an individual protein can be recognised to be a member of a particular class of phosphatase

• We have this knowledge in a computational form• If we had protein instances described in terms of

the ontology, we could classify those individual proteins

• A catalogue of phosphatases

23

Classification of Protein Tyrosine Phosphatases

24

Results

• Human “gold standard”: Same results plus two more

• Partially annotated A. fumigatis: Better results and two new putative phosphatases

• Easily generated and compared phosphatase profiles

• Parasites• Whole range of unexpected results---back to

bioinformatics sequence analysis

25

myGrid Service Ontology

• myGrid services and workflow toolkit

• Web service discovery and composition

• Semantic content of provenance repository

• Wide use of service ontology

• Links wit BioMOBY

• Workflows as knowledge management

26

Informal Modelling

• OWL is formal, but ontology has a long informal stage

• Tool forms of knowledge elicitation techniques such as card sorting and laddering

• Experiments with text to ontology tools • With suitable text can truncate the informal

stage• Provide useful starting points for later stages

27

Casual Modelling

• OWL can be scary

• Need the equivalent of pseudo-code

• Work on concept maps as an elicitation tool

• Convertible to OWL

• Converting spreadsheets to OWL

• Converting thesaurae to OWL

28

Community Building of Ontologies

• Collaboration with University of British Columbia, Vancouver

• No money and no centre: What can you do?

• Use your community to build, extend, check facts in your ontology

• Currently running experiments

29

The Sealife Browser

• An EU project to build a Semantic Grid browser for the life sciences

• Uses ontology as background knowledge• Dynamically link to terms on a page• Link to tools, data, documents, etc• A semantic shopping cart• Need to use a broad range of ontologies and

many conversions

30

Modelling Biology & Medicine

• Describing biological phenomena• Reconciling descriptions• Analysing biological data• Describing and analysing healthcare

records• Guiding annotation: Creating and filling

forms• Describing medical phenomena

31

Outside Relationships

• BioPAX

• FUGO/OBI

• Plant Ontology

• CBIO

• HL7, …

32

Training

• Introductory OWL tutorials: Non-biological

• Advanced tutorial: Biology orientated

• Hundreds trained in UK and overseas (mainly life sciences)

• Hands-on training

top related