ontologies in physical science

22
tologies in Physical Scien Onto Workshop, ed.ac.uk 2013-04-11 An #animalgarden production Peter Murray-Rust, University of Cambridge & Open Knowledge Foundation

Upload: petermurrayrust

Post on 06-May-2015

847 views

Category:

Education


5 download

DESCRIPTION

A 5-minute presentation at University of Edinburgh for UK Ontology Workshop 2013-04-11. The animals demonstrate that ontologies can be simple and lament the lack og good ontologies in most of physical science, especially computational chemistry. Blog at http://blogs.ch.cam.ac.uk/pmr

TRANSCRIPT

Page 1: Ontologies in Physical Science

Ontologies in Physical Science

Onto Workshop, ed.ac.uk 2013-04-11

An #animalgarden production

Peter Murray-Rust, University of Cambridge

& Open Knowledge Foundation

Page 2: Ontologies in Physical Science

PMR and friends want us to help build

a computational chemistry ontology

Is it an important problem?

$1,000,000,000/yrfor compchem

They need OWL

Problem: How to build ontologies when people are uninterested or antagonisticeven though we have the technology

Page 3: Ontologies in Physical Science

Chemists don’t use ontologies

Perhaps the chemists could use OWL-DL

Top-down schemas like AniML haven’t (yet) taken off

Page 4: Ontologies in Physical Science

Are there any ontologies in physical

science that work?

Crystallo-graphers build

CIF dictionaries

The IUCr, right? Tell us about CIF

IUCr: International Union of Crystallography

Page 5: Ontologies in Physical Science

CIF Core defines 500 common

concepts

CIF: http://www.iucr.org/cif

Or the volume of the crystal cell

Like the wavelength of the radiation

used

Page 6: Ontologies in Physical Science

Core dictionary (coreCIF) version 2.4.3 _diffrn_ambient_temperatureDefinition: The mean temperature in kelvins at which the intensities were measured. Range: 0.0 -> infinity Type: numb

An example

?

ID

For humans

For machines:Constraint + type

http://www.iucr.org/__data/iucr/cifdic_html/1/cif_core.dic/Idiffrn_ambient_temperature.html

Page 7: Ontologies in Physical Science

Definition: The mean temperature in kelvins at which the intensities were measured.

So everyone converts

temperatures to use K?

Yes! today I swam at 273K

But chemists want to use all

sorts of different units

We MUST have a units

ontology

Page 8: Ontologies in Physical Science

OWL? Is CIF a proper

ontology? It’s not RDF…

…but we’ve global URIs, likecif:_diffrn_ambient_temperature

Because IUCr controls the namespace prefix: cif=http://www.iucr.org/cif

Page 9: Ontologies in Physical Science

CIF had 20 years of community involvement through IUCr

But most top-down chemistry projects don’t

work

So we’ll do this bottom-up.

Page 10: Ontologies in Physical Science

Every compchem program uses basically

the same scientific concepts

We think each should build its own dictionary so we understand the output

Won’t that just be a mess?

No. It’s the first step to interoperability.

Page 11: Ontologies in Physical Science

Chemical Markup Language PMR/Rzepa http://www.xml-cml.org

Hyperchem builds ITS dictionary

Each annotates their own

program output

NWChem builds ITS dictionary

The programs will use CML* for chemical output

Page 12: Ontologies in Physical Science

in a communal cml:compchem dictionary

that everyone uses

We agree they are the same so create

compchem:alphae

Alpha-electrons: Hyperchem uses hchem:e_alpha

NWChem has nwchem:_alpha_elec

Page 13: Ontologies in Physical Science

What if the data structure

or concepts don’t map

CML provides conventions so each group can define their data

structure

Data can then be machine validated

against each convention!

Page 14: Ontologies in Physical Science

But there are over 20 program codes.

We’ve prototyped with many before.

They’ll be encouraged

I think it’s going to work.

BUT TTT*

GULP, DPOLY,

CASTEP, SIESTA,

MOPAC …

TTT: Things Take Time (Piet Hein)

Page 15: Ontologies in Physical Science

Will it work? It depends on

people

National labs CSIRO/AU

and PNNL/US are committed

And we have companies like

Hyperchem and Kitware

I wish we had some

publishers

Page 16: Ontologies in Physical Science

We’ll need tools

We’ve got FoX* for FORTRAN output

JUMBOTemplates to parse logfiles

RDF for navigating dictionaries

FoX*: XML/FORTRAN Toby White, Andrew Walker

Page 17: Ontologies in Physical Science

Benefits of semantic dictionaries:• FORTRAN logfile can be made semantic• High degree of interoperability in chemistry• Semantic publication (HTML5, CML, MathML)• Interoperates with mainstream Web • Easily scalable to other phys sci.

Problems:• Closed code/minds is short-term market advantage• Non-trivial commitment (updates, code revision)• Getting top-down approval (e.g. IUPAC)

Page 18: Ontologies in Physical Science

Benefits of semantic dictionaries:• FORTRAN logfile can be made semantic• High degree of interoperability in chemistry• Semantic publication (HTML5, CML, MathML)• Interoperates with mainstream Web • Easily scalable to other phys sci.

Problems:• Closed code/minds is short-term market advantage• Non-trivial commitment (updates, code revision)• Getting top-down approval (e.g. IUPAC)

Page 19: Ontologies in Physical Science

Benefits of semantic dictionaries:• FORTRAN logfile can be made semantic• High degree of interoperability in chemistry• Semantic publication (HTML5, CML, MathML)• Interoperates with mainstream Web • Easily scalable to other phys sci.

Problems:• Closed code/minds is short-term market advantage• Non-trivial commitment (updates, code revision)• Getting top-down approval (e.g. IUPAC)

Page 20: Ontologies in Physical Science

Benefits of semantic dictionaries:• FORTRAN logfile can be made semantic• High degree of interoperability in chemistry• Semantic publication (HTML5, CML, MathML)• Interoperates with mainstream Web • Easily scalable to other phys sci.

Problems:• Closed code/minds is short-term market advantage• Non-trivial commitment (updates, code revision)• Getting top-down approval (e.g. IUPAC)

Page 21: Ontologies in Physical Science

Benefits of semantic dictionaries:• FORTRAN logfile can be made semantic• High degree of interoperability in chemistry• Semantic publication (HTML5, CML, MathML)• Interoperates with mainstream Web • Easily scalable to other phys sci.

Problems:• Closed code/minds is short-term market advantage• Non-trivial commitment (updates, code revision)• Getting top-down approval (e.g. IUPAC)

Page 22: Ontologies in Physical Science

Top-down schemas like AniML haven’t (yet) taken off

Chemists don’t use ANY

ontologies

Perhaps the chemists could use OWL-DL