seadatanet ontology use case roy lowry british oceanographic data centre coastal atlas...

12
SeaDataNet Ontology Use Case SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007 July 17-19 2007 (+ Lessons Learned) (+ Lessons Learned)

Upload: prosper-mosley

Post on 11-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007 (+ Lessons

SeaDataNet Ontology Use Case SeaDataNet Ontology Use Case

Roy Lowry

British Oceanographic Data Centre

Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007

(+ Lessons Learned)(+ Lessons Learned)

Page 2: SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007 (+ Lessons

SummarySummary

What is SeaDataNet?

Some SeaDataNet semantic issues

What has SeaDataNet done?

What is SeaDataNet going to do?

Is SeaDataNet relevant to CAI?

Page 3: SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007 (+ Lessons

What is SeaDataNet?What is SeaDataNet?

SeaDataNet in a Nutshell Combine over 40 oceanographic data centres across

Europe into a single interoperable data system

Approach is to adopt established standards and technologies wherever possible

Two phases:

One brings 12 centres together with centralised metadata and distributed data as files. Due fully operational in autumn 2008 (beta next February)

Two introduces data virtualisation, aggregation, cutting and 30 more centres. Due in 2010

Project is well on its way up the interoperability operational implementation curve

Page 4: SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007 (+ Lessons

SeaDataNet Semantic Issues SeaDataNet Semantic Issues

The major problem facing the project is heterogeneous legacy content SeaDataNet inherited 3 independently-developed

metadatabases

Each is heavily populated (3000-30000 records)

Each had its own independently developed controlled vocabularies

These vocabularies

– Covered overlapping domains

– Said similar things in different ways

– Provided a shining example of how NOT to manage vocabularies

Page 5: SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007 (+ Lessons

Brief DiversionBrief Diversion

Vocabularies can have two types of governance Content governance

Mechanism for making decisions on vocabulary population– Expected deliverables include:

» Vocabulary standards and internal consistency» Change on a timescale matching the needs of

the user community» Terms with definitions!!!

Technical governance Vocabulary storage, maintenance and serving

– Expected deliverables include:» Convenient access to up to date vocabularies» Clear, rigorous vocabulary versioning» Version history through audit trails» Maintenance that doesn’t break user systems

Page 6: SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007 (+ Lessons

SeaDataNet Semantic IssuesSeaDataNet Semantic Issues

Vocabulary content governance Done by individuals who were often inadequately qualified

to do the job Metadata entry form with an ‘Add to Vocabulary’ button

used by students

Vocabulary technical governance Scattered files on servers or inaccessible database tables Multiple data models (e.g. some with abbreviations, some

without) No versioning Vocabularies updated by destructive overwrites

Harmonisation required for related vocabularies Within centralised metadata Between partner local systems and centralised metadata

Page 7: SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007 (+ Lessons

What has SeaDataNet Done?What has SeaDataNet Done?

Established content governance

Within SeaDataNet (TTT e-mail list)

Further afield (SeaVoX e-mail list)

Established technical governance

Adopted the NERC DataGrid Vocabulary Server

– Heavily defended Oracle back end

– Automated version and audit trail management

– Web Service API front end plus clients e.g. http://vocab.ndg.nerc.ac.uk/client/vocabServer.jsp

– Currently serving out 75 lists

Established a Mapping Infrastructure

List entries connected by SKOS RDF triples

Operational mappings between parameter vocabularies (GCMD science keywords, CF Standard Names)

Page 8: SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007 (+ Lessons

What is SeaDataNet Going To Do?What is SeaDataNet Going To Do?

Harmonise centralised metadata vocabularies or map if too hard

Map centralised vocabularies to partner system vocabularies

Build metadata crosswalks and generators (e.g. from CF) that include semantics (Use case 1)

Implement ‘Smart Discovery’ for legacy plaintext. E,g. search for pigment, find chlorophyll (Use case 2)

Establish URLs to represent vocabularies and individual entries delivering XML – probably SKOS – documents

Extend mapping efforts to other areas such as ‘devices’

Release a much improved Vocabulary Server API (mid-August)

Page 9: SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007 (+ Lessons

Is SeaDataNet Relevant to CAI?Is SeaDataNet Relevant to CAI?

This workshop is about building a coastal atlas ontology that brings together semantic resources that say similar things in different ways

The vocabulary entry semantic content may be different from oceanographic parameters, but the problem is essentially the same

If it works for SeaDataNet it will probably work for the CAI community

More important – if it didn’t work for SeaDataNet then it probably won’t work for CAI

Page 10: SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007 (+ Lessons

Is SeaDataNet Relevant to CAI?Is SeaDataNet Relevant to CAI?

What has worked for SeaDataNet: The NERC DataGrid Vocabulary Server

Content governance through a MODERATED e-mail list (also works pretty well for CF Standard Names)

Representing vocabulary terms by URNs in metadata documents

What I believe will work in the next 12 months: Semantic interoperability through mappings

The conceptual framework of RDF in general and SKOS in particular

21st Century tooling

Page 11: SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007 (+ Lessons

Is SeaDataNet Relevant to CAI?Is SeaDataNet Relevant to CAI?

What hasn’t worked for SeaDataNet: Weak content governance

Examples– Terms without definitions– Vocabularies without strict entity definitions populated by mixed

entities e.g. » helicopter = class » RRS Discovery = instance

– Vocabularies without managed deprecation

Poor technical governance

Example– A vocabulary served by:

» Dynamic web page from database» Static HTML page» ASCII file as e-mail attachment» Each having a different number of entries….

Page 12: SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, July 17-19 2007 (+ Lessons

That’s All Folks!That’s All Folks!

Thank you for your attention

Any questions?

Morals

Always provide definitions for your terms

If you are going to use vocabularies to build an ontology make sure that they are properly governed