2 nd training workshop 4 – 5 june 2007 common data index - cdi by dick m.a schaap technical...

24
2 2 nd nd Training Workshop 4 – 5 June Training Workshop 4 – 5 June 2007 2007 Common Data Index - CDI Common Data Index - CDI By Dick M.A Schaap By Dick M.A Schaap Technical Coordinator SeaDataNet Technical Coordinator SeaDataNet

Upload: anis-davis

Post on 25-Dec-2015

222 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

22ndnd Training Workshop 4 – 5 June 2007 Training Workshop 4 – 5 June 2007

Common Data Index - CDICommon Data Index - CDI

By Dick M.A SchaapBy Dick M.A SchaapTechnical Coordinator SeaDataNetTechnical Coordinator SeaDataNet

Page 2: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

Common Data Index - CDI

• The CDI gives users highly detailed insight in the availability and geographical spreading of marine data across the different data centers in SeaDataNet.  

• The CDI provides a fine-grained index to individual data sets, managed by partners.

• The CDI paves the way to direct online data access or online requests for data access.

Page 3: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

CDI set-up

SearchGO

Metadata repository

Datacenter1

Datacenter2

Datacenter3

Datacenter4 …..

XML Stream

Page 4: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

CDI governance

• SeaDataNet Data centres are responsible for setting-up CDI generation and reporting systems at their own data centres and later on for supporting other data centres in their country to join this system.

• Data centres prepare CDI entries

• Central coordinator (=MARIS) has overall CDI coordination, compiles European directory, performs automatic quality control and makes CDI online available for users

Page 5: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

Present status

• CDI pilot initiated in the Sea-Search project with 10 partners, providing subsets of their data holdings.

• In SeaDataNet the CDI Version 0 metadatabase has been expanded with a wider coverage of the data sets of the 10 pilot partners + 5 additional core partners are in the process of contributing their data sets.

• So far the CDI Version 0 metadatabase contains > 220.000 entries of individual data sets.

Page 6: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

Present status

http://www.sea-search.net/cdi

Page 7: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

Targets in the coming months

• Expand the CDI Version 0 metadatabase with the CDI entries of the other 25 data centres in SeaDataNet – October 2007

• The result will be a complete index to all data sets managed by SeaDataNet partners and options for access / requesting access, provided by the local systems of partners.

• Upgrade the CDI Version 0 to Version 1 for the 11 members of the Technical Task Team, which will include a uniform data shopping, user authorisation and downloading mechanism – fully operational in February 2008.

Page 8: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

Objective of this Workshop

• To give background on the Common Data Index, its format, principles, etc.

• To instruct the 25 partners how to prepare CDI entries for their data holdings, using the CDI tools

Page 9: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

What is a Data set for CDI?

The definition of a dataset is arbitrary for each partner, because of the differences in granularity that partners are applying in their archive systems for storing and accessing datasets. A number of examples are given to illustrate how a Data set is considered for CDI.

Page 10: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

Case: CTD measurements

CTD casts are collected at different geographical locations, e.g. during a scientific cruise. Each cast is represented by a data file / dataset, which is stored and can be reproduced by a Data centre. The CDI record reflects the metadata of a single CTD cast, covering multiple parameters, and including the information for accessing this specific CTD dataset or getting a copy of this specific CTD dataset.

The same approach can be applied for other in-situ and discrete measurements, such as sediment grabs, geological cores, water bottles, etc.

Page 11: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

Case: Sea level / wave / current observations

Hydrodynamical observations are collected at different geographical locations, and might be part of a station, equipped with a number of instruments. Each instrument produces a timeseries of observation data, which is stored and can be reproduced by a Data centre.

The CDI record reflects the metadata of the resulting dataset of a single instrument at a single station, covering multiple parameters, and including the information for accessing this specific dataset or getting a copy of this specific dataset.

Page 12: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

Case: Sea level / wave / current observations

This can cover a long timeseries. In case there are large gaps in time coverage, the Data centre might have decided to split the dataset into a sub series of datasets, each covering a consistent observation period. In that case each sub serie is represented by a separate CDI record.

Page 13: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

Case: Hydrographic measurements

The bathymetry of the seabed is measured by hydrographic surveys, which cover a specific area. Each survey can comprise a consistent timeperiod in which the area is sampled by sailing a number of tracks, during which the seabed bathymetry is recorded in singular tracks or zones by specific instruments. Each instrument during a specific survey produces a hydrographic survey dataset, which is stored and can be reproduced by a Data centre.

Page 14: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

Case: Hydrographic measurements

The CDI record reflects the metadata of the resulting data set of a single instrument during a single area survey, including the information for accessing this specific dataset or getting a copy of this specific dataset. The same approach can be applied for other area measurements, such as seismic surveys, satellite images, etc.

Page 15: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

CDI V0 format – ISO 19115

For purposes of standardisation, international exchange and interoperability with other systems and networks it was decided to adopt the International Metadata Standard for Geographic Information ISO 19115.

 

The XML ISO 19115 schema (or DTD) is defined and managed by the Technical committee TC211 of the International Organization for Standardization (ISO), who is responsible for making international standards on geographic information (www.isotc211.org).

Page 16: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

CDI V0 format – ISO 19115

This standard defines more than 300 metadata elements, most of which can be applied optionally. It contains around ten elements, which are mandatory ‘core’ metadata. Moreover one can create profiles and add new elements.

The CDI metadata are constructed as a dedicated subset of this standard.

Comprehensive metadata

components

Core metadatacomponents

Extended metadata components(community profile)

300+ elements

Page 17: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

CDI V0 format

CDI documentation can be found at:

http://www.sea-search.net/cdi_documentation

This page gives access to:

• Document with detailed description of format and XML tags

• ISO 19115 XML schema (XSD)

• Supporting Vocabularies

• Tools

• Example

Page 18: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

CDI V0 format

Entries to CDI consist of:

• Descriptions of each deliverable data set, supported by a number of controlled Vocabularies

Page 19: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

Controlled Vocabularies

Controlled vocabularies or lists of standard terms have been defined and are maintained for standard and systematic filling of a number of fields.

These vocabularies cover a broad spectrum of disciplines of relevance to the oceanographic and wider community.

The vocabularies are available as Web services and as an online client application, always giving access to the latest versions and values.

There is also an active content governance.

Page 20: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

Controlled Vocabularies

The SeaDataNet Vocabulary service is based upon the NERC DataGrid (NDG) vocabulary Web service, developed in 2006 and operated by BODC,

and a vocabulary Client Interface, that has been developed in 2006 and is operated by MARIS, to provide users the options to search and browse in the various vocabularies and to make and download export files of selected entries in csv format.

Page 21: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

SeaVox Content governance

Content governance of these vocabularies is very important to stay up-to-date and in sync with ongoing developments. Therefore a combined SeaDataNet and MarineXML Vocabulary Content Governance Group (SeaVoX) has been set up, moderated by BODC, and with active membership from experts from SeaDataNet, MMI, MOTIIVE, JCOMMOPS and more international groups. SeaVox operates by mailing list server.

Controlled Vocabularies

Page 22: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

European Directory of Marine Organisations - EDMO

Data Centres and Data Originators are maintained per country by the NODC in the EDMO directory

http://www.sea-search.net/edmo/welcome.htm

via its online Content Management System

http://ww.sea-search.net/vu_organisations/welcome.asp

with id-password

Relevant info for CDI editors available at:

http://www.sea-search.net/organisations/welcome.html

Page 23: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

CDI V0 format

The CDI gives answers to the following basic questions:

• Where?

• When?

• What?

• How?

• Who?

• Where to find data?

• Other relevant information?

Details are given following the CDI Documentation

Page 24: 2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet

How to apply to your situation?

Steps:

• Do a mapping analysis to map your data to CDI

• Generate an export of CDI XML files for your data sets, using the defined XML tags

• Use the XML schema (XSD) to check (parse) your XML files.

• Send a number of test XML files to MARIS for evaluation.

• If ok, prepare and send full set of XML files.