willem coetzer south african institute for aquatic biodiversity...

25
The SAIAB Biodiversity Data Curation Platform Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstown

Upload: others

Post on 01-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

The SAIAB Biodiversity Data Curation Platform

Willem Coetzer South African Institute for Aquatic Biodiversity

Grahamstown

Page 2: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Overview

Introduction to SAIAB

Specimen collections

Research platforms

Case-study using BRUV data

The natural science collections community

Natural Science Collections Facility (NSCF)

The role of SAIAB in hosting museums’ biodiversity data for the NSCF

Concluding remarks

Data publication

The SAIAB Biodiversity Data Curation Platform

Page 3: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

http://www.saiab.ac.za/information-brochures.htm Introduction to SAIAB

Page 4: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Collections Platform

• National Fish Collection

Page 5: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Research platforms (for use by the SA scientific community)

• African Coelacanth Ecosystem Programme (ACEP): Marine Platform

• Acoustic Tracking Array Platform (ATAP)

• Marine Remote Imagery Platform (Ma-RIP) *

• Vessels and instruments (e.g. remotely operated vehicle & remote underwater camera)

Page 6: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African
Page 7: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Stills from remote underwater video

Page 8: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Baited remote underwater video (BRUV)

How would you characterise these data?

a) Marine biodiversity data

b) Fish data

c) Behavioural data

d) Ecological data

e) Camera-trap data

f) All of the above

Page 9: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African
Page 10: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Baited remote underwater video (BRUV)

How would you characterise these data?

a) ‘Marine biodiversity data’: A marine biodiversity scientist can model knowledge…

b) ‘Fish data’: An ichthyologist can model knowledge…

c) ‘Behavioural data’: An ethologist can model knowledge…

d) ‘Ecological data’: An ecologist can model knowledge …

e) ‘Camera-trap data’?

Fundamentally we are talking about ‘raw data records’ of events and occurrences. We describe

these using metadata, which is an organised form of knowledge (i.e. a computer file).

We create a knowledge model of abstract concepts (e.g. an ontology) in a particular domain.

The knowledge model can then be applied to the raw data, and used as a ‘cookie cutter’ to extract useful or actionable biodiversity information from the knowledge/data fusion.

Page 11: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Metadata properties of the Occurrence class

Page 12: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African
Page 13: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Workbench: Mapping BRUV spreadsheet columns to Specify fields

Page 14: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African
Page 15: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African
Page 16: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African
Page 17: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Vertical integration of events and occurrences

Page 18: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Natural Science Collections Facility (NSCF)

• The NSCF (virtual facility) is a network of institutions/museums holding natural science collections that are accessible to external researchers (open data).

• The NSCF will ensure that natural science collections/data are used for research.

• The NSCF is funded by DST’s long-term funding programme, the Research Infrastructure Roadmap (SARIR).

Page 19: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Biodiversity Data Curation Platform (hosted by the SAIAB Data Centre)

• Three data centres (2 replicated in real time and 1 for backup)

• Systems Administrator / Data Custodian

• Museum web servers hosted by SAIAB Data Centre

• Specify Software (Specify Collections Consortium, USA)

• Specify 7 (web version)

• Information Manager / Data Steward

• Primarily for the use of SAIAB scientists (originally used for collection data)

• Open to all natural science museums in South Africa

• Collaborative research in capacity development, specifically for biodiversity data

curation in the context of South African natural science museums

• Not necessarily a long-term solution but is useful for NSCF objectives

Description

Intention

Page 20: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Museum Application Status of Vertebrates

Migration

Ditsong Museum Specify 7 Complete

Durban Nat Sci Museum Specify 7 Complete

Port Elizabeth Museum Specify 7 Complete

Albany Museum Specify 6/7

East London Museum Specify 7 Complete

KwaZulu-Natal Museum Specify 7 Complete

McGregor Museum Specify 7 Complete

Participating museums

Page 21: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Biodiversity Data Curation Platform: Support

Page 22: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

• Data hosted by SAIAB is replicated across three independent data centres (>250m

apart)

• Two data centres replicate in real time, and the third is dedicated to storing

backups.

• High-capacity tape backup will be added in the near future

• As an additional measure, cloud storage is used to store daily extracts of Specify

databases, which are retained for one year.

Biodiversity Data Curation Platform: Backup

Page 23: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Specify 7 Workbench Application to import a spreadsheet (e.g. web application used by

data steward at a remote museum) *

Page 24: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Publication of data (Global Biodiversity Information Facility)

Page 25: Willem Coetzer South African Institute for Aquatic Biodiversity Grahamstownbiodiversityadvisor.sanbi.org/wp-content/uploads/2018/12/... · 2018-12-06 · Willem Coetzer South African

Concluding Remarks

• Museum / research institute dichotomy; reflected in the culture and attitude to data

• Capacity development for data curation in the biodiversity community

• The quality of data, and the adherence to metadata standards

• Encourage researchers to publish data (e.g. Biodiversity Data Journal)