csiro national research collections

16
Andrew Young – Director, (NRCA) Margaret Cawsey – Curator National Wildlife Collection John Morrissey - CSIRO IMT National Research Collections Australia (NRCA) Specimen Identifiers – possible futures!

Upload: australiannationaldataservice

Post on 14-Apr-2017

94 views

Category:

Education


2 download

TRANSCRIPT

Page 1: CSIRO National Research Collections

Andrew Young – Director, (NRCA)Margaret Cawsey – Curator National Wildlife CollectionJohn Morrissey - CSIRO IMT

National Research Collections Australia (NRCA) Specimen Identifiers – possible futures!

Page 2: CSIRO National Research Collections

Australia: a mega-diverse continentAustralia has:

• A lot of biodiversity – 8% of the Earth’s species

• Unique biodiversity – >70%+ endemic

• Valuable biodiversity – soybean, cotton, sorghum, macadamia,

acacias, eucalypts

The challenge and opportunity is to:• Manage biodiversity for conservation

and ecosystems services– species decline, Convention on Biological

Diversity

• Exploit biological assets for industry– food, fibre, medicines, novel compounds

Page 3: CSIRO National Research Collections

NRCA Mission• National Research Collections Australia (NRCA) is a world-class

“science-ready” collections research facility• It discovers, documents, describes and explores Australia’s

biodiversity• NRCA delivers digital data and science to inform the conservation

and use of Australia’s unique biological assets

Page 4: CSIRO National Research Collections

What is NRCA?

4 |

Wildlife Collection

Insect Collection

Herbarium(CANBR + ATH JVs)

Algae Culture

Collection

Tree Seed CentreStaff: 8

Fish Collection

Atlas of Living

Australia (ALA)

• Six national biological collections • 15+ million specimens• 200 year time-series (1805)• Web-based digital delivery

portal - Atlas of Living Australia (ALA)

Data from Drawers 2015 | Andrew Young

Crop germplasm Collection

Soil Collection

Page 5: CSIRO National Research Collections

What is in NRCA?• Physical specimens

– whole organisms, skins, tissues samples, DNA samples

• Living collections– cultures, seed banks, seed orchards

• Digital specimens– sounds, photographs, X ray images, DNA

sequences• Contextual data

– Location, site descriptions, species associations

• Unique $1.4+ billion research asset

Page 6: CSIRO National Research Collections

Currently each collection has its own database: • 5/6 are bespoke• Only one is run by IMT • Inefficient, ineffective and vulnerable…

Data management challenge – a single system:• 15+ million specimens x 30-40 fields = 500 000 000 records• Links to field books, living collections, nomenclature, associated samples

(e.g. seeds, tissues, DNA samples, sounds) • Loans (30 000 - 40 000pa) and curation• Room for future expansion (30 000+ pa)• New data layers e.g. genomes and phenomes• Biologically intuitive interface• Seamless data delivery to the ALA

We need Integrated data management!

Collective Access- Open source- Thin client- Fits IMT architecture- Good functionality

Page 7: CSIRO National Research Collections

Kinds of digital data

• High resolution scans/ photographs• X-rays• Fluorescent images• 3D images• Sounds• Micro CT scans

Data from Drawers 2015 | Andrew Young7 |

Page 8: CSIRO National Research Collections

You want me to re-label 15+ Million specimens?

Page 9: CSIRO National Research Collections

• Collection Management Systems‒ loans and tissue grants‒ Provenance data about each specimen‒ Taxonomy data (which can change)‒ Geospatial data

• Data standards – Compliance with Biodiversity Information Standards (TDWG) main

driver• Data sharing and discoverability

– Facilitated via metadata feeds to various domain specific aggregators like ALA and GBIF

Lets look at GUIDs from a CSIRO Natural History Collections point of view….

Page 10: CSIRO National Research Collections

•Established 1985•Initial Data standards

– Faunal communities: Darwin Core– Herbaria: HISPID 6

•GUIDs – 2006–Relatively simple requirements–The LSID: Life Science Identifier–URN technology

•2010 - URI–Semantic web and linked data

introducing TDWG

Page 11: CSIRO National Research Collections

•2001 - GBIF•2007 - ALA (Darwin Core)

– LSIDs adopted by the faunal collections– each mint their own

• But ...–Often don’t resolve–Not used by many major collections–ALA & GBIF both mint its own record identifiers

Discoverability – a brief history

urn:lsid:ozcam.taxonomy.org.au:ANWC:Birds:Specimen:B56401

Page 12: CSIRO National Research Collections

• Lists 14 recommendationsR1. GUID technologies should be chosen from the list of recommended GUID types.

• *HTTP URI (used as a basis for some of the following options)

• LSID — *Life Science Identifier• DOI — Digital Object Identifier.• PURL — Permanent URL.• UUID — Universally Unique Identifier.• Handle System

TDWG GUID applicability statement 2010

http://bioguid.info/urn:lsid:ozcam.taxonomy.org.au:ANWC:Birds:Specimen:B56401

Page 13: CSIRO National Research Collections

GUID’s can be applied to a variety of objects – Scientific names– Taxonomic Concepts– Datasets & Collections– Specimens– Genetic samples– Images, videos, sound recordings– Observations

TDWG GUID applicability statement 2010

Page 14: CSIRO National Research Collections

•GBIF• Moved to using DOIs• Recommend the adoption of an identifier scheme that would

work well with DOI’s

•TDWG• TDWG are removing recommendations to use LSID, as decided at

the TDWG Executive meeting Sept 2016

•Conclusions:• No consensus reached ...• It is unlikely that any particular GUID technology will be

successfully implemented until TDWG achieves consensus

Current situation for the natural history collections...

Page 15: CSIRO National Research Collections

So what about the rest of CSIRO’s collections?• Most are currently using a bespoke in-house Specimen ID’s within

their collection management systems. • Poor understanding of the value proposition of adopting IGSN or

any other resolvable GUID technology• Major need to have GUID’s visible beyond the collection

management system so that data and be easily linked to specimens from other systems like:

– CSIRO Data Access Portal – CSIRO Publications repository– Digitisation and Characterisation services like Australian Synchrotron– ALA, TERN and GBIF…

Page 16: CSIRO National Research Collections

John Morrissey

CSIRO IMT

NATIONAL FACILITIES AND COLLECTIONS, NATIONAL RESEARCH COLLECTIONS AUSTRALIA

Thank you