biodiversity informatics at the natural history museum
DESCRIPTION
TRANSCRIPT
Biodiversity Informatics at the Natural History Museum
Ed BakerTerrestrial Invertebrates, Department of Life Sciences& NHM Informatics Initiative
http://dx.doi.org/10.6084/m9.figshare.722897
Science as a Slow Cooker• Only the surface visible
• Lid kept on for extended periods of time
• Uses cheap cuts of raggy meat
• Ingredient lose their nutritional value
• Children at risk due to high temperatures
http://ispiders.blogspot.co.uk/2011/11/realtime-web.html
We like data• 70 million+ specimens collected over 400 years
• 350,000+ books
• ??? Unpublished datasets in archive, notebooks, computers
• ??? In the minds of staff
How do we provide access?• Digitisation of specimens and associated data
• Scanning and transcribing books, journals, archives
• Providing tools for managing the data life cycle
• Changing the way we publish: data publication
Flowing Data
Publication
Collection Curation Use
Flowing Data
Collection Curation
Somebody retires Somebody dies Project is cancelled
Sits in desk drawer or on a hard drive until….
Flowing Data
Collection Curation Use
Data Publication
Re-use
Publication
Re-use Re-use Re-use
Flowing Data: from collection to reuse
Collection Curation Use
Data Publication
Re-use
Publication
Re-use Re-use Re-use
Collection
Citizen Science
Automated identification and monitoring
Traditional taxonomic sources
Flowing Data: from collection to reuse
Curation Use
Data Publication
Re-use
Publication
Re-use Re-use Re-use
Curation
Websites for communities to publish and curate:• Taxonomy / nomenclature• Bibliographies• Specimen information• Character matricies
Flowing Data: from collection to reuse
Use
Data Publication
Re-use
Publication
Re-use Re-use Re-use
Use: Oboe
Use: Oboe
Flowing Data: from collection to reuse
Data Publication
Re-use
Publication
Re-use Re-use Re-use
Publication (Data)
• Datasets
• Single species descriptions
• Checklists
• Software
Flowing Data: from collection to reuse
Re-use
Publication
Re-use Re-use Re-use
Publication (Research)
• Traditional research
• Systematic zoology
• Phylogeny
• Biogeography
Flowing Data: from collection to reuse
Re-use Re-use Re-use Re-use
The Problem of Scale
Data is being generated by tens of thousands of researchers, in thousands of institutions
• Hard to find what you need
• Hard to know if what you need actually exists
• Impossible to go through researcher by researcher
NHM Data Portal
• Aggregator for NHM science data
• Visualisation tools for datasets
• Allows export of NHM data for re-use
The Informatics Landscape
>18K specimen records(local small scale coverage)
>276M specimen records(worldwide coverage)
The Informatics Landscape
A webpage for every species
Aggregate specimen and observation data globally
Wikimedian in Residence
• Make NHM content available under open licenses for use on Wikimedia projects (and elsewhere)
• Reach of Wikipedia: BBC, Encyclopedia of Life
• Wikisource: Transcription and translation crowd-sourcing
Flowing Data: from collection to reuse
?
"Everybody makes mistakes. And if you don't expose your raw data, nobody will find your
mistakes." Jean-Claude Bradley
http://bit.ly/146ugIv