dimitris koureas, phd natural history museum london linking layers of biodiversity data: informatics...
TRANSCRIPT
Dimitris Koureas, PhDNatural History Museum London
Linking layers of biodiversity data:Informatics challenges for the long tail research
RDA - Long Tail IG breakout sessionAmsterdam, 23 Sep 2014
The problem – Capturing and integrating biodiversity data
How to we join up these activities? How do we use this as a tool? Species conservation & protected areas
Impacts of human developmentBiodiversity & human health
Impacts of climate changeFood, farming & biofuels
Invasive alien species
What infrastructures do we need?(technologies, tools, standards…)What processes do we need?(Modelling, workflows…)What data do we need?(Genes, localities…)
LinkD
Challenge 1: mobilising data at all scales
LinkD
Challenge 2: linking & aggregating data at different scales
National Efforts c.5M(e.g. NHM Data Portal)
Communities c.50k(e.g. Scratchpads)
Global Efforts c.500M(e.g. GBIF Data Portal)
LinkD
Challenge 3: Synthesising data, e.g. modelling human pressures on biodiversity
www.predicts.org.uk
Projecting Responses of Ecological Diversity In Changing Terrestrial Systems
2M records, 19k sites, 34k spp.
Management Practices
Ecosystems Agro-systems
Small aggregated datasets
Species richness in different ecosystems
• Land-use change• Pollution• Invasive species• Infrastructure
Models to predict how biodiversity responds to human pressures
The problem – integrating biodiversity research
Figure from Costello M.J et al, 2013. doi: 10.1126/science.1230318
c. 17000 new sp and subsp. described every year
The problem – integrating biodiversity research
Key problems• Landscape is complex, fragmented & hard to navigate• Many audiences (policy makers, scientists, amateurs, citizen scientists)• Many scales (global solutions to local problems)
Figure adapted from Peterson et al 2010
An informaticians view of biodiversity
An informaticians view of biodiversity
Investigator-focused 'small data‘
Locally generated 'invisible data'
'incidental data'
dark data
20%
80%
Published and discoverable data
Dark data more important mainly due to their volume1
1Heidorn PB. Library Trends 57:280-299
Incentives for mobilising long-tail research
Leverage effort and data impact
Increase exposure and citability of work
Provide easy to use and long-lasting VRE
Promote the culture of openness in science
Increase exposure and citability of work
Scholarly data publication
Enable easy publication of data and data descriptors
Link data journals with data sources (repositories, VREs) using common data exchange standards
Small data contributions
Leverage effort and data impact
Virtual Research Environments
Empower researchers through development and deployment of service-driven digital research environments
515 Scratchpad Communities
by 6,321 active registered users
covering 176,950 taxa
in 932,296 pages. 134 paper citations in 2013
In total more than
2,500,000 visitors
Leverage effort and data impact
Long tail data External data & services
Leverage effort and data impact
Enable long tail researchers to do science online by processing own data together with data from cross-disciplinary sources
Provide workflows for the processing of data in major areas of biodiversity research: ecological niche modelling, ecosystem functioning, and taxonomy.
The BioVeL approach
Design and Construct – Run – Share and Discover scientific workflows
Leverage effort and data impact
A highly dynamic but fragmented landscape
Data curation
Data publishing
Data mobilisation &
generation
Data analysis
Leverage effort and data impact
Seamless virtual research environments that incentivise mobilisation of long tail research
H2020 2015 VRE Proposal: LinkD
Topic:EINFRA-9-2015Virtual Research Environments
Estimated Budget:€ 8-9 m
Consortium:c. 24 partners
LinkDLinking data, services and communities for predictive modelling of the biosphere
Deliver a coherent and accessible ecosystem of federated services and deploy a network of research and collaboration enabling tools to support scientific excellence towards the long term vision of predicting modelling of the biosphere
Builds upon:ViBRANT | BioVeL | pro-iBiosphere | EU-BON
Strategic links to:ESFRI projects (incl. LifeWatch, ELIXIR)