how e-infrastructure can contribute to linked germplasm data
DESCRIPTION
TRANSCRIPT
How e-‐infrastructure can contribute to Linked Germplasm data
Giannis Stoitsis, Agro-‐Know [email protected]
e-‐conference on Germplasm Data Interoperability
Contents
• Why we need e-‐infrastructure • What e-‐infrastructure can provide • The agINFRA approach • agINFRA powered services for Germplasm data
• What is next
WHY WE NEED E-‐INFRASTRUCTURE
• publicaKons, thesis, reports, other grey literature • educaKonal material and content, courseware • primary data, such as measurements & observaKons
– structured, e.g. datasets as tables – digiKzed, e.g. images, videos
• secondary data, such as processed elaboraKons – e.g. dendrograms, pie charts, models
• provenance informaKon, incl. authors, their organizaKons and projects
• experimental protocols & methods • social data, tags, raKngs, etc. • …
agricultural data
• stats • gene banks • gis data • blogs, • journals • open archives • raw data • technologies • learning objects • ………..
educators’ view
• stats • gene banks • gis data • blogs, • journals • open archives • raw data • technologies • learning objects • ………..
researchers’ view
• stats • gene banks • gis data • blogs, • journals • open archives • raw data • technologies • learning objects • ………..
practioners’ view
• stats • gene banks • gis data • blogs, • journals • open archives • raw data • technologies • learning objects • ………..
we sKll have data silos • Many metadata standards (e.g. DC, IEEE LOM, Dw, local schemas) • Diversity of web interfaces (e.g. REST, OAI-‐PMH, SOAP, SPI, SQI) • Different exchange format (e.g. XML, RDF, JSON) • Fragmented use of texonomies
LD for educational data/resource sharing Overview Approaches for LD in educational data sharing
On the-fly/automated integration of heterogeneous APIs and data (http://www.meducator.net)
Dataset (transformation and) cataloging (http://linkedup-project.eu)
?
We are still here … … and not here …
we need ontologies published online and aligned
• stats • gene banks • blogs, • journals • open archives • raw data • learning objects
we need tools to share data
we need tools to semanKcally annotate data
and for all this we need
• aim is: promo&ng data sharing and consump&on related to any research ac&vity aimed at improving produc&vity and quality of crops
ICT for compu&ng, connec&vity, storage, instrumenta&on
data infrastructure for agriculture
what researchers need in agINFRA
… only a browser and internet connecKon
typical problem: compuKng
typical problem: hosKng
what can be hosted and executed on agINFRA
• Data storage & management tools – APIs for content disseminaKon in large networks
• Processing & visualisaKon tools • Metadata aggregaKon infra • Search engines and apps for insKtuKons or communiKes
• Environments for running experiments e.g. comparing different content recommendaKon algorithms
h[p://aginfra.eu/en/our-‐soluKon/api
HOW AGINFRA CAN SOLVE DATA INTEROPERABILITY PROBLEMS
WORKFLOW FOR METADATA AGGREGATION
metadata aggregaKons
• concerns viewing merged collecAons of metadata records from different sources
• useful: when access to specific supersets or subsets of networked collecAons – records actually stored at aggregator – or queries distributed at virtually aggregated collecKons
23
typically look like this
24 Ternier et al., 2010
metadata aggregaKon tools
More than a harvester:
q Valida&on Service q Repository So4ware q Registry Service q Harvester
25
Powered by
a metadata aggregaKon workflow that can be ported on agINFRA
HarvesKng ValidaKng Transforming
OAI target -‐ XMLs
TriplificaKon Storing and indexing
TOOLS FOR PUBLISHING AND LINKING VOCABULARIES
AGRICULTURAL DATA DISCOVERY SERVICE/PORTAL OVER THE CLOUD
agricultural data discovery modules for open source CMS
hIp://www.youtube.com/watch?v=OYlxWlyag04&feature=youtu.be
LINKING GERMPLASM DATABASES AND EXPOSING DESCRIPTIONS AS LINKED DATA
agINFRA contribuKon in germplasm data interoperability
• Define recommendaKons for describing germplasm data
• Define mappings between different metadata formats
• Provide APIs for transformaKon – triplificaKon of germplasm descripKons
mapping between different metadata formats powered by agINFRA
publishing germplasm data as linked data in agINFRA
services
next steps in the context of agINFRA
• Develop the recommendaKons for publishing germplasm data
• Deploy transformers and make them available in agINFRA
• Deploy API for triplificaKon
thank you! [email protected] www.agroknow.gr www.aginfra.eu