biocase web services for germplasm data sets, at fao, rome (2006)
DESCRIPTION
Sharing of biodiversity data with web services - demonstration of the BioCASE software. Food and Agriculture Organization of the United Nations (FAO) 2nd March 2006.TRANSCRIPT
FAO, Rome, March 2nd 2006, Dag Endresen, NGB, IPGRIFAO, Rome, March 2nd 2006, Dag Endresen, NGB, IPGRI
Sharing of biodiversity data with Web Services
Demonstration of BioCASE
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 2
TOPICSTOPICS
Biodiversity data Data Standards Data exchange
tools The BioCASE data
provider software Decentralized
data network
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 3
Biodiversity collections dataBiodiversity collections data
Different Biodiversity collections data describe very similar data objects.
Preserved reference collections, such as those in museums and herbaria.
Living collections, like botanical and zoological gardens, aquaria, seed banks, microbial strain cultures and tissue collections.
Data collections, from surveys of objects in the field, such as observations.
These collections have most of their attributes in common, although the terminology used to describe them may differ substantially.[http://www.bgbm.org/TDWG/CODATA/ABCD-Evolution.htm]
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 4
Germplasm data, seed Germplasm data, seed genebanksgenebanks
Germplasm genebanks are biodiversity collections.
Collection level dataMetadata about genebank institutes and the germplasm collections they hold.
Unit level dataThe unit level data for germplasm collections are the accessions. Genebank accessions have most of the same properties and attributes as other biodiversity specimens.
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 5
Data Standards
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 6
Crop DescriptorsCrop Descriptors
The IPGRI crop descriptors (as well as other networks) is developed to meet specific needs for these crops.
The MCPD is designed to be compatible with the IPGRI crop specific descriptor lists and the FAO World Information and Early Warning System (WIEWS).
The MCPD descriptor list is compatible with ABCD (2.06).
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 7
Taxonomic Database Working Taxonomic Database Working GroupGroup
Standards development and Standards development and maintenancemaintenance
Darwin Core 2 - Element definitions designed to support the sharing and integration of primary biodiversity data". [http://darwincore.calacademy.org/]
Access to Biological Collection Data (ABCD) 2.06 - An evolving comprehensive standard for the access to and exchange of data about specimens and observations (a.k.a. primary biodiversity data)“.[http://www.bgbm.org/TDWG/CODATA/Schema/]
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 8
ABCDABCD AAccess to ccess to BBiological iological CCollection ollection DDataata
ABCD is a common data specification for data on biological specimens and observations (including the plant genetic resources seed banks).
The design goal is to be both comprehensive and general (about 1200 elements).
Development of the ABCD started after the 2000 meeting of the TDWG.
ABCD was developed with support from TDWG/CODATA, ENHSIN, BioCASE, and GBIF.
The MCPD descriptor list is now completely mapped and compatible to ABCD 2.06
[http://www.bgbm.org/TDWG/CODATA/Schema/]
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 9
PGR sub-unit of ABCDPGR sub-unit of ABCD
PGR
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 10
Generation Challenge ProgramGeneration Challenge ProgramGCP_Passport_1.03GCP_Passport_1.03
In the context of the GCP (Generation Challenge Program), the GCP Passport data exchange schema was developed.
Similar XML schema are under development for Phenotype (trait data) and Genotype.
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 11
Demo Data PortalDemo Data Portal
A demo data portal was developed, providing live access to selected BioCASE data providers.
[http://geifir.ngb.se/abcdproto/default.jsp][http://geifir.ngb.se/abcdproto/default.jsp]
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 12
Create your own BioCASE data Create your own BioCASE data schemaschema
Create an XML schema (xsd file) of your data model and copy the schema online (http://...)
Create a Concept Mapping Configuration (CMF) file from the XML schema.[http://ww3.bgbm.org/biocase/utilities/process_schema.html] (or use your own BioCASE installation ... /utilities/process_schema.html)
Save the result XML (CMF file) into your BioCASE installation cmf folder to make it available for local mapping..../biocase/configuration/templates/cmf/cmf_your-preferred-file-
name.xml
Visit : [http://ww3.bgbm.org/bps2/GenerateCmFiles] for more info!
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 13
Biodiversity informatics data exchange tools
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 14
Data Provider SoftwareData Provider Software
Distributed network of data providers retrieving structured data from multiple, distributed, heterogeneous databases across the Internet.
DiGIR, Distributed Generic Information Retrieval. [http://digir.net]
BioCASE, The Biological Collection Access Service for Europe.
[http://www.biocase.org/]
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 15
Protocol integration - TAPIRProtocol integration - TAPIR
There is a need to integrate the current protocols in use by different biodiversity informatics community networks.
During the TDWG meeting in 2004, the unified protocol was presented and named TAPIR. The TDWG Access Protocol for Information Retrieval.
New BioCASE and DiGIR software will implement the TAPIR protocol.
Will TAPIR also help us to integrate GBIF with the BioMOBY community?
[http://ww3.bgbm.org/tapir]
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 16
BioMOBYBioMOBY
BioMOBY is an international research project on methodologies for biological data representation, distribution, and discovery.
BioMOBY is chosen as the web service framework for the Generation Challenge Program[http://www.biomoby.org/]
Work is in progress to develop BioMOBY and BioCASE interoperability.
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 17
BioCASE data provider software
BioCASEBioCASEBioBiological logical CCollection ollection AAccess for ccess for
EEuropeurope
[http://www.biocase.org/]
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 18
BioCASE establish web-based unified access to biological collections in Europe while leaving control of the information with the collection holders.
ABCD is the main data definition used by BioCASE.
Designed generic to handle any schema and connect to any SQL capable database.
BioCASE provide full access to its registry for GBIF. Being a BioCASE provider thus means being a GBIF provider.
[http://www.biocase.org/]
BioCASE BioCASE Biological Collection Access for Biological Collection Access for EuropeEurope
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 19
BioCASE BioCASE [http://www.biocase.org/][http://www.biocase.org/]
BioCASE runs on MS Windows, Mac OS X, Linux, BSD, Solaris...
BioCASE works with many different databases, PostgreSQL, MySQL, Oracle, MS Access, MS SQL Server....
BioCASE works with UNICODEאבדו ضاإطقكغب ששچپچ
BioCASE is OpenSource
BioCASE is developed in the Python programming language
CVS
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 20
Distributed BioCASE networkDistributed BioCASE network
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 21
Unit DataProvider
PyWrapperXML / CGI
Provider Domain
Client
Client DomainInternet
Queryusing ABCD concepts
BioCASE Protocol
XML
http
Response
BioCASE Protocol
XML
ABCD SchemaABCDdata
XML
httpPSF
CMF
configurationXML files
SQL
BioCASE protocol stackBioCASE protocol stack
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 22
Required configuration:
Web server: Any CGI compliant web server: Apache, IIS, etc.
Database: major databases are supported, including MySQL, Oracle, SQLServer, Sybase, Access, PostgreSQL. Theoretically any database with a Python library should work.
Python (BioCASE is developed in the Python programming language. Install version 2.3 or later)
[http://ww3.bgbm.org/bps2/DocumentationToc]
[http://www.biocase.org/products/provider_software/index.shtml]
BioCASE Provider Software v BioCASE Provider Software v 2.3.1 2.3.1
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 23
Download the provider software and unzip the archive file [provider_software_2.3.1.tar.gz]
For example uncompress it into [C:\biocase\] Configure your web server to publish the
www folder. Example [C:\biocase\] to be accessible trough [http://localhost/biocase/]
Download and install the latest Python software [http://www.python.org/download/]
Execute the [C:\biocase\setup.py] script. For a UNIX like system: %> cd biocase
%> python setup.py
Test your installation [http://localhost/biocase]
BioCASE installationBioCASE installation
[http://ww3.bgbm.org/bps2/Installation]
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 24
Install third party software[http://localhost/biocase/utilities/testlibs.cgi ]
Follow the links from the Library test page.
The column for installed version will display the installed version after successful installation.
BioCASEBioCASE
To update the BioCASE software:
Download the new release. Unzip to a temporary folder. Execute the setup.py and follow
the instructions.
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 25
After successful installation you will need to configure your data provider. Follow the instructions from the BioCASE documentation to configure
Data sources. If you provide more datasets or several databases they will be configured as individual data sources.
Database connection. So the software can access your database.
Database structure. Define the relevant tables, the primary keys and foreign keys.
Data model. Map your database model to the standard represented by the XML Schemas you choose.
BioCASE configurationBioCASE configuration
[http://ww3.bgbm.org/bps2/Configuration]
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 26
Example of a service Example of a service requestrequest
All exchanged data is formatted with XML tags.
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 27
Example of a service Example of a service responseresponse
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 28
TAPIRTAPIR
TAPIR will offer you more advanced request formats.
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 29
TAPIR service requestTAPIR service request
TAPIR will offer you more advanced request formats.
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 30
TAPIR service responseTAPIR service response
singer:/sourcenamesinger:/taxonomy/genussinger:/taxonomy/speciessinger:/taxonomy/subspeciessinger:/holding/IDsinger:/holding/namesinger:/origin/collecting/
countrysourcesinger:/origin/collecting/
countrysourceIDsinger:/status/biologicalstatussinger:/status/biologicalstatusID
...
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 31
Decentralized data network with web services
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 32
Data warehouse modelData warehouse model(Slide by Samy Gaiji, IPGRI)(Slide by Samy Gaiji, IPGRI)
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 33
Decentralized modelDecentralized model(Slide by Samy Gaiji, IPGRI)(Slide by Samy Gaiji, IPGRI)
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 34
Data flow from genebanks to Data flow from genebanks to EURISCO and ECCDBs EURISCO and ECCDBs
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 35
Decentralized modelDecentralized model
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 36
Genebanks on BioCASEGenebanks on BioCASE
The BioCASE data provider software has been implemented at (almost) all the CGIAR germplasm centers during the autumn of 2005.
Several other genebanks have installed the GBIF web service technology. Nordic Gene Bank, IPK Gatersleben, IHAR (DiGIR), USDA GRIN, CGN, more to follow soon...
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 37
Germplasm data indexing Germplasm data indexing toolstools
We are building data indexing methodologies for access to germplasm data with BioCASE.
This is planned to build a Germplasm Clearing House Mechanism.
Development in cooperation with GBIF, which themselves index basic biodiversity data from a similar approach.
[http://chm.grinfo.net/index.php]
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 38
BioCASE and germplasm dataBioCASE and germplasm data[http://chm.grinfo.net/index.php?app=data_providers]
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 39
Global Unique Identifiers, GUID (LSID, Life Science Identifiers) [http://lsid.sourceforge.net/]
Biodiversity informatics workflow tools (BioMOBY and Taverna, Kepler and SEEK...)
Germplasm Clearing House Mechanism [http://chm.grinfo.net/]
TAPIR
Works in progressWorks in progress
Sharing of biodiversity data with BioCASE, March 2, 2006, FAO, RomeSharing of biodiversity data with BioCASE, March 2, 2006, FAO, Rome 40
Thank you for listening!