european internet search catalogue of ex situ pgr accessions
TRANSCRIPT
European Internet Search Catalogue of ex situ PGR accessions
EURISCO Uploading mechanism
• Principles• Framework• Data format• Responsibilities
• Mechanism • Protocols
• Import file data format / technology
• Uploading web site• EURISCO demo version
Principles (Framework)
Na t
i on a
l Le v
el
Data processing
Data packagingData sources
Ce n
t ra l
Lev
e l Data importing Data checkingData access
Principles (Data format)
• Data from National Inventory are uploaded to EURISCO in a fixed format
• to allow automation• to allow validation
• Descriptor list is only for uploading• National Inventory descriptors can be different• Scripts for conversion can be necessary
• The EURISCO descriptor list is an extension of the FAO/IPGRI Multi Crop Passport Descriptor List (MCPD) version 2 (December 2001)
• the MCPD list v2 is used without changes but with six additions
• Download the MCPD: • http://www.ipgri.cgiar.org/publications/pubfile.asp?id_pub=124• http://www.ipgri.cgiar.org/publications/pdf/124.pdf
Mechanism
(Responsibilities)
1. Responsible of data sources & merges2. Ensure data quality & accuracy 3. Decide on data availability 4. Commit to provide data in the
EURISCO-MCPD data format Nat
ion
al
1. Responsible of data checking of NI 2. Provide feedback to National Partners 3. Import NI data in EURISCO4. Develop and maintain front endCen
tral
Protocols (Data flow)
First line contains the MCPD columns descriptors. no order restrictions but has to follow the exactname of the descriptors (I.e. INSTCODE)
ASCII Tab delimited format
Next lines contains the respective MCPD descriptorsfor each accession submitted.
Essential fields required for each line:NICODE= NI identifier = ISO code = From ?INSTCODE= INST identifier = Who ?ACCENUMB= ACC identifier = What ?GENUS= Taxa identifier = Of ?
National Inventory data file…
Protocols(Import file data format)
• TAB delimited text file (for the time being)• TAB=ASCII(9) as field separator• One accession = one line• Line Break=ASCII(10) or ASCII(10)+ASCII(13) as line
separator
Descr1 -> Descr2 -> Descr3 -> Descr4 …. ->DescrX Data1 -> Data2 -> Data3 -> Data4 …. ->DataX Data1 -> Data2 -> Data3 -> Data4 …. ->DataX Data1 -> Data2 -> Data3 -> Data4 …. ->DataX Data1 -> Data2 -> Data3 -> Data4 …. ->DataX Data1 -> Data2 -> Data3 -> Data4 …. ->DataX
Protocols (Data flows)
FTPODBC
XML
Text file
Now Short-Medium term
Protocols (Database structure)
ACC
COD_INST
COD_TAX
COD_ISO
COD_SAMPSTAT
COD_COLLSRC
COD_XXXXXXX
Protocols (tools and technology)
Database software: MySQL (http://www.mysql.com)
Scripting language: PHP (http://www.php.net)
Web server: Apache (http://www.apache.org)
The objective is to develop EURISCO central node with free software.EURISCO should be able to run on most OS (Linux, Windows, Mac…)All developments including codes are freely accessible.
Protocols (Preliminary checks)
• Data format (for the time being tab delimited)• Essential descriptors
• NICODE• INSTCODE• GENUS• ACCENUMB
• Line per Line checking• Uploading report generated• Data provider contacted for approval….
Protocols (Line checks)
• Line per Line checking• Required Descriptors• Descriptors coded
• Sample status• Collecting source• With INSTCODE (BREDCODE…)
• Descriptors semi-coded• Dates• Coordinates
• Descriptors un-coded• Taxonomy
• ...
Essential !!!
Detailed checking
Basic checking
No real checking(for the time being)
Uploading web site
www.ecpgr.org/eurisco-upload/index.php
Import File
View transfer report
View file import report
View data before uploading to
EURISCO
Upload to EURISCO
View EURISCO update
reports
EURISCO web site
DEMO
Demo site: http://ipgri.singer.cgiar.org/
Demo site