the crop ontology - harmonizing semantics for agricultural field data, by elizabeth arnaud

17
The Crop Ontology harmonizing semantics for agricultural field data www.cropontology.org Elizabeth Arnaud (Bioversity International) Co-authors: Leo Valette, Marie Angelique Laporte (Bioversity), Julian Pietragalla (Integrated Breeding Platform), Medha Devare (CGIAR) And all crop curators and breeders D pre-meeting to 6 th Research Data Alliance Conference, 21-22 Septemb

Upload: aims-agricultural-information-management-standards

Post on 08-Jan-2017

485 views

Category:

Science


0 download

TRANSCRIPT

Page 1: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

The Crop Ontologyharmonizing semantics for agricultural field data

www.cropontology.org

Elizabeth Arnaud (Bioversity International)Co-authors: Leo Valette, Marie Angelique Laporte (Bioversity), Julian Pietragalla

(Integrated Breeding Platform), Medha Devare (CGIAR)And all crop curators and breeders

IGAD pre-meeting to 6th Research Data Alliance Conference, 21-22 September 2015

Page 2: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

A common structured language for multidisciplinary agricultural research

– Molecular geneticists, breeders, agronomists, physiologists, highthrouput phenotyping, and crop modelers

– Enabling farmers to access information and exchange their preferences

– Calls for • a Common Terminology for annotating data• A mediation language that supports data

interpretation• Ontology for knowledge inference

Photos : courtesy of IRRI

Page 3: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

Semantic Barriers to data interpretation

• No naming convention for variables and methods of measurement which are heterogeneous

• Confusion between traits and variables• No semantic coherence

Same trait given different names or abbreviations

One trait named the same way for various species but refers to different plant structures

• Definitions and measurements are different between farmers, breeders, agronomists, modelers

• No ontology on methods of measurement for formal description

Page 4: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

The Integrated Breeding Platformwww.integratedbreeding.net

Crop Ontology provides most frequently measured traits and their standard variables for the Breeding fieldbook and for data annotation in the crop databases

• Crop Traits (agronomy, morphology, phenology, physiology, quality, stress)

• Experimental Design, trial management• Environmental factors

Page 5: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

Crop Ontology www.cropontology.org

• Banana• Barley• Cassava• Chickpea • Common bean • Cowpea • Groundnut • Lentil• Maize • Oat (Global Triticeae )• Pearl millet• Pigeon Pea• Potato• Soybean (USDA & IITA)• Sweet Potato• Rice• Sorghum• Vitis (INRA)• Wheat • Yam

Page 6: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

Extracting standard variables Trait Dictionary Template 5.0

Trait = Entity + Quality(Flower) (colour)

Trait ID CO_341:0000090Trait Flower colorEntity Flower

Attribute Colour

Trait synonyms Flower pigmentationTrait abbreviation FCL

Trait abbreviation synonyms

FlwCol

Trait description Color of the flower

Trait class Morphological traitsTrait status Recommended

Trait Xref TO:0000537

• A Trait can group several variables

Grain Weight– Weight of 100 grains expressed in g– Average weight of a grain, expressed in g– Weight of 100 grains expressed on a

categorical scale: 1=low (50-100g), 2= medium (100-150g), 3=high (150-200g)

Julian Pietragalla, IBP, Agronomist - Based in CIMMYT

Léo Valette, Bioversity, Agronomist

Page 7: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

Standard VariableMethod and scales are important information to capture for data comparison & interpretation (e.g. crop models). Current ontologies provide sometimes brief information on Methods but as a text in the attribute information

A Variable is described by the assembly: Property (Trait) + Method + Scales/units

Unique name Annotate the real value of the measurement (for

fieldbook, for databases) Proposed convention for a standard variable naming :

P_M_S•Measurement•Counting•Estimation•Computation

•Nominal•Ordinal•Numerical•Time•Duration•Text•Code

Methods types Scales/Units

Page 8: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

Online vizualizationTrait, Methods , Scales & Standard Variables

Work in progress

Page 9: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

Naming convention for standard variables

Property (Trait) Method of measurement Scale or Unit

Applicable to any type of measurement & indicators for survey, monitoring

Page 10: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

Google Cloud & API

EU-SOL - Solanaceae Breeding DBWageningen.

International cassava DB – Boyce Thompson Institute/IITA

USERSGlobal Repository of Evaluation trials – Agtrials1,410 agronomic variables are mapped to Crop Ontology traits for 29,633 trials out of 34,329 trials description

Phenomics Ontology Driven DB (PODD)

Luca Matteis, Web developer

Page 11: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

Breeding Management System

Annotation of breeders data

Page 12: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

Agronomy Ontology• Agronomic trial data are often collected, described and/or

formatted in inconsistent ways• An Agronomy ontology will support the integration of pre-

breeding, breeding and agronomy data• Combining results of field management practices x crop traits

measurements leads to fully understand how factors vary within a cropping system

• First step: Aligning with the International Consortium for Agricultural System Applications (ICASA)(http://research.agmip.org/display/dev/ICASA+Master+Variable+List ) - 600 standard variables – used by Crop Models of AGMIP and Crop Research Ontology

©Cimmyt

CROP - PLANTINGSEED TREATMENT IRRIGATIONFERTILIZERPESTICIDE SOILBIOTIC STRESSABIOTIC STRESSHARVEST-YIELD

Medha DevareData and Knowledge ManagerCGIAR Consortium Office

Work in progress

Page 13: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

ICASA Variables for Crop Models

Page 14: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

14

Common Reference Ontologies for Plants (cROP) and Tools for Integrative Plant GenomicsCommon Reference Ontologies for Plants (cROP) and Tools for

Integrative Plant GenomicsPlanteome pilot project

• Centralized platform for reference ontologies for plants • Online informatics portal for ontology-based, annotated data for plant germplasm, gene

expression, and non-model genomes• Data query, analysis, visualization and community-based annotation and curation tools

• Plant Ontology (PO)• Plant Trait Ontology (TO)• Plant Stress Ontology (PSO)• Plant Experimental Conditions

Ontology (PECO/EO)• Gene Ontology (plants)• Phenotypic Qualities Ontology (PATO)• Cell Type Ontology (CL)• Chemicals (ChEBI)• Protein Ontology (PRO)

Common Reference Ontologies for Plants and Tools for Integrative Plant Genomics

• Lead PI : Pankaj Jaiswal, • Sinisa Todorovic, Eugene Zhang Oregon State University, USA• Dennis W. Stevenson New York Botanical Garden, NY, USA,• Elizabeth Arnaud sity International, Montpellier, France; • Christopher Mungall . Lawrence Berkeley National Laboratory,

Berkeley CA, USA,• Georgios V. Gkoutos, John Doonan ; University of Aberystwyth, UK• Barry Smith, University of Buffalo, NY, USA

Page 15: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

PATO:0000122Length

CO_321:0000056Spike Length

TO:0000271Inflorescence Length?

PO:0009049Inflorescence

Narrow Synonym: spike

CO_321:0000056Spike Length

TO:0000271Inflorescence Length

PO:0009049Inflorescence

Narrow Synonym: spike

Com

poun

d m

appi

ngs

Infe

renc

e

PATO:0000122Length

Mapping Crop Ontology terms across species and to the Reference Ontologies

• Mappings performed by Marie Angélique Laporte • Automatic mapping generated by AML tool developed by Catia Pesquita [email protected] ,

Daniela Oliveira [email protected] from LaSIGE - Large-Scale Informatics Systems Laboratory ((https://github.com/AgreementMakerLight/AML-Project )

• This mapping tool is used for thesaurus alignment of FAO, CABI, NAL in the Global Agricultural Concept Server (GACS) project

Page 16: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

Future Activities• Content Expansion:

– Farmers’ preferences of Participatory Variety Selection (PVS)– Functional traits for Agroecology and Ecosystem Services restoration– Hosting Agricultural and Nutrition Technology ontology (ANT) of IFPRI– Aligning with Agrovoc, CABI, NAL thesauri for literature mining

• Community Use Expansion– Through Planteome and Divseek initiative– International Wheat Initiative– Collaborative, Open Plant Omics (COPO): a community-driven bioinformatics

platform for plant science (BBSRC)– The ISA-Tools group at Oxford and test their Statistical Method Ontology :

http://www.stato-ontology.org

April 2016: Workshop on Crop Ontology for scientists to discuss their data and the definitions of traits, present our project results – Hands on sessions; Vocamp

Sponsors, sessions conveners ?

Page 17: The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Elizabeth Arnaud

CGIAR Crop Lead Centers and partners Since 2008

Community workshop in 2014, Montpellier : http://tiny.cc/rw51ax