integrated solutions for wheat data sets at cimmyt - … solutions for wheat data sets at cimmyt ......

28
Integrated solutions for wheat data sets at CIMMYT David Marshall Clermont-Ferrand November 2015

Upload: vanbao

Post on 20-Mar-2018

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Integrated solutions for wheat data sets at CIMMYT

David Marshall

Clermont-Ferrand November 2015

Page 2: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

CIMMYT plant breeding

Broad scope of maize and wheat breeding

Land races in genebanks to elite lines

Maize or wheat growing areas in developing countries

300+ partners in germplasm testing

Annual budget US $ 150 million

Restricted project funding

Total staff 1500

9 Biometrics + consultants

8 in germplasm IT

Page 3: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Vision for breeding and Breeding IS

High‐throughput and lots of data on:

Genealogy

Phenotyping

Genotyping

Environmental and sensor data

Many options for integrated analysis of data

Page 4: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Simplified model of a breeding information system

Databases@ Phenotype, Genealogy, Seed Inventory

Data access tools

Data Collections Tools Field book, Field Log,Sample tracker,

LIMS

Read and write to DB

Data query and analysis tools, Statitics, visualisation, datamarts

Mainly read from DB

Page 5: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Simplified data model for breeding data

Experimental design

ManagementEnvironment

Plant/plots

Genealogy

Phenotyping

Molecular markers

Seed

Page 6: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Phenotypic data

Foundation for breeding decisions

Expensive to generate, and cost of traditional phenotyping

increasing

Costly to have people walk around repeatedly to each plot

Quality issues in phenotypic data

Potential for enhanced genetic gain

More effective / precise phenotyping, better decisions

More efficient phenotyping, larger breeding populations

In cereal breeding we need high throughput (precision?)

phenotyping

Page 7: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Remote sensing potential

Availability of low cost UAV & light high resolution

hyperspectral remote sensors are game changers for use of

remote sensing in phenotyping

Airborne remote sensing can be used for non‐destructive

screening of plant physiological properties

Enough resolution to obtain information at plot level while

being able to measure several hundreds plots in one take

Potential benefits of remote sensing

Reduce cost in human resources and time

Increase size of breeding cycle and higher selection intensity

Quicker selection process

Page 8: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Traits and current challenges

Potential traits or measures include for example:

Canopy temperature

Early vigor or biomass

Grain yield estimates

Flowering date

Plant height

Challenge 1: High‐throughput automated analysis procedures

must be developed to process high volumes of data

Challenge 2: Research needed on testing trait measure in

combinations of: Spectral band/ indices, time of measure, and

environmental or management conditions

Page 9: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Climate Data from Partners Metservices

CRU

Wordclim

NASA GPM

NASA Power

Issues: Accessibility, costs, quality, calibrations,

extrapolated data, time series, daily data, coverage: only

ca 5000 stations for the whole African continent

Page 10: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Climate data: On Station

Investment of between 1,000 to 15,000 USD for met stations

Mainly depending on durability, connectivity options and sensors

Maintenance (clogging pluviometer, insects, birds), have to be

calibrated, at least one person in charge, one technician in the

country who can travel for multi sites

Connectivity:

Cable

Wifi

GSM modem

Problems with sites without connectivity

Page 11: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Genealogy data

Simple data to document pedigree / family relationship

among breeding material

Very cheap data to generate

Full potential requires discipline, coordination, and central

genealogy data base

Main challenge is to render the information according to

crop / breeding program traditions

Useful for example for

Analyzing sources of traits

Calculating Coefficient of Parentage

Page 12: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Genotypic data

Getting cheaper, but phenotyping/genotyping data cost

not as drastically different as e.g. cattle or trees

We need to manage quality genotypic data at scale

Sample generation e.g. seed chipping or similar

Sample tracking

Ship for genotyping & get data

Quality control

Analysis of data in time for selection decisions

Page 13: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Particular challenges with crop research information systems Scientist unaware of cost of technological choices

Changes in breeding cost structure not reflected in budgets

IT staff don’t understand the biology of breeding (or

biometrics)

IT staff mostly trained on administrative systems with defined

workflows

Scientist change approach, workflow, data etc. frequently

Breeding is large numbers game, and automation valuable

Research institutions hesitant to impose institutional standards

or changes on research staff

Page 14: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Conceptual data integration model

Page 15: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Toolbox and collaborators

Page 16: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family
Page 17: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Georeferenced Passport and Climate Data

Page 18: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Flapjack components

Overview map

Traitheat map

Zoom

QTL tracks

Status info

Genotype display

Window map

Page 19: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Let’s start with a simple example

Page 20: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Sorting by traits – habit and row type

Page 21: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Sorting by similarity to line

Page 22: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Select markers under QTL

Page 23: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Other similarity options

Page 24: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Germplasm relatedness

Helium

Page 25: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Some wheat specific Issues at CIMMYT

Heterogeneity of marker platforms depending on source of

funding and which partners are involved

This imposes limitations on integration of genotypic data

Possibility of using genomics contigs/scaffolds as integration

substrate for wheat and for comparative cereal genomics

Need for a simple interface into the wheat genome from

marker based maps, GWAS etc.

Generic map dressed with landscape of known important

wheat genes

Heterogeneity of data sources and analysis tools.

Page 26: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Plant Breeding API

Page 27: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Some Conclusions

Page 28: Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ... Vision for breeding and Breeding IS ... Simple data to document pedigree / family

Acknowledgments

Jens Riis‐Jacobsen, Jose Crossa, Juan Burgueño, Maria

Tattaris, Kai Sonder, Sarah Hearne, Kate Dreher

Iain Milne, Gordon Stephen, Paul Shaw Sebastian

Raubach

IBP Team

DArT team

Breeding API group