2015. jason wallace. applying high throughput genomics to crops for the developing world

31
Applying High-Throughput Genomics to Crops for the Developing World Jason Wallace Cornell University

Upload: foodcrops

Post on 11-Jan-2017

949 views

Category:

Science


0 download

TRANSCRIPT

Page 1: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Applying High-Throughput Genomics to Crops for the Developing World

Jason Wallace Cornell University

Page 2: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

The big picture: Global food security

Photo credit: NASA

• Food security means reliable access to food of sufficient quality and quantity to lead an active and healthy life1

• 842 million people worldwide are food insecure2

• Increasing food security is one of the surest ways to improve health, educational attainment, and political stability

1 Paraphrased from FAO, Declaration of the World Summit on Food Security, 2009 2 FAO, The State of Food Insecurity in the World, 2013

Page 3: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Major constraints on food security

Environmental variability

Projected surface temperature change3

Negative side-effects

Erosion Pollution NOAA

Deforestation Rhett Butler

Changing consumption habits

Fat & oil Fish

Dairy Meat Fruits

Cereals Vegetables

1.0 2.0 3.0

Consumption (Billion tonnes/year) 2

1 UN Department of Economic and Social Affairs, World Population Prospects: The 2012 Revision. 3NOAA GFDL Climate Research Highlights Image Gallery 2Kearney 2010, Phil Trans Roy Soc B 365

Increasing population

4

Po

pu

lati

on

(b

illio

ns)

1

6

8

~9 billion by 2050

10

12

2

2010 2030 2050

Today

Page 4: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Reaching the goal Improved

crops Government

Policies

Agronomic Practices

Infrastructure development

Technology Development

Agroecology

Consumer habits

Market Incentives

Page 5: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Co

st/m

ega

bas

e

$1

$0.1

$10

$100

$1K

$10K

Year 2000 2005 2010 2015

The golden age of crop genetics

• Modern sequencing is opening the floodgates to genetic analysis

0

10

20

30

40

50

60

Ge

no

me

s seq

uen

ced

Total plant genomes sequenced2

Moore’s Law Cost of sequencing1

Sequencing trends over time

2 Michael & Jackson 2013, The Plant Genome 6 1 Wetterstrand KA. DNA Sequencing Costs, available at: www.genome.gov

Page 6: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Case studies outline Barnyard Millet

Diversity Analysis Pearl Millet

Genetic Map Creation Maize

Trait Mapping

Shramajeevi Agri Films

Page 7: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Case studies outline Barnyard Millet

Diversity Analysis Pearl Millet

Genetic Map Creation Maize

Trait Mapping

Shramajeevi Agri Films

Page 8: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Case Study 1: Barnyard millet diversity

Shramajeevi Agri Films

Barnyard Millet (Echinochloa spp.)

• Barnyard millet (Echinochloa spp.) is an important alternative crop in southern and eastern Asia

• Two species: E. colona (India) and E. crus-galli (Japan)

• Also grown as a forage crop in the US and Japan (“billion-dollar grass”)

• Goal: Characterize the newly created core collection at ICRISAT using genome-wide marker data

Page 9: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Genotyping-by-sequencing GBS • Created for high-throughput, semi-automated

genotyping

Sequencing adaptor Barcode

Sticky ends

Genomic DNA

Images: Qiagen, Illumina, Elshire et al 2011, PLoS ONE

Restriction digest

Sequence Ligate adaptors

Isolate DNA

Pool & amplify

Sample plants

• Advantages • One step SNP discovery + genotyping

• Simple protocol; no reference required

• Large numbers of SNPs found cheaply

• Broadly applicable

• Drawbacks • False SNPs from

sequencing errors

• Missing data from stochastic sampling

Page 10: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Cleaning up the data

• Have ~20,000 SNPs after basic filtering

• Problem: Both barnyard millet species are hexaploid -> false SNPs due to paralogs

Minor Allele Frequency

Re

lati

ve a

bu

nd

ance

Minor Allele Frequency

Re

lati

ve a

bu

nd

ance

Combined pop. E. colona E. crus-galli

Differentially segregating alleles

Filter by “heterozygosity”

Site Frequency Spectrum (raw) Site Frequency Spectrum (filtered)

Wallace et al. 2015, Plant Genome (in press)

Ideal

Paralogs

Page 11: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Phylogenetics

• Phylogeny splits the two species as expected

• Population structure within species closely matches phylogeny and geography

E. colona E. crus-galli

Potential hybrids

Wallace et al. 2015, Plant Genome (in press)

Page 12: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Outline Barnyard Millet

Diversity Analysis Pearl Millet

Genetic Map Creation Maize

Trait Mapping

Shramajeevi Agri Films

Page 13: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Genetic Maps for Pearl Millet • Staple crop for India and Sub-saharan Africa

• Large (2.3 GB), diverse genome

• Reference genome in process

Pearl Millet (Pennisetum glaucum)

• Goal: Assemble genetic maps to anchor scaffolds into pseudochromosomes

Page 14: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Mapping Populations • 3 biparental populations used for genetic mapping:

• 841 x 863 (“Patancheru”)

• ~ 100 RILs from ICRISAT-Patancheru

• Tift 99B x Tift 454 (“Tifton”)

• ~ 180 RILs from Som Punnuri, Ft. Valley State University, USA

• Wild x Domestic F2s (“Sadore”)

• ~ 300 F2 plants from Boubacar Kountche, ICRISAT-Niamey

Page 15: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Summary statistics

Comparison of Genotyping Depths

# ge

no

typ

es

(lo

g sc

ale

)

Call depth (= # reads)

100

102

104

106

108

SNP counts

0

20k

40k

60k

48k

75k 76k 80k

Fewer SNPs = less diversity

Tifton Patancheru Sadore

Best read depth

Page 16: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Making individual maps

1. Call SNPs

SNPs

Page 17: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

1. Call SNPs

2. Group via hierarchical clustering

Making individual maps

Page 18: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

1. Call SNPs

2. Group via hierarchical clustering

3. Merge linkage groups

Making individual maps

Page 19: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

1. Call SNPs

2. Group via hierarchical clustering

3. Merge linkage groups

4. Order markers

Making individual maps

Page 20: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

1. Call SNPs

2. Group via hierarchical clustering

3. Merge linkage groups

4. Order markers

5. Cleanup

Making individual maps

Page 21: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Merge maps for final assembly

• 4824 contigs assembled into 1.68 GB reference

• 92.8% of sequence data

• 60% have putative orientations

• Not perfect, but pretty good

Page 22: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Outline Barnyard Millet

Diversity Analysis Pearl Millet

Genetic Map Creation Maize

Trait Mapping

Shramajeevi Agri Films

Page 23: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Case Study 3: Trait Mapping in the CIMMYT WEMA Populations

• WEMA = Water-Efficient Maize for Africa

• ~20 biparental families, ~200 lines each

• Goal: Use data from across families to map trait loci with high resolution

3D PCA plot of the WEMA families

PC1 PC2

PC3

Page 24: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

• Two approaches to mapping traits in WEMA

Trait mapping

Env 3 Env 4 Env 2 Env 1

Unified Posterior Probabilities

Bayesian GWAS Traditional Joint GWAS

merge

Page 25: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Both methods get similar results

Traditional GWAS (-log10 p-value)

Bayesian GWAS (cumulative Bayes factor)

• Mappings in both methods are roughly equivalent

Page 26: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Preliminary trait-mapping results

ZCN8

VGT1 ZmRAP2.7

? ?

GIGZ1A?

0 MB 100 MB 150 MB 50 MB

?

-lo

g10

p-v

alu

e

Association for Days to Anthesis (well-watered) on Chromosome 8

Page 27: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Conclusions

Photo credit: NASA

• Genomic technology can rapidly characterize almost any crop

• These genetic tools help breed crops faster and better

• Genotyping is basically solved; the bottlenecks are now phenotyping and selection

Page 28: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Future Need 1: High-throughput phenotyping

Photo credits: CIMMYT & Michael Gore

• Genotyping frequently cheaper than dirt (field space)

• Phenotyping is now the limiting factor

Manual recording Rapid phenotyping High-throughput phenotyping

Page 29: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Future Need 2: Data infrastructure

• Both genotyping and phenotyping threaten to drown us in data.

• Data is only useful if it is usable

• Need to develop solutions so genotypes, phenotypes, and germplasm are integrated and linked

SERVER FARM IMAGE

Torkild Retvedt

Page 30: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Make crosses

Phenotype

yi = m + Smzijujdj + ei

(Re)train model

Predict via model Genotype

Standard breeding cycle

Selection cycle (faster, less expensive)

Training cycle (slower, expensive)

Future Need 3: Faster breeding methods

Genomic Selection scheme

Page 31: 2015. Jason Wallace. Applying high throughput genomics to crops for the developing world

Acknowledgements

The Buckler Lab

Collaborators

• C. Tom Hash (ICRISAT-Niamey)

• Boubacar Kountche (ICRISAT-Niamey)

• Som Punnuri (Fort Valley State University)

• Hari Upadhyaya (ICRISAT-Patancheru)

• Rajeev Varshney (ICRISAT-Patancheru)

• Xin Liu (BGI)

• Xuecai Zhang (CIMMYT-Mexico)

• The Institute for Genomic Diversity (Cornell)

• The Maize Diversity Project

• The Pearl Millet Genome Sequencing Consortium

Funding

• National Science Foundation (NSF)

• Plant Genome Research Program

• Basic Research to Enable Agricultural Development (BREAD)

• The International Crops Research Institute for the Semi-Arid Tropics (ICRISAT)

• The International Maize and Wheat Improvement Center (CIMMYT)

• The United States Agency for International Development (USAID)

• The United States Department of Agriculture Agricultural Research Service (USDA-ARS)