genomic selection to breed for resistance to bacterial ...tgc.ifas.ufl.edu/2016/3. genomic selection...
TRANSCRIPT
Genomic
Selection to breed
for resistance to
bacterial spot of
tomato
TBRT 2016
Deb Liabeuf
The Ohio State University
Introduction
Bacterial spot
Xanthomonas
X. euvesicatoria
X. vesicatoria
X. perforans
X. gardneri
Introduction Tomato Species
S. lycopersicum
S. pimpinellifolium
S. lycopersicum var
cerasiforme
Line
Ha7998
PI 128216
PI 114490
Resistance to
The 4 species
X. perforans
The 4 species
Gene/QTL
Rx3 on ch.5
QTL on ch.11
Rx4 on ch.11
QTLs on ch.
2, 3, 10, and 11
Combine resistance
Elite cultivar with all genes/QTLs of resistance
Marker assisted selection?
Only regions with major effects taken into
account
Introduction
Genomic selection (GS)
Introduction
(Heffner, Sorrells et al. 2009)
Predicts the performance of an individual based on genetic
information across the whole genome
GEBV = Genomic Estimated Breeding Values
Train the model
y = Xβ + ε
Use the model
y = Xβ + ε
Introduction
Vector of
phenotypic
values
[n,1] Marker
matrix
[n,m]
Vector of
Marker effects
[m,1] Give a value to
each marker
representing its
effect on the trait
To empirically compare phenotypic and genomic selection for bacterial spot resistance
Evaluate the effect on prediction accuracy of Modeling methods
Marker density and selection
Objectives
A B C D E F
A
B X
C X X
D X X
E X X X
F X X X
Introgress resistance in cultivated background
Crosses between resistant lines
F1 crossed together Segregating population
Population
51 inbred progenies
Self pollination
7 lines and hybrids
cross pollination
Workflow
1,110 individuals
109 individuals
109 families
phenotypic selection
Self pollination
Training population
Testing populations
Complex population
Phenotyping
Develop GS models
Marker effects and GEBVs
Genotyping
Cross validation
phenotypic evaluation
Compare phenotypic values and GEBVs
Empirical validation
population # of locs # of blocks per loc
Training pop. 1 2
Inbred progeny 2 4
Lines and hybrids 1 2
Phenotypic value =
Best Linear Unbiased Predictors
Corrected mean from a
random model
BLUPs
𝑌𝑖𝑗 = 𝜇 +𝑔𝑖 𝑏𝑗 + 𝜀𝑖𝑗
Yij = Phenotypic value
gi Genotype effect
bj = Block effect
𝜀𝑖𝑗 = error
+
𝑌𝑖𝑗 = 𝜇 +𝑔𝑖 + 𝜀𝑖𝑗 + 𝑏𝑗 𝑙𝑘 𝑙𝑘 +
lk = Location effect
+ 𝑙𝑘: 𝑔𝑖
Random models
For each location:
Across locations:
Experimental design = RCBD
𝜇 = grand mean
Phenotyping
Field inoculated with X. euvesicatoria
Plots rated with quantitative scale 0 to12
(Horsfall and Barratt 1945)
Genotyping
SolCAP infinium Array 397 SNP markers
Series based on prior knowledge and coverage
(Sim et al. 2012)
x cM
x cM
x cM
x cM
ch01 ch02 ch03 ch04 ch05 ch06
ch07 ch08 ch09 ch10 ch11 ch12
x cM
x cM
x cM
x cM
SolCAP infinium Array 397 SNP markers
Series based on prior knowledge and coverage
Genotyping
1
2
3
4
5
6 7
8
9
10
11
12
Manhattan plot -10log(p-value)
(Sim et al. 2015)
Sim et al, 2012
ch01 ch02 ch03 ch04 ch05 ch06
ch07 ch08 ch09 ch10 ch11 ch12
Ridge Regression Random model all markers as random effect
Ridge Regression Fixed model markers associated with QTLs as fixed effect
other markers as random effects
Bayesian LASSO (Least Absolute Shrinkage and Selection Operator)
Modeling
(Endelman 2013)
rrBLUP package on R
(Pérez et al. 2010)
BLR package on R
Training population
Train the model
on 108 families
Obtain GEBV with
only genotypic data
for 1 family
Repeat 109 times
Modeling
Marker effect: average
across repetitions
• Leave-one-out cross validation
Evaluation of the model accuracy:
Prediction accuracy
Validation
correlation coefficient between
phenotypic values and GEBVs
Phenotypic values
GEB
Vs
Phenotypic values
GEB
Vs
Model with high prediction accuracy Model with low prediction accuracy
Prediction accuracy of GS models from the leave-
one-out cross validation
Ridge regression – all markers as random effect
Cross validation
0 100 200 300 400
0.4
0.2
0.0
-0.2
Pre
dic
tio
n a
cc
ura
cy
Number of markers
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1P
red
ictio
n a
cc
ura
cy
cross-validation
progeny
hybrid
Validations
Selection made with Ridge Regression models with
fixed effect compared to phenotypic selection
Inbred progeny
phenotypic
selection Genomic
selection training pop.
Ph
en
oty
pic
BLU
Ps
Inbred
progeny Parents
Conclusion
Even with small training population and low marker
density GS allowed to accurately predict resistance in
progeny
In our population, prediction for resistance to bacterial
spot of tomato, are better when taking into account markers associated with QTLs
Benefit of doing an Association
analysis on the training population
before developing GS models!
Endelman, J. B. (2011). "Ridge regression and other kernels for genomic selection with R package rrBLUP." The Plant Genome 4(3): 250-255.
Heffner, E. L., et al. (2009). "Genomic Selection for Crop Improvement."
Crop Sci. 49(1): 1-12.
Horsfall, J. G. and R. W. Barratt (1945). "An improved grading system for
measuring plant diseases." Phytopathology 35: 656.
Pérez, P., et al. (2010). "Genomic-enabled prediction based on molecular markers and pedigree using the Bayesian linear regression package in R." The Plant Genome 3(2): 106.
Sim, S.-C., et al. (2012). "Development of a Large SNP Genotyping Array
and Generation of High-Density Genetic Maps in Tomato." PLoS ONE 7(7): e40563.
Sim, S.-C., et al. (2015). "Association Analysis for Bacterial Spot Resistance
in a Directionally Selected Complex Breeding Population of Tomato." Phytopathology 105(11): 1437-1445.
References
The Francis lab Dr David Francis
Bernard Eriku
Nicolas Lara
Eduardo Bernal
Eka Sari
Regis Carvalho
Troy Aldrich
JiHeun Cho
Thanks to…
the Ohio Department of Agriculture,
specialty crop research program
The Ohio State University Research
Enhancement Competitive Grant Program
Mid-America Food Processors Association
Funding
under award number 2014-67013-22410
Others Dr Antonio Cabrera
Ashley Markazi
The Francis Lab
Inoculation and evaluation
Phenotyping
Spray
inoculation Disease score:
Horsfall-Barratt scale
0 3 12
R S
X. euvesicatoria
80% of fruit maturity Ready for harvest
Maximal disease pressure
Phenotypic value =
Random model used:
Best Linear Unbiased Predictors
Corrected mean from a
random model
BLUPs
𝑌𝑖𝑗 = 𝜇 +𝑔𝑖 𝑏𝑗 + 𝜀𝑖𝑗
Phenotypic value
Genotype effect Block effect
error +
Phenotyping training pop.
Comparison between p-value from AA and
breeding value for each marker
Genomic model
x
Testing population
Prediction accuracy of GS models from test on the
hybrids and lines derived from the training
population
Number of markers 0 100 200 300 400
0.50
0.25
0.0
-0.25
Co
r. c
oe
ff
FG10_504
FG10_507
FG10_529
FG10_530 Fla8233
OH7663 OH8245
OH88119
y = 0.4553x - 0.0737
R² = 0.1601
-0.30
-0.20
-0.10
0.00
0.10
0.20
-0.20 -0.10 0.00 0.10 0.20 0.30 0.40
GEB
Vs
phenotypic BLUPs
397 markers
FG10_504
FG14_507
FG10_529
FG10_530 Fla.8233
OH7663 OH8245
OH88119
y = 3.3907x - 0.6337
R² = 0.4442
-2.00
-1.50
-1.00
-0.50
0.00
0.50
1.00
-0.20 -0.10 0.00 0.10 0.20 0.30 0.40
GEB
Vs
phentoypic BLUPs
markers associated with resistance
Testing population
Prediction accuracy of Ridge Regression with
markers associated with QTLs as fixed effect
Inbred progeny
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
loc1 loc2 across loc
pre
dic
tio
n a
cc
ura
cy
phenotypic
genotypic
Selection made with Ridge Regression models with
fixed effect compared to phenotypic selection
Inbred progeny
phenotypic selection
Across_locs Loc1 Loc2 Across_locs Loc1 Loc2
Genomic selection training pop.
Ph
en
oty
pic
BLU
Ps
rrBLUP random effect
Phenotypic prediction accuracy
Adj. R2 between F3 and F4 phenotypic values (rp)
Genomic prediction accuray
Adj. R2 between GEBVs and F4 phenotypic values (rg)
Inbred progeny
Loc1 Loc2 Across
Loc
rp 0.54*** 0.36** 0.51***
rg full set 0.44*** 0.13NS 0.30*
rg QTL 0.62*** 0.41*** 0.58***
rg/rp Prediction ability of genomic selection
compared to phenotypic selection
Inbred progeny
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
Loc1 Loc2 Across Loc
GS a
cc
ura
cy
re
lative
to
Ph
en
oty
pic
ac
cu
rac
y
1st set = associated to QTL
1st set = random
24 markers
119 markers
217 makers
397 markers (full set)
Directional selection
Disease level
# of
individuals
S individuals R individuals
µ σ σ
12 resistant lines
15 susceptible lines