high-throughput snp genotyping for rice...
TRANSCRIPT
High-throughput SNP genotyping
for rice improvement
Dr. Michael J. Thomson
Senior Scientist, Molecular Genetics and Marker Applications,
Head, Genotyping Services Lab (GSL)
Plant Breeding, Genetics and Biotechnology Division
International Rice Research Institute, Philippines
5th International Conference on Next Generation Genomics and
Integrated Breeding for Crop Improvement
ICRISAT, India 18 February 2015
Genetic Resources
Plant Genomics
Plant Breeding
Marker development in rice
Diverse rice germplasm
“allele pool”
Beneficial alleles
Fine-mapping, candidate gene analysis, cloning
Association mapping Gene and QTL mapping
Flanking and gene-based markers
for molecular breeding
Genes controlling traits of interest
IRRI BREEDING PIPELINES
IRRIGATED RICE SOUTH ASIA
JAPONICA RICE
RAINFED RICE SOUTH-EAST ASIA
RAINFED RICE SOUTH ASIA
RAINFED / IRRIGATED RICE AFRICA
HYBRID RICE
IRRIGATED RICE SOUTH-EAST ASIA
IRRIGATED RICE SOUTH ASIA GREEN SUPER RICE
HEAT TOLERANCE
GRAIN QUALITY
SALINITY TOLERANCE
P/N USE EFFICIENCY
DROUGHT TOLERANCE
ANAEROBIC GERMINATION
YIELD POTENTIAL
COLD TOLERANCE
PHOTO-INSENSITIVITY
FLOOD TOLERANCE
DISEASE RESISTANCE
BIOFORTIFICATION (FE, ZN)
GREEN SUPER RICE
HEAT TOLERANCE
TRAIT DEVELOPMENT PIPELINES
GENE DISCOVERY
MARKER DEVELOPMENT
MOLECULAR BREEDING
QTL FINE-MAPPING
IRRI’s Plant Breeding and Trait Development
High-throughput SNP genotyping
Single nucleotide polymorphism (SNP) - a site in the DNA where individuals differ at a single base
SNPs
Haplotypes
Tag SNPs
Individual 1
Individual 2
Individual 3
Individual 4
SNP technology can accelerate genetic
mapping and allele mining
• Millions of SNP loci across the genome
• Most SNP markers are bi-allelic
• SNP data can be easily merged in a database
• Rapid high-throughput SNP genotyping systems are available
• SNP haplotypes can track specific alleles
Single nucleotide polymorphism
(SNP) marker – Infinium platform
Resources for SNP development in rice
OryzaSNP
resequencing
(McNally et al
2009 160k SNPs)
Sequencing
Genebank
(3,000 lines:
IRRI/CAAS/BGI)
Rice SNP
Consortium
(NGS on 125 lines:
16 million SNPs)
SNP
discovery
pools
44k SNP chip
(Zhao et al. 2011;
Cornell University)
High resolution
genome-wide
genotyping
700K SNP chip
(Cornell University
and IRRI)
6K SNP chips
(Infinium, several
developed)
2,015 KASPar assays (GCP/KBiosciences)
Custom 384-SNP sets (IRRI, Cornell, others)
High sample
throughput
genotyping
QTL mapping, genetic diversity
analysis, DNA fingerprinting
Trait-based diagnostic
SNP markers for breeding
Published
genes,
QTLs, and
SNP info
MAS
3,000 rice genomes
14x coverage
17 Tb of sequence reads
18.9M SNPs
IRRI CAAS
Raw data is available from GigaDb and SRA
N. Alexandrov, K. McNally, R. Mauleon, R. Hamilton, S. Tai, W. Wang, G. Zhang, Z. Li, and others
SNP-Seek database on the IRIC website:
http://oryzasnp.org/iric-portal/
International Rice Informatics Consortium
Querying SNP-Seek for sd1 LOC_Os01g66100
77 SNPs in 2235 lines
K. McNally
Initiative under the Global Rice
Science Partnership (GRiSP)
Product 2.1.3 “High-throughput SNP genotyping platform for breeding applications”
• Set up facilities for high-throughput SNP
genotyping
• Develop trait-based SNP markers for breeding
• Implement a SNP fingerprinting platform
http://gsl.irri.org
• Core facility at IRRI providing high-throughput SNP
genotyping services
• Serving IRRI and our regional partners
• >20,000 samples processed in 2014 providing 32
million SNP data points
Optimizing the DNA extraction
and genotyping workflow
DNA extraction:
Automated magnetic
bead system
Automated DNA
extraction
QC check:
Check DNA quality
and concentration
DNA quality
control
Leaf sampling:
PlantTrak sampling
into 96 well format
Tissue
sampling
with
PlantTrak
Data storage &
analysis:
SNP database/tools
SNP genotyping:
Custom Fluidigm and
Infinium sets of 24 –
4,600 SNPs
Fluidigm
24 or 96 SNPs
Infinium 6K chip
Supported by the “Transforming Rice Breeding” project
Barcode
scanner/leaf puncher Tissue sampling in the field using the PlantTrak Hx unit
96-sample plastic
magazines
Automated leaf sampling with PlantTrak
Brooks Automation
• Low cost minipreps and
high quality DNA
Automated DNA extractions on oKtopure
• Processes 8 x 96-well
plates in 1.5 hrs
Rapid plant growth for
molecular breeding
J. Chin and M. Thomson
• Set of RILs, NILs, designed
populations, improved varieties
with QTL/gene profiles
• MABC and QTL pyramiding
multiple trait packaging lines
with 3-, 4-, and 5-QTL
combinations in progress
Improved donor
development
Genotyping platforms for breeding applications
GoldenGate 384 SNPs
Fluidigm 24 and 96 SNPs
Infinium
6K SNP chip 4,600 SNPs
GBS 10k-40k SNPs
Fragment Analyzer
SSRs/indels Genetics and
Breeding applications:
Trait-based MAS,
Fine-mapping,
Rapid QC scans,
Background MABC,
Diversity analysis,
QTL mapping,
SNP fingerprinting,
High-resolution
mapping,
GEBV selection
Chr. 5 Chr. 8 Chr. 9 Chr. 11
0.43 0.48 0.57
0.92
1.40
1.97
SNP SNP SNP
SNP
SNP
SNP
0.04 0.15 0.22
SNP SNP SNP
25.68
26.35 26.45
26.72 26.90
27.34
27.79
28.30
SNP
SNP SNP
SNP SNP
SNP
SNP
SNP
25.07 SNP 5.30
5.92
6.22
6.40
6.75 6.91
7.33
8.15
SNP
SNP
SNP
SNP
SNP SNP
SNP
SNP
19.43
19.87
20.18
20.37 20.49
20.98
SNP
SNP
SNP
SNP SNP
SNP
18.87 SNP
Xa5
Xa13
Sub1 Xa21
Gene-based and flanking SNPs across Sub1 and Xa regions for
precise MABC selection
C. Vera Cruz, B. Collard, J. Chin
Validating trait-based SNPs for breeding
Validate trait-specific
SNPs predictive for
desired alleles
needed for breeding
programs
• Verify published
functional SNPs
• Test and optimize
functional SNPs on
Fluidigm system
• Develop “breeders’
chips” with custom
SNP marker
packages for target
traits
xa13 Deletion
base position
Adday sel
Makassane
NSIC Rc 222 222
Swarna-Sub1
Tong 88-7
NSIC Rc 238
MS 11
KHO
Dasanbyeo
Milyang 23
Tongil
TR 22183
N 22
Genetic variation within xa13 region
M. Dwiyanti, unpublished data
• Fluidigm Dynamic Arrays
for nano-liter reactions for
flexible SNP genotyping at a
low cost per sample
24 SNPs x 192 samples
(4,608 reactions)
• Validated functional and
gene-based SNPs for trait-
specific markers for breeding
Xa21 resistant allele
Xa21 susceptible allele
Trait-specific SNP markers for breeding
0.1
TN1TKM9
IRBL7 M CO IRBLk Ka CO
IRBLkp K60 CO
IRBL1 CL CO
IRBLkm Ts CO
IRBLkh K3 CO
IR 66946 3R 116
Brridhan47IRRI 147
IR 66946 3R 149IR4595 4 1 13IR 70023
Teqing
Minghui 63
MH63
IR06M143KaolackIR12T213
Aswina
AraiRaj
Supa
CAS209IRRI 132
IAC 165
Jumbot jetApo
IR 74371 70 1 1IRRI 148IR08L181
Mazan red
I geo tze
Zs97bCbb23
Khaohlanon
Khao mine lar
Sadu cho
Namwon1
Tetep
Taducan
Madabaru
Tsipala 421
Pokkali 108921
Pokkali
Rasht 454 1454
Hybrid line 2
Basmati 306
Rausrr 5
Laksmilota
Nanhi
N 22Vishunparag
Dz192
Kharsu 80a
Chengri
Kalshoni
Asepulujawa
Lien chan sha pu tsan
DV85
Dv 86
Aus 299
Chinsurahboro 2Ausboro
IRGC29086
DularAus 257
Aus 80BeriArc 11204
Hasawi irgc 16817 1FR13ARayada
Kalimekri 77 5
Tog175
Tupa 501
Tchampa
IRBL1 CLKay noi leuang
IR 67966 44 2 3
M202
Yrl 1
Koryeong13Unkwangbyeo
Tr41
Dasanbyeo 1
Ssalbyeo2
Toploea 70 76
Kamenoo
Tng67
Baghlani nangarhar
ToyonishikiMS11
Jinmibyeo
Taichung65Asominori
IRBLkp K60
IRBL5 M
IRBLz5 CA
IRBLkh K3
IRBLkm TS IRBLk KALTH
IRBLsh S
Tr22183 1
Tr22183
Rinaldo bersani
S4542a3 49b 2b12
N12
PI 298967 1
Cypress Gogo lempuk
Morobe
Chahora 144
Malagkit puti
Azucena
Hybrid line 1
Nova
Sorkheh zarrin shahr
Sadri ghermes Moosa tarom 57
MulaiTarem
TaremeMussataremValisehLaromeHassan tareme
Sadri32331
Anbarboo
Hashemi
Zardrome
Mussa taremeFirooz
Dom sufid
Sadri32339Sadri reza
Sherazi
SorkhrishekGhasib
Gharib
SadriGharibe
Rata 21 3
Begunbichi 348
Sal 104 ir 63 or sal 024Sambha mahsuri sub1Sambha mahsuri
IR 84196 12 32IR 84649IR11T153
IR12F566
Swarna sub1
Swarna
Br11
IR11F216Br11 sub1CR1009 Sub1
CR 1009
IR 55179IR4630
IR 4630 22 2 5 1 3
12ds gmet 22IR11A282
12ds gmet 3IR12A181BR29Br 29
IR11N202IR10F203Shiroudi
A 69 1FL478
IR29Hasawi
IR 66946 3R 178IR58443 6B 10 3IR58443IR 58443
Rasht 452 14932 IR12T217IR 85891
IR82252 145 2 3 3 2IR 2006
SundensisHanareumbyeo
IR24
UtrimerahIRBB13
IRBB5 IRBB59IR BB 60xIR 72920 1 44 4
IRBB54IRBB60IRBB63IRBB66
IR BB 66
IRBB52
IRBB55IRBB21IRBB62
IRBB7IRBB57
IRBB67IRBB65
IRBB64IRBB61
IRBB23
IRBB 23
IR71730 51 2
IRRI 163
IR 77674 3B 8 2 2 13 4 AJY2
IR 77674
IR 77674 3B 8 2 2 8 2 4
IR 77674 3B 8 2 2 12 5 AJY2IR 77674 3B 8 2 2 14 4 AJY1
IR 77674 3B 8 2 2 14 4 1IR07T118IR 77674 3B 8 2 2 8 2 AJY6
IR 77674 3B 8 2 2 14 2 AJY2IR 77674 3B 8 2 2 8 3 4IR 77674 3B 8 2 2 8 2 AJY10IR 77674 3B 8 2 2 12 5 1
IR 77674 3B 8 2 2 14 2 AJY4
IR 77674 3B 8 2 2 14 1 AJY5
IR09L120
IR 72046
Vandana 356
FEDEARROZ 50
IRRI119 RC68IRRI 165
IR12N135
BG1222
IR72890 81 3 2IR10N237
Nsic rc118
Matatag 1
IR10F550
IR10F548
IR11C115
IR11C114IR 45427 2B 2 2B 1 1
Shz2
SANHUANGZHAN NO 2
IR83614 1002 B BIR11T213IR 77674 3B 8 2 2 8 2 AJY4
Giza178
Nsicrc238
Nsicrc222
IR11F186
PSBRC18
NSIC RC 222
IRRI 154
IR12N253
IR10F365IR09F437
Psbrc18 sub1
IRRI 105
IR12F107
IR77664 B 25 1 2 1 3 12 3AJY1
Br 28
IR77542 551 1 1 1 1 2
IR06M139
IR11A306
IR11C170 IR11C169
IR 77674 3B 8 1 3 10 3 2
IR 6 PAKISTAN
Thadokkham 1
TDK Sub1IR09F185
IR57514 PMI 5 B 1 2
IR09F158
IRRI 119
IR09N534
IR09F171
IR09F166
IR09N496
IR03W134
IR11F211
IR10F109
IR12F164
IRBL9 W RL SPK2
IR 49830
Brri dhan 55IR10N271IR 29IR82489 594 3 2 2
IR12A207
IR10N225IR10A134
IR64 Heat 74
Brri dhan 53
IR07N112
IR64 Pup1
IR11T189IR64 Sub1 AG1IRRI149IRRI 149IR 64 SUB 1
IR64 AG1IR64 SALTOL
IR64 SPIKE
IR64 EMF
IR64 21IR64
IR64XGR2 R B3F8 239 19 4 20 11
IR64XGR2 R B3F8 239 19 9 13 32
IR64XGR2 R B3F8 239 28 6 3 7IR64XGR2 R B3F8 239 28 6 3 3
IR64XGR2 R B3F8 239 28 6 3 55IR64XGR2 R B3F8 239 10 16 11 36
Gr2 rxir64 b3f8 148 10 10 10 59
Gr2 rxir64 b3f8 148 10 10 10 19
Gr2 rxir64 b3f8 148 10 10 10 12
IR77298 5 6 18
IR10N238
Binadhan 712ds gmet 25IR10A231IR07A179IR09A136
IR 77298 14 1 2 10
IR11T222
IR09N212CiherangIR09F436Ciherang Sub1
IR06N155
IR06N119
IR11C138
IR83405 B B 96 2
IR78555 3 2 2 2
IR10N108
PSBRC 82
IR10F336IR11C173Rc82IR11C134IR05N412
IR04A115
IR11A303IR09N522IR09F154IR09F153IRRI 156IR09A228IR10F559
IR08N121IR10F571
Nsic 158
IR 77186 122 2 2 3IRRI 168
IR12T101
aus
indica
Temperate
japonica
Tropical
japonica
aromatic
Infinium 6K
chip for SNP
fingerprinting
and GEBVs
• 6K developed
at Cornell Univ.
(S. McCouch)
• 5,274 SNP loci
on 6K chip
• ~4,500 high-
quality SNPs
• 4,400 samples
run to date
• 6K data will be
used for
genome-wide
prediction
Over 6,000 samples run on the
6K SNP chip in 2014 providing
27 million marker data points
Genotyping by sequencing (GBS) for
GWAS on rice bacterial blight
• GBS can provide low-
cost high-resolution SNP
scans by multiplex
sequencing
• 96-plex GBS using ApeKI
per HiSeq lane (Cornell
University)
Elshire et al. 2011
• Testing GBS for
GWAS for resistance
to BLB disease
•285 diverse rice
accessions with 9
Xoo isolates
40K SNPs from GBS used for GWAS
• Known Xa genes (Xa4, Xa5, Xa21)
clearly detected in diverse panel
• Several potentially novel loci also
found
• GBS works well, but requires
significant data analysis
C. Dilla-Ermita et al
(unpublished)
Xa4 xa5
Integrating markers into breeding programs
MABC and
QTL pyramiding
IR64-
Sub1
IR64-
Pup1 IR64-
AG1
IR64-
heat IR64-
Saltol IR64-
DTY
MABC
lines
IR64-Pup1
-DTY-heat IR64-Sub1
-DTY-heat
IR64-Sub1
-AG-Saltol
QTL pyramid lines
MAS in the pedigree
breeding programs
F2
Varieties for different
target regions
F3
F4
F5
MAS
MAS
MAS
Yield trials
New released varieties
F2
F3
F4
F5
Genes/
QTLs
Causal variants
Genome-wide
prediction tools
Breeding
population
Genotyping
GEBV
Selections
Training
population
Genotyping/
phenotyping
Train GS
model
Testing and release
New released varieties
Thomson, 2014 (Plant Breeding Biotech. 2:195)
Breeding data management system
Breeding4Rice
Web-based
Information
Managemen
t
System
Facilitates
Study
Management
Use of Data
for Decision
Making
Interoperable
with other systems
such as Analytical
Services Breeding Information Management, IRRI. 2014
Genomic & Open-source Breeding Informatics Initiative (GOBII)
3 CG Centers:
Ithaca hub:
5 crop species: Maize, wheat, sorghum, chickpea, rice
• Susan McCouch
• Ed Buckler
• Jean-Luc Jannink
• Mark Sorrells
• Qi Sun
• Michael Olsen
• Rajeev Varshney
• Michael Thomson
• Lukas Mueller
GOBII APIs User Platforms
Indexing
Imputation
IBD calculation
Prediction
Selection
Mating design
High Density
Genomic Analysis
GOBII
QTL discovery
Segment
tracking
AP
I AP
I
IRRI-B4R
Pedigree records
Seed inventory
Experimental design
Data collection
User Interface
Data / Analysis
Selection
Mating design
MAS
Breeding
Scheme
KDDArT
Pedigree records
Seed inventory
Experimental design
Data collection
User Interface
Data / Analysis
Selection
Mating design
MAS
Breeding
Scheme
IBP-BMS
Pedigree records
Seed inventory
Experimental design
Data collection
User Interface
Data / Analysis
Selection
Mating design
MAS
Breeding
Scheme
DivSeek
Pedigree records
Seed inventory
Experimental design
Data collection
User Interface
Data / Analysis
Selection
Genetic, Trait,
Geographic info
Workflow optimization: • Ma. Ymber Reveche
• Socorro Carandang
• Annalhea Jarana
• Grace Cariño
IRRI Scientists:
J.H. Chin
R. Mauleon
E. Septiningsih
B. Collard
B. Zhou
C. Vera Cruz
K. McNally
E. Nissila
Acknowledgments
GSL team
Marker Validation: • Maria S. Dwiyanti
• C. Jade Dilla-Ermita
• Erwin Tandayu
• Crisostomo Dizon
GSL services: • Nadia Vieira Castañeda
• Geraldine Malitic-Layaoen
• Geisha Sanchez
• Venice Juanillas
• Krizzel Llantada
Genotyping Services Lab (GSL) team at IRRI
• Susan McCouch
• Mark Wright
Cornell
University
Funding: Japan Breeding project,
GRiSP program, Syngenta SKEP