marker assisted whole genome selection in crop improvement
DESCRIPTION
Mapping and tagging of agriculturally important genes have been greatly facilitated by an array of molecular markers in crop plants. Marker-assisted selection (MAS) is gaining considerable importance as it would improve the efficiency of plant breeding through precise transfer of genomic regions of interest (foreground selection) and accelerating the recovery of the recurrent parent genome (background selection). MAS has been more widely employed for simply inherited traits than for polygenic traits, although there are a few success stories in improving quantitative traits through MASTRANSCRIPT
Subramanian. S.
Prehistoric selection for visible phenotypes that facilitated harvest and increased productivity led to the domestication of the first crop varieties
(Harlan, 1992)
Conventional plant breeding is primarily based on phenotypic selection of superior individuals among segregating progenies resulting from hybridization
Difficulties are often encountered during this process, primarily due to genotype – environment interactions
Testing procedures may be many times difficult, unreliable or expensive due to the nature of the target traits (e.g. abiotic stresses) or the target environment
(Babu, 2004)
A process whereby a marker (morphological, biochemical or one based on DNA/RNA variation) is used for indirect selection of a genetic determinant
Used in plant and animal breeding
Exploits the genetic linkage between markers and important crop traits (Edwards et al., 1987; Paterson et
al., 1988)
MAS can be useful for traits that are difficult to measure, exhibit low heritability, and/or are expressed late in development
Sax (1923) reported association of a simply inherited genetic marker with a quantitative trait in plants
Marker Assisted Selection (MAS)
Fig. Segregation of seed size associated with segregation for a seed coat colour marker
Morphological - Presence or absence of awn, leaf sheath coloration, height, grain colour, aroma of rice etc.
Biochemical- A gene that encodes a protein that can be extracted and observed; for example, isozymes and storage proteins.
Cytological - The chromosomal banding produced by different stains; for example, G banding.
Biological - Different pathogen races or insect biotypes based on host pathogen or host parasite interaction can be used as a marker
Genetic or molecular - A unique (DNA sequence), occurring in proximity to the gene or locus of interest, can be identified by a range of molecular techniques
Easy recognition of all possible phenotypes (homo- and heterozygotes) from all different alleles
Demonstrates measurable differences in expression between trait types and/or gene of interest alleles, early in the development of the organism
Has no effect on the trait of interest that varies depending on the allele at the marker loci
Low or null interaction among the markers allowing the use of many at the same time in a segregating population
Abundant in number and polymorphic
IMPORTANT PROPERTIES OF IDEAL MARKERS FOR MAS
(Babu et al., 2004)
Highly polymorphic and simple inheritance (often co-domimant)
Abundantly occur throughout the genome
Easy and fast to detect, minimum pleiotropic effect
Not environmentally regulated and are unaffected by the conditions in which the plants are grown and are detectable in all stages of plant growth (Francia et al., 2005)
Used in diversity analysis, parentage detection, DNA fingerprinting, and prediction of hybrid performance.
Molecular markers are useful in indirect selection processes, enabling manual selection of individuals for further propagation.
RFLP is the most widely used hybridization-based molecular marker
Technique is based on restriction enzymes that reveal a pattern difference between DNA fragment sizes in individual organisms
(Semagn et al., 2006)
RAPD (Random Amplified Polymorphic DNA) Single arbitrary oligonucleotide primer(10 bp) Low annealing temperature (generally 34 – 37 oC)) Polymorphisms (band presence or absence) result from changes in DNA
sequence
Combines the power of RFLP with the flexibility of PCR-based technology by ligating primer recognition sequences (adaptors) to the restricted DNA (Lynch and Walsh, 1998).
1. Digest genomic DNA with restriction enzymes
2. Ligate commercial adaptors (defined sequences) to both ends of the fragments
3. Carry out PCR on the adaptor-ligated mixture, using primers that target the adaptor, but that vary in the base(s) at the 3’ end of the primer
(Semagn et al., 2006)
PCR-based marker with 18-25 bp primers
SSR polymorphisms are based on number of repeat units, and are hypervariable (have many alleles)
SSRs have stable amplification and good repeatability
SSRs are easy to run and automate
Amplification of DNA segments present at an amplifiable distance in between two identical microsatellite repeat regions in opposite direction
The technique uses microsatellites as primers in a single primer PCR reaction targeting multiple genomic loci to amplify mainly inter simple sequence repeats of different sizes
ISSRs use longer primers (15–30 mers)
Simpler method compared to phenotypic screening
Especially for traits with laborious screening
May save time and resources
Selection at seedling stage
Important for traits such as grain quality
Can select before transplanting in rice
Increased reliability
No environmental effects
Can discriminate between homozygotes and heterozygotes and select single plants
More accurate and efficient selection of specific genotypes
May lead to accelerated variety development
More efficient use of resources
Especially field trials
The first step is to map the gene or quantitative trait locus (QTL) of interest first by using different techniques and then use this information for marker assisted selection.
• Ideally markers should be <5 cM from a gene or QTL
Marker A
QTL5 cM
RELIABILITY FOR SELECTION
Using marker A only:
1 – rA = ~95%Marker A
QTL
Marker B
5 cM 5 cM
Using markers A and B:
1 - 2 rArB = ~99.5%
• Using a pair of flanking markers can greatly improve reliability but increases time and cost
(Valentine and Howarth, 2000)
Marker Assisted Backcrossing
Foreground selection
Background selection
• MAB has several advantages over conventional backcrossing:
– Effective selection of target loci– Minimize linkage drag– Accelerated recovery of recurrent parent
x P2P1
DonorElite cultivar Desirable trait
e.g. disease resistance
• High yielding
• Susceptible for 1 trait
• Called recurrent parent (RP)
P1 x F1
P1 x BC1
P1 x BC2
P1 x BC3
P1 x BC4
P1 x BC5
P1 x BC6
BC6F2
Visually select BC1 progeny that resemble RPDiscard ~50% BC1
Repeat process until BC6
Recurrent parent genome recovered
Additional backcrosses may be required due to linkage drag
Selection for target gene or QTL
Useful for traits that are difficult to evaluate
Also useful for recessive genes
1 2 3 4
Target locus
TARGET LOCUS SELECTION
FOREGROUND SELECTION
(Melchinger, 1990)
Donor/F1 BC1
c
BC3 BC10
TARGET LOCUS
RECURRENT PARENT CHROMOSOME
DONOR CHROMOSOME
TARGET LOCUS
LIN
KE
D D
ON
OR
G
EN
ES
Concept of ‘linkage drag’
• Large amounts of donor chromosome remain even after many backcrosses• Undesirable due to other donor genes that negatively affect agronomic performance
Conventional backcrossing
Marker-assisted backcrossing
F1 BC1
c
BC2
c
BC3 BC10 BC20
F1
c
BC1 BC2
MARKERS CAN BE USED TO GREATLY MINIMIZE THE AMOUNT OF DONOR CHROMOSOME
TARGET GENE
TARGET GENE
(Ribaut and Hoisington, 1998 )
Use flanking markers to select recombinants between the target locus and flanking marker
Linkage drag is minimized
Require large population sizes Depends on distance of flanking
markers from target locus
RECOMBINANT SELECTION
1 2 3 4
Use unlinked markers to select against donor
Accelerates the recovery of the recurrent parent genome
Savings of 2, 3 or even 4 backcross generations may be possible
1 2 3 4
BACKGROUND SELECTION
Percentage of RP genome after backcrossing
Theoretical proportion of the recurrent parent genome is given by the formula:
Where n = number of backcrosses, assuming large population sizes
2n+1 - 1
2n+1
Although the average percentage of the recurrent parent is 75% for BC1, some individual plants possess more or less RP than others
Improvement of qualitative traits Resistance to soybean cyst nematode Development of QPM genotypes Marker-aided pyramiding of rice genes for bacterial blight and blast
resistance
Quantitative trait improvement Improvement of heterotic performance in maize Germplasm enhancement in tomato Submergence tolerance in rice cultivars
(Babu et al.,2004)
EXAMPLES OF GENE–MARKER ASSOCIATIONS FOR IMPORTANT TRAITS IN MAJOR CROPS
(Babu et al.,2004)
CONTD…
Flash floods or short-term submergence regularly affect around 15 million hectares of rice (Oryza sativa L.) growing areas in South and Southeast Asia
An economic loss of up to one billion US dollars annually has been estimated
Submergence tolerant varieties have been developed but have not been widely adopted
Poor agronomic and quality characteristics Many popular and widely-grown rice varieties - “Mega varieties”
(Mackill et al., 1996)
BR11 Bangladesh
CR1009 IndiaIR64 All Asia
KDML105 ThailandMahsuri IndiaMTU1010 IndiaRD6 ThailandSamba Mahsuri India
Swarna India, Bangladesh
A major QTL (Sub1) for submergence tolerance identified and fine mapped on chromosome 9 in the submergence tolerant cultivar FR13A
(Xu and Mackill, 1996)
Three related ethylene response factor(ERF)-like genes at this locus were identified Sub1 A, B and C.
Sub1A and Sub1C were up-regulated by submergence and ethylene (Fukao et al., 2006)
Sub1A was strongly induced in the tolerant cultivars in response to submergence, whereas intolerant cultivars had weak or no induction of the gene.
Overexpression of Sub1A conferred submergence tolerance in an intolerant japonica cultivar and down-regulation of Sub1C
IR49830-7-1-2-2 (IR49830-7), one of the FR13A-derived submergence-tolerant breeding lines (Mackill et al.,1993),was used as the donor of Sub1
The recipient variety was Swarna, a widely grown cultivar in India and also in Bangladesh.
X
SwarnaPopular variety
IR49830Sub1 donor
F1 X Swarna
BC1F1
In the BC1F1 individual heterozygous plants at the Sub1 locus were identified reducing the population size for further screening (foreground selection)
Homozygous for the recipient allele at one marker locus (RM219) distally flanking the Sub1 locus (i.e. recombinant) were identifed “recombinant selection” (Collard and Mackill, 2006)
From these recombinant plants, individuals with the fewest number of markers from the donor genome were selected (background selection)
In the second BC generation the same strategy was followed for selection of individual plants
The conversion of the mega variety Swarna to submergence tolerant within a two year time span for the BC2 and 2.5-year-time span for the BC3
Using rice genome sequence, polymorphic microsatellite markers were designed from the same BAC clone (AP005907) harbouring the Sub1 genes (Xu et al., 2006).
Initially the Sub1 locus was monitored by markers shown to be closely linked with the gene
Using tightly linked (RM464A, 0.7 cM ) and flanking(RM219 ,3.4 cM, RM316) markers ensured efficient foreground and recombinant selection
For flanking markers used for recombinant selection, about 5 Mb region on each side of the Sub1 region was targeted.
In advanced backcrosses and selfed generations, newly developed markers from the Sub1 region were used for the target loci
Fourteen- day-old seedlings were submerged for 14 days(BC1F2, BC2F2 and BC3F2). The survival of plants was scored 14 days after de-submergence (calculated as a percentage) for confirmation of the presence of the Sub1 locus.
(Neeraja et al., 2007)
BC3F2 plant (No. 227-9-407) Selected BC2F2 plant No. 246-237
(Neeraja et al., 2007)
(Neeraja et al., 2007)
The grain quality parameters in the Sub1 lines were on par with the non-introgressed Swarna
There was inhibition of brown furrows in the seed coat of Sub1 introgressed Swarna, yielding plants with straw colored hulls instead of the golden hull color of Swarna.
Easily distinguish the submergence- tolerant version of Swarna from the original
Normal maize - deficiency in two essential amino acids (lysine and tryptophan) and high leucine–isoleucine ratio.
Breakthrough came in the 1960s, discovery maize mutant opaque2 (Mertz et al., 1964)
Encodes a transcriptional factor that regulates the expression of zein genes and a gene encoding a ribosomal inactivating protein (Schmidt et al., 1990)
Reduces the level of 22-kD alpha-zeins while increasing the content of non zein proteins particularly, EF-1 alpha, which is positively correlated with lysine content in the endosperm (Habben et al., 1995).
The protein quality of opaque2 maize is 43% higher than that of common maize(Mertz, 1992).
opaque2 maize not popular with farmers - reduced grain yield, soft endosperm, chalky and dull kernel appearance and susceptibility to ear rots and stored grain pests
QPM is a genotype in which opaque2 gene has been incorporated along with associated modifiers. A genetically improved, hard endosperm quality protein maize
Contains twice the amount of lysine and tryptophan as compared to normal maize endosperm
Two additive modifier genes significantly influence the endosperm modification in two populations viz., W64Ao2·pool 33 and pool 33·W22o2
(Lopes et al., 1995)
One modifier locus was tightly linked with the gamma-zein coding sequences near the centromere of chromosome 7, while the other near the telomere of the 7L .
Advantages of QPM Higher yield potential, assured seed purity, more uniform and stable
endosperm modification and less monitoring for ensuring protein quality in seed production.
The opaque2 gene is recessive and the modifiers are polygenic
Each conventional backcross generation needs to be selfed to identify the opaque2 recessive gene
A minimum of six backcross generations are required to recover satisfactory levels of recurrent parent genome
In addition to maintaining the homozygous opaque2 gene, multiple modifiers must be selected
Rigorous biochemical tests to ensure enhanced lysine and tryptophan levels
in the selected materials in each breeding generation require enormous labor, time and material resources
A set of nine normal maize and five QPM inbred lines were analyzed for polymorphism with opaque2 specific SSR markers.
Based on the parental polymorphism analysis three normal inbred lines viz., V25, CM212 and CM145 and two QPM donors viz.,CML173 and CML176 were chosen for line conversion
In this study V25 is converted using CML176 as QPM donor
late maturing lines had tryptophan content ranging from 0.80 to 1.05% of total protein, where as normal inbred lines possessed tryptophan content from 0.38 to 0.49% of endosperm protein
PARENTAL POLYMORPHISM ANALYSIS USING OPAQUE2 SPECIFIC SSR MARKER,UMC1066 BETWEEN NORMAL AND QPM INBREDS
Lane M1: 1 kb marker, Lane M2: 100 bp marker, Lanes1: CM212,2:CM145, 3: V25, 4: V335, 5:V338, 6: V340, 7: V345, 8:V346, 9: V348, 10: CML173, 11:CML176, 12: CML180, 13:CML184 and 14: CML189
Three SSR markers, viz., phi057, phi112 and umc1066 located as internal repetitive elements within the opaque2 gene, were used in initial polymorphism analysis (suitability for foreground selection)
A total of 200 SSR markers spanning all the bin locations in a maize SSR consensus map (http://www.maizegdb.org) were selected for background selection
Of the 200 markers,77 were found to be polymorphic between V25 and CML176
P1: V25, P2: CML176, Lanes 1–14: BC2F2 individuals
a V25, b CML176, c 50% or more opaque kernels, d less than 25% opaque kernels, e completely modified kernels