linkage mapping and qtl analysis_lecture

47
Lecture inkage Mapping and QTL Analysis in Experimental Populations

Upload: sameer-khanal

Post on 16-Apr-2017

402 views

Category:

Science


2 download

TRANSCRIPT

Page 1: Linkage mapping and QTL analysis_Lecture

Lecture

Linkage Mapping and QTL Analysis in

Experimental Populations

Page 2: Linkage mapping and QTL analysis_Lecture

Concepts: Linkage and Linkage MappingLinkage (of genes): “the association of genes that results from their being on the same chromosome (i.e., physically associated)”. For example, genes A and B in chromosomes Chr1 and Chr2 (Fig. 1a).

Linkage group:“all genes in one chromosome form one linkage group”. For example: Chr1 and Chr2 are two different linkage groups (Fig. 1a).

Linked (genes):“a pair of linked genes (specifically, their alleles) tend to be transmitted together during meiotic cycle and progenies deviate from Mendelian ratios depending upon recombination fraction (r) between the two genes”. For example, genes A and B in Fig. 1b.

A

BFig. 1a. A and B linked;C unlinked to A and B

C

Chr1 Chr2

AA aaBB bb

Aa aaBb bb

X

Xab Unlinked Linked

A AaB BbA Aab bba aaB Bba aab bb

Frequency

1/4

1/4

1/4

1/4

(1-r)/2

r/2

r/2

(1-r)/2

Fig. 1b. Test cross frequencies Source: R.H.J. Schlegel, Encyclopedic Dictionary of Plant Breeding

Page 3: Linkage mapping and QTL analysis_Lecture

Concepts: Linkage and Linkage MappingLinkage map: - “is a map of the frequencies of

recombination that occur between markers on homologous chromosomes during meiosis.”

- distance is measured in cM.

Physical map:- “shows the physical locations of genes and

other DNA sequences of interest. -distance measure in base pairs

Comparative map:- a map that compares linkage maps or

physical maps of related species based on shared markers or sequences, respectively (Fig. 2) Fig. 2. Test cross frequencies

Source: Fig. 2 - www.pnas.org/content/102/37/13206/F3.expansion.html

Page 4: Linkage mapping and QTL analysis_Lecture

1. Monogenic or oligogenic2. Discreet phenotypic classes

(nominal scale). 3. Typically, environmental effect on

trait expression is absent or low4. Discontinuous variation (Fig. 3)5. Genes have large effect6. Mapped as visible marker

(i.e., linkage mapping)

Concepts: QTL Analysis

Qualitative traits Quantitative traits1. Polygenic (quantitative trait loci)2. Continuum of measures (interval

scale). 3. Trait expression may show

profound environmental effect 4. Continuous variation (Fig. 4)5. Genes have smaller effects6. Mapping requires QTL analysis

cubocube.comFig.

3. D

iscr

eet t

rait

Fig. 4. Fruit shape: a quantitative trait

www.nature.com

Page 5: Linkage mapping and QTL analysis_Lecture

Lecture Outline: Linkage Mapping1. A peek into the history of linkage mapping

1.1. Mendel’s work: rediscovery, validation and exceptions 1.2. Early genetic linkage maps- natural mutants as genetic markers- two-point and three-point linkage analysis1.3. Mapping functions

2. Molecular era and revolution in genetic linkage mapping2.1. Molecular markers - isozymes, RFLPs, SSRs and SNPs 2.2. Mapping populations in plants - F2, RILs, BC2.3. Methods and tools for linkage mapping in plants- maximum likelihood, LOD support, multipoint linkage mapping2.4. Mapping polyploid genomes and outcrossing species

Page 6: Linkage mapping and QTL analysis_Lecture

1. A peek into the history of linkage mapping1.1. Mendel’s work: rediscovery,

validation and exceptions

- Experiments in Plant Hybridization (1865). Crosses between natural mutants (Fig. 5)

- Rediscovered in 1900

- Laws of segregation (Fig. 6) and independent assortment (Fig. 7)

- Wide validity in diverse organisms for unlinked qualitative traits

Source: monohybrid cross - www.desktopclass.com

Fig. 6. Monohybrid Cross

Fig. 5. Mendel’s traits

Source: Mendel’s traits -www.nature.com

Fig. 7

Source: Punnett square - sites.saschina.org

Page 7: Linkage mapping and QTL analysis_Lecture

1. A peek into the history of linkage mapping1.1. Mendel’s work: rediscovery,

validation and exceptions

- Bateson and Punnett (1904)- Deviation from Mendelian inheritance

(Fig. 8)

www.cas.miamioh.edu

1900

1865Gregor Mendel: - Proposed basic laws of inheritance

H. de Vries, E. von Tschermak, C. Correns - Rediscovered Mendel’s work

Boveri and Sutton:- Chromosome theory of inheritance1902Bateson and Punnett:- Linkage1904

Fig. 8

Page 8: Linkage mapping and QTL analysis_Lecture

1. A peek into the history of linkage mapping1.2. Early genetic linkage maps- 1900 – 1910: concepts of gene, allele,

genotype, phenotype, homozygote, heterozygote

Thomas Hunt Morgan: i. studied Drosophila geneticsii. genes responsible for discreet

phenotypic differences are located on chromosomes

iii. likelihood of co-transmission and reshuffling (due to recombination) were dependent on linkage between genes (Fig. 9)

iv. linkages can be quantified (i.e., linkage mapping is a possibility) Fig. 9. An illustration of

Morgan’s study in DrosophilaSource: Fig. 9. - http://bio.vtn2.com/bio-home/harvey/lect/images/morgan15.4.gif

Page 9: Linkage mapping and QTL analysis_Lecture

1. A peek into the history of linkage mapping1.2. Early genetic linkage mapsQuantifying genetic linkages: - mostly dihybrid test crosses and F2

populations (Fig. 10)- segregating for wild-type (+) and mutant

(-) alleles - sex-linked genes (X-linked)

First genetic linkage map of Sturtevant (Morgan’s student):

- Series of dihybrid crosses. Example, Fig. 10

- Map distance between body color and eye color genes

= Recombination frequency, RF (%) = [(0+2)/373)]*100 = 0.5

Fig. 10. An illustration of a dihybrid cross, based on Sturtevant (1913)

Source: Fig 10 - http://www.esp.org/foundations/genetics/classical/holdings/s/ahs-13.pdf

RF (%) = (recombinant type)*100/total

(+)

(-) (+)(-)

Parental type

Page 10: Linkage mapping and QTL analysis_Lecture

1. A peek into the history of linkage mapping1.2. Early genetic linkage maps

First genetic linkage map of Sturtevant (Morgan’s student) (Fig. 11):

- a series of two-point recombination frequencies (%) between 6 genes (Fig. 12). Here, 19 different populations

- started marker order from closest linkages and manually added other loci

Fig. 11. First genetic linkage map. Sturtevant (1913)

Factors concered

Proportion of crossovers

% of crossovers

BCO 193 / 16278 1.2BO 2 / 373 0.5BP 1464 / 4551 32.2BR 115 / 324 35.5BM 260 / 693 37.5COP 224 / 748 29.9COR 1643 / 4749 34.6COM 76 / 161 47.2OP 247 / 836 29.5OR 183 / 538 34.0OM 218 / 404 54.0CR 236 / 829 28.5CM 112 / 333 33.6B(C,O) 214 / 21736 1.0(C,O)P 471 / 1584 29.7(C,O)R 2062 / 6116 33.7(C,O)M 406 / 898 45.2PR 17 / 573 3.0PM 109 / 405 26.9

Source: Fig.11, Fig. 12 - www.nature.com/scitable/content/The-linear-arrangement-of-six-sex-linked-16655

Fig. 12. Sturtevant table of RF (%)

Page 11: Linkage mapping and QTL analysis_Lecture

1. A peek into the history of linkage mapping1.2. Early genetic linkage mapsLimitations of two-point linkage

analysis- Consider that 2 genes are far enough

apart that 2 crossovers (XOs) occur between them (occasionally) and involves:

i. same two nonsister chromatids for both (Fig. 13)

ii. different nonsister chromatids for both (Fig. 14)

- Result: either underestimation or overestimation of RF

Fig. 13. Double crossover (same)

A

A

B

B

ABAB

Gametes

a

a

b

b

abab

Fig. 14. Double crossover (different )

A

A

B

B

AbAb

Gametes

a

a

b

b

aBaB

Page 12: Linkage mapping and QTL analysis_Lecture

1. A peek into the history of linkage mapping1.2. Early genetic linkage maps

The three point test cross- Using trihybrid crosses- more efficient; includes 2 XOs- allows calculation of XO interference

Example (Fig. 15):i. First, test linkage. Here, they are

linked

ii.Most frequent are parental types

ii. Four single crossovers (SCOs)

iii. Two double crossovers (DCOs)

X- Z+Y+

offspring No. of Parental/phenotypes individual

sRecombinant

X+Y-Z+ 1 Recombinant DCOX-Y+Z+ 440 ParentalX-Y-Z+ 26 Recombinant SCO #1X-Y-Z- 61 Recombinant SCO #2X+Y+Z- 32 Recombinant SCO #1X+Y-Z- 442 ParentalX+Y+Z+ 58 Recombinant SCO #2X-Y+Z- 2 Recombinant DCOtotal 1062

XO type

Fig. 15. Three point test cross freq.

X+ Z-Y-

X- Z-Y-

X- Z-Y-

Triple Heterozygote

Triple HomozygousX

Page 13: Linkage mapping and QTL analysis_Lecture

1. A peek into the history of linkage mapping1.2. Early genetic linkage mapsExample (Fig. 16) continued..iv. Compare either parental type to

double XO types

v. Conclusion: gene Z is in centervi. Map distance (X-Z) = [SCO (X-Z) + DCOs]*100/totalvii. Coefficient of coincidence (C)

= observed DCO freq./expected DCO freq.

where, expected DCO freq = (X-Z SCO freq. * Z-Y SCO freq)

viii. Interference = 1 - C

X- Z+Y+

offspring No. of Parental/phenotypes individual

sRecombinant

X+Y-Z+ 1 Recombinant DCOX-Y+Z+ 440 ParentalX-Y-Z+ 26 Recombinant SCO #1X-Y-Z- 61 Recombinant SCO #2X+Y+Z- 32 Recombinant SCO #1X+Y-Z- 442 ParentalX+Y+Z+ 58 Recombinant SCO #2X-Y+Z- 2 Recombinant DCOtotal 1062

XO type

Fig. 16. Three point test cross freq.

X+ Z-Y-

X- Z-Y-

X- Z-Y-

Triple Heterozygote

Triple HomozygousX

P X- Y+ Z+ X+ Y- Z-

DCO X+ Y- Z+ X+ Y- Z+

D D S S S D

Page 14: Linkage mapping and QTL analysis_Lecture

1. A peek into the history of linkage mapping1.3. Mapping functions- “for more than three loci,

relationship among possible recombination fractions is complex”

- “RFs between loci flanking a region are not simple sum of recombination fractions for adjacent loci within the region”

- “conversion of recombination fractions to additive map distances requires mapping functions (Fig. 17):i. Haldane ii. Kosambi

Fig. 17. Table: Haldane and Kosambi mapping functions. Chart: comparison of mapping functions. “r” is recombination fraction and “d’ is map distance.

Source: Ben Hui Liu, Statistical Genomics; Roling Wu et al. , Statistical Genetics of Quantitative Traits

Page 15: Linkage mapping and QTL analysis_Lecture

1. A peek into the history of linkage mappingSummary:

-Paucity of visible natural markers (phenotypic mutants)

-Radiation mutants offered additional traits, but lethality, sterility was a problem

-Nevertheless, two point and three point linkage maps persisted for several decades (~70 years)

-Example:i. tomato: 258 morphological and physiological markers (Rick 1975)

Fig. 18. An illustration of A tomato linkage map made in 1952

Source: Fig. 18 – An introduction to Genetic Analysis, 5th edition.

Page 16: Linkage mapping and QTL analysis_Lecture

2. Molecular era and revolution in genetic linkage mapping

2.1. Molecular markers- gel electrophoresis brought isozyme markers in picture

-restriction endonuclease and southern blot techniques brought RFLPs

-DNA sequencing and PCR brought SSRs and SNPs

- virtually unlimited number of “visible markers”

-gaps in genetic linkage maps could be filled

- comparative mapping, gene cloning, QTL analysis and MAS could be done Fig. 19. Classes of molecular

markersSource: Fig.19 -nature.berkeley.edu/brunslab/tour/tour2.html

RFLP SSR

Page 17: Linkage mapping and QTL analysis_Lecture

2. Molecular era and revolution in genetic linkage mapping

2.2. Mapping populations in plants - considerations:

1st: marker polymorphism - adequate polymorphic markers between parents- contrasting traits of interest

2nd: reproductive mode- If inbreeding is a possibility:F2, recombinant inbred lines (RIL), backcross (BC)

- Mostly outcrossing (or self-incompatible), long generation time:pseudo-testcross, backcross

Fig. 20a. F2 population

Source: Fig.20 –K. Meksem and G. Kahl, The Handbook of Plant Genome Mapping

Fig. 20b. RIL population

Fig. 20c. BC population

Fig. 20d. pseudo- testcross population

Page 18: Linkage mapping and QTL analysis_Lecture

2. Molecular era and revolution in genetic linkage mapping

2.3. Methods and tools for linkage mapping in plants Steps:i. Data generation: genotype mapping population and prepare input format

for mapping

ii. Calculating recombination fractions (RFs): maximum likelihood estimates of pair-wise RFs

iii. Locus grouping: grouping of markers into prospective linkage groups based on linkage (maximum recombination fraction) and LOD (minimum limit of support) thresholds

iv. Locus ordering: finding the best possible order based on highest multi point likelihood (LOD) among different probable orders

v. Multilocus distance estimation

Page 19: Linkage mapping and QTL analysis_Lecture

2. Molecular era and revolution in genetic linkage mapping

2.3. Methods and tools for linkage mapping in plants Detailed procedural discourse on MapMakeri. Data generation:

mapmaker input file format (Fig. 21)Type of cross: F2 intercross

F2 backcross F3 self RI self RI sib

Defaults

Genotype Score:Default symbols areA : homozygous for parent AH : heterozygous B : homozygous for parent BC : not homozygous for parent AD : not homozygous for parent B - : for missing

ScoresMarker Names

Population Size

Number of Markers

Fig. 21. MapMaker input format

Page 20: Linkage mapping and QTL analysis_Lecture

2. Molecular era and revolution in genetic linkage mapping

2.3. Methods and tools for linkage mapping in plants Detailed procedural discourse on MapMakerii. Calculating recombination fractions (RFs): in backcross mating design (BC1)

- progenies can be distinctly categorized into parental or recombinant (Fig. 22a)- recombination fraction is simply the frequency of recombinant type (Fig 22b)

Fig. 22a. Freq. of gametes in BC mating

Fig. 22b. RF estimation is plain and simple for a backcross mating design

Page 21: Linkage mapping and QTL analysis_Lecture

2. Molecular era and revolution in genetic linkage mapping

2.3. Methods and tools for linkage mapping in plants Detailed procedural discourse on MapMakerii. Calculating recombination fractions (RFs): in F2 mating design (Fig. 23a)

- progenies cannot be distinctly categorized. For illustration, four possible genotypes shown in Fig. 23b belong to same genotype class A1A2B1B2, but may come from parental gametes without XO or recombinant gametes (with XO) in both parents

Fig. 23a. F2 mating design and F2 genotypes

Fig. 23b. The counts (in parenthesis) and frequencies of the 16 possible genotypes in an F2 family

Page 22: Linkage mapping and QTL analysis_Lecture

2. Molecular era and revolution in genetic linkage mapping

2.3. Methods and tools for linkage mapping in plants Detailed procedural discourse on MapMaker

ii. Calculating recombination fractions (RFs): in F2 mating design- 16 possible genotypes coalesce into 9 observable genotypic classes

Fig. 24. Frequencies of the nine observed genotypes in an F2 population

Page 23: Linkage mapping and QTL analysis_Lecture

2. Molecular era and revolution in genetic linkage mapping

2.3. Methods and tools for linkage mapping in plants Detailed procedural discourse on MapMaker

ii. Calculating recombination fractions (RFs): in F2 mating design- likelihood function for estimating RF ( )

- “Maximum likelihood for r is obtained by setting S(r) = 0 and solving for r”

- “however, there is no explicit solution for r”

- different ways to invoke iterative algorithm to solve for r:a. Grid search b. Newton-Raphson Method

Fig. 25. Likelihood function of r

Page 24: Linkage mapping and QTL analysis_Lecture

2. Molecular era and revolution in genetic linkage mapping

2.3. Methods and tools for linkage mapping in plants Detailed procedural discourse on MapMaker

iii. Locus grouping :- MapMaker’s “GROUP” command builds preliminary linkage groups based on maximum-likelihood estimates of RF and corresponding LOD score between marker pairs

- maximum allowable RF and minimum LOD score thresholds can be manually updated to track changes in grouping structure with corresponding changes in thresholds

- finally, linkage groups are formed by marker associations. For example, if A is linked to B, and B is linked to C, all three belong to a group (remember, RF and LOD thresholds are there for minimizing spurious linkages)

Page 25: Linkage mapping and QTL analysis_Lecture

2. Molecular era and revolution in genetic linkage mapping

2.3. Methods and tools for linkage mapping in plants Detailed procedural discourse on MapMaker

iv. Locus ordering:

-“ ordering is the central problem in linkage mapping, and also the most interesting in the sense that for groups of even modest size there is no sure way to find the best (N! / 2) possible order”

-MapMaker’s “COMPARE” command is exhaustive - computes maximum likelihood score for all possible orders and reports a subset of most likely ones

- however, ordering more than 5-7 markers with “COMPARE” is not practical (time issue!)

Source: Meksem and Kahl, The Handbook of Plant Genome Mapping

Page 26: Linkage mapping and QTL analysis_Lecture

2. Molecular era and revolution in genetic linkage mapping

2.3. Methods and tools for linkage mapping in plants Detailed procedural discourse on MapMaker

iv. Locus ordering:

- therefore, have to resort to faster algorithms. For example, MapMaker’s “ORDER” command:

a. identifies the most informative subset of markers (default 5 markers)b. performs exhaustive order search (akin to COMPARE) and finds onec. tries to add remaining markers individually (at default RF = 0.5 and LOD =

3.0)d. drops LOD threshold to 2.0 and tries remaining onese. in case markers still cannot be assigned a particular position, reports as suchf. such markers can be manually tried with “TRY” command and dropped if fails

Source: Meksem and Kahl, The Handbook of Plant Genome Mapping

Page 27: Linkage mapping and QTL analysis_Lecture

2. Molecular era and revolution in genetic linkage mapping

2.3. Methods and tools for linkage mapping in plants Detailed procedural discourse on MapMakerv. Multipoint distance estimation:

- MapMaker uses MAP command for multipoint estimates (not two-point estimates)

- it employs EM algorithm (expectation-maximization algorithm), where mutually dependent unknown parameters are alternately updated to converge to a maximum.

- for example, an initial estimate (two-point) of r (θold = θ1, θ2, … θl-1, where l is the number of loci) is used to compute expected number of recombinant type for each interval (E step)

- (M step): using the new expected value MLE of θnew is computed- E and M is iterated until θnew θold (the likelihood converges to a maximum)- map distances are calculated using different mapping functions (default

Haldane)

Source: Ben Hui Liu, Statistical Genomics

Page 28: Linkage mapping and QTL analysis_Lecture

Revisiting tomato genetic linkage maps:-Example:

Tomato: (Sim et al. 2012)Fig. 26a and 26b

- 7,666 SNPs

2. Molecular era and revolution in genetic linkage mapping

Fig. 26a. SNP distribution

Fig. 26b. Two tomato linkage maps compared to draft genome assembly

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0040563

Page 29: Linkage mapping and QTL analysis_Lecture

2. Molecular era and revolution in genetic linkage mapping

2.4. Mapping polyploid genomes

- Allopolyploids show disomic segregation. Hence, linkage mapping in allopolyploids are similar to diploid linkage mapping

- Autopolyploids (e.g., potato, sugarcane etc) show polysomic segregation (Fig. 27a). Hence, linkage mapping in autopolyploids employ different mapping techniques

-For example, single dose markers (SDMs) segregating in 1:1 ratio (Fig. 27b) used in pseudo-testcross mapping strategy

-Also, biparental and double-dose markers can be integrated using TetraploidMap software

Fig. 27a. Single locus Segregation

Aaaa X aaaa

1/2 Aaaa 1/2 aaaa

Autotetraploid

Fig. 27b. Segregation of a SDM

Page 30: Linkage mapping and QTL analysis_Lecture

2. Molecular era and revolution in genetic linkage mapping

2.4. Mapping polyploid genomes

- Example TetraploidMap: four homologous chromosomes and a consensus map

Source: TetraploidMap manual

Page 31: Linkage mapping and QTL analysis_Lecture

Linkage MappingSummaryi. Genetic linkage maps were originally built to map phenotypic mutantsii. Modern linkage maps use molecular markers (predominantly, DNA

markers)iii. Different types of mapping populations are usediv. Mapping studies in diploid and allopolyploids use similar tools and

techniquesv. Linkage maps in autopolyploids neccessitates different mapping

strategiesvi. Linkage maps are useful for

- tagging markers along chromosomes- identifying markers linked to genes and cloning genes- identifying quantitative trait loci for traits of interest- marker assisted selection- comparative mapping and evolutionary studies

Page 32: Linkage mapping and QTL analysis_Lecture

Lecture Outline: QTL Analysis

3. QTL mapping: models and methods3.1. Single QTL model

3.1.1. Single marker analysis (SMA)- t-tests, ANOVA, linear regression

3.1.2. Simple interval mapping (SIM)

3.2. Multiple QTL model3.2.1. Multiple regression 3.2.2. Composite interval mapping (CIM)

3.3. QTL mapping in polyploid genomes

Page 33: Linkage mapping and QTL analysis_Lecture

3. QTL Mapping: Models and Methods

3.1. Single QTL model- Assessing marker-trait associations at individual marker locus- gene effects for single QTL model:Backcross: g = 0.5 (µ1 - µ2), where

µ1 = mean for homozygous µ2 = mean for heterozygous

F2: additive (α) = 0.5 (µ1 - µ3) and dominance (d) = 0.5 (2µ2- µ1 - µ3), where

µ3 = mean for homozygous for parent B alleles

- Employs single marker analysis (SMA) techniques

Source: Ben Hui Liu, Statistical Genomics

Page 34: Linkage mapping and QTL analysis_Lecture

3.1.1. Single marker analysis (SMA)

- based on linear model:yj = µ + f (markerj) + ɛj, where yj is trait value of the jth individual in the population µ is population mean f (markerj) is a function of marker genotype ɛj is the residual associated with the jth individual

Different methods:a. marker genotypes treated as classification variable

- for a backcross (2 genotypes): use t-test- for F2 population (up to 3 genotypes): use ANOVA

b. marker genotypes treated as dummy variables- use marker-trait regression

c. likelihood ratio test and maximum likelihood estimationSource: Ben Hui Liu, Statistical Genomics

Page 35: Linkage mapping and QTL analysis_Lecture

3.1.1. SMA

Source: Ben Hui Liu, Statistical Genomics

yj = β0 + β1xj + ɛj ,whereyj is the trait value for the jth individual in the population, xj is the dummy variable taking 1 if the individual is AA and -1 for Aa. β0 is the intercept for the regression which is the overall mean for the trait. β1 is the slope for the regression line and ɛj is the random error.

yj = β0 + β1x1j + β2x2j + ɛj ,whereyj is the trait value for the jth individual in the population, x1j is the dummy variable for the marker additive effect taking 1, 0, and -1 for marker genotypes AA, Aa and aa, respectively. x2j is the dummy variable for the marker dominant effect taking 1, 0, and -1 for marker genotypes AA, Aa and aa. β0 is the intercept for the regression which is the overall mean for the trait. β1 and β2 are the slopes for the additive and dominant regression lines, respectively. ɛj is the random error.

BC

F2

- t-test and ANOVASteps (given alleles A and a at a marker locus):a. sort marker genotype classes into groups - “AA” and “Aa” in backcross; “AA”, “Aa”, and “aa” in (F2)b. test significant difference in means - t statistic (in backcross), F statistic (in F2)

- Linear regression approach

Fig. 27. One way analysis

Page 36: Linkage mapping and QTL analysis_Lecture

1. Conceptually and computationally simple

2. Genetic linkage map information not needed

3. Easily incorporates covariates

4. Informative when markers sufficiently cover the genome

5. Can be extended to multiple regression for multiple QTL model

3.1.1. SMA

Advantages Limitations1. Location and effects of detected QTLs are

confoundedlarger QTL effect could be because the marker is close to a QTL orfarther from the QTL, but the QTL contributes much significantly to the trait

2. QTL position cannot be precisely detected3. Power to detect QTL is low when marker

density is low4. Multiple comparison increases false

positives5. Missing genotypes are totally excluded from

analysis6. Limited ability to separate linked QTLs and

no ability to assess interacting QTLs

Page 37: Linkage mapping and QTL analysis_Lecture

Basic statistical analysis platforms:Excel JMPSASR etc

QTL mapping platforms:WinQTLCartographerR/QTLJoinMapMapMarker/QTL etc.

3.1.1. SMA

Software tools Windows QTL CartographerSMA analysis fits the data to the simple linear

regression modely = b0 + b1 x + e

Results reported includes b0, b1 and the F statistic for each marker

F statistic compares the hypothesis H0: b1 = 0; H1: b1

The pr(F) is a measure of how much support there is for H0

A smaller pr(F) indicates less support for H0 and thus more support for H1

Likelihood ratio test statistic compares two nested hypothesis H0 and H1 with L0 and L1 likelihoods. Then, the “Likelihood Ratio Test Statistic: is: -2ln(L0/L1)

Page 38: Linkage mapping and QTL analysis_Lecture

3.1.2. Simple interval mapping (IM)- “Mapping Mendelian factors underlying Quantitative Traits

using RFLP linkage maps” (Lander and Bolstein 1989)- Concept:

Based on joint segregation of a pair of adjacent markers and a putative QTL within an interval flanked by the marker pair (Fig. 28)

Methods:a. Likelihood approach (preferred over regression)b. Regression approach (faster computation than ML)

Source: Ben Hui Liu, Statistical Genomics

Fig. 28. Linkage relationship of a QTL and two flanking markers

Page 39: Linkage mapping and QTL analysis_Lecture

3.1.2. SIM

Likelihood approach (employed in WinQTLCart):

Source: Course notes, QTL mapping and Discovery

The density function for the normal distribution with mean μQk, and variance σ2. There are K=1 to N genotypes.

probability of the QTL genotype, given the jth

genotypes of the flanking markers

likelihood of phenotypic value z, given the jth

genotypes of the flanking markers.

MLE estimate under the reduced model of no QTL: μQQ=μQq=μqq

MLE estimate under the full model including a QTL.

LOD scores (log10 of the odds ratio), where

OR LR= 4.6LOD

Page 40: Linkage mapping and QTL analysis_Lecture

1. Conceptually and computationally simple

2. Genetic linkage map information not needed

3. Easily incorporates covariates 4. Informative when markers

sufficiently cover the genome5. Can be extended to multiple

regression for multiple QTL model

3.1.1. SIM

Advantages Limitations1. Location and effects of detected

QTLs are confoundedlarger QTL effect could be because the marker is close to a QTL orfarther from the QTL, but the QTL contributes much significantly to the trait

2. QTL positions cannot be precisely detected

3. Power to detect QTL is low when marker density is low

4. Multiple comparison increases false positives

5. Missing genotypes are totally excluded from analysis

Page 41: Linkage mapping and QTL analysis_Lecture

3.2. Composite interval mapping (CIM)

Source: Course notes, QTL mapping and Discovery

Test Interval

Left Marker Right Marker

Blocked Region (Cofactors)

CIM is a combination of IM and multiple regression (multiple QTL model)- Fits both the effects of a QTL as well as the effects of covariates (subset of

selected genetic markers)

- CIM adds background loci to simple interval mapping (IM).

- It fits parameters for a target QTL in one interval while simultaneously fitting

partial regression coefficients for "background markers" to account for

variance caused by non-target QTL.

- Background markers are usually 20-40 cM apart

Page 42: Linkage mapping and QTL analysis_Lecture

3.2. CIM

General CIM statistical model can be written as:

Phenotypic trait value of subject i

Overall mean

Row vector of predictor variables corresponding to the effects of the putative QTL

Row vector of predictor variables corresponding to the rth cofactor marker

Column vector with the coefficient of the rth cofactor marker

N(0,δ2)

Zi1α: additive effectZi1d: dominance effect

Page 43: Linkage mapping and QTL analysis_Lecture

3.2. CIM

Set of statistical models evaluated in the CIM analysis (WinQTLCartographer):

- For backcross, recombinant inbred lines, and double haploids, only Model 0 and Model 1 are generated and tested

- For F2 design, all four models are generated and tested

Page 44: Linkage mapping and QTL analysis_Lecture

Comparison of SMA, SIM and CIM

Much precise location

http://solcap.msu.edu/pdf%20files/5PAA_Douches_2_Mapping_Populations.pdf

Page 45: Linkage mapping and QTL analysis_Lecture

3.3. QTL mapping in polyploid genomes - Generally, QTL mapping in allopolyploid genomes is same as

diploids

- However, QTL mapping in autopolyploid genomes require different strategies

- Example: QTL mapping in autotetraploids using TetraploidMap

Page 46: Linkage mapping and QTL analysis_Lecture

3.3. QTL mapping in polyploid genomes Summary- Single marker analysis (SMA) involves t-test, ANOVA, or linear

regression approach

- Interval mapping is based on joint segregation of a pair of adjacent markers

- CIM is a combination of IM and multiple regression and is desirable among the three

- QTL mapping in autopolyploids require different analytical strategies

Page 47: Linkage mapping and QTL analysis_Lecture

Thanks