software for incorporating marker data in genetic evaluations
DESCRIPTION
Software for Incorporating Marker Data in Genetic Evaluations. Kathy Hanford U.S. Meat Animal Research Center Agricultural Research Service U.S. Department of Agriculture. Outline. Introduction Mixed Models Incorporating Random QTL Effects Current/Future Modification to MTDFREMLQ - PowerPoint PPT PresentationTRANSCRIPT
Software for Incorporating Marker Data in Genetic
Evaluations
Kathy Hanford
U.S. Meat Animal Research Center
Agricultural Research Service
U.S. Department of Agriculture
2
Outline Introduction Mixed Models Incorporating Random
QTL Effects Current/Future Modification to
MTDFREMLQ Practical Limitations of MTDFREMLQ Applications
3
Introduction
Genetic evaluation genetic improvement of quantitative traits
through selection currently use polygenic model
genes at many loci each with a small effect measure the cumulative effect analysis with mixed models –software available
add genomic information
4
Introduction
Two phases in application of genomic data to livestock improvement
1) Statistical analysis of genomic information to determine the potential importance of that information (i.e. use of genetic markers to quantify the effects of QTL on traits of economic importance)
2) Include marker information in the genetic evaluation of potential parents to determine which will have the best progeny (Marker Assisted Selection)
5
Introduction QTL Identification
methods needed for outbred populations daughter and granddaughter designs
– many half-sib families with QTL effects being estimated for each half-sib family
Fernando and Grossman
– works with the outbred population as a whole, using both pedigree and marker information. Need complete marker data
other methods– such as MCMC, primarily been used only in
simulations
6
Mixed Model Incorporating Random QTL Effects
2
e
vu
RVar(e)MVar(v)AVar(u)
e vZu ZX y
,, VG
v 2nx1 vector of QTL alleleic effects
(a i=vp i+vm i+u i, v i=[vp i,vm i]’ )
7
BLUP equations for Fernando and Grossman model
yRZ
yRZ
yRX
v
u
VMZRZZRZXRZ
ZRZGAZRZXRZ
ZRXZRXXRX'
1
1
1
1
1
1111
1111
111
'
'
'
'''
'''
''
v
u
vvuvv
vuuuu
vu
8
Numerator Relationship Matrix (A)
The probability that alleles are IBD Probability between two half sibs is .25
Need the inverse of A depends on pedigree information Computed directly (Henderson, 1976) Relatively few nonzero elements (sparse)
9
Gametic (QTL) Correlation Matrix (M)
The probability that alleles are IBD Need the inverse of M
depends on pedigree information depends on probabilities QTL alleles are
IBD Computed directly if complete marker
information (Abdel-Azim and Freeman, 2001)
10
Practical Issues in Calculating the QTL Correlation Matrix Outbred population
Sparse marker information Individuals with missing or incomplete marker
data Some of which will be incorrect
Large complex pedigrees (inbreeding and loops)
11
Complex PedigreeA B
C D
E
??
A1A2
A1A2
A1A2
Software•MCMC
•LOKI•DET
•Pong-Wong,et al.•Allelic Peeling
•GenoProb
12
Size Considerations Each additional QTL increases the
number of equations by 2 times the number of animals in the pedigree
Sparse matrix storage Only store nonzero elements Polygenic (A-1) grows by 4 times the
number of animals Gametic (M-1) grows by 15 times the
number of animals
13
MTDFREML
Multiple Trait Derivative-free Restricted (or residual) Maximum Likelihood
A set of programs to obtain estimates of variances and covariances
USDA/ARS – Dale Van Vleck Keith Boldman, Lisa Kriese, Curt Van
Tassell, Steve Kachman, Joerg Dodenhoff
14
MTDFREML MTDFNRM – Calculate and output the
inverse of the numerator relationship matrix
MTDFPREP – Set up the model for the analysis
MTDFRUN – Run the analysis using the files produced by MTDFNRM and MTDFPREP to obtain (co)variance estimates and breeding values
15
Current Modifications to MTDFREML to Incorporate QTL effects
(MTDFREMLQ) MTDFNRMQ – modified to calculate inverse
of QTL correlation matrix (M-1) from IBD probability file (produced by Genoprob, Loki, etc) Non-inbred pedigree when marker data are
incomplete Inbred pedigree when marker data are complete Genetic groups arising from different populations
with different prior selection
16
Current Modifications to MTDFREMLQ (cont.)
MTDFPRPQ – modified to include multiple QTL in the model (validated for single QTL) Multiple trait Gametic imprinting (coded, not validated)
17
Current Modifications to MTDFREMLQ (cont.)
MTDFRUNQ – modified to include M-1 and associated between trait (co)variances for each QTL (V-1) Assumes independence between two
QTLs
18
Further Modifications to MTDFREMLQ Include inbred pedigree when marker
data are incomplete (approximate M-1) Calculate standard errors for the
parameters using the delta method Currently in MTDFREML In the testing/debugging stage
appears to work for single-trait, single-QTL and two-trait, single-QTL cases.
still need to test for multiple-QTL
19
Practical Limitations of MTDFREMLQ Memory Limitations/animal/traits/qtl50,000 1Trait 2 Traits 3 Traits 4 Traits
1 qtl <268M 324M 778M 1.6G
2 qtls <268M 552M 1.5G
3 qtls <268M 919M
4 qtls 314M 1.4G
5 qtls 430M
20,000 3 Traits 4 Traits 5 Traits
1 qtl 362M 698M 1.2G
2 qtls 642M 1.3G
3 qtls 1.1G
100,000 1 Trait 2 Traits
1 qtl <268M 562M
2 qtls <268M 1.0G
3 qtls 376M
4 qtls 542M
5 qtls 775MTime Limitations
20
Applications QTL detection
Find and utilize QTL in a breed and include that information in national genetic evaluation.
Marker Assisted Selection Experimental herds
The twinning herd at MARC– Currently producing about 50% twin calving
compared to a normal range of 1-3%– ~6000 in genetic evaluation, marker data from 1994
on over 3000 animals in regions of 3 QTLs
21