penalized maximum likelihood estimates of genetic covariance matrices with shrinkage towards...
Post on 15-Aug-2015
35 Views
Preview:
TRANSCRIPT
Penalized maximum likelihood
estimates of genetic covariance
matrices with shrinkage towards
phenotypic dispersion
Karin Meyer1, Mark Kirkpatrick2, Daniel Gianola3
1Animal Genetics and Breeding Unit, University of New England, Armidale
2Section of Integrative Biology, University of Texas, Austin
3University of Wisconsin-Madison, Madison
AAABG 2011
Penalized REML | Introduction
Motivation
Multivariate genetic analyses: more than 2-4 traits
→ desirable!→ technically increasingly feasible→ inherently problematic
SAMPLING VARIANCE ↑↑ with no. of traits
Measures to alleviate S.V.
large→ gigantic data setsparsimonious models→ less parameters than covar.sestimation→ use additional information
Bayesian: Prior
REML: Impose penalty P on likelihood→ P = f(parameters)→ designed to reduce S.V.
K. M. / M. K. / D. G. | | AAABG 2011 2 / 13
Penalized REML | Introduction
Motivation
Multivariate genetic analyses: more than 2-4 traits
→ desirable!→ technically increasingly feasible→ inherently problematic
SAMPLING VARIANCE ↑↑ with no. of traits
Measures to alleviate S.V.
large→ gigantic data setsparsimonious models→ less parameters than covar.sestimation→ use additional information
Bayesian: Prior
REML: Impose penalty P on likelihood→ P = f(parameters)→ designed to reduce S.V.
K. M. / M. K. / D. G. | | AAABG 2011 2 / 13
Penalized REML | Introduction
Penalized REML
Maximize:
log LP = log L − 12 ψ P
Tuning factor
Penalty on parameters
Standard, unpenalized REML log likelihood
Objectives:Introduce new type of P→ prior: Σ ∼ IW
Compare efficacy of different P
K. M. / M. K. / D. G. | | AAABG 2011 3 / 13
Penalized REML | REML and penalties
Penalties to improve estimates of ΣG
Σ̂P estimated much more accurately than Σ̂G
Idea: ‘Borrow strength’ from Σ̂P
1 Shrink canonical eigenvalues towards their mean→ ‘bending’ (Hayes & Hill 1981)
P ℓλ∝∑
i(logλi − λ̄)2 λi : eig.values of Σ̂−1
PΣ̂G
2 a) Shrink Σ̂G towards Σ̂P
→ assume ΣG ∼ IW(Σ−1P, ψ)
→ obtain penalty as minus log density of IW
PΣ ∝ C log |Σ̂G|+ tr(Σ̂−1G
Σ̂0P
)b) Shrink R̂G towards R̂P
PR ∝ C log |R̂G|+ tr(R̂−1G
R̂0P
)
K. M. / M. K. / D. G. | | AAABG 2011 4 / 13
Penalized REML | REML and penalties
Penalties to improve estimates of ΣG
Σ̂P estimated much more accurately than Σ̂G
Idea: ‘Borrow strength’ from Σ̂P
1 Shrink canonical eigenvalues towards their mean→ ‘bending’ (Hayes & Hill 1981)
P ℓλ∝∑
i(logλi − λ̄)2 λi : eig.values of Σ̂−1
PΣ̂G
2 a) Shrink Σ̂G towards Σ̂P
→ assume ΣG ∼ IW(Σ−1P, ψ)
→ obtain penalty as minus log density of IW
PΣ ∝ C log |Σ̂G|+ tr(Σ̂−1G
Σ̂0P
)b) Shrink R̂G towards R̂P
PR ∝ C log |R̂G|+ tr(R̂−1G
R̂0P
)K. M. / M. K. / D. G. | | AAABG 2011 4 / 13
Penalized REML | Simulation study
Simulation
Paternal half-sib design: s = 100, n = 10
5 traits, h21≥ . . . ≥ h2
5, MVN
60 sets of population values
→ vary mean & spread of λi
1000 replicates per case
3 penalties
P ℓλ
regress log(λ̂i) towards their mean
PΣ shrink Σ̂G towards Σ̂P
PR shrink R̂G towards R̂P
Obtain Σ̂ψG
and Σ̂ψE
for values of ψ, range 0− 1000
K. M. / M. K. / D. G. | | AAABG 2011 5 / 13
Penalized REML | Simulation study
Simulation - cont.
Estimate ψ using population values (V∞)
→ construct MSB & MSW→ validation→ ψ̂ maximize log L in valid. data
Evaluate effect of penalty
→ Loss: deviation of Σ̂X from ΣX
L1(ΣX, Σ̂X) = tr(Σ−1X
Σ̂X)− log |Σ−1X
Σ̂X| − q
→ Percentage Reduction In Average Loss
PRIAL = 100
1−L̄1(ΣX, Σ̂ψ̂
X)
L̄1(ΣX, Σ̂0X
)
K. M. / M. K. / D. G. | | AAABG 2011 6 / 13
penalized
unpenalized
Penalized REML | Results | PRIAL
PRIAL in estimated covariance matrices
Pλl
PΣ PR
40
60
80
100
●
●
●
●
●●
●
Genetic
71.4 70.6 72.3
Pλl
PΣ PR
0
20
40
60
80
Residual
43.4 13.3 37.1
Pλl
PΣ PR
0
2
4
6
8
●
●
●
●●
●
●●
●
●
●
●
●
Phenotypic
1.2 1.2 2.2
K. M. / M. K. / D. G. | | AAABG 2011 7 / 13
Penalized REML | Results | PRIAL
PRIAL for Σ̂G – 60 individual cases
in order of PRIAL for P ℓλ
5K 5L 5J 5D 2K 5H 5F 2F 5E 2L 4F 4K 2H 4L 5I 2D 4H 3K 3F 2E 5C 1D 4D 3L 1C 3H 3D 2J 5G 4J 4E 5B 3E 5A 4I 1G 3C 3A 3B 1E 3G 3J 1F 4A 2I 4C 4B 1K 3I 4G 2C 1B 2G 1H 2B 2A 1L 1J 1A 1I
40
60
80
100
● ●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●●
●
●
●
●● ●
●
●● ●
● ● ● ●
● ●
● ● ●●
● ●● ●
●
●● ●
●
●
●
● ●
●●
●
●
●
Penalty
Pλl PΣ PR
K. M. / M. K. / D. G. | | AAABG 2011 8 / 13
Penalized REML | Results | PRIAL
Extension: PRIAL “double” penalties
P so far: improve Σ̂G
“Double”
→ PΣ on Σ̂G & Σ̂E
PR on R̂G & R̂E
P ℓλ
on logλi &log(1−λi)
→ 1: joint ψ2: separate ψ
P ℓλ
PΣ PR
Σ̂G G 71 71 72G+E 1 73 70 73
2 74 73 74
Σ̂E G 43 13 37G+E 1 62 54 60
2 65 65 63
Little impact on PRIAL for Σ̂G
Can ↑↑ PRIAL for Σ̂E w/out ↓ PRIAL for Σ̂G
Separate ψ̂: extra effort, limited add. improvement
K. M. / M. K. / D. G. | | AAABG 2011 9 / 13
Penalized REML | Results | PRIAL
Summary: PRIAL
Substantial improvements in Σ̂G & Σ̂E feasible
→ Results shown ‘optimistic’→ ψ̂ using pop.values
PRIAL heavily influenced by spread of can. eigenvalues
P ℓλ
best if λi ≈ λ̄; overshrink if not
PΣ & PR worst if λi ≈ λ̄; more robust than P ℓλ
if λi 6≈ λ̄similar canonical eigenvalues unlikely in practice?
New type of P useful
K. M. / M. K. / D. G. | | AAABG 2011 10 / 13
Penalized REML | Results | Bias
Mean estimates of can. eigenvalues
Penalties act differently – illustrated for single case
λi = 0.5,0.2,0.15,0.1,0.05; λ̄ = 0.2
0
0.2
0.4
λ1 λ3 λ5
●
●
●
●
●
None
●
●
●
●
●
Pλl
λ1 λ3 λ5
●
●
●
●
●
PΣ
●
●
●
●
●
●
●
●
●
●
PR
● Pop.value
K. M. / M. K. / D. G. | | AAABG 2011 11 / 13
Penalized REML | Results | Bias
Mean relative bias (in %)
1 2 3 4 5
Can. eigenvaluesNone 9 27 17 -20 -79P ℓλ
-4 16 29 58 101
PΣ 8 25 25 39 75PR 1 16 21 37 57
HeritabilitiesNone -1 4 5 7 12P ℓλ
-7 5 12 23 45
PΣ 1 10 15 26 44PR -2 2 5 9 17
K. M. / M. K. / D. G. | | AAABG 2011 12 / 13
Penalized REML | Finale
Conclusions
Can obtain ‘better’ estimates of genetic cov. matrices
→ borrow strength from phenotypic covariance matrix→ easy to implement by penalty on log L
Comparable improvement (PRIAL) from different PNone best overall
PR: Shrinking R̂G towards R̂P
less variableless bias in estimates of genetic parameterseasier to implement than P ℓ
λ
Penalized REML recommended!
Small to moderate data sets, ≥ 4 traits
K. M. / M. K. / D. G. | | AAABG 2011 13 / 13
Penalized REML | Finale
Conclusions
Can obtain ‘better’ estimates of genetic cov. matrices
→ borrow strength from phenotypic covariance matrix→ easy to implement by penalty on log L
Comparable improvement (PRIAL) from different PNone best overall
PR: Shrinking R̂G towards R̂P
less variableless bias in estimates of genetic parameterseasier to implement than P ℓ
λ
Penalized REML recommended!
Small to moderate data sets, ≥ 4 traits
K. M. / M. K. / D. G. | | AAABG 2011 13 / 13
Penalized REML | Finale
Conclusions
Can obtain ‘better’ estimates of genetic cov. matrices
→ borrow strength from phenotypic covariance matrix→ easy to implement by penalty on log L
Comparable improvement (PRIAL) from different PNone best overall
PR: Shrinking R̂G towards R̂P
less variableless bias in estimates of genetic parameterseasier to implement than P ℓ
λ
Penalized REML recommended!
Small to moderate data sets, ≥ 4 traits
K. M. / M. K. / D. G. | | AAABG 2011 13 / 13
P-REMLpart of
everydaytoolkit
top related