forecasting with weakly identified linear state-space models. · logo motivation weakly identified...
TRANSCRIPT
logo
Forecasting with Weakly Identified LinearState-Space Models.
Sebastien Blais
Bank of Canada
4 November 2009
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Outline
1 Motivation
2 Weakly Identified LSSMs
3 Simulation Results
4 The Identification Principle
5 Conclusion
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Outline
1 Motivation
2 Weakly Identified LSSMs
3 Simulation Results
4 The Identification Principle
5 Conclusion
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Likelihood-based inference
Consider a model with E [yt+1|y1:t ,θ] = f (y1:t ,θ) and a prior distributionp(θ).
The predictive density is
p(yT+1|y1:T ) =
∫
p(yT+1|y1:T ,θ)p(θ|y1:T ) dθ.
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Likelihood-based inference
Consider a model with E [yt+1|y1:t ,θ] = f (y1:t ,θ) and a prior distributionp(θ).
The predictive density is
p(yT+1|y1:T ) =
∫
p(yT+1|y1:T ,θ)p(θ|y1:T ) dθ.
Minimizing expected square error yields a point forecast
E [yT+1|y1:T ] = arg minδ
∫
(yT+1 − δ)2p(yT+1|y1:T ) dyT+1
6= f (y1:T ,θMLE ) in finite sample
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Likelihood-based inference
Consider a model with E [yt+1|y1:t ,θ] = f (y1:t ,θ) and a prior distributionp(θ).
The predictive density is
p(yT+1|y1:T ) =
∫
p(yT+1|y1:T ,θ)p(θ|y1:T ) dθ.
Minimizing expected square error yields a point forecast
E [yT+1|y1:T ] = arg minδ
∫
(yT+1 − δ)2p(yT+1|y1:T ) dyT+1
6= f (y1:T ,θMLE ) in finite sample
When does it matter?
Posterior averaging is beneficial when a linear state-space modelis weakly identified.
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
When is a LSSM weakly identified?
When the “true” parameter values are “close” to the regionwhere the model is locally unidentified.
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
When is a LSSM weakly identified?
When the “true” parameter values are “close” to the regionwhere the model is locally unidentified.
Weak identification (empirical underidentification, localalmost nonidentification) is a finite-sample problem.
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
When is a LSSM weakly identified?
When the “true” parameter values are “close” to the regionwhere the model is locally unidentified.
Examples:
Multicollinearity - correlation is “close” to being equal to 1
Weak instruments - correlation is “close” to being equal to0
ARMA processes - MA and AR roots are “close” tocanceling out
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
When is a LSSM weakly identified?
When the “true” parameter values are “close” to the regionwhere the model is locally unidentified.
Consequences: irregular finite-sample distributions (far from asymmetric normal, e.g. multimodal)
Parameter point estimators are bad summaries ofuncertainty (e.g. strong bias)
Unreliable asymptotics
Confidence regions can be disjoint (challenges forcommunication)
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Related literature
1 Weak identification and “irregular” distributions.The likelihood function of a finite mixture distribution isinvariant with respect to permutations of its componentdistributions (“Label switching”)The likelihood function of latent factor models is invariantwith respect to factor permutationsThe likelihood function of latent factor models is invariantwith respect to factor reflections (“Sign switching”)No valid bounded confidence interval for a parameter existsif this parameter is not identifiable on a subset of theparameter space.
2 How best to normalize?Normalization in structural equations models affects thefinite-sample distribution of OLS and 2SLS estimators(Hillier, 1990).Normalization becomes critical when weak identificationissues arise (Hamilton, Waggoner and Zha, 2007)
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Related literature
1 Weak identification and “irregular” distributions.The likelihood function of a finite mixture distribution isinvariant with respect to permutations of its componentdistributions (“Label switching”)The likelihood function of latent factor models is invariantwith respect to factor permutationsThe likelihood function of latent factor models is invariantwith respect to factor reflections (“Sign switching”)No valid bounded confidence interval for a parameter existsif this parameter is not identifiable on a subset of theparameter space.
2 How best to normalize?Normalization in structural equations models affects thefinite-sample distribution of OLS and 2SLS estimators(Hillier, 1990).Normalization becomes critical when weak identificationissues arise (Hamilton, Waggoner and Zha, 2007)
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Related literature
1 Weak identification and “irregular” distributions.The likelihood function of a finite mixture distribution isinvariant with respect to permutations of its componentdistributions (“Label switching”)
Dick and Bowden (1973), Redner and Walker (1984),Stephen (1997,2000), Celeux, Hurn and Robert (2000),Fruhwirth-Schnatter (2001), Geweke (2007)
The likelihood function of latent factor models is invariantwith respect to factor permutationsThe likelihood function of latent factor models is invariantwith respect to factor reflections (“Sign switching”)No valid bounded confidence interval for a parameter existsif this parameter is not identifiable on a subset of theparameter space.
2 How best to normalize?Normalization in structural equations models affects thefinite-sample distribution of OLS and 2SLS estimators(Hillier, 1990).
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Related literature
1 Weak identification and “irregular” distributions.The likelihood function of a finite mixture distribution isinvariant with respect to permutations of its componentdistributions (“Label switching”)The likelihood function of latent factor models is invariantwith respect to factor permutationsThe likelihood function of latent factor models is invariantwith respect to factor reflections (“Sign switching”)No valid bounded confidence interval for a parameter existsif this parameter is not identifiable on a subset of theparameter space.
2 How best to normalize?Normalization in structural equations models affects thefinite-sample distribution of OLS and 2SLS estimators(Hillier, 1990).Normalization becomes critical when weak identificationissues arise (Hamilton, Waggoner and Zha, 2007)
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Related literature
1 Weak identification and “irregular” distributions.The likelihood function of a finite mixture distribution isinvariant with respect to permutations of its componentdistributions (“Label switching”)The likelihood function of latent factor models is invariantwith respect to factor permutations
Jennrich (1978)The likelihood function of latent factor models is invariantwith respect to factor reflections (“Sign switching”)No valid bounded confidence interval for a parameter existsif this parameter is not identifiable on a subset of theparameter space.
2 How best to normalize?Normalization in structural equations models affects thefinite-sample distribution of OLS and 2SLS estimators(Hillier, 1990).Normalization becomes critical when weak identificationissues arise (Hamilton, Waggoner and Zha, 2007)
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Related literature
1 Weak identification and “irregular” distributions.The likelihood function of a finite mixture distribution isinvariant with respect to permutations of its componentdistributions (“Label switching”)The likelihood function of latent factor models is invariantwith respect to factor permutationsThe likelihood function of latent factor models is invariantwith respect to factor reflections (“Sign switching”)No valid bounded confidence interval for a parameter existsif this parameter is not identifiable on a subset of theparameter space.
2 How best to normalize?Normalization in structural equations models affects thefinite-sample distribution of OLS and 2SLS estimators(Hillier, 1990).Normalization becomes critical when weak identificationissues arise (Hamilton, Waggoner and Zha, 2007)
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Related literature
1 Weak identification and “irregular” distributions.The likelihood function of a finite mixture distribution isinvariant with respect to permutations of its componentdistributions (“Label switching”)The likelihood function of latent factor models is invariantwith respect to factor permutationsThe likelihood function of latent factor models is invariantwith respect to factor reflections (“Sign switching”)
Blais (this paper), Box and Jenkins (1976), Stoffer and Wall(1991), Kleibergen and Hoek (2000), Fruhwirth-Schnatterand Wagner (2008)
No valid bounded confidence interval for a parameter existsif this parameter is not identifiable on a subset of theparameter space.
2 How best to normalize?Normalization in structural equations models affects thefinite-sample distribution of OLS and 2SLS estimators(Hillier, 1990).
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Related literature
1 Weak identification and “irregular” distributions.The likelihood function of a finite mixture distribution isinvariant with respect to permutations of its componentdistributions (“Label switching”)The likelihood function of latent factor models is invariantwith respect to factor permutationsThe likelihood function of latent factor models is invariantwith respect to factor reflections (“Sign switching”)No valid bounded confidence interval for a parameter existsif this parameter is not identifiable on a subset of theparameter space.
2 How best to normalize?Normalization in structural equations models affects thefinite-sample distribution of OLS and 2SLS estimators(Hillier, 1990).Normalization becomes critical when weak identificationissues arise (Hamilton, Waggoner and Zha, 2007)
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Related literature
1 Weak identification and “irregular” distributions.The likelihood function of a finite mixture distribution isinvariant with respect to permutations of its componentdistributions (“Label switching”)The likelihood function of latent factor models is invariantwith respect to factor permutationsThe likelihood function of latent factor models is invariantwith respect to factor reflections (“Sign switching”)No valid bounded confidence interval for a parameter existsif this parameter is not identifiable on a subset of theparameter space.
Gleser and Hwang (1987), Dufour (1997)2 How best to normalize?
Normalization in structural equations models affects thefinite-sample distribution of OLS and 2SLS estimators(Hillier, 1990).Normalization becomes critical when weak identificationissues arise (Hamilton, Waggoner and Zha, 2007)
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Related literature
1 Weak identification and “irregular” distributions.2 How best to normalize?
Normalization in structural equations models affects thefinite-sample distribution of OLS and 2SLS estimators(Hillier, 1990).Normalization becomes critical when weak identificationissues arise (Hamilton, Waggoner and Zha, 2007)
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Related literature
1 Weak identification and “irregular” distributions.2 How best to normalize?
Normalization in structural equations models affects thefinite-sample distribution of OLS and 2SLS estimators(Hillier, 1990).Normalization becomes critical when weak identificationissues arise (Hamilton, Waggoner and Zha, 2007)
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Related literature
1 Weak identification and “irregular” distributions.2 How best to normalize?
Normalization in structural equations models affects thefinite-sample distribution of OLS and 2SLS estimators(Hillier, 1990).Normalization becomes critical when weak identificationissues arise (Hamilton, Waggoner and Zha, 2007)
“A poor normalization can lead to multimodal distributions,disjoint confidence intervals, and very misleadingcharacterizations of the true statistical uncertainty.”They propose an “identification principle”.
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Outline
1 Motivation
2 Weakly Identified LSSMs
3 Simulation Results
4 The Identification Principle
5 Conclusion
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
The likelihood function of Gaussian Linear State-Space Models(LSSMs )
y t (N×1) = B + H′ξt + wt , wt ∼ N (0,R)
ξt+1(K×1) = Fξt + v t , v t ∼ N (0,Q)
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
The likelihood function of Gaussian Linear State-Space Models(LSSMs )
y t (N×1) = B + H′ξt + wt , wt ∼ N (0,R)
ξt+1(K×1) = Fξt + v t , v t ∼ N (0,Q)
is invariant with respect to linear transformations: for anyinvertible matrix M,
l(
B,H,R,F,Q∣
∣ y)
= l(
B,M−1′H,R,MFM−1,MQM′∣
∣ y)
∀y ∈ Y.
y t = B + H′M−1Mξt + wt , wt ∼ N (0,R)
Mξt+1 = MFM−1Mξt + v t , v t ∼ N (0,MQM′)
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
The likelihood function of Gaussian Linear State-Space Models(LSSMs )
y t (N×1) = B + H′ξt + wt , wt ∼ N (0,R)
ξt+1(K×1) = Fξt + v t , v t ∼ N (0,Q)
is invariant with respect to linear transformations: for anyinvertible matrix M,
l(
B,H,R,F,Q∣
∣ y)
= l(
B,M−1′H,R,MFM−1,MQM′∣
∣ y)
∀y ∈ Y.
Elementary linear transformations:
M = D : diagonal scale matrix
M = O : rotation matrix
M = P : permutation matrix
M = S : diagonal reflection matrix
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
The likelihood function of Gaussian Linear State-Space Models(LSSMs )
y t (N×1) = B + H′ξt + wt , wt ∼ N (0,R)
ξt+1(K×1) = Fξt + v t , v t ∼ N (0,Q)
Example
LSSMs are invariant with respect to a (finite) set of 2K
reflections. With K = 2, these transformations are
S ∈
{[
1 00 1
]
,
[
−1 00 1
]
,
[
1 00 −1
]
,
[
−1 00 −1
]}
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
DefinitionA normalization is a parameter subspace
ΘN ⊆ Θ
Example
y t = B + H′ξt + wt , wt ∼ N (0,R)
ξt+1 = Fξt + v t , v t ∼ N (0,Q)
The normalization
ΘQ1 ={
θ ∈ Θ∣
∣ Qkk = 1,k = 1, . . . ,K}
breaks scale invariance.
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
What can be done?
Taking parameter uncertainty into account will improveforecasts → Simulation results.
Taking parameter uncertainty into account becomes morebeneficial the weaker the identification of some parameters.
Normalization ensuring global identification may helpcommunication → An identification principle.
Normalizations satifying the identification principle are morelikely to yield unimodal distributions.Many normalizations satisfy this principle: it may be usefulto try several of them.
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
What can be done?
Taking parameter uncertainty into account will improveforecasts → Simulation results.
Taking parameter uncertainty into account becomes morebeneficial the weaker the identification of some parameters.
Normalization ensuring global identification may helpcommunication → An identification principle.
Normalizations satifying the identification principle are morelikely to yield unimodal distributions.Many normalizations satisfy this principle: it may be usefulto try several of them.
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Outline
1 Motivation
2 Weakly Identified LSSMs
3 Simulation Results
4 The Identification Principle
5 Conclusion
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Gibbs sampler
Every parameter except γ admits a conditionally conjugateprior.
I use a random-walk Metropolis-Hastings step to draw γ
with latent factors as a single block.
q(
γ′,ξ′∣
∣
∣y,γ,Φ,Σγ
)
= p(
ξ′∣
∣
∣y,γ′,Φ
)
φ(
γ′
∣
∣
∣γ,Σγ
)
,
where p(
ξ′∣
∣
∣y,γ′,Φ)
is available inclosed form and
Φ ≡ {B,Q,R,F,ξ1} .
Note: the joint proposal does not depend on ξ.
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Posterior averaging is beneficial for weaklyidentified LSSMs.
H All ARMA(1,1) AR(1)
Weaker reflection identification 0.005 0,795 0,817 0,655(0,020) (0,022) (0,048)M=1000 M=898 M=102
0.010 0,852 0,880 0,737↓ (0,024) (0,027) (0,050)
M=1000 M=884 M=116
0.050 0,883 0,919 0,748(0,024) (0,028) (0,051)M=1000 M=871 M=129
Stronger reflection identification 0.100 0,961 0,968 0,908(0,023) (0,026) (0,059)M=1000 M=876 M=124
The data-generating process (an ARMA(1,1)) is
ξt = Fξt−1 + vt ,
yt = B + H ′ξt + wt ,
with B=0, R =1, Q =1, F =0.95.
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Outline
1 Motivation
2 Weakly Identified LSSMs
3 Simulation Results
4 The Identification Principle
5 Conclusion
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Definition
Let Θl denote the nonidentification subset. A normalizationΘN ⊆ Θ satisfies the identification principe if it
a) int(
ΘN)
∩ Θl = ∅ and Θl ⊆ fr(
ΘN)
;
b) is connected;
c) provides global identification.
Note: Intersections of hyperplanes and half-spaces areconnected.
Hamilton, Waggoner and Zha (2007)“Our proposal is that the boundaries of [a normalization set]A should correspond to the loci along which the structure islocally unidentified or the log likelihood is −∞.”“One easy way to check whether a proposed normalizationset A conforms to this identification principle is to make surethat the model is locally identified at all interior points of A.”
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Example (Harvey, 1989, K = 2)
y t = B + H′ξt + wt , wt ∼ N (0,R)
ξt+1 = Fξt + v t , v t ∼ N (0,Q)
Normalization{
θ ∈ Θ∣
∣ H12 = 0}
does not provide globalidentification because it does not break permutation invarianceif H0
22 = 0 under the data-generating process.
H =
[
H11 0H21 H22
]
The sampling distribution of H11 will be bimodal for sufficientlylarge samples if H22 is close enough to being equal to 0.
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Example (Harvey, 1989, K = 2)
y t = B + H′ξt + wt , wt ∼ N (0,R)
ξt+1 = Fξt + v t , v t ∼ N (0,Q)
Normalization{
θ ∈ Θ∣
∣ H12 = 0}
does not provide globalidentification because it does not break permutation invarianceif H0
22 = 0 under the data-generating process.
H =
[
H11 0H21 0
]
The sampling distribution of H11 will be bimodal for sufficientlylarge samples if H22 is close enough to being equal to 0.
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Example (Harvey, 1989, K = 2)
y t = B + H′ξt + wt , wt ∼ N (0,R)
ξt+1 = Fξt + v t , v t ∼ N (0,Q)
Normalization{
θ ∈ Θ∣
∣ H12 = 0}
does not provide globalidentification because it does not break permutation invarianceif H0
22 = 0 under the data-generating process.
H =
[
H21 0H11 0
]
The sampling distribution of H11 will be bimodal for sufficientlylarge samples if H22 is close enough to being equal to 0.
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
A normalization of LSSMs that ensures globalidentification.
I propose
ΘHO ={
θ ∈ Θ∣
∣ HH′ = I}
,
which breaks scale and rotation invariance, but preservespermutation and reflection invariance.
I parameterize a K×N row-orthogonal matrix as:H′ = B1B2 . . . BK U, ou
Bk = ρk,k+1ρk,k+2 . . . ρk,N , γk,n = arctan
Hk,n+1√
n∑
i=1H2
k,i
ρi,j =
I
cos γi,j − sin γi,jI
sin γi,j cos γi,jI
(N×N)
, U(N×K ) =[
I
0
]
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Outline
1 Motivation
2 Weakly Identified LSSMs
3 Simulation Results
4 The Identification Principle
5 Conclusion
logo
Motivation Weakly Identified LSSMs Simulation Results The Identification Principle Conclusion
Conclusion
This paper
argues that LSSMs are subject to weak identificationissues;
shows that posterior averaging is beneficial whenforecasting with weakly identified LSSMs;
offers a normalization which can alleviate communicationproblems caused by weak identification;
describes a novel Gibbs sampler for Gaussian LSSMs.