dynamic prediction pools: an investigation of financial frictions

Dynamic Prediction Pools:An Investigation of Financial Frictions and

Forecasting Performance

Marco Del Negro, Raiden B. HasegawaFederal Reserve Bank of New York

Frank SchorfheideUniversity of Pennsylvania

March 2014

Disclaimer: The views expressed here do not necessarily reflect those of the Federal Reserve Bankof New York or the Federal Reserve System

Introduction

Model uncertainty is pervasive.

How should we combine models for forecasting, policy analysis, riskassessment?

Especially, if we suspect that their relative performance may changeover time (e.g., financial crisis vs ‘normal’ times)...

We consider dynamic prediction pools of the form

p(y1:T |λ1:T ,Mλ) =T∏t=1

{λtp(yt |y1:t−1,M1)

+(1− λt)p(yt |y1:t−1,M2)

}.

Del Negro, Hasegawa, Schorfheide Dynamic Pools

Application

Are financial variables (spreads) useful in forecasting macroeconomic outcomes?

Many macroeconomists paid scant attention to financialfrictions/variables before the recent crisis.

Why? Did these models not forecast very well in ‘normal’ times – ordid they/we not look at existing data well enough?

Which models should policymakers use now?

We focus on forecasting with DSGE models (Del Negro andSchorfheide 2013, Handbook of Economic Forecasting).


DSGE forecasts of the Great RecessionSmets-Wouters w/ Smets-Wouters w/

Infl Exp Infl Exp & Fin FrictionsOutput Growth

2004 2005 2006 2007 2008 2009 2010 2011 2012−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

3

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

3

2004 2005 2006 2007 2008 2009 2010 2011 2012−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

3

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

3

Inflation

2004 2005 2006 2007 2008 2009 2010 2011 2012−0.5

0

0.5

1

1.5

−0.5

0

0.5

1

1.5

2004 2005 2006 2007 2008 2009 2010 2011 2012−1

−0.5

0

0.5

1

1.5

−1

−0.5

0

0.5

1

1.5


Combining Models - A Stylized Framework

Estimation of large-scale DSGE models can be tedious andcomputationally costly – and is typically delegated to expertmodelers.

We have a principal-agent setting in mind...

Principal = policy maker = Dudley, Yellen, ... who aggregatesinformation obtained from modelers.

Agents = econometric modelers = Del Negro, Hasegawa,Schorfheide... who provide principal with predictive densitiesp(yt |y1:t−1,Mi ), t = 1, . . . ,T .

The agents are rewarded based on the realized value ofln p(yt |y1:t−1,Mi ) (induces truth-telling).


Bayesian Model Averaging (BMA)

At any time T the policy maker can use the predictive densities toform marginal likelihoods:

p(y1:T |Mi ) =T∏t=1

p(yt |y1:t−1,Mi ).

The predictive likelihoods can be used to update model probabilities.

Assume λBMAT = P[M1 is correct|y1:T ] and let λBMA

0 be priorprobability of M1.

Then

λBMAT =

λBMA0 p(y1:T |M1)

λBMA0 p(y1:T |M1) + (1− λBMA

0 )p(y1:T |M2).


Bayesian Model Averaging (BMA)

In real time, policy maker can form time T predictive densities foryT+1 as

p(yT+1|y1:T ) = λBMAT p(yT+1|y1:T ,M1)

+(1− λBMAT )p(yT+1|y1:T ,M2).

Policy maker can use p(yT+1|y1:T ) to make decisions that minimizeposterior expected loss – provided that M1 and M2 are the onlymodels on the table.


What If...

the policy maker believes that the econometric modelers (Del Negro,Hasegawa, Schorfheide ...) didn’t get it quite right?

DGP = p(y1:T )

p(y1:T |M1) p(y1:T |M2)

λBMAT

a.s.−→ 1 or 0 as T −→∞ (unless models generate very similarforecasts) – Dawid (1984).

Asymptotically, no model averaging!


Remedy

A policy maker concerned about misspecification of M1 could createnew models by creating time-varying convex combinations ofpredictive densities (requires no DSGE model estimation skills!):

DGP = p(y1:T )

p(y1:T |M1) p(y1:T |M2)

p(y1:T |λ1:T ,Mλ) =∏T

t=1

{λt p (yt |y1:t−1,M1) + (1− λt) p (yt |y1:t−1,M2)

}Policy maker only needs to make inference with respect to λ1:T .


Some References

Bayesian model averaging: Leamer (1978), ...

Dynamic model averaging: Raftery et al. (2010), Koop and Koribilis(2011)

Static prediction pools: Hall and Mitchell (2007), Geweke andAmisano (2011), ...

Nonlinear prediction pools: Gneiting and Ranjan (2013), Fawcett etal. (2013)

Dynamic prediction pools with regime switching: Waggoner and Zha(2012)

General dynamic combination of predictive densities usingconvolutions: Billio et al. (2013)


Our Contribution

Replace static (i.e., constant λ) weights used in Geweke andAmisano (2011) with dynamic (i.e., time-varying λ1:T ) weights.

Conduct inference on the time series of λ1:T :

Is there time variation?

Was the weight skewed toward one model vs. the other before theGreat Recession?

What is λT now?

DSGE model application


Odds and Ends

Information set used to estimate DSGE models may be larger thany1:t . Let It = {y1:t , z1:t}.

Modelers report p(yt |It−1,Mi ).

For any set of beliefs p(zt |yt , It−1,P), t = 1, . . . ,T , the policymaker can form the joint distribution:

p(y1:T , z1:T |λ1:T )

=T∏t=1

{(λtp(yt |It−1,M1) + (1− λt)p(yt |It−1,M1)

)×p(zt |yt , It−1,P)

}Thus,

p(λ1:T |IT ) ∝ p(λ1:T )T∏t=1

(λtp(yt |It−1,M1)+(1−λt)p(yt |It−1,M1)

).


Dynamic Pools - (Hierarchical) Prior

We index prior by hyperparameter ρ that controls the amount of“smoothing.”

Prior p(λ1:T |ρ) for sequence λ1:T :

xt = ρxt−1 +√

1− ρ2εt , εt ∼ iid N(0, 1), x0 ∼ N(0, 1),

λt = Φ(xt)

where Φ(.) is the Gaussian CDF.

Remarks:

Unconditionally, λt ∼ U[0, 1] for all t.

As ρ −→ 1: dynamic pool −→ static pool.

Alternative (not yet implemented): Allow for a mean µ of xt so thatρ < 1 6−→ E[λt ] = 1/2 for t →∞ (prior, µ ∼ N (0, σ2

µ))


Dynamic Pools - Nonlinear State Space System

Measurement equation:

p(yt |It−1, λt) = λtp(yt |It−1,M1) + (1− λt)p(yt |It−1,M2)

Transition equation:

λt = Φ(xt), xt = ρxt−1 +√

1− ρ2εt , εt ∼ iid N(0, 1)

Use particle filter to construct the sequence p(λt |ρ, y1:t).

Predictive density:

p(yt |y1:t−1, ρ)

=

∫ {λtp(yt |It−1,M1)

+(1− λt)p(yt |It−1,M2)

}p(λt |y1:t−1, ρ)dλt

= λt|t−1p(yt |It−1,M1) + (1− λt|t−1)p(yt |It−1,M2),

where λt|t−1 = E[λt |ρ, y1:t−1].


The DSGE Models

Smets and Wouters (2007), augmented with time-varying targetinflation rate: SWπ.

A version of SW with financial frictions along the lines ofBernanke-Gertler-Gilchrist (1999) and spreads as observable:SWπFF.


SWπFF: Financial Frictions

Gross nominal return on capital:

Rkt = λrkt + (1− λ)qkt − qkt−1 + πt

Smets-Wouters: arbitrage condition between return on capital andreturn on nominal bond:

Et [Rkt+1] = Rt + bt ,

where bt is a shock (“spread”/“discount”).

Smets-Wouters w/ Financial Frictions: arbitrage condition is

Et [Rkt+1] = Rt + bt+ζsp,b

(qkt + kt − nt

)︸︷︷︸leverage

+σω,t

where σω,t is an additional shock, nt is an additional endogenous

variable, and Rkt − Rt is treated as observed:

Et [Rkt+1 − Rt ] = (Baa Corp. rate - 10y Treas. yield)/4


Empirical Analysis – Data

Variables to be forecast (yt): output growth, inflation.

The first predictive density is computed for 1992:Q1 (t = 1) and thelast for 2011:Q2 (t = T ).

All of our information sets (I0 to IT−1) correspond to real-timedata sets starting in 1964:Q1.

The information sets contain:

SWπ: output growth, inflation, fed funds, consumption growth,investment growth, wage growth, hours worked, 10-yrs InflationExpectations from Surveys.

SWπFF: . . . + spread


Empirical Analysis – Multi-Step Forecasting

In practice, the policy maker may be interested in multi-stepforecasts (and so are we): e.g., average growth/inflation over hperiods:

yt,h =1

h

h−1∑s=0

yt−s .

All subsequent results are based on h = 4.


Empirical Analysis – Multi-Step Forecasting

Recall that we left p(zt |yt , It−1,P) unspecified. This makes itinfeasible to generate multi-step predictions (without furtherassumptions, e.g., It−1 = y1;t−1).

As is common in the literature on predictive regressionsyt = β0 + β1xt−h + ut , we estimate the pooling weights directly,ignoring the overlap between yt,h, yt−1,h, etc.

(Pseudo) measurement equation:

p(h)(yt,h|It−h, λt) = λ(h)t p(yt,h|It−h,M1)+(1−λ(h)

t )p(yt,h|It−h,M2)

Particle filter generates pseudo posteriors p(h)(λ(h)t |ρ, y1:t,h).

We currently use the following predictive density for forecasting:

p(h)(yt,h|ρ, y1:t−h,h) = λ(h)t−h|t−hp(yt,h|It−h,M1)

+(1− λ(h)t−h|t−h)p(yt,h|It−h,M2).


Empirical Analysis

Questions:

1 Is there significant time variation in the relative forecastingperformance of the two models, as captured by the pseudo posteriordistribution p(h)(λ

(h)t |ρ, y1:t,h)?

2 Does p(h)(λ(h)t |ρ, y1:t,h) change rapidly enough when estimated in real

time to offer useful guidance to policy makers or forecasters?

3 Does the dynamic pool (DP) perform better in real time than thestatic pool (SP), or BMA weights, and by how much?

Benchmarks:

1 Static pool based on pseudo posterior p(h)(λ(h)|y1:t,h). FollowingGeweke and Amisano, we use posterior mode rather than posteriormean.

2 Pseudo BMA based on multistep predictive densitiesp(yt,h|It−h,Mi ).


Filtered Posterior p(h)(λ(h)t |ρ = 0.9, y1:t,h), t = 1, . . . ,T –

Weight on SWπFF


Filtered Posterior p(h)(λ(h)t |ρ = 0.9, y1:t,h), t = 1, . . . ,T –

Weight on SWπFF

Whole Sample Great Recession


Log Predictive Scores: SWπFF (red) vs SWπ (blue) vs DP(black)


Dynamic Pool (black), Static Pool (purple), and BMA(green) Weights in Real Time


Pseudo Posterior of ρ: p(h)(ρ|y1:T ,h)


Cumulative Log Predictive Scores

Component Models Model PoolingModel Log score Method Log scoreSWπ -306.34 BMA -275.57

SWπFF -259.58 SP -264.67

EW -260.42

DP -258.53


Estimate Parameters and Combination Weights Jointly?

We introduced a policy maker and several modelers.

In our framework the DSGE models are not re-estimated for differentλ1:T sequences.

If one were to remove the principal-agent friction, one could formthe joint distribution:

p(y1:T , λ1:T , θ(1), θ(2)) =

( T∏t=1

λtp(yt |y1:t−1, θ(1),M1)

+ (1− λt)p(yt |y1:t−1, θ(2),M2)

)× p(θ(1))p(θ(2))p(λ1:T )

Thus, posterior of (θ(1), θ(2)) will generally depend on λ1:T :

p(θ(1), θ(2), λ1:T |y1:T ) = p(θ(1), θ(2)|y1:T , λ1:T )p(λ1:T |y1:T ).

We think thatin central bank application re-estimating the models for differentλ1:T ’s is not practical;our approach allows for different information sets.


Conclusions

Methodology:

There is evidence of time-variation in relative forecastingperformance of different models over time (see also Amisano andGeweke 2013) → Dynamic Pools seem worth exploring.

Substantive:

Evidence of time-variation in relative forecasting performance ofDSGE models with and without financial frictions ...

... yet no excuse for having ignored FF prior to the Great Recession.

Should use financial friction model now.

Evidence in favor of non-linearities?


dynamic prediction pools: an investigation of financial frictions

Documents