Download - 2.5 Forecasting and Impulse Response Functionspareto.uab.es/lgambetti/L4_2012_Master.pdf · Forecasting with VAR models Iterated forecast Let us consider the VAR(p) in companion formY

2.5 Forecasting and Impulse Response Functions

Principles of forecasting

Forecast based on conditional expectations Suppose we are interested in forecasting the

value of yt+1 based on a set of variables Xt (m× 1 vector).

Let yt+1|t denote such a forecast.

To evaluate the usefulness of the forecast we need to specify a loss function. Here we

specify a quadratic loss function. A quadratic loss function means that yt+1|t is chose to

minimize E(yt+1 − yt+1|t)2.

E(yt+1 − yt+1|t)2 is known as the mean squared error associated with the forecast yt+1|tdenoted

MSE(yt+1|t) = E(yt+1 − yt+1|t)2

Fundamental result: the forecast with the smallest MSE is the expectation of yt+1|t condi-

tional on Xt that is

yt+1|t = E(yt+1|Xt)

We now verify the claim. Let g(Xt) be any other function and let yt+1|t = g(Xt). The

associated MSE is

E [yt+1 − g(Xt)]2 = E [yt+1 − E(yt+1|Xt) + E(yt+1|Xt)− g(Xt)]

2

= E [yt+1 − E(yt+1|Xt)]2 +

+2E {[yt+1 − E(yt+1|Xt)] [E(yt+1|Xt)− g(Xt)]}++[E(yt+1|Xt)− g(Xt)]

2

Define

ηt+1 ≡ [yt+1 − E(yt+1|Xt)] [E(yt+1|Xt)− g(Xt)]

(24)

The conditional expectation is

E(ηt+1|Xt) = E {[yt+1 − E(yt+1|Xt)] [E(yt+1|Xt)− g(Xt)] |Xt}= [E(yt+1|Xt)− g(Xt)]E {[yt+1 − E(yt+1|Xt)]Xt}= [E(yt+1|Xt)− g(Xt)] [E(yt+1|Xt)− E(yt+1|Xt)]

= 0

Therefore by law of iterated expectations

E(ηt+1) = E(E(ηt+1|Xt)) = 0

This means that

E [yt+1 − g(Xt)]2 = E [yt+1 − E(yt+1|Xt)]

2 + E([E(yt+1|Xt)− g(Xt)])2

Therefore the function that minimizes the MSE is

g(Xt) = E(yt+1|Xt)

E(yt+1|Xt) is the optimal forecast of Yt+1 conditional of Xt under a quadratic loss function.

The MSE of this optimal forecast is

E[yt+1 − g(Xt)]2 = E[yt+1 − E(yt+1|Xt)]

2

Forecast based on linear projections We now restrict the class of forecasts we consider

to be linear function of Xt:

yt+1|t = α′Xt

Suppose α′ is such that the resulting forecast error is uncorrelated with Xt

E[(yt+1 − α′Xt)X′t] = 0 (25)

If (25) holds the we call α′Xt the linear projection of yt+1 on Xt. The linear projection

produces the smallest mean square forecast error among the class of linear forecasting rules.

To verify this let g′Xt be any arbitrary forecasting rule.

E(yt+1 − g′Xt

)2= E

(yt+1 − α′Xt + α′Xt − g′Xt

)2= E

(yt+1 − α′Xt

)2+

+2E[(yt+1 − α′Xt

) (α′Xt − g′Xt

)]+

+E(α′Xt − g′Xt

)2

the middle term

E[(yt+1 − α′Xt

) (α′Xt − g′Xt

)] = E[

(yt+1 − α′Xt

)X ′t] [α− g]

= 0

by definition of linear projection. Thus

E(yt+1 − g′Xt

)2= E

(yt+1 − α′Xt

)2+ E

(α′Xt − g′Xt

)2

The optimal linear forecast is the value g′Xt = α′Xt.

We use the notation

P (yt+1|t) = α′Xt

to indicate the linear projection of yt+1 on Xt.

Notice that

MSE[P (yt+1|Xt)] ≥MSE [E(yt+1|Xt)]

The projection coefficient α can be calculated in terms of moments of yt+1 and Xt.

E(yt+1X′t) = α′E(XtX

′t)

α′ = E(yt+1X′t)[E(XtX

′t)]−1

It is easy to generalize to the multivariate case. Let Yt be a n× 1 vector α an m× n matrix

and Xt an m× 1 vector. The linear pojection is defined to be the vactor α′Xt satisfying

E(Yt+1 − α′Xt)X′t = 0

Linear projections and OLS regression There is a close relationship between OLS esti-

mator and the linear projection coefficient. If Yt+1 and Xt are stationary processes and also

ergodic for the second moments then

(1/T )

T∑t=1

XtX′t

p→ E(XtX′t)

(1/T )

T∑t=1

Xtyt+1p→ E(Xtyt+1)

implying

βp→ α

The OLS regression yields a consistent estimate of the linear projection coefficient.

Forecasting with VAR models

Iterated forecast Let us consider the VAR(p) in companion form

Yt = AYt−1 + et (26)

where et is white noise. The h-step ahead linear predictor of Yt+h conditional on the infor-

mation available at time t, Yt+h|t, is given by

Yt+h|t = AhYt = AYt+h−1|t (27)

where the first n rows of Yt+h|t represent the optimal forecast of Yt+h. From (27) it is easy

to compute recursively the forecast for Yt+h at any horizon.

The predictor in the previous slide is optimal in the sense that, as we know, it delivers

the minimum MSE among those that are linear functions of Y .

Using

Yt+h = AhYt +

h−1∑i=0

Aiet+h−i (28)

we get the forecast error

Yt+h −Yt+h|t =h−1∑i=0

Aiet+h−i (29)

From the forecast error it is easy to obtain the Mean Square Error, the covariance of the

forecast error,

MSE[Yt+h|t] = E(Yt+k −Yt+h|t

)(Yt+k −Yt+h|t)′ = Σ(h) =

h−1∑i=0

AiΩAi′

= Σ(h− 1) +Ah−1ΩAh−1′

the MSE for will be the first upper left n×n matrix. Notice that the MSE is non decreasing

and that as h→∞ will approach the variance of Yt.

Direct forecast An alternative is to compute the direct forecast by computing the projection

of Yt+h on Yt. To see this consider a bivariate VAR(p) with two variables, xt and yt. We

want to forecast xt+h given the incormation available at time t.

The direct forecast works as follows:

1. Estimate the projection equation

xt = a+

p−1∑i=0

φixt−h−i +p−1∑i=0

ψiyt−h−i + εt

2. Using the estimated coefficients, the predictor xt+h|t is obtain as

xt+h|t = a+

p−1∑i=0

φixt−i +p−1∑i=0

ψiyt−i

Forecast evaluation: pseudo out-of-sample exercises A key issue in forecasting is to

evaluate the forecasting accuracy of a model of interest. In particular several times we will

be interested in comparing the performance of competing forecasting models.

How can we perform such a forecast evaluation?

Answer: we can compare the mean squared errors using pseudo out-of-sample forecast exer-

cises.

Suppose we have a sample of T observations. Let τ = T0 < T . A pseudo out-of-sample

exercise works as follows:

1. We use τ observations to estimate the parameters of the model.

2. We forecast Yτ+j with Yτ+j|τ j = 1,2, ..., s.

3. We compute the forecast error wτ+j|τ = Yτ+j − Yτ+j|τ with j = 1,2, ..., s.

4. We update τ = τ +1 and repeat steps 1-3.

We repeat steps 1-4 up to the end of the sample and we compute the mean squared error

for each variable i = 1, ...n.

MSE(Yi,T+j|T) =1

T − T0 − j

T−T0−j∑τ=T0

w2i,τ+j|τ

or the root mean squared error

RMSE(Yi,T+j|T) =

√√√√ 1

T − T0 − j

T−T0−j∑τ=T0

w2i,τ+j|τ

Repeating the same exercise for various models we can choose the one which has the smallest

MSE or RMSE.

Application: Forecasting inflation and unemployment

D’Agostino Gambetti and Giannone (JAE forthcoming) consider a trivariate VAR(2) model

including inflation unemployment and the short term interest rate.

The sample spans from 1948:I-2007:IV. Forecast up to 12 quarters ahead.

Estimation is in real time and is made using both VAR and AR for each of the series.

Estimation is done both recursively snd with rolling window.

Impulse Response Functions

Impulse response functions represent the mechanisms through which shock spread over time.

Let us consider the Wold representation of a covariance stationary VAR(p),

Yt = C(L)εt

=

∞∑i=0

Ciεt−i (30)

The matrix Cj has the interpretation

∂Yt

∂ε′t−j= Cj (31)

or

∂Yt+j

∂ε′t= Cj (32)

That is, the row i, column k element of Cj identifies the consequences of a unit increase in

the kth variable’s innovation at date t for the value of the ith variable at time t+ j holding

all other innovation at all dates constant.

Example 1 Let us assume that the estimated matrix of VAR coefficients is

A =(

0.8 0.1

−0.2 0.5

)(33)

with eigenvalues 0.8562 and 0.4438. We generate impulse response functions of the Wold

representation

Cj = Aj

Figure 3: Impulse response functions

Example 2 Let us now assume that the second variable does not Granger cause the first

one so that

A =(

0.8 0

−0.2 0.5

)(34)

with eigenvalues 0.8 and 0.5. Impulse response functions are plotted in the next figure

Figure 4: Impulse response functions when the second variable does not Granger cause the

first one.

Cumulated impulse response functions Suppose Yt is a vector of trending variables (i.e.

log prices and output) so we consider the first difference to reach stationarity. So the model

is

ΔYt = (1− L)Yt = C(L)εtWe know how to estimate, interpret, and conduct inference on C(L). But suppose we are

interested in the response of the levels of Yt rather than their first differences (the level of

and prices rather than their growth rates). How can we find these responses? We transform

the model

Yt = Yt−1 + C(L)εtThe effect of εt on Yt is C0. Now substituting forward we obtain

Yt+1 = Yt−1 + C(L)εt + C(L)εt+1

= Yt−1 + C0εt+1 + (C0 + C1)εt + ...

(35)

and for two periods ahead

Yt+2 = Yt−1 + C(L)εt + C(L)εt+1 + C(L)εt+2

= Yt−1 + C0εt+2 + (C0 + C1)εt+1 + (C0 + C1 + C2)εt+1...

(36)

so the effect of εt on Yt+1 are (C0 +C1) and on Yt+2 is (C0 +C1 +C2). In general the effects

of εt on Yt+j are

Cj = C0 + C1 + ...+ Cj

defined as cumulated impulse response functions.

Error Bands for Impulse Response Functions

Method I: Asymptotic Hamilton (1994).

Method II: Bootstrap The idea behind bootstrapping (Runkle, 1987) is to obtain esti-

mates of the small sample distribution for the impulse response functions without assuming

that the shocks are Gaussian. Steps:

1. Estimate the VAR and save the π and the fitted residuals {u1, u2, ..., uT}.

2. Draw uniformly from {u1, u2, ..., uT} and set u(1)1 equal to the selected realization

and use this to construct

Y (1)1 = A1Y0 + A2Y−1 + ...+ ApY−p+1 + u(1)1 (37)

3. Taking a second draw (with replacement) u(2)1 generate

Y (1)2 = A1Y

(1)1 + A2Y0 + ...+ ApY−p+2 + u(2)1 (38)

4. Proceeding in this fashion generate a sample of length T {Y 11 , Y

12 , ..., Y

1T } and use

the sample to compute π(1) and the implied impulse response functions C(1)(L).

5. Repeat steps (3)−(4) M times and collect M realizations of C(l)(L), l = 1, ...M and

takes for all the elements of the impulse response functions and for all the horizons

the αth and 1− αth percentile to construct confidence bands.

Method III: Montecarlo As bootstrap but drawing the residuals from a normal distribution.

Example: A Monetary VAR with bootstrap bands

We estimate the standard monetary VAR which includes real output growth, the inflation

rate and the federal funds rate. These three variables are the core variables for monetary

policy analysis in VAR models. Data are taken from the St.Louis Fed FREDII database.

Download - 2.5 Forecasting and Impulse Response Functionspareto.uab.es/lgambetti/L4_2012_Master.pdf · Forecasting with VAR models Iterated forecast Let us consider the VAR(p) in companion formY

Top Related