nonlinear time series modeling - columbia universityrdavis/lectures/cop_p3.pdf · 2004-10-03 ·...

100
1 MaPhySto Workshop 9/04 Nonlinear Time Series Modeling Nonlinear Time Series Modeling Part III: Nonlinear and Part III: Nonlinear and NonGaussian NonGaussian State State - - Space Models Space Models Richard A. Davis Colorado State University (http://www.stat.colostate.edu/~rdavis/lectures) MaPhySto Workshop Copenhagen September 27 — 30, 2004

Upload: others

Post on 18-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

1MaPhySto Workshop 9/04

Nonlinear Time Series Modeling Nonlinear Time Series Modeling Part III: Nonlinear and Part III: Nonlinear and NonGaussianNonGaussian StateState--Space Models Space Models

Richard A. DavisColorado State University

(http://www.stat.colostate.edu/~rdavis/lectures)

MaPhySto Workshop

Copenhagen

September 27 — 30, 2004

Page 2: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

2MaPhySto Workshop 9/04

Part III: Nonlinear and NonGaussian State-Space Models1. Introduction

1.1 Motivating examples1.2 Linear state-space models

SetupRandom walk + noiseLocal linear trend + seasonalityKalman filtering and smoothingOther applications (missing values)

1.3 Generalized state-space models2. Observation-driven models

2.1 GLARMA models for time series of countsProperties

Existence and uniqueness of stationary distributions

Maximum likelihood estimation and asymptotic theory

Application to asthma data2.2 GLARMA extensions

Bernoulli2.3 Other (BIN)

Page 3: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

3MaPhySto Workshop 9/04

3. Parameter-driven models3.1 Estimation

GLM

Importance sampling

Approximation to the likelihood

3.2 Simulation and Application

Time series of counts

Stochastic volatility

3.3 How good is the posterior approximation?

Posterior mode vs posterior mean

3.4 Application to estimating structural breaks

Poisson model

Stochastic volatility model

Page 4: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

4MaPhySto Workshop 9/04

1.1 Motivating Examples: Daily Asthma Presentations (1990:1993)

Year 1990

••••••••••

•••••

•••••••••••••••••

•••••

••••••

••••••••••••

••••

•••••••••••

••••••••

•••••••••

•••

••

••••

••••••

•••••

••

•••••••••••••••••••••

••

•••••

••••

•••••••

••••

••••••••••••

••••••

•••••

•••

••••••

•••••••••

•••••••

•••••••••••••

••••••••••

••••••••••

••••••••••••

••••••

•••••

••••••••

••••••••••••

•••••••

•••••••••••

••••••••••••

•••••••••••••

•••••

••••••••••••

••••••••

••••••

••••

06

14

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Year 1991

••••••

••••••••••••

•••••••••

••••••••

•••••

••

•••••

••••••

••••••

••••••

•••••••••

••

•••

••••••

•••••••

••••••

••••••••

••••••••••••••

•••••

•••••

•••

•••••••••

••••••••

••••••••

••••••••••••

•••••••

••••

••••••

•••••••••••

•••••

•••••••

••••••••••••

••••

••••••••••

•••••••••••••••••••••••••••

••••••••

•••••

•••••

••••••••••

•••••••

•••••••••••••

•••••

••••••••••

•••••••••••••

••••••••••

•••••••••

06

14

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Year 1992

•••••••••

••••••••••••••••••••

••••••••

••••••••••

••

•••••

••

••

••••••••••

••

••

•••••

•••••

••••••

•••••••

•••

••••

••••••••••

•••

•••••••••

•••••••••

••••••

••

••

••

•••••••

•••••••

••••

•••••

••••••

•••••••••

••••••••••••

••••••

•••••••••

••••

••••••••

•••

••••••••

••••••••

•••••••••••••

•••••

••••••

••••••••••

••••••••

•••••••

•••••••••

•••••••••

•••••••

••••••••

••••••••••••

••••••••

•••••

•••••••0

614

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Year 1993

••••••••••••

••••••••

•••••

•••••

•••••••••••

•••

••••

••

••••

••••

•••••••

•••••••••••

••

•••••••••

••••

••

••

••••••••

•••••••••

••••••••

•••••••••••

•••••••••••

••••

••••••

••

••••••

••

•••••

•••••••••••

•••••••••

•••••••

••••

••••••••••••

•••••••

•••••

••••••••

••••••

•••••••

•••••

•••••••

•••••

•••••

•••••

•••••••••

••••••••••••••••••••

••••

••••

•••••••

•••••••••••••••••••••

••••

•••••••••

••••••••••••0

614

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Page 5: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

5MaPhySto Workshop 9/04

day

log

retu

rns

(exc

hang

e ra

tes)

0 200 400 600 800

-20

24

lag

AC

F

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

lag

AC

F of

squ

ares

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

lag

AC

F of

abs

val

ues

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

Example: Pound-Dollar Exchange Rates (Oct 1, 1981 – Jun 28, 1985; Koopman website)

Page 6: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

MaPhySto Workshop 9/04

1.2 Linear State-Space Models: Setup

Observation equation:

(1) Yt = Gt Xt + Wt , Wt ~ WN(0, Rt )

State equation:

(2) Xt+1 = Ft Xt + Vt , Vt ~ WN(0, Qt )

(Wt and Vt are uncorrelated.)

Def: Yt has a state-space representation if there exists a state-space

model for Yt given by (1) and (2).

Page 7: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

MaPhySto Workshop 9/04

Example (ARMA(1,1)):

Suppose Yt follows the recursions

Yt = φ Yt-1 + Zt + θ Zt-1 , Zt~WN(0,σ2).

Let Xt be the AR(1) process defined by

Xt = φ Xt-1 + Zt

so that

Setting Xt+1=[Xt-1 , Xt]’, we obtain the state-space representation

(state equation)

(observation equation)

⎥⎦

⎤⎢⎣

⎡+⎥

⎤⎢⎣

⎡⎥⎦

⎤⎢⎣

⎡φ

=⎥⎦

⎤⎢⎣

t1-t

2-t

t

1-t

Z0

XX

010

XX

⎥⎦

⎤⎢⎣

⎡+⎥

⎤⎢⎣

⎡φ

=⎥⎦

⎤⎢⎣

⎡=

++

1tt

t

1-t1t Z

00

10XX

XX

[ ] tt 1Y Xθ=

Page 8: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

MaPhySto Workshop 9/04

ARMA(1,1) cont (canonical observable representation):

Suppose Yt follows the recursions

Yt = φ Yt-1 + Zt + θ Zt-1 , Zt~WN(0,σ2).

and again let Xt be the AR(1) process defined by

Xt+1 = φ Xt + (φ + θ) Zt (state eqn)

Then

Yt = Xt+ Zt (observation eqn)

To see this, note

Yt − φ Yt-1 = Xt− φ Xt-1 + Zt − φ Zt-1

= (φ + θ) Zt-1 + Zt − φ Zt-1

= Zt + θ Zt-1

(Unlike the previous representation this state equation has dimension 1.)

Page 9: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

MaPhySto Workshop 9/04

Random walk plus noise

The observed process is modeled as

Yt = Xt + Wt , Wt~WN(0,σw2),

where the local level Xt, follows the random walk

Xt+1 = Xt + Vt , Vt~WN(0,σv2).

Note that the differenced data follows a MA(1) process, i.e.,

Dt = Yt − Yt-1 = Xt − Xt-1 + Wt − Wt-1 = Vt-1 + Wt − Wt-1

has lagged 1-correlation given by

21

2)1( 22

2

-wv

w =σ+σ

σ−=ρ if and only if σv

2 = 0.

This latter condition is equivalent to the signal being constant.

Page 10: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

MaPhySto Workshop 9/04

Random walk plus noise (cont)

The differenced process Dt can be expressed as the MA(1) process

Dt = Zt + θ Zt-1 , Zt~WN(0,σ2).

Matching up lag 1-correlations, we have

So that the signal is constant if and only if θ = -1. This is referred to as the

unit root problem.

Prediction. The best linear predictor of Yt+1 in terms of Y1, . . . , Yt, is given by

222

2

1

2)1(

θ+θ

=σ+σ

σ−=ρ

wv

w

.ˆ)1()ˆ)(1( ˆttt1 tttttt YaYaYYaYY +−=−−+=+

For large t, at converges to θ +1. The one-step predictor is given byexponential smoothing, which is known to be optimal for ARIMA(0,1,1) models!

Page 11: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

MaPhySto Workshop 9/04

Local linear trend plus seasonality model

This model takes the form

Yt = Mt + St + Wt

trend seasonal noise

where

Mt = Mt-1 + Bt-1 + Vt1 ,

Bt = Bt-1 + Vt2 ,

St = −St-1 − −St-11 + Vt3

State variable Xt=[Mt, Bt , St , . . ., St-10 ]’,

Xt+1=F Xt + Vt

Yt = [1 0 1 0, ] Xt + Wt

L

local linear trend

seasonal component w/ period=12)

L

Page 12: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

MaPhySto Workshop 9/04

Kalman Filter

Recursive method for producing:

• predictors of the signal P( Xt+1 | Y1, . . . , Yt).

• predictors of the series P( Yt+1 | Y1, . . . , Yt).

• filtered values of the signal P( Xt+1 | Y1, . . . , Yt+1).

• Gaussian likelihood of (Y1, . . . , Yn ).

Remarks:

1. Excellent method for calculating best predictors of missing

observations.

2. Can handle computation of Gaussian likelihood with missing

observations.

Page 13: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

13MaPhySto Workshop 9/04

1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961

100

200

300

400

500

600

(thou

sand

s)

1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961

100

200

300

400

500

600

(thou

sand

s)

one-step predictorsactual data

Example (airpass data):

Page 14: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

14MaPhySto Workshop 9/04

1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961

010

020

030

040

050

0

(thou

sand

s)

local levelslopeseasonal

Predicted state components.

Page 15: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

15MaPhySto Workshop 9/04

1992 power usage

0 50 100 150

020

0060

0010

000

1993 power usage

Time

0 50 100 150

020

0060

0010

000

1994 power usage

Time

0 50 100 150

020

0060

0010

000

1995 power usage

Time

0 50 100 150

020

0060

0010

000

Example (power usage): (with Sarah Streett)

Page 16: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

16MaPhySto Workshop 9/04

Goal: Forecast maximum power demand 1-2 weeks in advance.

Legendre polynomials: orthogonal polyn wrt dx on [0,1]

f0(x) = 1, f1(x) = 2x-1, f2(x) = 6x2-6x+1, . . .

Tentative regression model:

Yt = β0 p0(t) + β1 p1(t) + β2 p2(t) + β3 p3(t) + β4 p5(t) + β5 p7(t)

+ β6 p9(t) + εt , t=1, 2, …, 183,

where

State-space formulation:

Yt = pt βt + Wt , observation equation

βt = βt-1 + Vt . state equation

(t/183)f(t/183)f

(t)pj

jj =

Page 17: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

17MaPhySto Workshop 9/04

Polynomial fits (using asymmetric loss fcn).

1992 Legendre polynomial fit

o: Y

(t) *

: Yha

t(t)

0 50 100 150

020

0060

0010

000

1993 Legendre polynomial fit

o: Y

(t) *

: Yha

t(t)

0 50 100 150

020

0060

0010

000

1994 Legendre polynomial fit

o: Y

(t) *

: Yha

t(t)

0 50 100 150

020

0060

0010

000

1995 Legendre polynomial fit

o: Y

(t) *

: Yha

t(t)

0 50 100 150

020

0060

0010

000

Page 18: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

18MaPhySto Workshop 9/04

The model:

Example: power usage (cont)

Page 19: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

19MaPhySto Workshop 9/04

The model:

Example: power usage (cont)

Also, the variance is modeled as

where R(t) is the amount of rainfall on day t. If there is a large rainfall on a given day, then only small changes in the coefficients are allowed for the next several days.

Page 20: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

20MaPhySto Workshop 9/04

Kalman recursions and prediction:Example: power usage (cont)

Page 21: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

21MaPhySto Workshop 9/04

Robust Kalman filter (Cipra and Romera 1991):Example: power usage (cont)

Page 22: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

22MaPhySto Workshop 9/04

One-step predictions using robustified KF filter:

1992 one-step predictions

Time

o: Y

(t) *

: Yha

t(t)

0 50 100 150

020

0040

0060

0080

0010

000

1200

0

Page 23: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

23MaPhySto Workshop 9/04

1993 one-step predictions

Time

o: Y

(t) *

: Yha

t(t)

0 50 100 150

020

0040

0060

0080

0010

000

1200

0

One-step predictions using robustified KF filter:

Page 24: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

24MaPhySto Workshop 9/04

One-step predictions using robustified KF filter:

1994 one-step predictions

Time

o: Y

(t) *

: Yha

t(t)

0 50 100 150

020

0040

0060

0080

0010

000

1200

0

Page 25: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

25MaPhySto Workshop 9/04

One-step predictions using robustified KF filter:

1995 one-step predictions

Time

o: Y

(t) *

: Yha

t(t)

0 50 100 150

020

0040

0060

0080

0010

000

1200

0

Page 26: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

26MaPhySto Workshop 9/04

1.3 Generalized State-Space Models

Observations: y(t) = (y1, . . ., yt )

States: α(t) = (α1, . . ., αt )

Observation equation:

p(yt | αt ):= p(yt | αt , α(t-1), y(t-1) )

State equation:-observation driven

p(αt+1 | y(t) ):= p(αt+1 | αt , α(t-1), y(t) )

-parameter driven

p(αt+1 | αt ):= p(αt+1 | αt , α(t-1), y(t) )

Page 27: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

27MaPhySto Workshop 9/04

Example for Time Series of Counts

Count data: Y1, . . . , Yn

Regression (explanatory) vector: xt

Model: Distribution of the Yt given xt and a stochastic process αt are indep Poisson distributed with mean

µt = exp(xtT β + αt).

The distribution of the stochastic process αt may depend on a vector of

parameters γ.

Note: αt = 0 corresponds to standard Poisson regression model.

Primary objective: Inference about β.

Page 28: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

28MaPhySto Workshop 9/04

Parameter-Driven Specification

Parameter-driven specification: (Assume Yt | µt is Poisson(µt ))

log µt = xtT β + αt ,

where αt is a stationary Gaussian process. e.g. (AR(1) process)

(αt + σ2/2) = φ(αt-1 + σ2/2) + εt , εt ~IID N(0, σ2(1-φ2)).

Advantages:• properties of model (ergodicity and mixing) easy to derive.• interpretability of regression parameters

E(Yt ) = exp(xtT β )Εexp(αt) = exp(xt

T β ), if Ε[exp(αt)] = 1.Disadvantages:

• estimation is difficult-likelihood function not easily calculated (MCEM, importance sampling, estimating eqns).

• model building can be laborious• prediction is more difficult.

Page 29: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

29MaPhySto Workshop 9/04

Observation-Driven Specification

Observation-driven specification: (Assume Yt | µt is Poisson(µt ))

log µt = xtT β + αt ,

where νt is a function of past observations Ys , s < t. e.g. αt = γ1Yt-1 + … + γpYt-p

Advantages:• likelihood easy to calculate• prediction is straightforward (at least one lead-time ahead).

Disadvantages:

• stability behavior, such as stationarity and ergodicty, is difficult to derive.• xt

T β is not easily interpretable. In the special case above,

E(Yt ) = exp(xtT β )Εexp(γ1Yt-1 + … + γpYt-p )

Page 30: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

30MaPhySto Workshop 9/04

Financial Time Series Example-cont

A parameter-driven model for financial data (Taylor `86):Model:

Yt = σt Zt , Zt~IID N(0,1)

αt = γ + φαt-1 + εt , εt~IID N(0,σ2),

where αt = 2 log σt .

The resulting observation and state transition densities are

p(yt| αt) = n(yt ; 0, exp(αt ))

p(αt+1 | αt) = n(αt+1 ; γ + φαt , σ2 )

Properties:

• Martingale difference sequence.

• Stationary.

• Strongly mixing at a geometric rate.

• Estimation can be difficult-cannot calculate likelihood in closed form.

Page 31: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

31MaPhySto Workshop 9/04

Financial Time Series Example

An observation driven model for financial data:Model (GARCH(p,q)):

Yt = σt Zt , Zt~IID N(0,1)

Special case (ARCH(1)=GARCH(1,0)): The resulting observation and state transition density/equations are

p(yt| σt) = n(yt ; 0, σ2t )

2211

22110

2qtqtptptt YY −−−− σβ++σβ+α++α+α=σ LL

2110

2−α+α=σ tt Y

Properties:

• Martingale difference sequence.

• Stationary for α1 ∈[0,2eE), E-Euler’s constant.

• Strongly mixing at a geometric rate.

• For general ARCH (GARCH), properties are difficult to establish.

Page 32: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

32MaPhySto Workshop 9/04

2 Observation2 Observation--Driven ModelsDriven Models

2.1 GLARMA models for TS models of countsPropertiesExistence and uniqueness of stationary distributionsMaximum likelihood estimation and asymptotic theoryApplication to asthma data

2.2 GLARMA extensions Bernoulli case

2.3 OtherBin

Page 33: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

33MaPhySto Workshop 9/04

Model: Yt | µt is Poisson(µt ) with log µt = xtT β + αt

Two components in the specification of αt (see also Shephard (1994)).

1. Uncorrelated (martingale difference sequence)

For λ > 0, define

(Specification of λ will be described later.)

2. Form a linear process driven by the MGD sequence et

where

Since the conditional mean µt is based on the whole past, the model is no longer Markov.

λµµ−= tttte /)Y(

.

,xlog

1

Tt

iti

it

tt

e −

=∑ψ=α

α+β=µ

2.1 Generalized Linear ARMA (GLARMA) Model for Poisson Counts

Page 34: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

34MaPhySto Workshop 9/04

Properties of the Model

1. et is a MG difference sequence E(et | Ft-1) = 0

2. et is an uncorrelated sequence (follows from 1)

3. E(et2) = E(µt

1−2λ)

= 1 if λ = .5

4. Set,

so that

and

,/)Y( λµµ−= tttte . ,xlog1

Tt it

iittt e −

=∑ψ=αα+β=µ

,xlogW Ttt tt α+β=µ=

β= Ttt x)E(W

.5) (if

)E()Var(W

1

2

2-1i-t

1

2t

=λψ=

µψ=

∑∞

=

λ∞

=

ii

iiβ= T

tt x)E(W

.5) (if

)E()Var(W

1

2

2-1i-t

1

2t

=λψ=

µψ=

∑∞

=

λ∞

=

ii

ii

,/)Y( λµµ−= tttte . ,xlog1

Tt it

iittt e −

=∑ψ=αα+β=µ

,xlogW Ttt tt α+β=µ=

Page 35: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

35MaPhySto Workshop 9/04

Properties continued

5.

It follows that Wt has properties similar to the latent process specification:

which, by using the results for the latent process case and assuming the linear process part is nearly Gaussian, we obtain

By adjusting the intercept term, E(µt) can be interpreted as exp(xtTβ).

,

)()(

2/x

2/)(x

x

1

2Tt

Tt

Tt

∑ψ+β

α+β

ψ+β

=

=

∑=

ii

t

i itit

e

e

eEeEVar

eW

)E()W,Cov(W 2-1i-t

1htt

λ+

=+ µψψ= ∑ hi

ii

iti

ie −

=∑ψ+β=

1

Ttt xW

Page 36: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

36MaPhySto Workshop 9/04

Properties continued

6. (GLARMA model). Let Ut be an ARMA process driven by the MGD sequence et, i.e.,

Ut = φ1Ut-1+ . . . + φpUt-p + et + θ1et-1+ . . . + θqet-q

Then the best predictor of Ut based on the infinite past is

where

The model for log µt is then

where

iti

ie −

=∑ψ=

1tU

.1)()( 1

1

−θφ=ψ −∞

=∑ zzzi

ii

,xW Tt tt Z+β=

.)Z()Z(UZ 11p-t11-t1tt qtqtptpt eeee −−−− θ++θ++φ+++φ== LL

Page 37: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

37MaPhySto Workshop 9/04

Existence and uniqueness of a stationary distr in the simple case.

Consider the simplest form of the model with λ = 1, given by

Theorem: The Markov process Wt has a unique stationary distribution.

Idea of proof:

• State space is [β−γ, ∞) (if γ >0 ) and (−∞, β−γ] (if γ < 0 ).

• Satisfies Doeblin’s condition:

There exists a prob measure ν such for some m > 1, ε > 0, and δ >0,

ν(A) > ε implies Pm(x,A) ≥ δ for all x.

• Chain is strongly aperiodic.

• It follows that the chain Wt is uniformly ergodic (Thm 16.0.2 (iv) inMeyn and Tweedie (1993))

.)Y(W 1-t1-t WW1t

-t ee−γ+β= −

Page 38: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

38MaPhySto Workshop 9/04

Existence of Stationary Distr in Case .5 ≤ λ <1.Consider the process

Proposition: The Markov process Wt has at least one stationary distribution.

Idea of proof:

• Wt is weak Feller.

• Wt is bounded in probability on average, i.e., for each x, the sequence

is tight.

• There exists at least one stationary distribution (Thm 12.0.1 in M&T)

,...,2,1 ),,(1

1∑ =− =⋅

k

ii kxPk

Lemma: If a MC Xt is weak Feller and P(x, •), x∈X is tight, then Xt is bounded in probability on average and hence has a stationary distribution.

Note: For our case, we can show tightness of P(x, •), x∈X using a Markov style inequality.

.)Y(W 1-t1-t WW1t

λ− −γ+β= -

t ee

Page 39: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

39MaPhySto Workshop 9/04

Uniqueness of Stationary Distr in Case .5 ≤ λ <1?

Theorem (M&T `93): If the Markov process Xt is an e-chain which is bounded in probability on average, then there exists a unique stationary distribution if and only if there exists a reachable point x*.

For the process , we have

• Wt is bounded in probability uniformly over the state space.

• Wt has a reachable point x* that is a zero of the equation0 = x* + γ exp(1-λ) x*

• e-chain?

Reachable point: x* is a reachable point if for every open set Ocontaining x*,

for all x.

e-chain: For every continuous f with compact support, the sequence of functions Pnf , n =1,… is equicontinuous, on compact sets.

1-t1-t W-W1t )Y(W λ

− −γ+β= eet

∑∞

=>

10),(

nn OxP

Page 40: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

40MaPhySto Workshop 9/04

Estimation for Poisson GLARMA

Let δ = ( βT, γT)T be the parameter vector for the model (γ corresponds to the parameters in the linear process part).

Log-likelihood:

where

First and second derivatives of the likelihood can easily be computed recursively and Newton-Raphson methods are then implementable. For example,

and the term can be computed recursively.

),)(WY()L( )(Wtt

1

t δ

=

−δ=δ ∑ en

t

.)(x)(W1

tt iti

i e −

=∑ δψ+β=δ

δ∂δ∂

−=δ∂δ∂ δ

=∑ )(W)Y()L( t)(W

t1

ten

t

δ∂δ∂ /)(Wt

.

,xlog

1

Tt

iti

it

tt

e −

=∑ψ=α

α+β=µ

Model: Yt | µt is Poisson(µt )

Page 41: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

41MaPhySto Workshop 9/04

Asymptotic Results for MLE

Define the array of random variables by

Properties of ηnt:

•ηnt is a martingale difference sequence.

Using a MG central limit theorem, it “follows” that

where

.)(W)(Y t)(Wt

2/1 t

δ∂δ∂

−=η δ− ennt

).()|(1

1 δVFE Pn

tt

Tntnt ⎯→⎯ηη∑

=−

.0)|)|(|(1

1 ⎯→⎯>∑=

−P

n

ttnt

Tntnt FIE εηηη

),,0()ˆ( 12/1 −⎯→⎯δ−δ VNn D

).()(1lim1

)( δ∂δ∂= ∑=

∞→

Ttt

n

t

W

nWWe

nV t δ

Page 42: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

42MaPhySto Workshop 9/04

Simulation Results

Model 1:

Parameter Mean SD SD(from like)β0 = 1.50 1.499 0.0263 0.0265γ = 0.25 0.249 0.0403 0.0408β0 = 1.50 1.499 0.0366 0.0364γ = 0.75 0.750 0.0218 0.0218β0 = 3.00 3.000 0.0125 0.0125γ = 0.25 0.249 0.0431 0.0430β0 = 3.00 3.000 0.0175 0.0174γ = 0.75 0.750 0.0270 0.0271

Model 2:

β0 = 1.00 1.000 0.0286 0.0284β1 = 0.50 0.500 0.0035 0.0034 γ = 0.25 0.248 0.0420 0.0426β0 = 1.50 0.998 0.0795 0.0805β1 = -.15 -.150 0.0171 0.0173γ = 0.25 0.247 0.0337 0.0339

5000nreps 500,n ,)Y(W 1-t1-t -WW10t ==−γ+β= − eet

5000nreps 500,n ,)Y(500/W 1-t1-t -WW110t ==−γ+β+β= − eet t

Page 43: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

43MaPhySto Workshop 9/04

Application to Sydney Asthma Count Data

Data: Y1, . . . , Y1461 daily asthma presentations in a Campbelltown hospital.

Preliminary analysis identified.

• no upward or downward trend

• annual cycle modeled by cos(2πt/365), sin(2πt/365)

• seasonal effect modeled by

where B(2.5,5) is the beta function and Tij is the start of the jth school term in year i.

• day of the week effect modeled by separate indicatorvariables for Sunday and Monday (increase in admittance on these days compared to Tues-Sat).

• Of the meteorological variables (max/min temp, humidity)and pollution variables (ozone, NO, NO2), only humidity at lags of 12-20 days and NO2(max) appear to have an association.

55.2

1001

100)5,5.2(1)( ⎟⎟

⎞⎜⎜⎝

⎛ −−⎟⎟

⎞⎜⎜⎝

⎛ −= ijij

ij

TtTtB

tP

Page 44: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

44MaPhySto Workshop 9/04

Model for Asthma Data

Trend function.

xtT=(1, St, Mt, cos(2πt/365), sin(2πt/365), P11(t), P12(t),

P21(t), P22(t), P31(t), P32(t), P41(t), P42(t), Ht, Nt)

( and ht is the residual from an annual cycle fitted to the

daily average humidity at 0900 and 1500.)

Model for αt.

ΜΑ(7): αt = θ7et-7

∑=

−−=6

0127

1i

itt hH

Page 45: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

45MaPhySto Workshop 9/04

Results for Asthma Data

Term Est SE T-ratioIntercept 0.583 0.062 9.46Sunday effect 0.197 0.056 3.53Monday effect 0.230 0.055 4.20cos(2πt/365) -0.214 0.039 -5.54sin(2πt/365) 0.176 0.040 4.35Term 1, 1990 0.200 0.056 3.54Term 2, 1990 0.132 0.057 2.31Term 1, 1991 0.087 0.066 1.32Term 2, 1991 0.172 0.057 2.99Term 1, 1992 0.254 0.055 4.66Term 2, 1992 0.308 0.049 6.31Term 1, 1993 0.439 0.050 8.77Term 2, 1993 0.116 0.061 1.91Humidity Ht/20 0.169 0.055 3.09NO2 max -0.104 0.033 -3.16MA, lag 7 θ7 0.042 0.018 2.32

Page 46: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

46MaPhySto Workshop 9/04

Asthma Data: observed and conditional means cond meansobserved

1990

Day of Year

Cou

nts

0 100 200 300

02

46

1991

Day of Year

Cou

nts

0 100 200 300

02

46

8

1992

Day of Year

Cou

nts

0 100 200 300

02

46

810

1214

1993

Day of Year

Cou

nts

0 100 200 300

02

46

810

Page 47: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

47MaPhySto Workshop 9/04

Summary Remarks for Observation-Driven Models

The observation model for the Poisson counts proposed here has the following properties.

1. Easily interpretable on the linear predictor scale and on the scale of the mean µt with the regression parameters directly interpretable as the amount by which the mean of the count process at time t will change for a unit change in the regressor variable.

2. An approximately unbiased plot of the µt can be generated by

3. Is easy to predict with.

4. Provides a mechanism for adjusting the inference about the regression parameter β for a form of serial dependence.

5. Generalizes to ARMA type lag structure.

6. Estimation (approx MLE) is easy to carry out.

.)ˆ5.ˆexp(ˆ1

2∑∞

=

ψ−=µi

itt W

Page 48: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

48MaPhySto Workshop 9/04

2.2 GLARMA Extensions (Binary data)

Binary data: Y1, . . . , Yn

Regression (explanatory) variable: xt

Model: Distribution of the Yt given xt and the past is Bernoulli(pt), i.e.,

P(Yt = 1| Ft-1) = pt and P(Yt = 0 | Ft-1) = 1−pt.

As before construct a MGD sequence

and using the logistic link function, the GLARMA model becomes

and,x with

1logW T

tt ttt

t ZWp

p+β=

−=

2/1))1(/()Y( ttttt pppe −−=

.)Z()Z(UZ 11p-t11-t1tt qtqtptpt eeee −−−− θ++θ++φ+++φ== LL

Page 49: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

49MaPhySto Workshop 9/04

A Simple GLARMA Model for Price Activity (R&S)

Model for price change: The price change Ci of the ith transaction has the following components:

• Yt activity 0,1

• Dt direction -1,1• St size 1, 2, 3, . . .

Rydberg and Shephard consider a model for these components. An autologistic model is used for Yt .

Simple GLARMA(0,1) model for price activity: Yt is a Bernoulli rvrepresenting a price change at the tth transaction. Assume Yt given Ft-1 is Bernoulli(pt), i.e.,

P(Yt = 1 | Ft-1) = pt = 1− P(Yt = 0 | Ft-1),

where.

)1(Y Zand

)1( 11-t1-t

1-t1t −

−σ

σ

=−

−=

+= t

ttZ

Z

epp

pe

ept

t

Page 50: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

50MaPhySto Workshop 9/04

Existence of Stationary Solns for the Simple GLARMA Model

Consider the process

where Yt-1 is Bernoulli with parameter

Propostion: The Markov process Zt has a unique stationary distribution.

Idea of proof:

• Zt is an e-chain.

• Zt is bounded in probability on uniformly over the state space

• Possesses a reachable point ( x∗ is soln to x+eσx/2=0 )

, )1(

YZ1-t1-t

1-t1

pppt

t −−

= −

.)1( 1ZZ1-t

11 −σσ −− += tt eep

Page 51: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

51MaPhySto Workshop 9/04

2.3 BIN Models: A Modeling Framework for Stock Prices(Davis, Rydberg, Shephard, Streett)

Consider the model of a price of an asset at time t given by

where

• N(t) is the number of trades up to time t

• Zi is the price change of the ith transaction.

Then for a fixed time period ∆,

denotes the rate of return on the investment during the tth time interval and

denotes the number of trades in [t ∆, (t+1) ∆).

,)0((t))(

1∑

=

+=tN

iiZpp

, )())1((:))1((

1)(∑

−∆+

+∆=

=∆−−∆+=tN

tNiit Ztptpp

)())1((: ∆−−∆+= tNtNNt

Page 52: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

52MaPhySto Workshop 9/04

The Bin Model for the Number of Trades

Bin(p,q) model: The distribution of the number of trades Nt in [t ∆, (t+1) ∆), conditional on information up to time t ∆− is Poisson with mean

Proposition: For the Bin(1,1) model,

there exists a unique stationary solution.

Idea of proof:

• λt is an e-chain.

• λt is bounded in probability on average.

• Possesses a reachable point ( x∗ = α/(1−γ) )

.1,0 ,0 ,N11

t <δγ≤≥αλδ+γ+α=λ ∑∑=

−=

− jj

q

jjtj

p

jjtj

,N 1-t1t δλ+γ+α=λ −t

Page 53: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

53MaPhySto Workshop 9/04

3 Parameter Driven Models3 Parameter Driven Models

3.1 EstimationGLMImportance samplingApproximation to the likelihood

3.2 Simulation and Application Time series of countsStochastic volatility

3.3 How good is the posterior approximation?Posterior mode vs posterior mean

3.4 Application to estimating structural breaks (tomorrow)Poisson modelStochastic volatility model

Page 54: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

54MaPhySto Workshop 9/04

Exponential Family Setup for Parameter-Driven Model

Time series data: Y1, . . . , Yn

Regression (explanatory) vector: xt

Observation equation:

p(yt | αt) = exp(αt + βΤxt) yt − b(αt+ βΤxt) + c(yt).

State equation: αt follows an autoregressive process satisfying the recursions

αt = γ + φ1αt-1 + φ2αt-2+ … + φpαt-p+ εt ,

where εt ~ IID N(0,σ2).Note: αt = 0 corresponds to standard generalized linear model.

Original primary objective: Inference about β.

Page 55: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

55MaPhySto Workshop 9/04

Estimating equations (Zeger `88):

Monte Carlo EM (Chan and Ledolter `95)

GLM (ignores the presence of the latent process, i.e., αt = 0.)

Importance sampling (Durbin & Koopman `01, Kuk `99, Kuk & Chen `97):

Approximate likelihood (Davis, Dunsmuir & Wang ’98)

Estimation Methods for Parameter Driven Models

Page 56: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

56MaPhySto Workshop 9/04

Estimation Methods — Estimating Equations

Estimating equations (Zeger `88): Let be the solution to the equation

where µ = exp(X β) and Γn = var(Yn).

Iterative weighted least squares can be used to compute (See Zegerfor details and asymptotic results.)

β

,0)( =−Γ∂∂ µyβµ

nn

.ˆ β

Page 57: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

57MaPhySto Workshop 9/04

Estimation Methods — MCEM

Monte Carlo EM (Chan and Ledolter `95): Given ψ(k) from the k-thiteration, ψ(k+1) is computed in the two steps:

E-Step: Compute Q(ψ| ψ(k) ) = E(L(ψ; yn, αn) | yn, ψ(k) ),

• L(ψ; yn, αn) is the log-likelihood based on yn, αn

• expectation taken with respect to p(αn | yn, ψ(k) )

M-Step: Update ψ(k) by maximizing Q(ψ| ψ(k) ) with respect to ψ

Note: This procedure is relatively straightforward except for drawing samples from p(αn | yn, ψ(k) ) in the E-step. Chan and Ledolter use a Gibbs sampler for this.

Page 58: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

58MaPhySto Workshop 9/04

Suppose Yt follows the linear model with time series errors given by

Yt = xtT β + Wt ,

where Wt is a stationary (ARMA) time series.

• Estimate β by ordinary least squares (OLS).

• OLS estimate has same asymptotic efficiency as MLE.

• Asymptotic covariance matrix of depends on ARMA parameters.

• Identify and estimate ARMA parameters using the estimated residuals,

Wt = Yt - xtT

• Re-estimate β and ARMA parameters using full MLE.

βOLS

βOLS

GLM estimation – (Linear Regression Model)

Page 59: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

59MaPhySto Workshop 9/04

Model: Yt | αt , xt ∼ Pois(exp(xtT β + αt )).

GLM log-likelihood:

(This likelihood ignores presence of the latent process.)

⎥⎦

⎤⎢⎣

⎡−β+−=)β ∏∑∑===

βn

tt

n

t

n

t

yyel11

Ttt

1

x !logx(Tt

Estimation Methods Specialized to Poisson Example— GLM estimation

Assumptions on regressors:

),()(xx

),(xx

1 1

Tst

1,

1

Ttt

1,

βΩ→−γµµ=Ω

βΩ→µ=Ω

∑∑

=

=

II

n

ts

n

stnII

I

n

ttnI

tsn

n

),()(xx

),(xx

1 1

Tst

1,

1

Ttt

1,

βΩ→−γµµ=Ω

βΩ→µ=Ω

∑∑

=

=

II

n

ts

n

stnII

I

n

ttnI

tsn

n

Page 60: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

60MaPhySto Workshop 9/04

Theorem (Davis, Dunsmuir, Wang `00). Let be the GLM estimate of βobtained by maximizing l(β) for the Poisson regression model with a stationary lognormal latent process. Then

). N(0, )ˆ( 1III

1I

1I

2/1 −−− ΩΩΩ+Ω→β−βd

n

Theory of GLM Estimation in Presence of Latent Process

Notes:

1. n-1ΩI-1 is the asymptotic cov matrix from a std GLM analysis.

2. n-1ΩI-1 ΩII ΩI

-1 is the additional contribution due to the presence of the latent process.

3. Result also valid for more general latent processes (mixing, etc),

4. The xt can depend on the sample size n.

β

Page 61: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

61MaPhySto Workshop 9/04

Conditions on the regressors hold for:

1. Trend functions.xnt = f(t/n)

where f is a continuous function on [0,1]. In this case,

Remark. xnt = (1, t/n) corresponds to linear regression and works. However xt = (1, t) does not produce consistent estimates say if the true slope is negative.

. )()()()(

,)()(

)(2T1

01 1

Tst

1

)(T1

01

Ttt

1

T

T

∑∫∑∑

∫∑

εβ

=

β

=

γ→−γµµ

→µ

h

tfn

ts

n

st

tfn

tt

hdtetftftsxxn

dtetftfxxn

When Does CLT Apply?

Page 62: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

62MaPhySto Workshop 9/04

2. Harmonic functions to specify annual or weekly effects, e.g.,

xt = cos(2πt/7)

3. Stationary process. (e.g. seasonally adjusted temperature series.)

When Does CLT Apply? (cont)

Page 63: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

63MaPhySto Workshop 9/04

Year

Counts

1970 1972 1974 1976 1978 1980 1982 1984

02

46

810

1214

•••••

••

••

••••

•••••

•••

•••

••••••••

••••••••••••••••

••••••••

••••

•••

•••••

••••

••

••

•••••••

••

•••

••

••

•••

••••

•••••••••••••

••

•••••••••

••••

•••••

•••••

Application to Polio Data

Page 64: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

64MaPhySto Workshop 9/04

Assume the αt follows a log-normal AR(1), where

(αt+σ2/2) = φ(αt-1+ σ2/2) +ηt , ηt~IID N(0, σ2(1−φ2)),

with φ =.82, σ2 = .57.

βZ s.e. βGLM s.e. s.e. βGLM s.e.

Application to Model for Polio Data

.207 .075 .205-4.80 1.40 4.12-0.15 .097 .157-0.53 .109 .168.169 .098 .122-.432 .101 .125

.150 .213-4.89 3.94-.145 .144-.531 .168.167 .123-.440 .125

Zeger GLM Fit Asym Simulation

Intercept 0.17 0.13Trend(×10-3) -4.35 2.68cos(2πt/12) -0.11 0.16sin(2πt/12) -.048 0.17cos(2πt/6) 0.20 0.14σιν(2πτ/6) -0.41 0.14

Page 65: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

65MaPhySto Workshop 9/04

Year

Counts

1970 1972 1974 1976 1978 1980 1982 1984

02

46

810

1214

•••••

••

••

••••

•••••

•••

•••

••••••••

••••••••••••••••

••••••••

••••

•••

•••••

••••

••

••

•••••••

••

•••

••

••

•••

••••

•••••••••••••

••

•••••••••

••••

•••••

•••••

Polio Data With Estimated Regression Function

Page 66: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

66MaPhySto Workshop 9/04

Estimation Methods — Importance Sampling (Durbin and Koopman)

Model: Yt | αt , xt ∼ Pois(exp(xt

T β + αt ))

αt = φ αt-1+ εt , εt~IID N(0, σ2)

Relative Likelihood: Let ψ=(β, φ, σ2) and suppose g(yn, αn; ψ0) is an approximating joint density for Yn= (Y1, . . . , Yn)' and αn= (α1, . . . , αn)'.

nnnn dppL α)α()α|y()( ∫=ψ

nnnnn

nnn

g

nnnnnn

nnn

dgg

ppLL

dggg

pp

α);y|α();α,y(

)α()α|y()(

)(

α);y();y|α();α,y(

)α()α|y(

000

000

ψψ

=ψψ

ψψψ

=

nnnnn

nnn dgg

pp α);α,y();α,y(

)α()α|y(0

0∫ ψ

ψ= nnn

nn

nnn dgg

pp α);α,y();α,y(

)α()α|y(0

0∫ ψ

ψ=

nnnnn

nnn

g

nnnnnn

nnn

dgg

ppLL

dggg

pp

α);y|α();α,y(

)α()α|y()(

)(

α);y();y|α();α,y(

)α()α|y(

000

000

ψψ

=ψψ

ψψψ

=

Page 67: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

67MaPhySto Workshop 9/04

Importance Sampling (cont)

where

,);α,y(

)α()α|y(1~

);y|);α,y(

)α()α|y(

α);y|α();α,y(

)α()α|y()(

)(

1 0)(

)()(

00

000

= ψ

⎥⎦

⎤⎢⎣

⎡ψ

ψ=

ψψ

=ψψ

N

jj

nn

jn

jnn

nnn

nnng

nnnnn

nnn

g

gpp

N

gppE

dgg

ppLL

).;y|g(α iid ~,...,1;α 0)( ψ= nn

jn Nj

Notes:

• This is a “one-sample” approximation to the relative likelihood. That is, for one realization of the α’s, we have, in principle, an approximation to the whole likelihood function.

• Approximation is only good in a neighborhood of ψ0. Geyer suggests maximizing ratio wrt ψ and iterate replacing ψ0 with

Page 68: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

68MaPhySto Workshop 9/04

Importance Sampling — example

phi_0=-0.9

likel

ihoo

d

-1.0 -0.5 0.0 0.5 1.0

200

250

300

phi_0=-0.367

likel

ihoo

d

-1.0 -0.5 0.0 0.5 1.0

220

240

260

280

phi_0=0.029

likel

ihoo

d

-1.0 -0.5 0.0 0.5 1.0

180

220

260

phi_0=0.321

likel

ihoo

d

-1.0 -0.5 0.0 0.5 1.0

100

150

200

250

Simulation example: Yt | αt ∼ Pois(exp(.7 + αt )),

αt = .5 αt-1+ εt , εt~IID N(0, .3), n = 200, N = 1000

Page 69: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

69MaPhySto Workshop 9/04

Importance Sampling — example

Simulation example: β = .7, φ = .5, σ2=.3, n = 200, N = 1000, 50 realizations plotted

phi_0=-0.5

likel

ihoo

d

-1.0 -0.5 0.0 0.5 1.0

180

220

260

300

phi_0=-0.25

likel

ihoo

d-1.0 -0.5 0.0 0.5 1.0

200

220

240

260

280

phi_0=0

likel

ihoo

d

-1.0 -0.5 0.0 0.5 1.0

160

180

200

220

240

260

280

phi_0=0.25

likel

ihoo

d

-1.0 -0.5 0.0 0.5 1.0

150

200

250

phi_0=0.5

likel

ihoo

d

-1.0 -0.5 0.0 0.5 1.0

5010

015

020

025

0

phi_0=0.75

likel

ihoo

d

-1.0 -0.5 0.0 0.5 1.050

100

150

200

250

Page 70: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

70MaPhySto Workshop 9/04

Importance Sampling (cont)

where for each ψ, are generated using the same noise sequence. This is the idea behind Durbin and Koopman’s implementation.

,);α|y(

)α|y()(~

);y|);α|y(

)α|y()(

α);y|α();α,y(

)α()α|y()()(

1)(

)(

= ψψ

⎥⎦

⎤⎢⎣

⎡ψ

ψψ=

ψψ

ψ=ψ

N

jj

nn

jnng

nnn

nngg

nnnnn

nnng

gp

NL

gpEL

dgg

ppLL

),;y|g(α iid ~,...,1;α )( ψ= nnj

n Nj

• Instead one can use ψ0 = ψ to get

Page 71: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

71MaPhySto Workshop 9/04

phi

log

likel

ihoo

d

0.3 0.4 0.5 0.6 0.7

-416

-415

-414

-413

Simulation example: Yt | αt ∼ Pois(exp(.7 + αt )),

αt = .5 αt-1+ εt , εt~IID N(0, .3), n = 200, N = 1000

Importance Sampling — example

Page 72: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

72MaPhySto Workshop 9/04

Importance Sampling (cont)

Choice of importance density g:

Durbin and Koopman suggest a linear state-space approximating model

Yt = µt+ xtT β + αt+Zt , Zt~N(0,Ht),

with

where the are calculated recursively under the approximating model until convergence.

),,y~(~);y|g(α 110

−− ΓΓψ nnnnn N

.))'αα(()(diag

,αyy~

1αXβ

αXβαXβ n

−+

++

+=Γ

+−=

nnn

nnn

Ee

ee

n

n

With this choice of approximating model, it turns out that

where

)y|(ˆ ntgt E α=α,

,1'xˆ

)'xˆ(

)'xˆ(t

β+α−

β+α−

=

+−α−=µ

tt

tt

eH

eyy

t

tttt

,

,1'xˆ

)'xˆ(

)'xˆ(t

β+α−

β+α−

=

+−α−=µ

tt

tt

eH

eyy

t

tttt

),,y~(~);y|g(α 110

−− ΓΓψ nnnnn N

.))'αα(()(diag

,αyy~

1αXβ

αXβαXβ n

−+

++

+=Γ

+−=

nnn

nnn

Ee

ee

n

n

Page 73: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

73MaPhySto Workshop 9/04

Importance Sampling (cont)

Components required in the calculation.

• g(yn,αn)

• simulate from

compute

simulate from

Remark: These quantities can be computed quickly using a version of the innovations algorithm or the Kalman smoothing recursions.

nnn y~'y~ 1−Γ

)det( nΓ

),y~( 11 −− ΓΓ nnnN

nn y~1−Γ

),0( 1−ΓnN

Page 74: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

74MaPhySto Workshop 9/04

Log-likelihood based on yn and αn :

Estimation Methods — Approximation to the likelihood

General setup:

where

Likelihood:

2/)()(exp||)|y(),y( 2/1 µ−αµ−α−α∝α nnT

nnnnnn GGpp

)()(1 µ−αµ−α=−n

Tnn EG )()(1 µ−αµ−α=−

nT

nn EG

nnnn dppL α)α()α|y()( ∫=ψ nnnn dppL α)α()α|y()( ∫=ψ

)()(21)|y;(|G|log

21)2log(

2

);,y(log),y;(

µ−αµ−α−αθ++π−=

ψα=αψ

nnT

nnnn

nnnn

Glnpl

)()(21)|y;(|G|log

21)2log(

2

);,y(log),y;(

µ−αµ−α−αθ++π−=

ψα=αψ

nnT

nnnn

nnnn

Glnpl

2/)()(exp||)|y(),y( 2/1 µ−αµ−α−α∝α nnT

nnnnnn GGpp

Page 75: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

75MaPhySto Workshop 9/04

where

Estimation Methods — Approximation to the likelihood (cont)

If T(α ;α*) denotes the second-order Taylor series expansion of l(θ; yn |αn)around α* with remainder

R(α ;α*) = l(θ; yn |αn) −T(α ;α*),where α* is the mode of l(ψ; yn ,αn), we have

*

*

)|y;(*

)|y;(*2

nn

nn

nnTnn

nn

lK

lh

α=α

α=α

αθα∂α∂

∂−=

αθ=

*),;(*))(*(*)(21

)*()*(21*|G|log

21)2log(

2),y;(

αα+α−α+α−α−

µ−αµ−α−++π−=αψ

nnnT

n

nT

nnn

RGK

Ghnl

*),;(*))(*(*)(21

)*()*(21*|G|log

21)2log(

2),y;(

αα+α−α+α−α−

µ−αµ−α−++π−=αψ

nnnT

n

nT

nnn

RGK

Ghnl

*

*

)|y;(*

)|y;(*2

nn

nn

nnTnn

nn

lK

lh

α=α

α=α

αθα∂α∂

∂−=

αθ=

Page 76: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

76MaPhySto Workshop 9/04

where

Estimation Methods — Approximation to the likelihood (cont)

Let pa(αn | yn, ψ) be the posterior based on the log likelihood neglectingR(α ;α*),

i.e.,

),()y;(

),y;()y;(

ψψ=

ααψ=ψ ∫ana

nnnn

ErL

dLL

( ) )**,;(),y|( 1−+ααφ=ψα nnnna GKp

Then

2/)()(*exp*

||)y;( **2/1

2/1

n µ−αµ−α−+

=ψ nT

n

na Gh

GKGL

and

∫ αψααα=ψ nnnana pREr d );y|(*);(exp)(

),()y;(

),y;()y;(

ψψ=

ααψ=ψ ∫ana

nnnn

ErL

dLL

2/)()(*exp*

||)y;( **2/1

2/1

n µ−αµ−α−+

=ψ nT

n

na Gh

GKGL

∫ αψααα=ψ nnnana pREr d );y|(*);(exp)(

( ) )**,;(),y|( 1−+ααφ=ψα nnnna GKp

Page 77: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

77MaPhySto Workshop 9/04

Notes:

1. The approximating posterior,

is identical to the importance sampling density used by Durbin and Koopmanfor the case of exponential families.

Estimation Methods — Approximation to the likelihood (cont)

2. In traditional Bayesian setting, posterior is approximately pa for large n(see Bernardo and Smith, 1994).

( ) ),**,;(),y|( 1−+ααφ=ψα nnnna GKp ( ) )**,;(),y|( 1−+ααφ=ψα nnnna GKp

Page 78: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

78MaPhySto Workshop 9/04

1. Importance sampling (IS).

Estimation Methods — Approximation to the likelihood (cont)

Recall that

)()y;()y;( ψψ=ψ anan ErLL

where

),(ˆ)y;()y;(ˆ ψψ=ψ anan rELL

*)),,(exp(1ˆ1

)( αα= ∑=

N

i

ina R

NrE

and ).;y|(αp iid ~,...,1;α a)( ψ= nn

jn Nj

Page 79: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

79MaPhySto Workshop 9/04

Estimation Methods — Approximation to the likelihood (cont)

)()y;()y;( ψψ=ψ anan ErLL

)y;()y;(ˆnan LL ψ=ψ

2. Approximate likelihood (AL).If pa(αn | yn, ψ) is concentrated around αn=α∗, then the error term Era(ψ)is close to 1. This suggests ignoring the error term and approximating the likelihood by

3. Hybrid estimate (AIS).Approximate e(ψ) = log Era(ψ) by a linear function. Let be the maximum likelihood estimator based on the objective function La(ψ).Then

ALψ

)ˆ)(ˆ()ˆ(~)( ALALAL eee ψ−ψψ+ψψ &

Ignoring the constant term, we have

).ˆ)(ˆ(exp)y;()y;(ˆALALnan eLL ψ−ψψψ∝ψ &

Use one run of importance sampling to estimate ).ˆ( ALe ψ&

)y;()y;(ˆnan LL ψ=ψ

)ˆ)(ˆ()ˆ(~)( ALALAL eee ψ−ψψ+ψψ &

).ˆ)(ˆ(exp)y;()y;(ˆALALnan eLL ψ−ψψψ∝ψ &

).ˆ( ALe ψ&

ALψ

Page 80: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

80MaPhySto Workshop 9/04

Estimation Methods — Approximation to the likelihood (cont)

Case of exponential family:

where

and α* is the solution to the equation

( ),2/)()()y()(1yexp||)y;( ****

2/1

2/1

n µ−αµ−α−−α−α+

=ψ nT

nTT

nn

na Gcb

GKGL

,)(diag**

2

2

t

ttt

bKα

αα∂∂

=

.0)α(G)b(αy nnn =µ−−α∂∂

−n

n

Using a Taylor expansion, the latter equation can be solved iteratively.

Page 81: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

81MaPhySto Workshop 9/04

phi

log

Era

0.3 0.4 0.5 0.6 0.7

-0.5

0.0

0.5

1.0

phi

log

Era

0.3 0.4 0.5 0.6 0.7

-0.5

0.0

0.5

1.0

⎟⎠

⎞⎜⎝

⎛αα=ψ ∑

=

*)),(exp(1log))(ˆlog(1

)(N

i

ina R

NrE

A closer look at log (Era(ψ))

)ˆ)(ˆ()ˆ(~)( ALALAL eee ψ−ψψ+ψψ &

∫ αψααα=ψ nnnana pREr d );y|(*);(exp)(

⎟⎠

⎞⎜⎝

⎛αα=ψ ∑

=

*)),(exp(1log))(ˆlog(1

)(N

i

ina R

NrE )ˆ)(ˆ()ˆ(~)( ALALAL eee ψ−ψψ+ψψ &

∫ αψααα=ψ nnnana pREr d );y|(*);(exp)(

Page 82: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

82MaPhySto Workshop 9/04

0.3 0.4 0.5 0.6 0.7

-416

-415

-414

-413

-412

0.3 0.4 0.5 0.6 0.7

-416

-415

-414

-413

-412

Model: Yt | αt ∼ Pois(exp(.7 + αt )), αt = .5 αt-1+ εt , εt~IID N(0, .3), n = 200

Importance Sampling Hybrid: AL + IS

Comparison between IS and AIS

Page 83: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

83MaPhySto Workshop 9/04

0.3 0.4 0.5 0.6 0.7

-416

-415

-414

-413

-412

0.3 0.4 0.5 0.6 0.7

-416

-415

-414

-413

-412 phi

log

Era

0.3 0.4 0.5 0.6 0.7

-0.5

0.0

0.5

1.0

phi

log

Era

0.3 0.4 0.5 0.6 0.7

-0.5

0.0

0.5

1.0

⎟⎠

⎞⎜⎝

⎛αα=ψ ∑

=

*)),(exp(1000

1log))(ˆlog(1000

1

)(

i

ina RrE )ˆ)(ˆ()ˆ(~)( ALALAL eee ψ−ψψ+ψψ &⎟

⎞⎜⎝

⎛ αα=ψ ∑=

*)),(exp(1000

1log))(ˆlog(1000

1

)(

i

ina RrE )ˆ)(ˆ()ˆ(~)( ALALAL eee ψ−ψψ+ψψ &

Page 84: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

84MaPhySto Workshop 9/04

Model: Yt | αt ∼ Pois(exp(1.0 + αt ))

αt = 1.25αt-1− .75αt-2+ εt , εt~IID N(0, .2), n = 200

Estimation methods:

• Importance sampling (N=100)

β φ1 φ2 σ2

mean 0.9943 1.2520 -.7536 0.1925

rmse 0.0832 0.0749 0.0668 0.0507

Comparison between IS and AIS

• Approximate likelihood + IS

β φ1 φ2 σ2

mean 0.9943 1.2520 -.7586 0.1925

rmse 0.0832 0.0749 0.0669 0.0507

Page 85: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

85MaPhySto Workshop 9/04

Model: Yt | αt ∼ Pois(exp(1.0 + αt ))

αt = 1.25αt-1− .75αt-2+.2αt-3 + εt , εt~IID N(0, .2), n = 200

Estimation methods:

• Importance sampling (N=100)

β φ1 φ2 φ3 σ2

mean 1.0009 1.2473 -.7496 0.1896 0.1969

rmse 0.1139 0.2765 0.3815 0.2011 0.0783

Comparison between IS and AIS (cont)

• Approximate likelihood + IS

β φ1 φ2 φ3 σ2

mean 1.0009 1.2471 -.7493 0.1894 0.1969

rmse 0.1139 0.2759 0.3809 0.2009 0.0783

Page 86: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

86MaPhySto Workshop 9/04

Model: Yt | αt ∼ Pois(exp(.7 + αt )), αt = .5 αt-1+ εt , εt~IID N(0, .3), n = 200

Estimation methods: Simulation results based on 1000 replications.

beta phi sigma2

mean 0.7038 0.4783 0.2939

std 0.0970 0.1364 0.0789

Simulation Results

mean 0.7035 0.4627 0.2956

std 0.0972 0.1413 0.0785

mean 0.7039 0.4784 0.2938

std 0.0969 0.1364 0.0789

Importance sampling (N=100)

Approx likelihood

AL + IS (AIS)

Page 87: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

87MaPhySto Workshop 9/04

0.4 0.5 0.6 0.7 0.8 0.9 1.0

01

23

4

dens

ity

0.2 0.4 0.6 0.8

0.0

1.0

2.0

3.0

0.1 0.2 0.3 0.4 0.5

01

23

4

0.4 0.5 0.6 0.7 0.8 0.9 1.0

01

23

4

dens

ity

0.2 0.4 0.6 0.8

0.0

1.0

2.0

3.0

0.1 0.2 0.3 0.4 0.5

01

23

4

0.4 0.5 0.6 0.7 0.8 0.9 1.0

01

23

4

beta

dens

ity

0.2 0.4 0.6 0.8

0.0

1.0

2.0

3.0

phi

0.1 0.2 0.3 0.4 0.5

01

23

4

sigma^2

Model: Yt | αt ∼ Pois(exp(.7 + αt )), αt = .5 αt-1+ εt , εt~IID N(0, .3), n = 200

Approx likelihood

Importance Sampling

AL + IS (AIS)

Page 88: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

88MaPhySto Workshop 9/04

AR model order selection for αt:

Approx Like (AL)

p 0 1 2 3 4 5 13

AIC 518.0 512.3 512.3 513.9 512.3 514.2 527.4

log(like) -252.0 -248.1 -247.1 -246.9 -245.2 -245.1 -243.7

Application to Model Fitting for the Polio Data

Approx + IS (AIS)

p 0 1 2 3 4 5 13

AIC 519.8 512.6 512.1 513.9 511.5 514.3 527.4

log(like) -252.9 -248.3 -247.0 -246.9 -244.8 -245.2 -243.7

Page 89: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

89MaPhySto Workshop 9/04

Import Sampling

Intercept 0.238 0.283Trend(×10-3) -3.744 2.823cos(2pt/12) 0.160 0.152sin(2pt/12) -0.481 0.168cos(2pt/6) 0.414 0.128sin(2pt/6) -0.011 0.126 f 0.661 0.202 s2 0.272 0.115

AL + IS

0.239 0.285-3.746 2.8670.161 0.151-0.480 0.1640.414 0.122-0.011 0.1270.661 0.2090.272 0.112

Approx Like

0.242 0.273 -3.814 2.7670.162 0.142-0.482 0.1660.413 0.128-0.011 0.1290.627 0.2290.289 0.122

ISˆ β SE

Model for αt:αt = φαt-1+εt , εt~IID N(0, σ2).

Importance sampling (N=1000)

Application to Model Fitting for the Polio Data

AISˆ β SE AL

ˆ β SE

Page 90: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

90MaPhySto Workshop 9/04

Simulation Results

Stochastic volatility model:Yt = σt Zt , Zt~IID N(0,1)

αt = γ + φαt-1 + εt , εt~IID N(0,σ2), where αt = 2 log σt ; n=500, N=100,NR=500

RMSE RMSE

.210 .195

.025 .023

.065 .064

True AL

γ −.411 −.491

φ 0.950 0.940

σ 0.484 0.479

AIS

−.481

0.942

0.477

RMSE RMSE

.341 .299

.046 .040

.068 .064

True AL

γ −.368 −.500

φ 0.950 0.932

σ 0.260 0.270

AIS

−.483

0.934

0.269

CV=10

CV=1

Page 91: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

91MaPhySto Workshop 9/04

phi

log

Era

0.940 0.945 0.950 0.955 0.960

-0.5

0.0

0.5

1.0

1.5

2.0

phi

log

likel

ihoo

d

0.940 0.945 0.950 0.955 0.960

1356

1358

1360

1362

1364

1366

1368

phi

log

Era

0.940 0.945 0.950 0.955 0.960

-0.5

0.0

0.5

1.0

1.5

2.0

phi

log

Era

0.940 0.945 0.950 0.955 0.960

-0.5

0.0

0.5

1.0

1.5

2.0

⎟⎠

⎞⎜⎝

⎛ αα=ψ ∑=

*)),(exp(5001log))(ˆlog(

500

1

)(

i

ina RrE )ˆ)(ˆ()ˆ(~)( ALALAL eee ψ−ψψ+ψψ &

phi

log

likel

ihoo

d

0.940 0.945 0.950 0.955 0.960

1356

1358

1360

1362

1364

1366

1368

phi

log

Era

0.940 0.945 0.950 0.955 0.960

-0.5

0.0

0.5

1.0

1.5

2.0

⎟⎠

⎞⎜⎝

⎛ αα=ψ ∑=

*)),(exp(500

1log))(ˆlog(500

1

)(

i

ina RrE )ˆ)(ˆ()ˆ(~)( ALALAL eee ψ−ψψ+ψψ &

Page 92: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

92MaPhySto Workshop 9/04

AL BC SE AIS BC SE

γ -0.0227 -0.0140 0.0198 -0.0230 -0.0153 0.0173

φ 0.9750 0.9845 0.0194 0.9747 0.9832 0.0166

σ2 0.0267 0.0228 0.0141 0.0273 0.0228 0.0138

Application to Pound-Dollar Exchange Rates

Stochastic volatility model:Yt = σt Zt , Zt~IID N(0,1)

αt = γ + φαt-1 + εt , εt~IID N(0,σ2), where αt = 2 log σt ; N=1000, B=500

Page 93: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

93MaPhySto Workshop 9/04

Application to Sydney Asthma Count Data

Data: Y1, . . . , Y1461 daily asthma presentations in a Campbelltown hospital.

Preliminary analysis identified.

• no upward or downward trend

• annual cycle modeled by cos(2πt/365), sin(2πt/365)

• seasonal effect modeled by

where B(2.5,5) is the beta function and Tij is the start of the jth school term in year i.

• day of the week effect modeled by separate indicatorvariables for Sunday and Monday (increase in admittance on these days compared to Tues-Sat).

• Of the meteorological variables (max/min temp, humidity)and pollution variables (ozone, NO, NO2), only humidity at lags of 12-20 days and NO2(max) appear to have an association.

55.2

1001

100)5,5.2(1)( ⎟⎟

⎞⎜⎜⎝

⎛ −−⎟⎟

⎞⎜⎜⎝

⎛ −= ijij

ij

TtTtB

tP

Page 94: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

94MaPhySto Workshop 9/04

Term AL Intercept 0.568Sunday effect 0.199Monday effect 0.225cos(2pt/365) -0.214 sin(2pt/365) 0.177Term 1, 1990 0.199Term 2, 1990 0.133Term 1, 1991 0.085Term 2, 1991 0.171Term 1, 1992 0.249Term 2, 1992 0.302Term 1, 1993 0.431Term 2, 1993 0.114Humidity Ht/20 0.009NO2 max -0.101AR(1), φ 0.774

σ2 0.011

SE.064.053.053.041.043.063.066.070.064.061.060.060.067.003.033.327.015

AIS0.5690.199 0.226 -0.214 0.177 0.200 0.133 0.0850.171 0.249 0.302 0.432 0.114 0.009 -0.101 0.786 0.010

Results for Asthma Data—(AL & AIS, N=100)

SE.066.051.052.041.048.065.063.078.069.066.058.062.075.003.034.394.013

Page 95: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

95MaPhySto Workshop 9/04

Asthma Data: observed and conditional mean cond meanobserved

1990

Day of Year

Cou

nts

0 100 200 300

02

46

1991

Day of Year

Cou

nts

0 100 200 300

02

46

8

1992

Day of Year

Cou

nts

0 100 200 300

02

46

810

1214

1993

Day of Year

Cou

nts

0 100 200 300

02

46

810

Page 96: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

96MaPhySto Workshop 9/04

Is the posterior distribution close to normal?

Compare posterior mean with posterior mode: Can compute the posterior mean using SIR (sampling importance-resampling) or particle filtering.

Posterior mode: The mode of p(αn | yn) is α* found at the last iteration of AL.

Posterior mean: The mean of p(αn | yn) can be found using SIR.

Let α(1), α(2), . . . , α(N) be independent draws from the multivariate distrpa(αn | yn). For N large, an approximate iid sample from p(αn | yn) can be obtained by drawing a random sample from α(1), α(2), . . . , α(N) with probabilities

.,,1 *),;(αexp)y|(α)y|(α , (i)

(i)

(i)

1

NiRppw

w

wpna

niN

ii

ii K=α∝==

∑=

Page 97: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

97MaPhySto Workshop 9/04

Posterior mean vs posterior mode?

Polio data: blue = mean, red = mode

t

smoo

thed

sta

te v

ecto

r

0 50 100 150

-0.5

0.0

0.5

1.0

1.5

Page 98: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

98MaPhySto Workshop 9/04

Posterior mean vs posterior mode?

Pound/US exchange rate data: blue = mean, red = mode

time

post

erio

r mea

n

0 200 400 600 800

-2-1

01

Page 99: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

99MaPhySto Workshop 9/04

Is the posterior distribution close to normal?

Suppose α(1), α(2), . . . , α(M) are independent draws from the multivariate distr p(αn | yn), generated using SIR. Then

~ ))(()( 2n

*)(*)(2 χα−α+α−α=iid

jn

Tjj GKd

Observed quantile

chi-s

quar

e qu

antil

e

30 40 50 60 70 80

3050

70

n=50, M=100

Observed quantile

chi-s

quar

e qu

antil

e

30 40 50 60 70 80

3050

70n=50, M=100

Observed quantile

chi-s

quar

e qu

antil

e

30 40 50 60 70 80

3050

70

n=50, M=100

Observed quantile

chi-s

quar

e qu

antil

e

80 100 120 140

8010

014

0

n=100, M=150

Observed quantile

chi-s

quar

e qu

antil

e

60 80 100 120 140

6080

120

n=100, M=150

Observed quantile

chi-s

quar

e qu

antil

e

80 100 120 140

8010

014

0

n=100, M=150

Observed quantile

chi-s

quar

e qu

antil

e

160 180 200 220 240 260

160

200

240

n=200, M=250

Observed quantile

chi-s

quar

e qu

antil

e

160 180 200 220 240 260

160

200

240

n=200, M=250

Observed quantile

chi-s

quar

e qu

antil

e

160 180 200 220 240 260

160

200

240

n=200, M=250

Correlations are notsignificant.

Page 100: Nonlinear Time Series Modeling - Columbia Universityrdavis/lectures/Cop_P3.pdf · 2004-10-03 · ARMA(1,1) cont (canonical observable representation): Suppose {Y t} follows the recursions

100MaPhySto Workshop 9/04

Summary Remarks

1. Importance sampling offers a nice clean method for estimation in parameter driven models.

2. Approximation to the likelihood is a non-simulation based procedure which may have great potential especially with large sample sizes and/or large number of explanatory variables.

3. The hybrid approach seems like a good compromise betweenimportance sampling and the approximate likelihood methods.

5. Approximate likelihood and hybrid approaches are amenable to bootstrapping procedures for bias correction.

6. Posterior mode matches posterior mean reasonably well.

7. Approximate likelihood approach may be useful to the problem ofstructural break detection, the subject of tomorrow’s lecture.