laplace likelihood and lad estimation for non-invertible ma(1)rdavis/lectures/tokyo_06.pdf ·...

26
1 Waseda 1/06 Laplace Laplace Likelihood and LAD Estimation Likelihood and LAD Estimation for Non for Non - - invertible MA(1) invertible MA(1) Colorado State University National Tsing-Hua University U. of California, San Diego (http://www.stat.colostate.edu/~rdavis/lectures) Richard A. Davis F. Jay Breidt Nan-Jung Hsu Murray Rosenblatt

Upload: others

Post on 30-Apr-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

1Waseda 1/06

LaplaceLaplace Likelihood and LAD Estimation Likelihood and LAD Estimation for Nonfor Non--invertible MA(1)invertible MA(1)

Colorado State UniversityNational Tsing-Hua University

U. of California, San Diego

(http://www.stat.colostate.edu/~rdavis/lectures)

Richard A. Davis

F. Jay Breidt

Nan-Jung Hsu

Murray Rosenblatt

Page 2: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

3Waseda 1/06

MA(1) unit root problem

MA(1): (world’s simplest time series model!)

Yt = Zt − θ Zt-1 , {Zt} ~ IID (0, σ2)

Properties:

• |θ| < 1 ⇒ (invertible)

• |θ| > 1 ⇒ (non-invertible)

• |θ| = 1 ⇒ and

⇒ (perfect interpolation)

• |θ| < 1 ⇒

MLE = maximum (Gaussian) likelihood, n = sample size

What if θ =1?

∑∞

=−θ=

0

j

jjtt YZ

∑∞

=+θ−=

1

j-

jjtt YZ

},...,,{sp },...,,{sp 211 ++− ∈∈ tttttt YYZYYZ

00}0 ,sp{P YYsYs=≠

)/)1(,(AN is ˆ 2 nmle θ−θθ

Page 3: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

4Waseda 1/06

Why study MA(1) with a unit root?

a) differencing (to remove non-stationarity)

• linear trend model: Xt = a + bt + Zt .Yt = Xt − Xt-1 = b + Zt − Zt-1 ~ MA(1) with θ = 1.

• seasonal model: Xt = st + Zt , st seasonal component w/ period 12.Yt = Xt − Xt-12 = Zt − Zt-12 ~ MA(12) with θ = 1.

b) random walk + noise

Xt = Xt-1 + Ut (random walk signal)

Yt = Xt + Vt (random walk signal + noise)

ThenYt − Yt-1 = Ut + Vt − Vt-1 ~MA(1)

with θ=1 if and only if Var(Ut) = 0.

Page 4: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

5Waseda 1/06

Identifiability and the Gaussian likelihoodIdentifiability

• |θ| > 1 ⇒ Yt = εt – θ−1 εt-1 , where {εt} ~ WN(0,θ2σ2).

• {εt} is IID if and only if {Zt} is Gaussian (Breidt and Davis `91)

• {εt} is a special case of an All-Pass Model (Breidt, Davis, Trindade `01, Andrews et al. `05a, `05b)

Gaussian Likelihood

LG(θ, σ2) = LG(1/θ, θ2σ2) ⇒ θ is only identifiably for |θ| ≤ 1.Notes:i) this implies LG(θ) = LG(1/θ) for the profile likelihood and θ = 1 is a

critical point,

ii) a pile-up effect ensues, i.e., even if θ < 1.

.0)1(L'G =

0)1ˆ( >=θP

Page 5: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

6Waseda 1/06

Gaussian likelihood, non-Gaussian data

theta

Gau

ssia

n lik

elih

ood

0.5 1.0 1.5 2.0

1.20

1.25

1.30

1.35

1.40

100 observations from Yt = Zt − θ0 Zt-1 , {Zt} ~ IID, Laplace pdf

θ0 =1.0θ0 =.8 θ0 =1.25

Page 6: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

7Waseda 1/06

Gaussian MLE for near-unit roots

Idea: build parameter normalization into the likelihood function.

Model: Yt = Zt - (1-β/n) Zt-1 , t =1,…,n.

β = n(1-θ), θ = 1- β/n, θ0 = 1- γ/n

Gaussian Likelihood:Ln(β) = ln (1- β/n) - ln (1), ln ( ) = profile log-like.

Theorem (Davis and Dunsmuir `96): Under θ0 = 1-γ / n,Ln(β) →d Zγ (β) on C[0,∞).

Results:

• argmax Ζγ(β)

• arglocalmax Ζγ(β)

• if γ = 0.

=β→θ− ˆ)ˆ1( LLn

6518.)0ˆP()1ˆP( ==β→=θ LL

=β→θ− mlemlen ˆ)ˆ1(

Page 7: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

8Waseda 1/06

Extensions of MLE (Gaussian likelihood)

ii) heavy tails (Davis and Mikosch `98): {Zt} symmetric alpha stable

(SαS). Then the max Gaussian likelihood estimator has the same

normalizing rate, i.e.,

The pile-up decreases with increasing tail heaviness.

i) non-zero mean (Chen and Davis `00): same type of limit, except pile-up is more excessive.

This makes hypothesis testing easy!

Reject H0: θ = 1 if (size of test is .045)

LdLn β→θ− ˆ)ˆ1(

)0ˆP()1ˆP( =β→=θ LL

.955)1ˆ(P →=θmle

1ˆ <θmle

Page 8: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

12Waseda 1/06

Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter θ is identifiable for all real values.

Q1. For MLE (non-Gaussian) does one have 1/n or 1/n1/2 asymptotics?

Q2. Is there a pile-up effect?

Look at this problem with non-Gaussian likelihood• Specifically, consider Laplace likelihood / Least AbsoluteDeviations for unit root only (not near-unit root) • Preliminary results only!

Page 9: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

13Waseda 1/06

Non-Gaussian likelihood – Joint and ExactModel. Yt = Zt − θ Zt-1 , {Zt} ~ IID with median 0 and EZ4 < ∞. Initial variable.

Joint density: Let Yn=(Y1, . . . ,Yn), then

where the zt are solved

⎩⎨⎧

−≤θ

= ∑ =

n

t tn

init

YZZ

Z1

0

otherwise. ,,1||if ,

( ),1||1),...,,(),( }1|{|}1|{|10 >θ−

≤θ θ+= nn

initn zzzfzf y

forward by: zt = Yt + θ zt-1, t =1,…,n for |θ| ≤ 1 with z0 = zinit

backward by: zt-1 = θ−1(zt – Yt ), t =n,…,1 for |θ| > 1 with zn = zinit+ Y1+. . .+ Yn

Note: integrate out zinit to get Exact likelihood.

initinit dzzff nn ∫∞

∞−

= ),()( yy

Page 10: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

14Waseda 1/06

Laplace likelihood examples

100 observations from Yt = Zt − θ0 Zt-1 , {Zt} ~ IID Laplace pdf

0.5

1

1.5

2

theta

-2

-1

0

1

2

z.init 0

0.2

0.4

0.6

0.8

1Z

0.5

1

1.5

2

theta

-2

-1

0

1

2

z.init

00.

20.

40.

60.

81

Z

θ0 =.8

θ0 =1.0

Page 11: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

15Waseda 1/06

Laplace likelihood, Laplace noise

100 observations from Yt = Zt − θ0 Zt-1 , {Zt} ~ IID Laplace pdf

θ0 =1.0θ0 =.8 θ0 =1.25

Exact likelihood Joint likelihood at zmax(θ)

theta

Lapl

ace

likel

ihoo

d

0.5 1.0 1.5 2.0

0.0

0.2

0.4

0.6

0.8

1.0

theta

Lapl

ace

likel

ihoo

d

0.5 1.0 1.5 2.0

0.0

0.02

0.04

0.06

0.08

0.10

Page 12: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

17Waseda 1/06

Laplace likelihood-LAD estimation

(Joint) Laplace log-likelihood. (σ = E|Z0| is a scale parameter)

Maximizing wrt σ, we obtain

so that maximizing L is equivalent to minimizing

∑=

>θ− θ−σ−σ+−=σθ

n

tt nznzL init

0}1|{|

1 1|)|(log||2log)1(),,(

∑=

+=σn

tt nz

0

)1/(||ˆ

⎪⎪⎩

⎪⎪⎨

θ

≤θ=θ

=

=n

tt

n

tt

initn

z

zzl

0

0

otherwise. |,|||

1, || if |,|),(

Page 13: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

18Waseda 1/06

Joint Laplace likelihood — limit results

Result 1. Under the parameterizations, θ = 1 + β/n and zinit = Z0 + ασ/n1/2,

we have

on C(R2), where

for β ≤ 0, and

for β > 0.

),()),1(),((),( 01 αβ→−θσ=αβ − UZlzlU dnnn

init

dsetdSef

sdWetdSeU

ssts

ssts

21

0 0

)(

1

0 0

)(

)()0(

)()(),(

∫ ∫

∫ ∫

⎟⎟⎠

⎞⎜⎜⎝

⎛α+β+

⎟⎟⎠

⎞⎜⎜⎝

⎛α+β=αβ

β−β

−β−β

dsetdSef

sdWetdSeU

s

sts

s

sts

21

0

1)1()(

1

0

1)1()(

)()0(

)()(),(

∫ ∫

∫ ∫

⎟⎟⎠

⎞⎜⎜⎝

⎛α+β+

⎟⎟⎠

⎞⎜⎜⎝

⎛α+β−=αβ

−β−−β

+

−β−−β

Page 14: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

19Waseda 1/06

From the limit,

it suggests (from the continuous mapping theorem?) that limit(optimum(criterion)) = optimum(limit(criterion)).

So for the Local optimizer of the Joint likelihood

where

Joint Laplace likelihood — limit results

( ) ( )( ) ( ) ˆ,ˆˆ,1ˆLJLJ0

initLJ

1LJ αβ→−σ−θ −

dZznn

.)(sign1)( ),(1)( d

][

0

][

0W(t)Z

ntWtSZ

ntS

nt

iind

nt

iin →=→= ∑∑

== σσ

).,( minarg(local) )ˆ,ˆ( LJLJ αβ=αβ U

The limits contain correlated Brownian Motions S(t) and W(t), obtained as the limits of the partial sum processes

),,(),( αβ→αβ UU dn

Page 15: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

20Waseda 1/06

Might expect a similar result to hold for the Global optimizer of the Joint likelihood

where

Joint Laplace likelihood — limit results

( ) ( )( ) ( ) ˆ,ˆˆ,1ˆGJGJ0

initGJ

1GJ αβ→−σ−θ −

dZznn

).,( argmin )ˆ,ˆ( GJGJ αβ=αβ U

Page 16: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

22Waseda 1/06

Exact Laplace likelihood — limit results

Exact Laplace Likelihood:

Result 2. For the Global optimizer of the Exact likelihood,

where

and U*(β) is a stochastic process defined in terms of S(t) and W(t).

In addition, for the Local optimizer of the Exact likelihood

∫∞

∞−

=σθ initinit dzzfL nn ),(),( y

,ˆ)1ˆ( GEGE β→−θ dn

, )( min argˆ *GE β=β U

).( min (local) argˆ ,ˆ)1ˆ( *LELELE β=ββ→−θ Un d

Page 17: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

23Waseda 1/06

Simulating from the limit process

Step 1. Simulate two indep sequences (W1, . . . , Wm) and (V1, . . . , Vm) of iid N(0,1) random variables with m=100000.

.100000/)( and 100000/)(]100000[

1

]100000[

1∑∑

==

==t

jj

t

jj VtVWtW

.1||/)(Var 02

1 −= ZEZc t

Step 5. Determine the respective Local and Global minimizers of Joint limit U(β,α) and Exact limit U*(β) numerically.

Step 2. Form W(t) and V(t) by the partial sum processes,

Step 3. Set S(t) = W(t) + c1V(t), where

Limit process depends only on c1 and f (0).

Step 4. Compute U(β,α) and U*(β) from the definition.

Page 18: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

24Waseda 1/06

Simulated realizations of the limit processes

Simulate Joint and Exact limit processes, U(β,α(β)), U*(β).

• Simulate realization of each limit process, joint and exact

• Compute local and global optima

• Repeat…

• Build up limit distribution functions

t(5) pdf

beta

U a

nd U

star

-20 -10 0 10 20

-20

24

Page 19: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

27Waseda 1/06

Limit cdf

beta.min

cdf

-20 -10 0 10 20

0.0

0.2

0.4

0.6

0.8

1.0

Joint Lap Like

red graph = Laplace pdf for Zt blue graph = Gaussian pdf for Zt

beta.min

cdf

-20 -10 0 10 200.

00.

20.

40.

60.

81.

0

Exact Lap Like

Page 20: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

29Waseda 1/06

.0000 .0004 .0293

.0574 .0208 .1511

.0574 .0208 .1539

.0483 .0211 .1414

biass.d.rmseasymp

n = 50

-.0057 -.0033 -.0208.1438 .0656 .2430.1439 .0657 .2438.1207 .0526 .2236

biass.d.rmseasymp

n = 20

.0005 -.0003 -.0025

.0303 .0107 .1000

.0303 .0107 .1000

.0241 .0105 .1000

biass.d.rmseasymp

n = 100

.0005 .0000 -.0016

.0140 .0058 .0718

.0141 .0058 .0718

.0121 .0053 .0707

biass.d.rmseasymp

n = 200

Exact Jointn ˆ ˆ ˆ

LJLE σθθLocal results

Laplace noise

θ = 1, σ =1

1000 reps

Page 21: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

30Waseda 1/06

-.013 .002 .000.096 .078 .057

biasrmse

n = 50

-.047 -.050 -.003.224 .213 .144

biasrmse

n = 20

.003 -.003 .000

.051 .034 .011biasrmse

n = 100

.000 .000 .000

.028 .014 .006biasrmse

n = 200

Exact Joint Localn ˆ ˆ ˆ

LJGJGE θθθLaplace noise

θ = 1, σ =1

1000 reps

Simulation results: Global Exact and Global Joint

Global Exact = MLE

Global Joint = maximize over θ and zinit

Note:

• Local dominates Global

• Joint dominates Exact (rmse is half the size)

Page 22: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

32Waseda 1/06

Analysis of pile-up probabilities

Look back at realizations of the limit processes, U(β,α(β)), U*(β).

• When is there a local optimum at θ = 1?

• Check derivatives

• Negative derivative from the left

• Positive derivative from the right

• Local optimum at θ = 1

t(5) pdf

beta

U a

nd U

star

-20 -10 0 10 20

-20

24

Page 23: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

33Waseda 1/06

Pile-up probabilities (Joint)

Result 3. (Local Joint Laplace likelihood)

where

Idea: look at derivatives

Now,

and the result follows.

),1 0(P)1ˆ(P LJ <<→=θ Y

)0))(ˆ,( lim and 0))(ˆ,( (limP)1ˆ(P 00 >βαββ∂∂

<βαββ∂∂

==θ ↓β↑β nnlm UU

)0))(ˆ,( lim and 0))(ˆ,( (limP 00 >βαββ∂∂

<βαββ∂∂

→ ↓β↑β UU

1))(ˆ,( lim 0 −=βαββ∂∂

↑β YU

YU =βαββ∂∂

↓β ))(ˆ,( lim 0

)2)1()(()0(2

)1()()1()()( 1

0

1

0

1

0∫ ∫ ∫+= /ds-WsW

fWdssS-WsdWsSY

Page 24: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

34Waseda 1/06

Pile-up probabilities (Exact)

Result 4. (Local Exact Laplace likelihood)

The pile-up probability is always zero for the Local Exact, and always positive for the Local Joint (see Result 3).

0 21-1

21P)1ˆ(P LE =⎥⎦

⎤⎢⎣⎡ <<→=θ Y

Remark. (Laplace pile-up)

If Zt has a Laplace density then

where W(s) and V(s) are independent standard Brownian motions.

,21)( -|z|/σezfσ

=

[ ] .21)()()1(

1

0∫ +−= sdVsWsWY

Page 25: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

35Waseda 1/06

Laplace pile-up probabilities (cont)

It follows that Local Joint has pile-up probability

)10P()1ˆ(P LJ <<→=θ Y

[ ]∫ <+−<=1

0

)15.)()()1( P(0 sdVsWsW

[ ] ⎥⎦

⎤⎢⎣

⎡∈<−<= ∫

1

0

])1,0[ ),(|5.)()()1( P(-.5 ttWsdVsWsWE

[ ]⎥⎥

⎢⎢

⎟⎟

⎜⎜

⎭⎬⎫

⎩⎨⎧

−Φ=−

∫ 1-)()1( .522/11

0

2 dssWsWE

820.0≈

But no pile-up probability for Local Exact:

Remark: if Local does not pile up, Global does not pile up

if Local does pile up, Global probably does as well

Page 26: Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf · Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter

36Waseda 1/06

.827

.846

.831

.817

.823

.796t(5)

.836

.841

.843

.864

.864

.831Unif

.820

.809

.819

.819

.806

.796Lap

.858

.855

.844

.873

.859

.827Gau

200

500

50

20

100

n

Simulation results – pile-up probabilities

Pile-up probabilities for Local Joint: )1ˆ(P LJ =θ )1ˆ(P LJ =θ

(No pile-up probabilities for Local Exact.)