here - people on the web at boston university

25
Econometrics Journal (2006), volume 9, pp. 423–447. doi: 10.1111/j.1368-423X.2006.00192.x A comparison of alternative asymptotic frameworks to analyse a structural change in a linear time trend AI DENG AND PIERRE PERRON Bates White, LLC, 1300 Eye Street NW, Suite 600, Washington, DC 20005, E-mail: [email protected] Department of Economics, Boston University, 270 Bay State Road, Boston, MA, 02215, USA E-mail: [email protected] Received: August 2005 Summary This paper considers various asymptotic approximations to the finite sample distribution of the estimate of the break date in a simple one-break model for a linear trend function that exhibits a change in slope, with or without a concurrent change in intercept. The noise component is either stationary or has an autoregressive unit root. Our main focus is on comparing the so-called ‘bounded-trend’ and ‘unbounded-trend’ asymptotic frameworks. Not surprisingly, the ‘bounded-trend’ asymptotic framework is of little use when the noise component is integrated. When the noise component is stationary, we obtain the following results. If the intercept does not change and is not allowed to change in the estimation, both frameworks yield the same approximation. However, when the intercept is allowed to change, whether or not it actually changes in the data, the ‘bounded-trend’ asymptotic framework completely misses important features of the finite sample distribution of the estimate of the break date, especially the pronounced bimodality that was uncovered by Perron and Zhu (2005) and shown to be well captured using the ‘unbounded-trend’ asymptotic framework. Simulation experiments confirm our theoretical findings, which expose the drawbacks of using the ‘ bounded-trend’ asymptotic framework in the context of structural change models. Keywords: Change-point, Confidence intervals, Shrinking shifts, Bounded trend, Level shift. JEL classification numbers: C22 1. INTRODUCTION Estimating and forming confidence intervals for break dates in structural change models is an important issue of practical interest (see e.g. Perron 2006, for an extensive review). To obtain confidence intervals, the most common approach is to use the asymptotic distribution of the estimates of the break dates, though other approaches are possible such as methods based on inverting a test statistic (see e.g. Siegmund 1988). In the context of models with stationary regressors, it is well known since the work of Hinkley (1970) that the limit distribution of the maximum likelihood estimate (or other estimates) depends on the finite sample distribution of the errors. A solution to this problem is to consider an asymptotic framework whereby the magnitude of the change shrinks at a suitable rate as the sample size increases, in which case the limit C Royal Economic Society 2006. Published by Blackwell Publishing Ltd, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA, 02148, USA.

Upload: others

Post on 09-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Here - people on the Web at Boston University

Econometrics Journal (2006), volume 9, pp. 423–447.doi: 10.1111/j.1368-423X.2006.00192.x

A comparison of alternative asymptotic frameworks

to analyse a structural change in a linear time trend

AI DENG∗ AND PIERRE PERRON†

∗Bates White, LLC, 1300 Eye Street NW, Suite 600, Washington, DC 20005,E-mail: [email protected]

†Department of Economics, Boston University, 270 Bay State Road, Boston, MA, 02215, USAE-mail: [email protected]

Received: August 2005

Summary This paper considers various asymptotic approximations to the finite sampledistribution of the estimate of the break date in a simple one-break model for a linear trendfunction that exhibits a change in slope, with or without a concurrent change in intercept.The noise component is either stationary or has an autoregressive unit root. Our main focus ison comparing the so-called ‘bounded-trend’ and ‘unbounded-trend’ asymptotic frameworks.Not surprisingly, the ‘bounded-trend’ asymptotic framework is of little use when the noisecomponent is integrated. When the noise component is stationary, we obtain the followingresults. If the intercept does not change and is not allowed to change in the estimation, bothframeworks yield the same approximation. However, when the intercept is allowed to change,whether or not it actually changes in the data, the ‘bounded-trend’ asymptotic frameworkcompletely misses important features of the finite sample distribution of the estimate of thebreak date, especially the pronounced bimodality that was uncovered by Perron and Zhu(2005) and shown to be well captured using the ‘unbounded-trend’ asymptotic framework.Simulation experiments confirm our theoretical findings, which expose the drawbacks of usingthe ‘ bounded-trend’ asymptotic framework in the context of structural change models.

Keywords: Change-point, Confidence intervals, Shrinking shifts, Bounded trend, Level shift.JEL classification numbers: C22

1. INTRODUCTION

Estimating and forming confidence intervals for break dates in structural change models is animportant issue of practical interest (see e.g. Perron 2006, for an extensive review). To obtainconfidence intervals, the most common approach is to use the asymptotic distribution of theestimates of the break dates, though other approaches are possible such as methods based oninverting a test statistic (see e.g. Siegmund 1988). In the context of models with stationaryregressors, it is well known since the work of Hinkley (1970) that the limit distribution of themaximum likelihood estimate (or other estimates) depends on the finite sample distribution of theerrors. A solution to this problem is to consider an asymptotic framework whereby the magnitudeof the change shrinks at a suitable rate as the sample size increases, in which case the limit

C© Royal Economic Society 2006. Published by Blackwell Publishing Ltd, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 MainStreet, Malden, MA, 02148, USA.

Page 2: Here - people on the Web at Boston University

424 Ai Deng and Pierre Perron

distribution is invariant to the finite sample distribution of the errors (see e.g. Yao 1987; Picard1985). For comprehensive results in linear regression models, see Bai (1997) for a single break,and Bai and Perron (1998) for the multiple break case. In general, the shrinking shift asymptoticframework provides a reliable guide unless the magnitude of the change is very small, in whichcase the confidence intervals are liberal, or very large, in which case they are conservative (seee.g. Bai and Perron 2005).

In the context of regression models with trending regressors, there is an additional layerof complexity related to the way in which the trend is specified. Consider the case where theregression function is a simple linear trend. One approach is simply to specify the regressionfunction as, say, μ + βt , as done in Perron and Zhu (2005), henceforth referred to as PZ, and Bai,et al. (1998). A more common approach is to specify it in the form μ + β (t/T ), so that the trendis bounded and indeed restricted to be in the interval [0, 1], see, for example, Bai (1997). This isa special case of a ‘bounded-trend’ asymptotic framework analysed by Andrews and McDermott(1995). A common argument is that the ‘bounded-trend’ asymptotic framework yields moretractable results, but little is known about which approach delivers a better approximation to thedistribution of the estimates of the break dates and the parameters of the trend function.

The aim of this paper is to inquire about which asymptotic framework provides the mostreliable approximation to the finite sample distribution of the estimate of the break date in asimple one-break model for a linear trend function of the form

yt = μ + βt + ut ,

where yt is a scalar observable variable and ut is a noise component that can be stationary, I (0),or have an autoregressive unit root, I (1). While a simple model, it is nevertheless an importantone in practice. For example, it is often of interest to assess whether the rate of growth of a seriessuch as real GDP exhibits a change and if so, to form a confidence interval for the break date.Allowing the errors to be I (0) or I(1) permits the series to be trend- or difference-stationary, see,for example Nelson and Plosser (1982) and Perron (1989).

The main findings are the following. First, if the noise function ut is I(1), the ‘bounded-trend’ asymptotic framework is of little use. This is not surprising since with a ‘bounded-trend’specification, the noise then dominates the signal. Of more interest are the results pertaining to thecase with a stationary noise function. If the intercept μ does not change concurrently with the slopeβ and the fitted intercept is not allowed to change, then all frameworks are basically equivalent.However, if the fitted intercept is allowed to change, whether or not μ changes, the ‘bounded-trend’asymptotic framework completely misses important features of the finite sample distributionof the estimate of the break date, especially the pronounced bimodality that was uncoveredby PZ and shown to be well captured using the alternative ‘unbounded-trend’ asymptoticframework.

The paper is organized as follows. Section 2 presents the various models considered,while Section 3 summarizes the relevant results from PZ pertaining to the ‘unbounded-trend’asymptotic framework. Section 4 provides results about limit distributions using the ‘bounded-trend’ asymptotic framework, with fixed and shrinking shifts. Section 5 presents simulationexperiments aimed to assess the quality of the approximation to the finite sample distributionof the various asymptotic frameworks. Section 6 offers brief concluding remarks and an appendixsome technical derivations. As a matter of notation, we let ‘⇒’ denote weak convergence indistribution under the sup metric, →d denote convergence in distribution, → convergence of adeterministic sequence, and W(r) is the unit Wiener process.

C© Royal Economic Society 2006

Page 3: Here - people on the Web at Boston University

Comparison of alternative asymptotic frameworks to analyse a structural change 425

2. THE MODELS

Throughout, it is assumed that some variable of interest, yt , is the sum of some systematic partd t and a random component, ut , that is,

yt = dt + ut .

The models analysed differ according to the assumptions made about both components d t andut . For the random component ut , we specify E(ut ) = 0 and alternatively one of the followingtwo assumptions:� Assumption 1. ut ∼ I (0). More specifically ut is such that T −1/2 ∑[T r ]

t=1 ut ⇒ σ W (r ) whereσ 2 = limT →∞ T −1 E(

∑Tt=1 ut )2 exists and is strictly positive.� Assumption 2. ut ∼ I (1). More specifically ut = ∑t

j=1 ε j where the sequence ε t is assumedto be I(0) as defined in Assumption 1.

Remark 1 There are many sets of sufficient conditions to guarantee that the weak convergenceresult stated in Assumption 1 holds. One that is fairly general is that used in Phillips and Perron(1988), namely (a) supt E |ut |γ+η < ∞ for some γ > 2 and η > 0 and (b) {ut}∞

1 is strongmixing with mixing numbers αm that satisfy

∑∞1 α

1−2/γm < ∞. Alternatively, we can assume

that ut is a linear process such that ut = ∑∞i=0 ci et−i where {et ,Ft−1} is a martingale difference

sequence with Ft−1 the filtration to which et is adapted. Also∑∞

i=0 i |ci | < ∞ (see Phillipsand Solo 1992). Either sets of conditions include the popular stationary and invertible ARMAprocesses.

For the systematic component d t , we consider three cases. As a matter of notation, the timeof break is denoted T 1 and the break fraction as λ = T 1/T . We fix λ and let T 1 = [λT ] increasewith T (where [·] denotes the integer part). Also, 1(A) is the indiactor function which takes value1 if A holds and 0 otherwise.• Models I.a and I.b: Joint broken trend with I(1) or I(0) errors. Here, d t is a first-order lineartrend with a one time change in slope such that the trend function is joined at the time of break,specified by:

dt = μ1 + β1t + βb Bt ,

where Bt = (t − T 1)1(t >T 1) is a dummy variable for the slope change. Here, the slopecoefficient changes from β 1 to β 1 + β b at the time point T 1. However, the trend functionis continuous at T 1. For this reason, this specification is referred to as a ‘joint brokentrend’.• Models II.a and II.b: Local disjoint broken trend with I(1) or I(0) errors. Here, d t is afirst-order linear trend with a one time change in intercept and slope such that in the absence ofan intercept change, the trend function is joined at the time of break, that is,

dt = μ1 + β1t + μbCt + βb Bt ,

where C t = 1(t >T 1) is a dummy variable for the level shift. Note that μb and β b capture thechange in the intercept and slope coefficients. At the break point T 1, the slope changes by β b andthe level shifts by μb, which is negligible compared to the level of the series μ1 + β 1T 1, hencethe label ‘local disjoint segmented trend’.

C© Royal Economic Society 2006

Page 4: Here - people on the Web at Boston University

426 Ai Deng and Pierre Perron

• Models III.a and III.b: Global disjoint broken trend with I(1) or I(0) errors. The thirdspecification is similar except that the trend function is not restricted to be joined at the time ofbreak (in the absence of a change in intercept). If one wants to model a permanent shift in the levelof the series such that the trend function is discontinuous at the break date even asymptotically,we can specify the DGP as

dt = μ1 + β1t + μbCt + βb Bd jt , (1)

where Bdjt = t1(t > T 1). Note that (1) can be written in terms of the parameters of Model II as

dt = μ1 + β1t + (μb + βbT1)Ct + βb Bt , (2)

which highlights the fact that the key difference is an intercept shift that gets large as the samplesize increases. We label this model as a ‘global disjoint segmented trend’ since, in contrastto the previous ‘local disjoint segmented trend’, the implied (relative to the overall level ofthe trend function) level shift at the break date converges to β b/β 1 �= 0 as T → ∞, sincedT1+1 − dT1 = β1 + μb + βb(T1 + 1). Note that Model III is the one used by Bai et al. (1998)with shrinking shifts.

Hence, we have six different models labelled as follows: (I.a) joint broken trend with I(1)errors; (I.b) joint broken trend with I(0) errors; (II.a) local disjoint broken trend with I(1) errors;(II.b) local disjoint broken trend with I(0) errors; (III.a) global disjoint broken trend with I(1)errors and (III.b) global disjoint broken trend with I(0) errors. Note that, in empirical applications,using Model II or III yields exactly the same results for the estimates of the parameters T 1, μ1,β 1 and β b. Nevertheless, the two specifications yield drastically different asymptotic results, inparticular pertaining to the rate of convergence and the asymptotic distribution of the estimateof the break date. Limiting results obtained from Model II (local disjoint broken trend) providegood approximations to the finite sample distributions when the shift in level is small while thosefrom Model III will do so when the shift in level is large. Hence, both asymptotic frameworks arecomplementary.

All specifications discussed can be expressed in matrix notation as

Y = XT1γ + U ,

where Y ′ = [y1, . . . , yT ], U ′ = [u1, . . . , uT ], X ′T1

= [x(T1)1, . . . , x(T1)T ], and where, forModels I, x(T 1)′ t = [1, t , Bt ], γ ′ = (μ1, β 1, β b), for Models II, x(T 1)′ t = [1, t , C t , Bt ],γ ′ = (μ1, β 1, μb, β b), and for Models III, x(T 1)′ t = [1, t , C t , Bdj

t ], γ ′ = (μ1, β 1, μb, β b). Notethat the matrix XT1 depends on the postulated value of the break date T 1. Since the parametersare assumed to be obtained using a global least-squares criterion, the estimate of the break date is

T1 = arg minT1∈T1

Y ′(1 − PT1 )Y ,

where T1 = {[πT1], . . . , [(1 − π )T1]} for some 0 < π < 1, and where PT1 is the projectionmatrix constructed using XT1 , that is, PT1 = XT1 (X ′

T1XT1 )−1 X ′

T1. Denoting by XT1

the matrix Xconstructed using the least-squares estimate of the break date T1, the least-squares estimate ofthe coefficients γ is γ = (X ′

T1XT1

)−1 X ′T1

Y and the resulting sum of squared residuals is, for an

estimated break fraction λ = T1/T ,

SS R(λ) =T∑

t=1

u2t =

T∑t=1

(yt − x(T1)′t γ

)2 = Y ′(I − PT1

)Y ,

C© Royal Economic Society 2006

Page 5: Here - people on the Web at Boston University

Comparison of alternative asymptotic frameworks to analyse a structural change 427

where PT1is the projection matrix associated with XT1

, that is PT1= XT1

(X ′T1

XT1)−1

X ′T1

.

The true values of the unknown coefficients will be denoted with a 0 superscript, that is, γ 0 =(μ0

1,β01,μ0

b,β0b)′, T 0

1,λ0 = T 01/T ; XT 0

1is the matrix of regressors constructed using the true value T 0

1

for the break date, and PT 01

is the associated projection matrix, that is PT 01

= XT 01(X ′

T 01

XT 01)−1 X ′

T 01.

So the true data generating process is assumed to be Y = XT 01γ 0 + U . We assume throughout

that β0b �= 0 and λ0 ∈ (0, 1).

3. SUMMARY OF ASYMPTOTIC RESULTS FOR THE UNBOUNDEDTREND CASE

For ease of comparisons with results to be presented below for the ‘bounded-trend’ asymptoticframework, we now summarize the relevant results from PZ pertaining to the limit distribution ofT1 for each case.

Theorem 1 (1) In Model I.a,√

T (λ − λ0) →d N (0, 2σ 2/(15(β0b )2));

(2) In Model I.b, T 3/2(λ − λ0) →d N (0, 4σ 2/[λ0(1 − λ0)(β0b )2]);

(3) For Model II.a, define

ξ1(λ0) ≡[ ∫ 1

0W (r ) dr ,

∫ 1

0r W (r ) dr ,

∫ 1

λ0W (r ) dr ,

∫ 1

λ0(r − λ0)W (r ) dr

]′,

ξ2(λ0) =[

0, 0, W (λ0),∫ 1

λ0W (r ) dr

]′, ξ3(λ0) ≡

∫ λ0

0[(3r2 − 2rλ0)/(λ0)2]dW (r ),

ξ4(λ0) ≡∫ 1

λ0[(r − 1)(3r − 2λ0 − 1)/(1 − λ0)2]dW (r ),

�1(λ0) ≡

⎡⎢⎢⎢⎢⎢⎢⎢⎣

4λ0 − 6

(λ0)22λ0

6(λ0)2

− 6(λ0)2

12(λ0)3 − 6

(λ0)2 − 12(λ0)3

2λ0 − 6

(λ0)24

λ0(1−λ0) 6 1−2λ0

(λ0)2(1−λ0)2

6

(λ0)2 − 12

(λ0)3 6 1−2λ0

(λ0)2(1−λ0)2 123(λ0)2−3λ0+1

(λ0)3(1−λ0)3

⎤⎥⎥⎥⎥⎥⎥⎥⎦,

�2(λ0) ≡

⎡⎢⎢⎢⎢⎢⎢⎢⎣

− 4

(λ0)212

(λ0)3 − 2

(λ0)2 − 12

(λ0)3

12

(λ0)3 − 36

(λ0)412

(λ0)336

(λ0)4

− 2

(λ0)212

(λ0)3 4 2λ0−1

(λ0)2(1−λ0)212

(λ0)3

3(λ0)2−3λ0+1

(λ0−1)3

− 12(λ0)3

36(λ0)4

12(λ0)3

3(λ0)2−3λ0+1(λ0−1)3

36(λ0)4

4(λ0)3−6(λ0)2+4λ0−1(λ0−1)4

⎤⎥⎥⎥⎥⎥⎥⎥⎦.

C© Royal Economic Society 2006

Page 6: Here - people on the Web at Boston University

428 Ai Deng and Pierre Perron

Also define Z (m) as follows: Z (0) = 0, Z (m) = Z 1(m) for m < 0 and Z (m) = Z 2(m) for m > 0,with

Z1(m) = (β0

b

)2 |m|3 /3 + m2σβ0bξ4 + mσ 2 [

2ξ ′2�1ξ1 − ξ ′

1�2ξ1]

, m < 0,

Z2(m) = (β0

b

)2 |m|3 /3 + m2σβ0bξ3 + mσ 2 [

2ξ ′2�1ξ1 − ξ ′

1�2ξ1]

, m > 0,

Then,√

T (λ − λ0) →d arg minm Z (m).(4) For Model II.b, define a stochastic process S(m) on the set of integers as follows: S(0) = 0,S(m) = S1(m) for m < 0 and S(m) = S2(m) for m > 0, with

S1(m) =0∑

k=m+1

(μ0

b + β0b k

)2 − 20∑

k=m+1

(μ0

b + β0b k

)uk, m = −1, −2, . . . ,

S2(m) =m∑

k=1

(μ0

b + β0b k

)2 + 2m∑

k=1

(μ0

b + β0b k

)uk, m = 1, 2, . . . .

If {ut} is strictly stationary with a continuous distribution, S∗ is a two-sided random walk withdrift, and T (λ − λ0) →d arg minm S(m).(5) In models III.a and III.b, |λ − λ0| = op(T −3) .

For a detailed discussion of these results, see PZ. However, some important features need tobe stressed to understand some comparisons to be made later. First, for Models I.a and I.b, thelimiting distributions of the estimate of the break date do no depend on the structure of the errorsand remain the same irrespective of the nature of the serial correlation. This is in stark contrastto results obtained in a stationary context in which case the limiting distribution of the estimateof the break date, in this fixed shift case, not only depends on the properties of the residualsbut in particular on their exact distribution (see e.g. Bai, 1997). With shrinking shifts, the exactdistribution of the errors is no longer present but the nature of the serial correlation still affectsthe limit distribution. For Model I with an ‘ unbounded-trend’ asymptotic framework, there is noneed to resort to shrinking shifts asymptotic approximations. As expected, the rate of convergenceis higher with I (0) errors. The limit distributions depend on the exact nature of the errors onlyfor Model II with I(0) errors. Most importantly, comparing the results for Models I and II [witheither I(1) or I(0) errors], the level shift plays an important role in the limiting distribution of theestimated break date. Suppose that the data generating process specifies no level shift, that is,μ0

b = 0. In Model I, no level shift is allowed in the regression while in Model II it is allowed viathe regressor C t . The results show that introducing such an irrelevant regressor changes the rateof convergence of the estimated break date and its asymptotic distribution, which shows strongbimodality as shown in PZ.

4. BOUNDED-TREND ASYMPTOTIC RESULTS

To describe the bounded trend asymptotic framework, we follow the treatment of Andrews andMcDermott (1995). Let T∗ denote the actual size of the sample available. The idea is to embed theseries of interest (y1, . . . , yT ∗ ) in a triangular array of random variables such that the shape of thetrend function is mimicked in the limit but its magnitude remains bounded. With the triangular

C© Royal Economic Society 2006

Page 7: Here - people on the Web at Boston University

Comparison of alternative asymptotic frameworks to analyse a structural change 429

array denoted {y T,1, . . . , y T,T } with y T,t = d T,t + ut , the number of rows is T and this valueis increased to infinity when doing the asymptotics. The embedding is achieved by specifying atrend function of the form dT ,t = dtT ∗/T so that dT ∗,t = dt for all t ≤ T . In our cases, this leadsto trend functions of the following forms. For Models I,

dt = μ1 + β1tT

T ∗ + βbBt

TT ∗,

for Models II:

dt = μ1 + β1tT

T ∗ + μbCt + βbBt

TT ∗,

and for Models III:

dt = μ1 + β1tT

T ∗ + μbCt + βbBd j

t

TT ∗.

4.1. Fixed shifts

We first consider the limit distributions of the estimate T1 ≡ T λ when the magnitudes of thechange coefficients (μ0

b, β0b) are fixed. The results are presented in the following Theorem, proved

in the appendix.

Theorem 2 (1) For Model I.a, with = (π, 1 − π ), λ →d arg maxλ∈ m(λ)′ Q(λ)m(λ), where

m(λ) ≡( ∫ 1

0W (r )dr ,

∫ 1

0r W (r )dr ,

∫ 1

λ

(r − λ)W (r )dr

)′,

Q(λ) =

⎛⎜⎜⎝1 1

2(1−λ)2

2

12

13

(1−λ)2(λ+2)6

(1−λ)2

2(1−λ)2(λ+2)

6(1−λ)3

3

⎞⎟⎟⎠ .

(2) For Model I.b,√

T (λ − λ0) →d N (0, 4σ 2/[λ0(1 − λ0)(T ∗β0b )2]).

(3) For Model II.a, λ →d arg maxλ∈ ξ1(λ)′�1(λ)ξ1(λ) with ξ 1(·) and �1(·) as defined in Theorem1.3.(4) For Model II.b, define a stochastic process S∗ (m) on the set of integers as follows: S∗ (0) =0, S∗ (m) = S∗

1(m) for m < 0 and S∗(m) = S∗2 (m) for m > 0, with

S∗1 (m) = |m| μ0

b − 20∑

k=m+1

uk, m = −1, −2, . . . ,

S∗2 (m) = mμ0

b + 2m∑

k=1

uk, m = 1, 2, . . . .

If {ut} is strictly stationary with a continuous distribution, S∗ is a two-sided random walk withdrift, and T (λ − λ0) →d arg minm S∗(m)

C© Royal Economic Society 2006

Page 8: Here - people on the Web at Boston University

430 Ai Deng and Pierre Perron

(5) For Model III.a, λ ⇒ arg maxλ∈{3(∫ 1

0 r W (r )dr )2 /λ3 + 3(∫ 1λ

r W (r )dr )2 /[1 − λ3]}.(6) For Model III.b, define a stochastic process Z∗(m) on the set of integers as follows:Z∗(0) = 0, Z∗(m) = Z 1(m) for m < 0 and Z∗ (m) = Z 2(m) for m > 0, with

Z1(m) = |m| (μ0b + β0

b T ∗λ0) − 20∑

k=m+1

uk, m = −1, −2, . . . ,

Z2(m) = m(μ0

b + β0b T ∗λ0) + 2

m∑k=1

uk, m = 1, 2, . . . .

If {ut} is strictly stationary with a continuous distribution, Z∗ is a two-sided random walk withdrift, and T (λ − λ0) →d arg minm Z∗(m).

Consider first the cases with I(1) errors (I.a, II.a and III.a). In all cases, the estimate of thebreak fraction is not consistent. Moreover, the limit distributions do not involve any parametersof the model, in particular the magnitude of the change in slope. Also, these limit distributions arethose obtained assuming no change in the trend function. This is fairly intuitive since the trendfunction is bounded while the noise component, being I(1), is stochastically unbounded. Hence,the noise component dominates any signal in the trend. As shown in PZ, the ‘unbounded-trend’asymptotic distribution provides a good approximation to the finite sample distribution in the caseof Model I.a. For Model II.a, it fails to deliver a limit distribution that involves the magnitude ofthe intercept shift, which influences greatly the finite sample distribution. However, for this case,PZ provides a stochastic expansion which provides a very accurate approximation.

Consider now the cases with I(0) errors. First, in the case of Model I.b., for which no interceptshift is present nor permitted, the results are the same as in the ‘unbounded-trend’ asymptotics.The reason is that the mapping from the asymptotic result to the approximation of the finite sampledistribution is done by setting T ∗ = T . It is then easy to see that the same approximation appliesin both asymptotic frameworks. Things are, however, very different when the intercept is allowedto change, as in Models II and III.

Consider the limit distribution for Model II.b. Of special relevance is the fact that the limitdistribution in the ‘bounded-trend’ asymptotic is independent of the value of the slope changeβ0

b, unlike for the ‘unbounded-trend’ asymptotic distribution. More interestingly, the two limitdistributions would be the same if the slope change is zero (though strictly, the ‘unbounded-trend’ limit distribution is valid only if the change in slope is non-zero). This insensitivity of thelimit distribution to the value of the change in slope implies automatically a bad finite sampleapproximation unless the slope change is very small (since the finite sample distribution is highlysensitive to changes in β0

b).The limit distribution in Model III.b does depend on the value of the slope shift β0

b. However,it is easy to see that it is the same as for Model II.b, but with a level shift μ0

b + β0b T ∗ λ0

instead of μ0b. The reason for this is that we can write Model III in the form of Model II with the

corresponding change in the level shift, see equation (2). What transpires from these results is thatonce allowance is made for a level shift, the importance of the slope shift is masked and no longerhas a first-order effect on the limit distribution. This contrasts sharply with the ‘unbounded-trend’asymptotic framework, whereby both the slope and level shifts influence the shape of the limitdistribution.

The most intriguing feature of the ‘bounded-trend’ asymptotic result is the following. Consider,for simplicity, the case where the errors ut are i.i.d. and suppose the true process involves a trend

C© Royal Economic Society 2006

Page 9: Here - people on the Web at Boston University

Comparison of alternative asymptotic frameworks to analyse a structural change 431

with a change in slope but with both segments joined at the time of break. If one uses ModelI, where no allowance is made for a concurrent level shift at the time of break, the asymptoticapproximation is T 3/2(λ − λ0) ≈ N (0, 4σ 2/[λ0(1 − λ0)(β0

b )2]), the same as for the ‘unbounded-trend’ asymptotic framework. Consider now simply allowing for the possibility of a level shift, i.e.,using the regression pertaining to Model II. Then the asymptotic approximation is T (λ − λ0) ≈arg minm 2W ∗(m) with W∗ (m) a two-sided random walk defined by W ∗(m) = − ∑0

k=m+1 uk form < 0 and by W ∗(m) = ∑m

k=1 uk for m > 0. Hence, introducing an irrelevant level shift regressornot only reduces the rate of convergence (as in the ‘unbounded-trend’ asymptotic framework) butalso completely eliminates the influence of the magnitude of the change in slope β0

b (unlike whatoccurs in the ‘unbounded-trend’ asymptotic framework).

4.2. Shrinking shifts

We next consider the bounded-trend asymptotic framework with shrinking shifts specified asfollows:� Assumption 3. Let δT = (μb, β b) and δ0 = (μ0

b, β0b) �= 0, then δT = δ0vT , where vT is a

positive number such that vT → 0, and T 1/2−αvT → ∞ for some α ∈ (0, 1/2).

In what follows, for reasons discussed above, we only consider the case with I(0) errors.The limit distributions of T1 are stated in the following Theorem, whose proof is omitted as thearguments closely parallel those for the proof of Theorem 1.

Theorem 3 Under Assumption 3, we have,

(1) For Model I.b,√

T vT (λ − λ0) →d N (0, 4σ 2/[λ0(1 − λ0)(T ∗β0b )2]).

(2) For Model II.b, define a stochastic process H (m) as follows: H (0) = 0, H (m) = H 1(m) form < 0 and H (m) = H 2(m) for m > 0, with

H1(m) = |m| μ0b − 2ψ1W1(−m), m ≤ 0,

H2(m) = mμ0b + 2ψ2W2(m), m > 0,

where W i (i = 1, 2) are two independent standard Wiener processes defined on [0, ∞), ψ21 =

lim E(T −1/21

∑T1t=1 ut )2 and ψ2

2 = lim E((T − T1)−1/2 ∑Tt=T1+1 ut )2. If {ut} is strictly stationary

with a continuous distribution, H is a two-sided random walk with drift, and T v2T (λ − λ0) →d

arg minm H (m).(3) For Model III.b, we have from Bai (1997),

δ′T g(λ0)g(λ0)δT

ψ21

T (λ − λ0) →d arg maxs

J (s),

where

J (s) ={

W1(−s) − |s| /2, if s ≤ 0√ψ2/ψ1W2(s) − |s| /2, if s > 0

with ψ 1 and ψ 2 defined as above, and, g(λ0 T ∗) = (1, λ0 T ∗)′.

C© Royal Economic Society 2006

Page 10: Here - people on the Web at Boston University

432 Ai Deng and Pierre Perron

For Model I.a, the results do not change qualitatively, the limit distribution is the same, onlythe rate of convergence is slower, as expected (the limit distribution was already independent ofthe finite sample distribution of the errors, so taking shrinking shifts is not expected to changethat back). In Models II and III, using shrinking shifts effectively implies limit distributions thatare no longer dependent on the finite sample distribution of the errors. But now, the slope shiftparameter β0

b no longer influences even the limit distribution for Model III. Since changes inthis parameter have an important effect on the finite sample distribution, it can be concluded thatusing a shrinking shift asymptotic framework is of no help in this ‘bounded-trend’ asymptoticframework.

5. SIMULATIONS

In this section, we use simulations to assess the adequacy of the approximations for the variousasymptotic distributions. Given the theoretical results, our focus is on Model II.b, which involvesI(0) errors and both fitted slope and intercept changes. Throughout, the number of replicationsis 5,000 and all graphs for the probability density functions are obtained from the empiricaldistributions of the estimates using a kernel smoothing method as in PZ.1 We consider two setsof simulations. The first uses i.i.d. errors and a wide grid for the values of the slope and interceptchanges. The second considers processes with a noise component that is serially correlated andthe generated data are calibrated to historical log real per capita GDP series for a variety ofcountries.

5.1. Base case simulations

Our first set of simulations is aimed to assess the extent to which the various approximationsperform for a range of values for the slope and intercept changes. To that effect we simulate datafrom Model II with i.i.d. N (0, σ 2) errors and the following combinations for the parameters ofinterest:

μ0b = [0, −0.1, −0.3, −0.5, −1.0] ,

β0b = [−0.1, −0.05, −0.03, 0.03, 0.05, 0.1].

The other parameters are set to μ1 = β 1 = 0.0 (without loss of generality since the estimate areinvariant to the value of these parameters) and σ 2 = 0.1. Since the errors are i.i.d. Normal, thefixed and shrinking shifts asymptotic distributions are the same. Hence, we make comparisonsbetween the finite sample distribution, the ‘unbounded-trend’ asymptotic approximation (labelled

1 That is, for a given set of statistics, say {X i}i=1, ... ,N , the pdf at value x is estimated by f (x) = (N · hx )−1 ∑Ni=1 K ((x −

Xi )/hx ), where K (·) is the kernel function and h x is the bandwidth. In our case, N = 5,000 and we use the standardNormal distribution as the kernel function. Since the estimates of the break date are discrete integers, the cross-validationmethod for choosing the optimal bandwidth does not work well in this case. As a rule of thumb, we simply let hx = 0.3σxwhere σx is the estimated standard deviation of a given sample of statistics {X i}i=1, ... ,N .

C© Royal Economic Society 2006

Page 11: Here - people on the Web at Boston University

Comparison of alternative asymptotic frameworks to analyse a structural change 433

PZ) from Model II.b, and the ‘bounded-trend’ asymptotic approximation (labelled BT), also fromModel II.b. The range of values used for μ0

b and β0b are intended to be representative of likely

values to be encountered in practice. For instance, the values for β0b/σ range from −0.32 to +0.32.

If one considers a change in the slope of a time series in logarithmic form, the upper bound is a32% change, which is very large. For the historical real per capita GDP series analysed below,the values range from 0 to 0.18. For the intercept changes, the grid used implies that |μ0

b/σ | ≤3.16. For the GDP series analysed below, the value of this quantity ranges from 0.9 to −3.0. Aswe shall see, a level shift of −3.0 is very large and implies a nearly degenerate distribution ofthe estimate of the break date at the true value. Hence, our grid is likely to cover most cases ofinterest for economic applications. Of course, this being a limited simulation design, we cannotclaim complete generality for our results.

The results are presented in Figures 1 to 5. They confirm the limited simulations reportedin PZ to the effect that the ‘unbounded-trend’ asymptotic distribution provides a very goodapproximation. In particular, when the level shift is small, the finite sample distribution is bimodaland the ‘unbounded-trend’ asymptotic distribution captures well this feature while the ‘bounded-trend’ asymptotic distribution completely misses it. Indeed, when the level shift is small, the‘bounded-trend’ limit distribution completely misses most aspects of the finite sample distribution.The latter performs best when the intercept shift is large, in which case the bimodal nature ofthe finite sample distribution is the weakest, though the ‘unbounded-trend’ limit distributionstill performs better. With a very large level shift, μ0

b = −1, the estimates are very preciseand all distributions are nearly degenerate at the true value. When the slope change is small, bothapproximations are nearly identical but for a large slope change, the ‘unbounded-trend’ asymptoticdistribution is again a superior approximation, especially when the change in level and slope areof opposite sign. We also performed experiments with positive values for the level shift. Theresults are qualitatively similar, except that the position of the dominant mode is inverted.

5.2. Simulations calibrated to empirical time series

While the previous simulation exercise is useful to assess the general properties of theapproximations, it remains to show how the approximations fare in the context of data generatingprocesses that are most likely to occur in practical applications. To that effect, we now consider asimulation design calibrated to historical (log) real per capita GDP series for a variety of countries.These are a subset of the series analysed in PZ and they cover the period from 1870 to 1986 forseven different countries for which the noise function was found to be I(0) in Perron (1992):Australia, Canada, Denmark, France, Germany, the United Kingdom and the United States.2 Thecomparison is made among three asymptotic approximations, namely, the ‘ unbounded-trend’, the

2 This data set is the same as used by Kormendi and Meguire (1990) and Perron (1992) and was obtained through theJournal of Money, Credit and Banking editorial office. All series are real GDP except for the United States for which realGNP is used. For the United States, the series is real GNP from the National Income and Products Accounts for the period1929–1986, spliced to Romer’s (1989) estimates for the period 1870–1928. For the United Kingdom, the series is real GDPfrom Feinstein (1972) for the period 1870–1947 spliced to the International Financial Statistics (IFS) series of the IMFfor the period 1948–1986. For the remaining countries, the series are indices of annual real GDP from Maddison (1982)spliced to the postwar IFS data. The population series used are from the same sources. A logarithmic transformation isapplied.

C© Royal Economic Society 2006

Page 12: Here - people on the Web at Boston University

434 Ai Deng and Pierre Perron

–60 –40 –20 0 20 40 600

0.05

0.1

0.15

0.2

μb0=0 β

b0=–0.1

finite samplePZ asymptoticsBT asymptotics

–60 –40 –20 0 20 40 600

0.02

0.04

0.06

0.08

μb0=0 β

b0=–0.05

finite samplePZ asymptoticsBT asymptotics

–60 –40 –20 0 20 40 600

0.02

0.04

0.06

μb0=0 β

b0=–0.03

finite samplePZ asymptoticsBT asymptotics

–60 –40 –20 0 20 40 600

0.02

0.04

0.06

μb0=0 β

b0=0.03

finite samplePZ asymptoticsBT asymptotics

–60 –40 –20 0 20 40 600

0.02

0.04

0.06

0.08

μb0=0 β

b0=0.05

finite samplePZ asymptoticsBT asymptotics

–60 –40 –20 0 20 40 600

0.05

0.1

μb0=0 β

b0=0.1

finite samplePZ asymptoticsBT asymptotics

Figure 1. Finite sample versus asymptotic approximations: μ0b = 0; β0

b = −0.1, −0.05, −0.03, 0.03, 0.05,0.1.

C© Royal Economic Society 2006

Page 13: Here - people on the Web at Boston University

Comparison of alternative asymptotic frameworks to analyse a structural change 435

–30 –20 –10 0 10 20 300

0.05

0.1

0.15

0.2

μb0=–0.1 β

b0=–0.1

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.02

0.04

0.06

0.08

μb0=–0.1 β

b0=–0.05

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.02

0.04

0.06

μb0=–0.1 β

b0=–0.03

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.02

0.04

0.06

μb0=–0.1 β

b0=0.03

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.02

0.04

0.06

0.08

μb0=–0.1 β

b0=0.05

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.05

0.1

0.15

0.2

μb0=–0.1 β

b0=0.1

finite samplePZ asymptoticsBT asymptotics

Figure 2. Finite sample versus asymptotic approximations: μ0b = −0.1; β0

b = −0.1, −0.05, −0.03, 0.03,0.05, 0.1.

C© Royal Economic Society 2006

Page 14: Here - people on the Web at Boston University

436 Ai Deng and Pierre Perron

–30 –20 –10 0 10 20 300

0.05

0.1

0.15

0.2

μb0=–0.3;β

b0=–0.1

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.05

0.1

0.15

0.2

μb0=–0.3;β

b0=–0.05

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.05

0.1

0.15

0.2

μb0=–0.3;β

b0=–0.03

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.05

0.1

0.15

0.2

μb0=–0.3;β

b0=0.03

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.05

0.1

0.15

0.2

μb0=–0.3;β

b0=0.05

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.05

0.1

0.15

0.2

μb0=–0.3;β

b0=0.1

finite samplePZ asymptoticsBT asymptotics

Figure 3. Finite sample versus asymptotic approximations: μ0b = −0.3; β0

b = −0.1, −0.05, −0.03, 0.03,0.05, 0.1.

C© Royal Economic Society 2006

Page 15: Here - people on the Web at Boston University

Comparison of alternative asymptotic frameworks to analyse a structural change 437

–30 –20 –10 0 10 20 300

0.1

0.2

0.3

0.4

μb0=–0.5;β

b0=–0.1

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.1

0.2

0.3

0.4

μb0=–0.5;β

b0=–0.05

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.1

0.2

0.3

0.4

μb0=–0.5;β

b0=–0.03

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.1

0.2

0.3

0.4

μb0=–0.5;β

b0=0.03

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.1

0.2

0.3

0.4

μb0=–0.5;β

b0=0.05

finite samplePZ asymptoticsBT asymptotics

–30 –20 –10 0 10 20 300

0.1

0.2

0.3

0.4

μb0=–0.5;β

b0=0.1

finite samplePZ asymptoticsBT asymptotics

Figure 4. Finite sample versus asymptotic approximations: μ0b = −0.5; β0

b = −0.1, −0.05, −0.03, 0.03,0.05, 0.1.

C© Royal Economic Society 2006

Page 16: Here - people on the Web at Boston University

438 Ai Deng and Pierre Perron

–4 –3 –2 –1 0 1 2 3 40

1

2

3

μb0=–1;β

b0=–0.1

finite samplePZ asymptoticsBT asymptotics

–4 –3 –2 –1 0 1 2 3 40

1

2

3

μb0=–1;β

b0=–0.05

finite samplePZ asymptoticsBT asymptotics

–4 –3 –2 –1 0 1 2 3 40

1

2

3

μb0=–1;β

b0=–0.03

finite samplePZ asymptoticsBT asymptotics

–4 –3 –2 –1 0 1 2 3 40

1

2

3

μb0=–1;β

b0=0.03

finite samplePZ asymptoticsBT asymptotics

–4 –3 –2 –1 0 1 2 3 40

1

2

3

μb0=–1;β

b0=0.05

finite samplePZ asymptoticsBT asymptotics

–4 –3 –2 –1 0 1 2 3 40

1

2

3

μb0=–1;β

b0=0.1

finite samplePZ asymptoticsBT asymptotics

Figure 5. Finite sample versus asymptotic approximations: μ0b = −1; β0

b = −0.1, −0.05, −0.03, 0.03, 0.05,0.1.

C© Royal Economic Society 2006

Page 17: Here - people on the Web at Boston University

Comparison of alternative asymptotic frameworks to analyse a structural change 439

Table 1. Parameter estimates; model II.b, real per capita GDP series.

AUS CAN DEN FRA GER UK US

μ1 1.033 0.347 2.155 2.111 0.934 −2.275 0.702

β1 0.002 0.016 0.016 0.011 0.014 0.010 0.016

μb −0.139 −0.284 −0.215 −0.460 −0.280 −0.250 0.180

βb 0.020 0.013 0.014 0.031 0.030 0.008 0.0001

ρ1 0.75 0.81 0.72 0.86 1.14 0.70 1.05

ρ2 0.23 −0.11 −0.13 −0.33 0.03 −0.28

ρ3 −0.24 0.09∑pi=1 ρi 0.74 0.79 0.72 0.73 0.81 0.73 0.77

σε 0.032 0.048 0.033 0.065 0.088 0.030 0.066

ψ2 0.019 0.050 0.011 0.027 0.063 0.007 0.040

T 1 1929 1930 1939 1943 1945 1919 1940

‘bounded-trend with fixed shifts’ and the ‘bounded-trend with shrinking shifts’ . Table 1 presentsthe relevant parameter estimates used to calibrate the simulations obtained by estimating thefollowing model:

yt = dt + ut ,

dt = μ1 + β1t + μbCt + βb Bt ,

ut =p∑

j=1

ρ j ut− j + εt . (3)

The noise component is modelled as an AR(p) using the BIC to select the order. To simulate theasymptotic distributions in the bounded and unbounded-trend asymptotic frameworks, we proceedas follows. From the parameter estimates (ρ1, . . . , ρp) and σ 2

ε , we generate recursively a sequenceof errors ut = ∑p

j=1 ρ j ut− j + et , where et ∼ i.i.d. N (0, σ 2ε ) and y0 = . . . = y−(p−1) = 0. These

are used along with the estimates of μb and β b (in the case of the unbounded-trend distribution) toconstruct the quantities S1(m) and S2(m) in Theorems 1(4) and 2(4). For the bounded-trend withshrinking shifts asymptotic distribution, we approximate the Wiener processes by partial sums ofi.i.d. N(0, 1) random variables. For the purpose of these simulation experiments, we assumed thatψ 1 = ψ 2 = ψ , that is, that the correlation structure is the same before and after the break. Then ψ

is estimated using the full sample of least-squares residuals from regression (3) using Andrews’(1991) data dependent method with AR(1) approximation. The number of replications is 2,000.

There is always the possibility that such a simple model is misspecified for the actual dataconsidered. This does not invalidate the usefulness of the experiments, which is to compare therelative merits of various asymptotic frameworks in providing good approximations for a widerange of plausible cases. It may be that the parameter estimates obtained are somewhat biased orinefficient but by using a wide range of values given by the results for the seven countries, we canassess the sensitivity of the approximations to specific values.

The results are presented in Figures 6 and 7. Although, there is a wide variety in the shapes of thefinite sample distributions, overall the ‘unbounded-trend’ asymptotic framework clearly performsbest in providing a good approximation. For Australia, Canada, Denmark, France and Germany,

C© Royal Economic Society 2006

Page 18: Here - people on the Web at Boston University

440 Ai Deng and Pierre Perron

–20 –15 –10 –5 0 5 10 15 200

0.1

0.2

0.3

0.4Australia

–6 –4 –2 0 2 4 60

0.5

1Canada

–4 –3 –2 –1 0 1 2 3 40

1

2

3Denmark

finite samplePZ asymptoticsBT asymptoticsBT shrinking asymptotics

finite samplePZ asymptoticsBT asymptoticsBT shrinking asymptotics

finite samplePZ asymptoticsBT asymptoticsBT shrinking asymptotics

Figure 6. Finite sample versus asymptotic approximations: empirically calibrated comparisons.

it is indeed, very accurate. For Canada and Denmark, the ‘bounded-trend with shrinking-shift’approximation is also good (due to the fact that the break in slope is small); and for Germany,the ‘bounded-trend with fixed shift’ is better than the ‘unbounted-trend’ approximation in thecentre of the distribution while the reverse ranking holds in the tails, especially the right tail. Forthe United States, none of the approximations work very well due to the wide spread of the finitesample distribution caused by the fact that the shift in slope in very small, if at all present. Forthe United Kingdom, the model calibration is such that the break is identified with high precisionin the sense that the finite sample distribution and all approximations are nearly degenerate at 0(in fact only up to three values out of the 2,000 replications are non-zero). Therefore, the kernelsmoothed densities are not a good indication of the true nature of the distribution since it is greatlyinfluenced by a few outlying values, hence a graph is not reported. This feature is due to the factthat the level shift is very large with μb/ψ = 3.0.

The results in the first set of simulations, which involved serially uncorrelated errors, revealedthat the bounded-trend asymptotic distribution performs best and can be close to the unbounded-trend and exact distributions when the level shift is large. In the case of the experiments in thissection, the level shifts are indeed large for many countries (Canada, Denmark, France, Germanyand the United Kingdom), which is why the distributions are not bimodal. What the results in thissection show is that the presence of serial correlation exacerbates the extent to which the bounded-trend asymptotic distribution provides a less accurate approximation to the exact distribution. On

C© Royal Economic Society 2006

Page 19: Here - people on the Web at Boston University

Comparison of alternative asymptotic frameworks to analyse a structural change 441

–3 –2 –1 0 1 2 30

1

2

3

4France

–15 –10 –5 0 5 10 150

0.05

0.1

0.15

0.2Germany

–15 –10 –5 0 5 10 150

0.05

0.1

0.15

0.2United States

finite samplePZ asymptoticsBT asymptoticsBT shrinking asymptotics

finite samplePZ asymptoticsBT asymptoticsBT shrinking asymptotics

finite samplePZ asymptoticsBT asymptoticsBT shrinking asymptotics

Figure 7. Finite sample versus asymptotic approximations: empirically calibrated comparisons.

the other hand, the quality of the approximations provided by the unbounded-trend asymptoticdistribution is seen to be robust to the presence of serial correlation in the noise component.

6. CONCLUSIONS

We considered the adequacy of various asymptotic approximations pertaining to the estimate of thebreak date in a simple linear trend model with a single change in slope with a possible concurrentchange in level. Our focus has been on comparing the limit distributions provided by the so-called‘bounded-trend’ and ‘unbounded-trend’ frameworks. As expected, when the noise function ut isI(1), the ‘bounded-trend’ asymptotic framework is of little use. Of more interest are the resultspertaining to the case with a stationary noise component. If the level does not change concurrentlywith the slope and the fitted intercept is not allowed to change, then all frameworks are basicallyequivalent. However, if the fitted intercept is allowed to change, whether or not there is a levelshift, the ‘ ounded-trend’ asymptotic framework completely misses important features of the finitesample distribution of the estimate of the break date, especially the pronounced bimodality thatwas uncovered by PZ and shown to be well captured using the alternative ‘unbounded-trend’asymptotic framework. Simulation experiments confirm our theoretical findings, which expose

C© Royal Economic Society 2006

Page 20: Here - people on the Web at Boston University

442 Ai Deng and Pierre Perron

the drawbacks of using the ‘bounded-trend’ asymptotic framework in the context of structuralchange models.

REFERENCES

Andrews, D. W. K. (1991). Heteroskedasticity and autocorrelation consistent covariance matrix estimation.Econometrica 59, 817–58.

Andrews, D. W. K. and C. J. McDermott (1995). Nonlinear econometric models with deterministicallytrending variables. Review of Economic Studies 62, 343–60.

Bai, J. (1997). Estimation of a change point in multiple regression models. The Review of Economics andStatistics 79, 551–63.

Bai, J., R. Lumsdaine and J. H. Stock (1998). Testing for and dating common breaks in multivariate timeseries. Review of Economic Studies 65, 395–432.

Bai, J. and P. Perron (1998). Estimating and testing linear models with multiple structural changes.Econometrica 66, 47–78.

Bai, J. and P. Perron (2005). Multiple structural change models: A simulation analysis. In, D. Corbea, S.Durlauf and B.E. Hansen (eds.) Econometric Theory and Practice: Frontiers of Analysis and AppliedResearch, London: Cambridge University Press.

Feinstein, C. H. (1972). National income expenditure and output of the United Kingdom, vol. 6. In,RichardStone (series ed.), Studies in the National Income and Expenditure of the United Kingdom, London:Cambridge University Press.

Hinkley, D. V. (1970). Inference about the change-point in a sequence of random variables. Biometrika 57,1–17.

Kormendi, R. C. and P. Meguire (1990). A multicountry characterization of the nonstationarity of aggregateoutput. Journal of Money, Credit and Banking 22, 77–93.

Maddison, A. (1982). Phases of Capitalist Development. London: Oxford University Press.Nelson, C. R. and C. I. Plosser (1982). Trends and random walks in macroeconomics time series: some

evidence and implications. Journal of Monetary Economics 10, 139–62.Perron, P. (1989). The great crash, the oil price shock and the unit root hypothesis. Econometrica 57,

1361–1401.Perron, P. (1992). Trend, unit root and structural change: A multi-country study with historical data.

Proceedings of the Business and Economic Statistics Section, American Statistical Association, 144–49.

Perron, P. (2006). Dealing with structural breaks. In,T. C. Mills and K. Petterson (eds.), Palgrave Handbookof Econometrics, Vol. 1, Econometric Theory, Palgrave Macmillan, New York, 278–352.

Perron, P. and X. Zhu (2005). Structural breaks with deterministic and stochastic trends. Journal ofEconometrics 129, 65–119.

Picard, D. (1985). Testing and estimating change-points in time series. Journal of Applied Probability 17,841–67.

Phillips, P. C. B. and P. Perron (1988). Testing for a unit root in time series regression. Biometrika 75,335–46.

Phillips, P. C. B. and V. Solo (1992). Asymptotics for linear processes. Annals of Statistics 20, 971–1001.Romer, C. (1989). The prewar business cycle reconsidered: New estimates of gross national product, 1969-

1928. Journal of Political Economy 97, 1–37.Siegmund, D. (1988). Confidence sets in change-point problems. International Statistical Review 56, 31–48.Yao, Y. -C. (1987). Approximating the distribution of the maximum likelihood estimate of the change-point

C© Royal Economic Society 2006

Page 21: Here - people on the Web at Boston University

Comparison of alternative asymptotic frameworks to analyse a structural change 443

in a sequence of independent random variables. Annals of Statistics 15, 1321–1328.

APPENDIX A:

From the properties of projections, we have for all T , SS R(λ) ≤ SS R(λ0), which implies thefollowing inequality as shown in PZ,

γ 0′(XT 01

− XT1

)′(I − PT1

)(XT 0

1− XT1

)γ 0 + 2γ 0′(XT 0

1− XT1

)′(I − PT1

)U + U ′(PT 0

1− PT1

)U ≤ 0.

(A.1)

Note also that

arg minT1

[SS R(T1)] = arg minT1

[SS R(T1) − SS R(T 0

1 )],

= arg minT1

[γ 0′(XT 0

1− XT1

)′(I − PT1

)(XT 0

1− XT1

)γ 0 + 2γ 0′(XT 0

1− XT1

)′(I − PT1

)U

+ U ′(PT 01

− PT1

)U

].

This will be employed to derive the asymptotic distribution of the least-squares estimate of thebreak fraction, λ = T1/T . Note that throughout we use the label O(T a) and O p(T a) in its strictsense, that is, meaning that the variables are not O(T a) and o p(T a). We shall repeatedly use thefollowing notation:

(X X ) ≡ γ 0′(XT 01

− XT1

)′(I − PT1

)(XT 0

1− XT1

)γ 0,

(XU ) ≡ γ 0′(XT 01

− XT1

)′(I − PT1

)U ,

(UU ) ≡ U ′(PT 01

− PT1

)U .

Proof of Theorem 1: We start by analysing consistency and then the rate of convergence whenthe estimate is consistent. The results can be obtained from the following Lemma, which is similarto Lemma 1 in PZ.

Lemma A.1 (1) For Model I.a,

(X X ) = ∣∣T1 − T 01

∣∣2Op(T −1), (XU ) = ∣∣T1 − T 0

1

∣∣ Op(T 1/2), (UU ) = ∣∣T1 − T 01

∣∣ Op(T ).

(2) For Model I.b,

(X X ) = ∣∣T1 − T 01

∣∣2Op(T −1), (XU ) = ∣∣T1 − T 0

1

∣∣ Op(T −1/2), (UU ) = ∣∣T1 − T 01

∣∣ Op(T −1).

(3) For Model II.a,

(X X ) = ∣∣T1 − T 01

∣∣ Op(1), (XU ) = ∣∣T1 − T 01

∣∣ Op(T 1/2), (UU ) = ∣∣T1 − T 01

∣∣ Op(T ).

(4) For Model II.b,

(X X ) = ∣∣T1 − T 01

∣∣ Op(1), (XU ) = ∣∣T1 − T 01

∣∣1/2Op(1), (UU ) = ∣∣T1 − T 0

1

∣∣1/2Op(T −1/2).

C© Royal Economic Society 2006

Page 22: Here - people on the Web at Boston University

444 Ai Deng and Pierre Perron

(5) For Model III.a,

(X X ) = ∣∣T1 − T 01

∣∣ Op(1), (XU ) = ∣∣T1 − T 01

∣∣ Op(T 1/2), (UU ) = ∣∣T1 − T 01

∣∣ Op(T ).

(6) For Model III.b,

(X X ) = ∣∣T1 − T 01

∣∣ Op(1), (XU ) = ∣∣T1 − T 01

∣∣1/2Op(1), (UU ) = ∣∣T1 − T 0

1

∣∣1/2Op(T −1/2).

The proof is similar to that of Lemma 1 in PZ and, hence, omitted. Using Lemma A.1, it isrelatively easy to establish the following results about consistency and rates of convergence.

Lemma A.2 For Models, I.a, II.a and III.a (with I(1) errors), λ − λ0 = Op(1) and the estimateof the break fraction is not consistent for the true break fraction. For Models I.b, II.b. and III.b(with I (0) errors), λ →p λ0.

Lemma A.3 In the case of I(0) errors, for Model I.b, λ − λ0 = Op(T −1/2), and for Models II.band III.b, λ − λ0 = Op(T )

We are now in a position to derive the limit distributions. We start with the case where theerrors are I(1).

Proof for Models I.a, II.a and III.a: In all cases, as can be verified from Lemma A.1, theterm (UU) dominates the other two. Since this term does not involve any parameter and, inparticular, the magnitude of the changes, this immediately implies that the limit distribution isthat corresponding to the no break case. We have for all three cases:

λ = arg minλ∈

U ′(PT 01

− PT1 )U

T 2+ op(1) = arg max

λ∈

U ′ PT1UT 2

+ op(1),

= arg maxλ∈

U ′ XT1 (X ′T1

XT1 )−1 X ′T1

U

T 2.

For Model I.a, let DT ∗ = diag(1, T ∗, T ∗), we have

T −1/2 X ′T1

U =

⎛⎜⎜⎝T −1/2 ∑T

t=1 ut

(T ∗/T −3/2)∑T

t=1 tut

(T ∗/T −3/2)∑T

t=T1+1(t − T1)ut

⎞⎟⎟⎠ ⇒ DT ∗m(λ). (A.2)

and T −1 X ′T1

XT1 → DT ∗ Q(λ)DT ∗ , so that λ = arg maxλ∈ m(λ)′ Q(λ)−1m(λ). The limitdistribution stated for Models II.a and III.a can be obtained analogously. We now consider thelimit distributions for the cases with I(0) errors.

Proof for Model I.b: Given the rate of convergence stated in Lemma A.3, we can restrict thesearch to the set of break dates that satisfy |T1 − T 0

1 | = Op(√

T ). Then, the terms (XX) and (XU)are both O p(1), and (UU) = o p(1). Hence, we have λ = arg maxλ[(X X ) + 2(XU ) + op(1)].Without loss of generality, we set μ1 = β 1 = 0, since the estimates are invariant to these values

C© Royal Economic Society 2006

Page 23: Here - people on the Web at Boston University

Comparison of alternative asymptotic frameworks to analyse a structural change 445

and, hence, γ 0′ = (0, 0, β0b). Defining mT = |T1 − T 0

1 |/√T , we have,

T −1/2 X ′T1

(XT 0

1− XT1

)γ 0

= β0b

⎛⎜⎜⎜⎜⎝1√T

T ∗T

( ∑T1−T 01

t=1 t + (T − T1)(T1 − T 0

1

))1√T

( T ∗T

)2( ∑T1−T 0

1t=1 t

(T 0

1 + t) + (

T1 − T 01

) ∑Tt=T1+1 t

)1√T

( T ∗T

)2(T1 − T 01

) ∑T −T1t=1 t

⎞⎟⎟⎟⎟⎠ ,

= β0b

[(1 − λ0)T ∗ 1 − (λ0)2

2(T ∗)2 (1 − λ0)2

2(T ∗)2

]mT + op(1).

Hence,

γ 0′(XT 01

− XT1

)′ XT1

(X ′

T1XT1

)−1 X ′T1

(XT 0

1− XT1

)γ 0

= T −1/2γ 0′(XT 01

− XT1

)′ XT1

(T −1 X ′

T1XT1

)−1T −1/2 X ′T1

(XT 0

1− XT1

)γ 0,

= (β0

b

)2

⎛⎜⎜⎝(1 − λ0)T ∗

1−(λ0)2

2 (T ∗)2

(1−λ0)2

2 (T ∗)2

⎞⎟⎟⎠′

(DT ∗ Q(λ)DT ∗ )−1

⎛⎜⎜⎝(1 − λ0)T ∗

1−(λ0)2

2 (T ∗)2

(1−λ0)2

2 (T ∗)2

⎞⎟⎟⎠ m2T + op(1),

= (β0

b

)2[

(1 − λ0)(4 − λ0)

4(T ∗)2

]mT + op(1).

Furthermore,

γ 0′(XT 01

− XT1

)′(XT 01

− XT1

)γ 0 = (

β0b

)2(T ∗)2(1 − λ0)m2

T + op(1),

and (XX) = (β0b)2 (T ∗)2 (1 − λ0) λ0 m2

T /4 + o p(1). Next,

γ 0′(XT 01

− XT1

)′U = β0b T ∗

(T1∑

t=T 01 +1

t − T 01

Tut +

T∑t=T1+1

T1 − T 01

Tut

)⇒ β0

b T ∗mT

∫ 1

λ0dW (r ),

using (A.2). Then, (XU) is such that

= β0b mT σ

⎡⎢⎢⎣T ∗∫ 1

λ0dW (r ) −

[−T ∗ 1−λ0

2

3(1 − λ0)

2λ0

3(2λ0 − 1)

2λ0(1 − λ0)

] ⎡⎢⎢⎣∫ 1

0 dW (r )

T ∗ ∫ 10 rdW (r )

T ∗ ∫ 1λ0 (r − λ0)dW (r )

⎤⎥⎥⎦⎤⎥⎥⎦ + op(1),

= β0b mT σ T ∗ζ + op(1),

where ζ is defined in PZ, equation A-8. The result then follows from the fact that

m∗T = arg max

mT

[(T ∗β0

b

)22(1 − λ0)λ0

4m2

T + β0b mT σ T ∗ζ + op(1)

].

C© Royal Economic Society 2006

Page 24: Here - people on the Web at Boston University

446 Ai Deng and Pierre Perron

Proof for Model II.b: In this case, from Lemma A.3, T (λ − λ0) = Op(1) and we can restrictthe analysis to break dates in a set such that |T 1 − T 0

1| = O p(1). Following PZ, we define thefollowing quantities. For T 0

1 ≥ T 1

g1(T1 − T 0

1

) =T 0

1∑t=T1+1

[μ0

b + β0b

t − T 01

TT ∗

], h1

(T1 − T 0

1

) =T 0

1∑t=T1+1

[μ0

b + β0b

t − T 01

TT ∗

]2

,

and, for T 1 ≥ T 01,

g2(T1 − T 0

1

) =T1∑

t=T 01 +1

[μ0

b + β0b

t − T 01

TT ∗

], h2

(T1 − T 0

1

) =T1∑

t=T 01 +1

[μ0

b + β0b

t − T 01

TT ∗

]2

.

Let n = T 1 − T 01, and k = t − T 0

1, then

For n < 0, g1(n) =0∑

k=n+1

[μ0

b + β0b

kT

T ∗]

, h1(n) =0∑

k=n+1

[μ0

b + β0b

kT

T ∗]2

.

For n > 0, g2(n) =n∑

k=1

[μ0

b + β0b

kT

T ∗]

, h2(n) =n∑

k=1

[μ0

b + β0b

kT

T ∗]2

.

Now consider driving T to infinity while keeping T ∗ fixed,

For n < 0, g1(n) = −nμ0b + op(1), h1(n) = −n

(μ0

b

)2 + op(1).

For n > 0, g2(n) = nμ0b + op(1), h2(n) = n

(μ0

b

)2 + op(1).

since n = T 1 − T 01 = O p(1). Now, for T 1 >T 0

1,

γ 0′(XT 01

− XT1

)′(I − PT1

)(XT 0

1− XT1

)γ 0 =

T1∑t=T 0

1 +1

[μ0

b + β0b

t − T 01

TT ∗

]2

,

− 1√T

T1∑t=T 0

1 +1

[μ0

b + β0b

t − T 01

TT ∗

]x(T1)′t

(X ′

T1XT1

T

)−1 1√T

T1∑t=T 0

1 +1

x(T1)t

[μ0

b + β0b

t − T 01

TT ∗

],

where x(T 1)t = (1, (t/T )T ∗, 0, 0) for T 01 + 1 ≤ t ≤ T 1. Note that

∑T1

t=T 01 +1

[μ0b + β0

bt−T 0

1T T ∗]2 = h2

and

1√T

T1∑t=T 0

1 +1

[μ0

b + β0b

t − T 01

TT ∗

]x(T1)′t ,

= 1√T

g2(

1 T 01

T T ∗ 0 0) + 1√

T

T1∑t=T 0

1 +1

[μ0

b + β0b

t − T 01

TT ∗

] (0 t−T 0

1T T ∗ 0 0

)≤

{1√T

|g2| + 1√T

|g2| T1 − T 01

TT ∗

}(1 T 0

1T T ∗ 0 0

) = Op( |g2| T −1/2), (A.3)

C© Royal Economic Society 2006

Page 25: Here - people on the Web at Boston University

Comparison of alternative asymptotic frameworks to analyse a structural change 447

since (T 1 − T 01)/T → 0. Also, (T −1 X ′

T1XT1 )−1 = Op(1), hence γ 0′

(XT 01

− XT1 )′ PT1 (XT 01

−XT1 )γ 0 = Op(|g2|2 T −1) = op(1). Therefore, we have (XX) = h2 + o p(1) for T 1 > T 0

1, and(XX) = h1 + o p(1) for T 0

1 > T 1. For the term (XU), we first have, for T 1 > T 01,

γ 0′(XT 01

− XT1

)′(I − PT1 )U

=T1∑

t=T 01 +1

[μ0

b + β0b

t − T 01

TT ∗

]ut ,

−⎡⎣ 1

T

T1∑t=T 0

1 +1

[μ0

b + β0b

t − T 01

TT ∗

]x(T1)′t

⎤⎦ (X ′

T1XT1

T

)−1 1√T

X ′T1

U

=T1∑

t=T 01 +1

[μ0

b + β0b

t − T 01

TT ∗

]ut + op(1),

where the last equality follows from (A.3). Hence,

(XU ) =

⎧⎪⎪⎨⎪⎪⎩∑T1

t=T 01 +1

[μ0

b

]ut + op(1) if T1 > T 0

1

0 if T1 = T 01

− ∑T 01

t=T1+1

[μ0

b

]ut + op(1) if T1 < T 0

1

,

and the result follows combining the limits of (XX) and (XU), since the term (UU) is dominated.

Proof for Model III.b: The result follows from Bai (1997) with simple modifications.

C© Royal Economic Society 2006