cointegrating regressions with time heterogeneity

43
This article was downloaded by: [Stony Brook University] On: 29 October 2014, At: 01:59 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Econometric Reviews Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lecr20 Cointegrating Regressions with Time Heterogeneity Chang Sik Kim a & Joon Y. Park b a School of Economics , Sungkyunkwan University , Seoul, South Korea b Department of Economics , Indiana University , Bloomington, Indiana, USASungkyunkwan University, Seoul, South Korea Published online: 26 Feb 2010. To cite this article: Chang Sik Kim & Joon Y. Park (2010) Cointegrating Regressions with Time Heterogeneity, Econometric Reviews, 29:4, 397-438, DOI: 10.1080/07474930903562221 To link to this article: http://dx.doi.org/10.1080/07474930903562221 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Upload: joon-y

Post on 02-Mar-2017

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Cointegrating Regressions with Time Heterogeneity

This article was downloaded by: [Stony Brook University]On: 29 October 2014, At: 01:59Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

Econometric ReviewsPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/lecr20

Cointegrating Regressions with Time HeterogeneityChang Sik Kim a & Joon Y. Park ba School of Economics , Sungkyunkwan University , Seoul, South Koreab Department of Economics , Indiana University , Bloomington, Indiana, USASungkyunkwanUniversity, Seoul, South KoreaPublished online: 26 Feb 2010.

To cite this article: Chang Sik Kim & Joon Y. Park (2010) Cointegrating Regressions with Time Heterogeneity, EconometricReviews, 29:4, 397-438, DOI: 10.1080/07474930903562221

To link to this article: http://dx.doi.org/10.1080/07474930903562221

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of theContent. Any opinions and views expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Cointegrating Regressions with Time Heterogeneity

Econometric Reviews, 29(4):397–438, 2010Copyright © Taylor & Francis Group, LLCISSN: 0747-4938 print/1532-4168 onlineDOI: 10.1080/07474930903562221

COINTEGRATING REGRESSIONS WITH TIME HETEROGENEITY

Chang Sik Kim1 and Joon Y. Park2

1School of Economics, Sungkyunkwan University, Seoul, South Korea2Department of Economics, Indiana University, Bloomington, Indiana, USA andSungkyunkwan University, Seoul, South Korea

� This article considers the cointegrating regression with errors whose variances changesmoothly over time. The model can be used to describe a long-run cointegrating relationship,the tightness of which varies along with time. Heteroskedasticity in the errors is modelednonparametrically and is assumed to be generated by a smooth function of time. We show that itcan be consistently estimated by the kernel method. Given consistent estimates for error variances,the cointegrating relationship can be efficiently estimated by the usual generalized least squares(GLS) correction for heteroskedastic errors. It is shown that the U.S. money demand function,both for M1 and M2, is well fitted to such a cointegrating model with an increasing trendin error variances. Moreover, we found that the bilateral purchasing power parities among theleading industrialized countries such as the United States, Japan, Canada, and the UnitedKingdom have been changed somewhat conspicuously over the past thirty years. In particular,it appears that they all have generally become more tightened during the period.

Keywords Cointegrating regression; GLS correction for heteroskedasticity; Kernel estimation;Time heterogeneity.

JEL Classification C22.

1. INTRODUCTION

Since the publication of an influential article by Engle and Granger(1987), cointegration has widely been regarded as a useful notion tomodel long-run economic relationships. Naturally, many authors haveformulated and tested various economic long-run relationships using theconcept of cointegration. Their empirical results, however, have moreoften provided negative, rather than positive, evidences for the presenceof cointegration. Though some authors successfully modeled long-run

Address correspondence to Chang Sik Kim, School of Economics, Sungkyunkwan University,Seoul, South Korea; E-mail: [email protected]

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 3: Cointegrating Regressions with Time Heterogeneity

398 C. S. Kim and J. Y. Park

economic equilibria within the framework of cointegration, many othersfound little empirical evidence of cointegration even for some well-postulated long-run economic relationships.

There can be various reasons for the observed frustrating evidence.It may be attributed to model misspecifications due to the presence ofomitted variables and/or neglected structural changes, among many otherpossibilities, for which the empirical researchers are to blame. However,there are models that the existing concept of cointegration is too simpleand restrictive to fit properly. In particular, many long-run economicrelationships exhibit characteristics changing over time. This is likely to beso, since cointegration is to describe a long-run relationship and usuallyestimated using data whose time span is rather long. Obviously, the simplestatic cointegration model is inappropriate for such relationships.

The purpose of this article is to introduce a more flexible cointegrationmodel. This is to accommodate a broader class of economic long-runrelationships within the framework of cointegration. In particular, we allowfor the long-run equilibrium error variance to change and vary with time,similarly as in Hansen (1995). While he specifies the error variance as acontinuous function of a nearly nonstationary AR process, it is modelednonparametrically as a smooth function of time in our model. Suchmodeling seems appropriate for an economic long-run relationship, thetightness of which evolves with time. The changing variance of our modelcan be estimated using the standard kernel method.

Stationary time series models with various types of time heterogeneitieshave been considered in the literature. The stationary regression with thesame type of time heterogeneity as the one we deal with in this articlewas studied earlier by Robinson (1987, 1989, 1991). More recently, Phillipsand Xu (2006) and Xu and Phillips (2008) investigated the stationaryautoregressive model with the time heterogeneity in the error variance.Harvey and Robinson (1988) considered the GLS estimation of regressionin the presence of deterministic and nonstationary error variances,and Hidalgo (1992) developed an adaptive estimation method in thepresence of unknown form of heteroskedasticity. Early work by Andrews(1987) established the consistency of least squares in the regressions withintegrated regressors under extremely mild error conditions allowing foreven explosively trending and non-mixing errors.

Moreover, much attention has recently been paid to the unit rootmodels with various types of time-varying error variances. Hamori andTokihisa (1997) and Kim et al. (2002) considered the unit root test in amodel with an abrupt change in volatility. More extensive analysis for theeffect of time varying variances on the unit root test was given in Cavaliere(2004) and Cavaliere and Taylor (2007). In particular, they developed theunit root tests that are robust to a very general specification of time-varyingvolatility. Beare (2004) proposed nonparametrically corrected tests for the

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 4: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 399

unit root, and Boswijk (2005) derived the asymptotic power envelopeof the unit root test for the model driven by a known nonstationaryvolatility process, and showed that the power envelope can be attained bynonparametric adaptive estimation of the volatility process. Boswijk and Zu(2007) extended the results obtained by Boswijk (2005) into a multivariatecase, and developed an efficient cointegration test.

Our model, on the other hand, may simply be viewed as acointegrating regression with heteroskedasticity. The serial correlation inthe cointegrating regression can be ignored without losing efficiency,as shown in Phillips and Park (1986). For the cointegrating regressionwith serially correlated errors, ordinary least squares (OLS) becomesasymptotically equivalent to GLS. However, the asymptotic equivalence ofOLS and GLS does not necessarily hold for the cointegrating regressionwith heteroskedastic errors. We show in the article that neglectedheterogeneity may result in inefficiency and invalidity of OLS, if it has acertain tendency and cannot be averaged out. Particularly, it may distortthe size of the test of cointegration and yield inefficient estimates forthe regression coefficients. Through simulation, we show that the adverseeffect of neglected heterogeneity can be substantial, both in small andlarge samples.

To demonstrate the usefulness of our model and methodology,we consider two economic long-run relationships—the money demandequation and the purchasing power parity condition. These are amongthe economic relationships that have recently been investigated mostintensely by the empirical researchers. In the article, we show that theU.S. money demand equation is well fitted by the cointegrating regressionwith changing variance. In fact, it is found that the time homogeneityin the equation has an increasing trend with a peak occurred roughlyat the first half of the 1990s. Moreover, many of the bilateral purchasingpower parities among the United States, Japan, Canada, and the UnitedKingdom hold when the heterogeneity in the error variance is accountedfor. The time heterogeneity for the PPP model is quite fluctuating inall cases, especially in the 1970s and 1980s. However, it has an obviousdecreasing trend. Except for the period of Asian financial crisis in the late1990s, the PPP conditions for U.S.–Japan, Japan–Canada, Japan–U.K., andCanada–U.K. all seem to have become tightened recently.

The rest of the article is organized as follows. Section 2 introducesthe model with some preliminary results. Theoretical results on the modelare given in Section 3. We first consider the simple regression withstrictly exogenous errors to highlight the effect of variance changes onthe statistical theory for the cointegrating regression. Then the theoryis extended to more general cointegrating regressions. This is doneusing the method of canonical cointegrating regressions, developed byPark (1992). We discuss the specification tests in Section 4, which we

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 5: Cointegrating Regressions with Time Heterogeneity

400 C. S. Kim and J. Y. Park

may use to check the adequacy of our model specification. Section 5introduces empirical applications of our model and methodology. Therewe report the empirical results for the money demand equation andthe purchasing parity condition. The results from a simple Monte Carlosimulation are given in Section 6. The effect of changing variance onthe test of cointegration is examined and the relative efficiency of theheteroskedasticity-corrected procedure is computed. Section 7 concludesthe article. All the proofs for the theoretical results of the article are in theMathematical Appendix.

A word on notation. For a random vector z = (zi), ‖z‖p =maxi(�|zi |p)1/p denotes the Lp -norm. For a function � = (�i) from asubset of � into �p , we define the supremum norm ‖�‖ by ‖�‖ =maxi supx |�i(x)|. If applied to a vector or a matrix, | · | denotes themaximum modulus of their elements. The space of cadlag functions onthe unit interval [0, 1] endowed with the Skorohod topology is denotedby D[0, 1]. As usual, D[0, 1]k signifies the product space of the k-copiesof D[0, 1]. Standard notations on the convergence of random sequences→p and →d are used, respectively, to denote convergences in probabilityand in distribution. Vector Brownian motion with covariance matrix �is denoted by BM(�). The theoretical results in the article rely heavilyon the invariance principle Bn →d BM(�) for a k-dimensional partialsum process Bn . The convergence →d in distribution here is the weakconvergence of the probability measures in D[0, 1]k . Finally, we denote by��(0,V ) mixed normal distribution with variance V .

2. THE MODEL AND PRELIMINARY RESULTS

We consider the regression model

yt = x ′t� + et (1)

for t = 1, � � � ,n. It is assumed in (1) that (xt) is an m × 1 integrated processof order one and (et) is generated as

et = �tut , (2)

where (ut) is a weakly stationary process and (�t) is a non-negativenumerical sequence. Let (ut) have unit variance, so that var(et) = �2

t .We specify

�2t = f

(tn

)(3)

with a smooth function f defined on the unit interval [0, 1].

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 6: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 401

The model given by (1)–(3) can be viewed as a cointegratingregression with heteroskedastic errors.1 The error variance is specified aschanging smoothly over time and modeled nonparametrically as beinggenerated by a smooth function. Our model is comparable to theregression considered by Hansen (1995). In his volatility regression model,the errors are assumed to be martingale differences and conditionalvariances are modeled as being generated by a continuous function of anearly nonstationary AR process. Our model allows for serial correlationin the errors, with the simplifying assumption that the error process hasthe same autocovariance structure with their magnitude changing overtime. Such a model seems practically very useful in describing a long-runrelationship, the tightness of which is believed to change slowly over time.

Throughout the article, we assume that the regressor (xt) is anintegrated process driven by a general stationary process. As a naturalextension to our model, we may consider the model which allows for timeheterogeneity in the regressor as well as in the regression error. Such amodel may be found to be more appropriate in some applications. In fact,some evidence has been found in the presence of nonstationary volatilityin macroeconomic and financial data. Watson (1999) and McConnell andPerez-Quiros (2000) noticed trending behaviors in variability and volatilitybreaks for U.S. short and long term interest rates and GDP series oversome periods. As can be clearly seen from the development of our theory,the asymptotic theory for the extended model can be readily establishedin a similar manner. For the expositional purpose, however, we will notexplicitly deal with this model in the article.

Now we write

e2t = �2t u

2t = f

(tn

)+ �2

t vt , (4)

where

vt = u2t − 1�

The heteroskedasticity of the errors in regression (1) is given as anonparametric regression function f in (4), which can be estimated usingthe kernel estimator

fn(r ) = 1nhn

n∑t=1

e2t K(r − t/n

hn

), (5)

1This is just a prototypical model, which can be readily extended to the model withdeterministic trends. All our subsequent theory applies to more general models with some obviousmodifications.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 7: Cointegrating Regressions with Time Heterogeneity

402 C. S. Kim and J. Y. Park

proposed by Priestley and Chao (1972), where K is the kernel functionand hn is the bandwidth parameter. Other types of kernel estimators suchas the one by Gasser and Müller (1979) can also be used. The reader isreferred to, e.g., Härdle (1990) for more details.

Define

wt = (ut ,�x ′t )

′� (6)

We now introduce the assumptions for (wt) and (vt).

Assumption 1. Assume (wt) is zero mean, weakly stationary, and strongmixing with mixing coefficient �m of size −pq/(p − q) and supt≥1 ‖wt‖p <∞for some 2 < q < p.

Assumption 2. Assume (vt) is zero mean, weakly stationary, andstrong mixing with mixing coefficient �m of size −3p/2(p − 2) andsupt≥1 ‖vt‖p <∞ for some p > 2.

For our subsequent results, we also need the following assumptions onthe heterogeneity generating function f , the kernel function K , and thebandwidth parameter hn .

Assumption 3. We assume:

(a) f is twice continuously differentiable and f > 0 on [0, 1];(b) K has a continuous and bounded derivative, and satisfies the usual

conditions∫ ∞

−∞ K (s)ds = 1,∫ ∞

−∞ sK (s)ds = 0 and∫ ∞

−∞ s2K (s)ds < ∞;(c) hn = cn−k with 0 < k < 1/4, where c is some constant.

Assumption 1 was employed previously by Hansen (1992) to develophis results on the weak convergence to stochastic integrals. The mixingcoefficients in both Assumptions 1 and 2 satisfy

∞∑m=1

�1−2/pm < ∞, (7)

the condition which has often been imposed in the nonstationary timeseries literature. See, for instance, Phillips (1987). An invariance principleholds under condition (7). If we define for (wt) and (vt)

Wn(r ) = 1√n

[nr ]∑t=1

wt and Vn(r ) = 1√n

[nr ]∑t=1

vt , (8)

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 8: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 403

then Wn →d BM(�) and Vn →d BM(�2) under Assumptions 1 and 2,respectively. Whenever condition (7) holds, the autocovariance function ofa weakly stationary process is absolutely summable. Therefore, if we define(k) = �wtw ′

t−k and c(k) = �vtvt−k , then the variances � and �2 of thelimit Brownian motions are given, respectively, by

� =∞∑

k=−∞(k) and �2 =

∞∑k=−∞

c(k)

under Assumptions 1 and 2.Define

W = BM(�) and =∞∑k=1

(k)� (9)

The following lemma will be used repeatedly throughout the article.

Lemma 1. Suppose that � : [0, 1] → � is differentiable with boundedderivative, and that (wt) satisfies Assumption 1. If we let �nt = �(t/n) andzt = ∑t

k=1 wk, then:

(a) 1n2

∑[nr ]t=1 �nt zt z ′

t →d

∫ r0 �WW ′,

(b) 1n

∑[nr ]t=1 �nt zt−1w ′

t →d

∫ r0 �WdW ′ + ( ∫ r

0 �)′,

as n → ∞.

Part (a) is a special case of Lemma 1 of Park (1992) and Part (b)generalizes the result in Hansen (1992) on the convergence to stochasticintegral to deal with the sample product moments including deterministicsequences.

Now we consider the kernel estimator fn of f given in (5). We firstdefine

cK =∫ ∞

−∞K (s)2ds and dK =

∫ ∞

−∞s2K (s)ds�

Then the following lemma.

Lemma 2. Under Assumptions 2 and 3, we have:

(a) �fn(r ) = f (r ) + 12dK h

2nf

′′(r ) + o(h2n) + O

(1

nh2n

),

(b) nhn�(fn(r ) − �fn(r ))2 = cK�2f (r )2 + o(1),

uniformly in r ∈ (0, 1) as n → ∞.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 9: Cointegrating Regressions with Time Heterogeneity

404 C. S. Kim and J. Y. Park

Part (a) is well known (see, for instance, Härdle, 1990) and includedhere for the sake of completeness. In Part (b), we extend the existingresult to stationary mixing sequences with changing variance. Our result isidentical to the iid case except that the variance of the kernel estimator isproportional to f (r )2. In particular, the estimate fn(r ) of f (r ) is uniformlyconsistent in r ∈ (0, 1). The optimal bandwidth contraction rate whichbalances off the variance and the squared bias is n−1/5, as is in the usualkernel density and regression function estimation. Furthermore, we havethe following proposition.

Proposition 3. Under Assumptions 2 and 3, we have√nhn(fn(r ) − �fn(r )) →d �(0, cK�2f (r )2)

as n → ∞.

Therefore, the asymptotic normality also holds. Once again, thevariance of the limiting distribution is proportional to f (r )2.

3. STATISTICAL INFERENCE

Let

� = (X ′X )−1X ′y (10)

be the OLS estimate for � in (1), where X and y are definedas usual respectively from (xt) and (yt), t = 1, � � � ,n. The presenceof heteroskedasticity in the errors also motivates us to consider thenonparametric GLS estimator � defined by

� = (X ′X )−1X ′y, (11)

where the t th rows, x ′t and yt , of �X and y, are given by

xt = xt�t

and yt = yt�t

(12)

for t = 1, � � � ,n.In practice, (�t) are unobserved and must be estimated. We may

consistently estimate (�t) by

�nt =√fn

(tn

),

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 10: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 405

where fn is given by

fn(r ) = 1nhn

n∑t=1

e2t K(r − t/n

hn

)� (13)

The kernel estimate fn in (13) is identical to fn defined in (5), if (et) arereplaced in the latter by the OLS residuals (et).

Lemma 4. Under Assumptions 1 and 3, we have∥∥fn − fn∥∥ = Op(n−1h−2

n )

as n → ∞.

The use of (et), in place of (et), in the estimation of f thus does notaffect our asymptotic results.

We need to introduce an additional assumption to ensure that the useof estimated �t does not change the asymptotics for the estimators and thetest statistics considered in the article. We let �t = �((ws)

ts=1), and assume

the following.

Assumption 4. �(wt |�t−1, (vs)ns=t) = �(wt |�t−1) for all n.

Assumption 4 requires that for each n, (vs)ns=t should not provideany additional information on the expectation of the innovationwt −�(wt |�t−1) of (wt). The condition in Assumption 4 is not absolutelynecessary, but greatly simplifies the proofs of our subsequent results. It doesnot appear to be overly restrictive. Recall that (vt) is generated from (ut)by vt = u2

t − 1. Assumption 4 must be satisfied for a broad class of modelsincluding those generated by symmetrically distributed innovations.

Lemma 5. Let Assumptions 1–4 hold. We have

1n2

n∑t=1

1�2ntzt z ′

t →d

∫ 1

0f −1WW ′ (14)

as n → ∞. Moreover, it follows that

1n

n∑t=1

�t zt−1w ′t ,1n

n∑t=1

�nt zt−1w ′t →d

∫ 1

0f 1/2WdW ′ +

( ∫ 1

0f 1/2

)′ (15)

1n

n∑t=1

�t

�2ntzt−1w ′

t ,1n

n∑t=1

1�nt

zt−1w ′t →d

∫ 1

0f −1/2WdW ′ +

( ∫ 1

0f −1/2

)′ (16)

as n → ∞.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 11: Cointegrating Regressions with Time Heterogeneity

406 C. S. Kim and J. Y. Park

Given Assumptions 1–4, the results in Lemma 5 can readily be obtainedfrom Lemma 1. They imply in particular that the use of �nt in place of �t

is valid and does not affect our asymptotics. We may, therefore, treat �t asknown in the subsequent development of our theory.

To more effectively compare the asymptotic distributions of � and �,we first look at the simplest case where the regressors (xt) are generatedindependently of the errors (et) in regression (1). Such regressors will becalled strictly exogenous in the article. As shown by Park and Phillips (1988)and Phillips and Park (1986), the limiting distributions of the least squaresestimators for the standard cointegrating regressions are mixed normalin this case of strict exogeneity. To present the subsequent results, wepartition the limit Brownian motion W and its variance � accordingly as(wt) defined in (6) and denote by W = (W1,W ′

2)′ and �=diag(�11,�22).

Lemma 6. Assume strict exogeneity, and let Assumptions 1–4 hold. Then wehave as n → ∞

n(� − �) →d ��(0,�11�)

n(� − �) →d ��(0,�11�),

where � = (∫ 10 f −1W2W ′

2)−1 and � = (

∫ 10 W2W ′

2)−1(

∫ 10 fW2W ′

2)(∫ 10 W2W ′

2)−1.

Moreover, we have � ≤ � a.s.

Lemma 6 implies that both the OLS and GLS estimators in (10)and (11) have mixed normal distributions, in case of strict exogeneity.Interestingly, the asymptotic equivalence of OLS and GLS in Phillips andPark (1986) does not hold here. They show that the GLS procedure whichutilizes the serial correlation structure of the errors does not improve theasymptotic efficiency of the OLS estimator in cointegrating regressions.Consequently, the serial correlation in the errors can be ignored withoutlosing efficiency. Our result here shows that their result applies onlyfor the cointegrating regressions with serially correlated errors. Whenheterogeneity is present, GLS is unambiguously more efficient than OLS.

The asymptotic Gaussianity in Lemma 6 does not hold for moregeneral models. We now consider the efficient estimation of theregression (1), when there is cross correlation between the innovationsof the regressors and regression errors. Here we extend the methodof canonical cointegrating regressions (CCR) by Park (1992) to ourregression model with nonparametric heterogeneity. The procedurerequires transformations of (yt) and (xt) in (1), using the stationarycomponents of the model. To define the required transformations moreexplicitly, we need to introduce some additional notation.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 12: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 407

In addition to the limit Brownian motion W and parameter matrix defined in (9), we let = (0) so that � = + + ′. We also define

� = + �

It will be useful to partition � as � and denote by �ij , correspondinglyas �ij for i , j = 1, 2, the partitioned submatrices. If the partitionedsubmatrices are indeed scalars or vectors, then the lowercase letters �ij and�ij will be used. Finally, define

�2 = (�′12,�

′22),

which will also be used to define the CCR transformation given below.We now consider the regression

y∗t = x∗′

t � + e∗t , (17)

where

y∗t = yt − �′�2

−1wt − �t�12�−122 �xt

x∗t = xt − �2

−1wt �

The CCR error (e∗t ) is given by

e∗t = �t(ut − �12�

−122 �xt),

which is asymptotically independent of the innovations of the regressor(�x∗

t ). The GLS procedure in the transformed model (17), therefore,yields an efficient and optimal estimator in the sense of Phillips (1991) andSaikkonen (1991).

Denote by �∗ the estimator from regression (17) with nonparametricGLS correction for heterogeneity. More explicitly, we define

�∗ = (X ∗′X ∗)−1X ∗′y∗, (18)

where the t th rows x∗t and y∗

t of X ∗ and y∗ are given by

x∗t = x∗

t

�tand y∗

t = y∗t

�t(19)

for t = 1, � � � ,n. For the sake of comparison, we let �∗ be the OLS estimatorfrom the transformed regression (17). In the article, we call �∗ and �∗ theCCR-OLS and the CCR-GLS estimators, respectively.

Both the CCR transformation and the GLS procedure introducedabove involve unknown parameters, which should of course be estimated

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 13: Cointegrating Regressions with Time Heterogeneity

408 C. S. Kim and J. Y. Park

for the practical applications. This causes no problem. We may easily getpreliminary consistent estimates for the parameters, once we run the first-step feasible GLS regression with the estimated �t and get the residual (ut)

and �. For the consistent estimation of �, � and , see Park (1992) andthe references cited there.

In sum, we may summarize the required procedure as follows:

(a) Obtain the OLS residual (et) from regression (1);

(b) Estimate �nt =√fn(t/n);

(c) Get (ut), where ut = et/�nt ;(d) Compute the CCR transformation using (ut);(e) Apply OLS and GLS on regression (17) to get the CCR-OLS and CCR-

GLS estimators.

Of course, the procedure can be iterated. In the article, we partiallyiterate the procedure for computational convenience. Firstly, we go throughsteps (a) and (b). Secondly, we obtain the GLS residual (et) to replace theOLS residual (et) in step (a), and re-estimate � in step (b) using (et).

In what follows, we denote, respectively, by �∗ and �∗ the CCR-OLSand the CCR-GLS estimators with the consistently estimated parameters.The following theorem can be deduced.

Theorem 7. Under Assumptions 1–4, we have as n → ∞n(�∗ − �) →d ��(0,�2

∗�)

n(�∗ − �) →d ��(0,�2∗�),

where � and � are defined in Lemma 6, and �2∗ = �11 − �12�

−122 �21.

Theorem 7 extends the results in Lemma 6 to general cointegratedmodels. Both the estimators �∗ and �∗ are asymptotically mixed normalfor general models. Of the two, �∗ is more efficient, since � ≤ � a.s., asshown in Lemma 6. Moreover, the results in Theorem 7 imply that theinference on the regression coefficient � can be done in the CCR, similarlyas in the standard regression model, if a modified statistic is used. For themodification, we only need to replace the usual error variance estimate bya consistent estimate for the long-run variance �2

∗ of the CCR error. This isto allow for the serial correlation in the regression errors and the presenceof the unit roots in the regressors.

4. SPECIFICATION TESTS

In this section, we discuss how the adequacy of our model specificationin (1) and (2) can be tested. For this purpose, we first introduce the

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 14: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 409

class of models that can be regarded as alternatives to our model. Howto discriminate each of them against our model will then be described.In place of our model, we may consider regression (1) with

et = ut �

The resulting model reduces, of course, to the usual cointegratingregression. We will refer to this model as CR-HOM (homoskedasticcointegrating regression), as opposed to our model which will be calledCR-HET (heteroskedastic cointegrating regression) for comparison.

Both of CR-HOM and CR-HET are cointegrated models. They providesensible, authentic regressions. In this sense, they are in sharp contrastwith the regressions with nonstationary integrated errors, which yieldnonsensical and spurious results. Such regressions may be modeled as

et =t∑

s=1

us �

With (1), this gives us nothing but a general representation of the non-cointegrated models. The models will be designated as NON-CR (non-cointegrating regression) henceforth.

There is another class of models that are of some particular interest inour context here. They may be specified as

yt = x ′t�t + ut , (20)

where �t is time-varying and given by �t = f (t/n), similarly as ourspecification of the time heterogeneity in (3). The models willsubsequently be referred to as TVC-CR (cointegrating regression withtime-varying coefficients). We may also consider the TVC-CR model withheteroskedastic errors. Adding time heterogeneity to the TVC-CR model,however, would not change any of our subsequent discussions on themodel, at least qualitatively. We will thus assume homoskedastic errors,whenever we discuss the TVC-CR model. The TVC-CR model is analyzedin Park and Hahn (1999).2

The TVC-CR model in (20) may be viewed as the non-cointegratedmodel with fixed coefficients. It is indeed easy to see that the TVC-CRmodel (20) can be rewritten as the fixed coefficients model (1) with

et = ut + x ′t (�t − �)�

2The regression (20) with stationary regressors and time heterogeneity is considered inRobinson (1989). See also Robinson (1991) for a more general time varying nonlinear regressionwith stationary regressors.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 15: Cointegrating Regressions with Time Heterogeneity

410 C. S. Kim and J. Y. Park

The neglected change in the regression coefficient would thus result inthe spuriousness of the underlying regression. In this regard, the TVC-CRmodel can be sharply discriminated against the CR-HET model proposedin the article: The former is essentially a non-cointegrated model, whilethe latter is a cointegrated model. In general, the misspecification of theregression coefficient has much more devastating effects in cointegratingregressions than the presence of heterogeneity in the errors. In whatfollows, we will concentrate on the time changes in the volatility ofcointegrating errors. The time changes in cointegrating vectors will not bediscussed any further.

The models considered in this section are summarized in the followingtable.

Cointegrating SpuriousRegressions Regressions

CR-HOM NON-CRCR-HET TVC-CR

For the cointegrating regressions, the model CR-HOM is nested in themore general CR-HET model. For the two models of the spuriousregressions, no such relationship exists.

Tests of Cointegration

The tests by Park (1990) and Shin (1994) can be used to test for thenull of the presence of cointegration in the CR-HOM model. For the testdeveloped by Park (1990), we simply include some additional regressorsand see whether the usual Wald-type tests can detect the superfluousnessof the added regressors. In the presence of cointegration, the redundancyis likely to be detected. When there is no cointegration, however, it isnot so since the underlying regression becomes spurious. The test willbe referred to as variable addition test (VAT). The test proposed by Shin(1994) tests whether the random walk component of regression error haszero variance, in which case the regression error becomes stationary andthe underlying regression supports a cointegrating relationship. Under theassumptions of strict exogeneity and Gaussianity, his test becomes the LMtest, and will thus be referred in subsequent discussions.

It is rather straightforward to extend the tests so that they areapplicable for the CR-HET model. For the VAT approach, we may simplytest whether � is significantly different from zero in the regression

y∗t = s ′

t� + x∗′t � + u∗

t ,

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 16: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 411

where st = st/�t , i.e., the GLS transform of the superfluously addedregressors (st), and (y∗

t ) and (x∗t ) are defined as in (19). In the

article, we use time polynomials t k , k = 1, 2, 3, as superfluous regressors assuggested by Park (1990). To test for the significance of �, we may use thestandard statistic with the usual variance estimate replaced by a consistentestimate of the long-run variance �2

∗ of the CCR error, as explained aboveTheorem 7. For the LM test, we may look at the regression

yt = x ′t� + ut ,

where (yt) and (xt) are given as in (12) to get the fitted residuals (ut), say.The LM test statistic can then be defined using (ut).

Theorem 8. Let Assumptions 1–4 hold.

(a) Let the true model be given by CR-HET. Then VAT is asymptotically chi-squarewith degrees of freedom given by the number of added superfluous regressors.On the other hand,

LM →d

∫ 1

0M 2

as n → ∞, where

M (r ) = W1(r ) −( ∫ 1

0f −1/2dW1W ′

2

)( ∫ 1

0f −1W2W ′

2

)−1 ∫ r

0f −1/2W2�

(b) Let the true model be given by NON-CR or TVC-CR. Then we have

VAT,LM →d ∞

as n → ∞.

Part (a) of Theorem 8 extends the results in Park (1990) and Shin(1994) to the cointegrated models with time heterogeneity. The VAT forthe CR-HET model has chi-square limiting distribution as for the CR-HOM model if the GLS correction is made for heterogeneity. The limitingdistribution of the LM statistic for the CR-HET model, however, is quitedifferent from that for the CR-HOM model, even with the necessarycorrection for the heterogeneity. For the CR-HET model it depends inparticular on the heterogeneity generating function f in a quite complexmanner. There does not seem to exist any simple modification to make itindependent of f . This would severely restrict the applicability of the LMtest in the CR-HET model, since the critical values are dependent upon

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 17: Cointegrating Regressions with Time Heterogeneity

412 C. S. Kim and J. Y. Park

the unknown function f .3 For this reason, we will not use the LM test forthe rest of our article.

Tests of Non-Cointegration

We may use the usual residual based tests (RBT) of non-cointegration,such as the ADF test and the Phillips modified t -test, to discriminate theCR-HOM and CR-HET models against the NON-CR model. For the RBT’sthe null hypothesis to be tested is the lack of cointegration, and thusthe model NON-CR is maintained. It is well known that the tests areconsistent, i.e., they reject the null of non-cointegration with probabilityone asymptotically, in case that cointegration is present. Though theexisting literature only considers the CR-HOM model, it is trivial to showthat the tests are also consistent against the CR-HET model.

It is easy to see that the RBT’s would discriminate the NON-CRmodel against both the CR-HOM model and CR-HET model, perfectlyand equally effectively in the limit. They may, however, have differingdiscriminatory powers against the CR-HOM and CR-HET models in finitesamples. Indeed it does not seem unreasonable to expect that thetests would have less discriminatory powers against the CR-HET modelcompared to the CR-HOM model, the former having more unstableregression errors. This is confirmed in our simulation, the details of whichwill be reported in a later section. If the true model is CR-HET, theRBT’s are thus more likely to fail to reject the null hypothesis of non-cointegration. The CR-HET model would thus have a higher chance ofbeing regarded as the NON-CR model, if the RBT’s are used to test fornon-cointegration.

5. EMPIRICAL APPLICATIONS

To demonstrate the empirical relevancy of our model, we consider twoeconomic long-run relationships: the money demand equation and thepurchasing power parity condition. These are perhaps the most intenselyinvestigated economic relationships within the framework of cointegration.Earlier work by Meltzer (1963) and Lucas (1988) on the historicalU.S. money demand function has subsequently been reformulated andreinterpreted using the notion of cointegration by many authors. Theyinclude Johansen and Juselius (1990), Baba et al. (1992), Hafer andJansen (1991), Miller (1991), and Stock and Watson (1993), among manyothers. Their empirical evidence is generally supportive of the presence of

3The critical values of the test may be simulated each time using estimated f . This, however,is not quite desirable, since the estimated f introduces additional variabilities in the computedcritical values.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 18: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 413

cointegration. It now seems to be widely agreed that there exists a stablelong-run money demand function.

The application of the cointegration methodology to the purchasingpower parity has much less been successful. Fisher and Park (1991),Johansen and Juselius (1992), and Juselius (1995), among many others,which study the parity condition within the framework of cointegration,provide some supportive evidence for cointegration. However, in mostcases that they consider, the evidence is somewhat weak and does not applyto the strict form parity condition. It seems that the purchasing powerparity may well be supported as a long-run cointegrating relationship, butthe relationship must be relatively weak.

For the money demand (MD) function, we consider the regression

mt = �′ct + �1yt + �2rt + et , (21)

where mt , yt , and rt signify the log real balance, log real income, andinterest rates, respectively. An absolute version of the purchasing powerparity (PPP) condition is specified as

pt = �′ct + �1xt + �2p∗t + et , (22)

where xt is the log of foreign currency price of a unit of foreign exchange,and pt and p∗

t are the log of prices, domestic and foreign, respectively.In (21) and (22), ct denotes the deterministic regressors such as constantand dummy variables to allow for nonzero intercept and possible structuralbreaks, and et is the regression error.

For both M1 and M2, the U.S. money demand function was fitted tothe MD Eq. (21). To estimate the MD equation, we used the monthlydata on M1, M2, personal income, and 10-year treasury bond rates (M1SL,M2SL, PI, and GS10, respectively), and CPI (CPIAUCNS) to deflate thenominal values, provided by the Federal Reserve at St. Louis. The dataspans from January 1960 to December 2004. Moreover, the purchasingparity conditions for U.S.–U.K., Japan–Canada, Japan–U.S., Japan–U.K.,and Canada–U.S. were examined based on the PPP Eq. (22). For thispurpose, we used the monthly IFS data on CPI indexes and the exchangerates from January 1973 to December 2004. The empirical results aresummarized in Tables 1 and 2. For the results reported here and elsewherein this section, the ADF test was run with the number of lagged differencesselected using the Modified Information Criteria (MIC) rule by Ng andPerron (2001). Moreover, the long-run variance for the Phillips t -test andthe CCR and VAT procedures was estimated using Parzen window with theautomatic bandwidth selection method by Andrews (1991).

All the time series used in our empirical analysis are widely believedto be nonstationary and have unit roots. Indeed, our results for the unit

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 19: Cointegrating Regressions with Time Heterogeneity

414 C. S. Kim and J. Y. Park

TABLE 1 Tests of cointegration

ADF Phillips VAT-HOM VAT-HET

MD M1 −2�275 −2�254 9.111 5.003M2 −3�116 −3�301 9.397 5.611

PPP U.K.–U.S. −3�959 −2�327 14.420 7.184Japan–U.S. −4�587 −4�946 22.216 6.205Japan–Canada −3�122 −5�230 16.610 4.918Japan–U.K. −5�226 −5�160 24.180 6.305Canada–U.S. −1�478 −1�706 9.843 6.197

critical values (5%) −3�77 7.81

root tests were strongly supportive of the presence of a unit root for themajority of the time series used here. This was true regardless of whetheror not we maintained the existence of a drift term. There was only oneexception. For the Japanese price, the Phillips t -test and the ADF testyielded the contradictory results. The former rejected the unit root in theseries, while the latter supported its presence rather strongly. To save thespace, however, we do not report the further details.

For the MD Eq. (21), we consider the intercept dummies for twopossible breaks: one in July 1985, and the other in February 1994, the periodwhich is known to experience a dramatic change of monetary policy andthe deregulation of financial institutions in the US.4 The CCR-GLS basedF-values for the break dummies are highly significant. For the PPP Eq. (22),the break points were slightly different across the countries considered.5

TABLE 2 Estimated cointegrating regressions

�1 �2

MD M1 0.238 (5.17) −0�020 (−4�62)M2 1.023 (30.1) −0�020 (−7�68)

PPP U.K.–U.S. −0�065 (−1�49) 1.380 (51.33)Japan–U.S. −0�035 (−1�99) 0.508 (16.12)Japan–Canada −0�043 (−2�86) 0.523 (20.86)Japan–U.K. −0�042 (−2�43) 0.413 (22.89)Canada–U.S. 0.014 (0.22) 1.053 (40.73)

4The Federal Reserve resumed intervention in 1985 after a long break after the Plaza Meetingwhen the G5 agreed to intervene more heavily to depreciate the U.S. dollar in relation to theJapanese yen and German Deutsche Mark.

5The second oil crisis break was considered in the PPP equations of Canada–U.S. and Japan–U.S., and the Asian financial crisis break was included in four cases except U.K.–U.S. since it turnedout to be insignificant for U.K.–U.S. For U.K.–U.S., break point around the gulf war period wasconsidered.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 20: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 415

Two major break points, the second oil crisis in 1979 and Asian financialcrisis in 1998 or 1999 were generally considered in the PPP equations.Their CCR-GLS based F-values are highly significant in all five cases.

Table 1 reports the actual values computed for various cointegrationtests. As we can see from Table 1, the RBT’s show little evidence ofcointegration for both the MD Eq. (21) and the PPP Eq. (22). Inparticular, the null of no cointegration cannot be rejected in every casefor the MD relationship and in some of PPP relationships, if based on theADF or Phillips t -test. The critical value −3�77, adapted from Phillips andOuliaris (1990), is for the simple cointegrating model that includes onlythe constant term and purely stochastic regressors. Strictly, it is not directlyapplicable for our models. Our models include dummy variables and havethe regressors that could possibly have the deterministic time trends. Thecritical values for such regressions are known to be much smaller. The testresults from the RBT’s are, therefore, expected to be even harder to rejectthe null of no cointegration, if the correct critical values are used.

The results from the VAT’s are quite different and relatively moresupportive of the presence of cointegration, especially when the timeheterogeneity is accounted for. For the VAT’s, we use three time polynomialterms t , t 2, and t 3 as superfluous regressors. The limiting null distributionsof the test statistics are, therefore, chi-square with three degrees of freedom.The VAT’s are employed to test for both the CR-HOM and CR-HETmodels.6 If the VAT is applied to the CR-HOMmodel, in every case it rejectsthe specification of the MD equation and the PPP condition as long-runcointegrating relationships, which is similar to the results based on the ADFor Phillips t -test. However, such negative evidences all disappear if we allowfor the time heterogeneity and base the VAT on the CR-HET model. In allcases, the test values are within the 5% critical value. The negative resultsfrom the VAT-HOM may well be due to the neglected heterogeneity, ratherthan the lack of cointegration. The critical values for the VAT’s are notaffected, at least asymptotically, by the inclusion of dummy variables and/orthe existence of deterministic time trends in the regressors.7

Figures 1 and 2 present the estimated time heterogeneity for theMD and PPP equations in (21) and (22). The function f , which isintroduced in (3) and generates the time heterogeneity in the regressionerrors, was estimated by the nonparametric kernel method. The 95%confidence interval was obtained from our result in Proposition 3. Thebandwidth parameter was set hn = cn1/5 with constant c chosen by theusual crossvalidation method [see, e.g., Härdle and Vieu, 1992]. For bothequations, the presence of time heterogeneity appears to be evident.

6The VAT’s based on the CR-HOM and CR-HET models are referred, to respectively, as theVAT-HOM and VAT-HET.

7The inclusion of dummy variables, however, may induce important finite sample size distortions.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 21: Cointegrating Regressions with Time Heterogeneity

416 C. S. Kim and J. Y. Park

FIGURE 1 Estimated time heterogeneity: Monsy demand equation.

There may be various economic reasons for the presence of timeheterogeneity in the investigated MD function and PPP condition. For theU.S. MD function, the time homogeneity appears to have an increasingtrend with a peak occurred roughly at the first half of the 1990s.As deregulation of financial institutions went far and new financialcommodities such as the NOW accounts, for instance, were introduced,the traditional distinction of money has become somewhat vague, and thischange of economic environments may be reflected as changing varianceof cointegration error. The time heterogeneity for the PPP model is quitefluctuating in all cases, especially in the 1970s and 1980s. However, ithas an obvious decreasing trend. Except for the period of Asian financialcrisis in the late 1990s, the PPP conditions for U.S.–Japan, Japan–Canada,

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 22: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 417

FIGURE 2 Estimated time heterogeneity: Purchasing power parity.

Japan–U.K., and Canada–U.K. all seem to have become tightened recently.This may be regarded as a natural consequence of various worldwide andbilateral trade agreements taken place in the 1980s and 1990s, which hasprovided a freer environment in the world trade and capital movement.

The estimates of the parameters and their t -statistics (in parenthesis)for the MD Eq. (21) and the PPP Eq. (22) are summarized in Table 2. Theyare obtained from the CCR-GLS methodology introduced in Section 3.The estimated coefficients are generally significant and sensible. This maywell be taken as additional supportive evidence for our specification of theMD function and the PPP condition as the cointegrating regression withtime heterogeneity.

6. MONTE CARLO SIMULATION

A set of simulations were performed to see the finite sample effectsof the presence of time heterogeneity on the statistical inference forcointegrating regressions. For the cointegrating regressions with errorshaving time heterogeneity, it is well predicted from our theory that theheteroskedasticity-corrected GLS procedure is more efficient than theusual OLS procedure. Also, we may reasonably expect that it is harder

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 23: Cointegrating Regressions with Time Heterogeneity

418 C. S. Kim and J. Y. Park

to detect the presence of cointegration for such regressions, than for theregressions with homoskedastic errors. Time heterogeneity is thereforelikely to have an adverse effect on the test of cointegration.

The simulation is based on the model specified in (1) and (2). We letwt = (ut ,�xt)′ be generated by

wt = Awt−1 + �t + B�t−1,

where (�t) are iid �(0, ), and A and B are parameter matrices which arefurther specified as

A =(

0�6 0�6−0�3 −0�4

)and

B =(0�2 00 0�2

)The time series (wt) is, therefore, generated as an invertible Gaussianvector ARMA process. Note that the invertibility condition is satisfied and(wt) becomes stationary. The covariance matrix of (�t) is given by

=(1 �

� 1

)with the values of � = −0�6,−0�3, 0�3, 0�6.

The time heterogeneity �2t is specified using a quadratic function f .

To calibrate the design of our simulation to a more economically relevantexample, we set the parameter values of the function f to be close towhat we have for the empirical models discussed in the previous section.The time heterogeneities we obtained in our applications differ widelyacross models, i.e., the estimated variances have the maximum/minimumratios that vary approximately from 15 to 160, with the minimum valuesranging from 0.00009 to 0.01. Therefore, we consider the following twospecifications for f for the model with mild time heterogeneity (MHT) andthe model with severe time heterogeneity (SHT):

MHT : �2t = 0�0005 + 0�0075

(tn

)2

SHT : �2t = 0�0001 + 0�02

(tn

)2

Note that the maximum/minimum ratios of the mild and severeheterogeneity models are set to be 15 and 200, respectively. For thehomoskedastic model, we simply set �2

t = 1 for all n = 1, � � � ,n.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 24: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 419

TABLE 3 Type I errors of VATs

VAT-HOM VAT-HET

� HOM MHT SHT HOM MHT SHT

n = 100−0.6 5.02 10.25 14.12 3.18 2.02 2.40−0.3 4.89 10.54 13.98 3.27 2.60 3.260.3 4.82 11.06 13.71 3.40 2.71 3.060.6 4.95 10.22 13.83 3.41 2.55 2.95

n = 300−0.6 4.50 11.53 14.20 4.34 4.59 5.57−0.3 5.12 11.54 13.95 4.11 4.87 5.430.3 4.89 11.94 13.91 4.07 4.63 5.330.6 4.67 11.57 13.62 4.67 4.40 5.03

We first investigate the finite sample performance of the VATs. Boththe VAT-HOM and VAT-HET are considered for the CR-HOM and CR-HET models. For the CR-HET model, only the VAT-HET is valid. The VAT-HOM is based on the incorrect critical values and becomes invalid. Boththe VAT-HOM and VAT-HET are valid for the CR-HOM model.8 For theapplication of the VATs in the article, we follow Park (1990) and use threetime polynomial terms, t , t 2, and t 3, as the superfluous regressors.

Table 3 reports the type I errors of the 5% VATs for our simulationmodel introduced above. The samples of sizes 100 and 300 are drawn10,000 times for the simulation. Here we look at the type I errors, i.e., theprobabilities of falsely rejecting the presence of cointegration. They arethe probabilities that we are mainly concerned with, since we are primarilyinterested in the possibilities that cointegrating relationships are falselyrejected by the presence of heteroskedasticity in the errors. As mentionedabove, the time heterogeneity generally invalidates the tests if not properlyaccounted for. Indeed, as is clearly seen from Table 3, the VAT-HOM haslarge size distortions when the time heterogeneity is present. We are thusled to over-reject the null hypothesis of cointegration when it is true. Whenwe allow for the time heterogeneity and the VAT-HET is used, the sizedistortions are substantially reduced. For the small samples, the VAT-HETunder-rejects the null hypothesis somewhat significantly. However, the over-rejection quickly disappears as the sample size increases. When there is notime heterogeneity, VAT-HOM performs better as expected, but the sizedistortion of VAT-HET seems very mild.

We also consider the effect of the time heterogeneity on the RBTs suchas the ADF test and the Phillips t -test. They test the null hypothesis of no

8Though both of the tests are valid in this case, it would of course be more appropriate touse the VAT-HOM. The VAT-HET introduces additional sampling variations due to the estimationof time heterogeneity, which is unnecessary for the CR-HOM model.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 25: Cointegrating Regressions with Time Heterogeneity

420 C. S. Kim and J. Y. Park

TABLE 4 Type II errors of RBTs

ADF Phillips

� HOM MHT SHT HOM MHT SHT

n = 100−0.6 21�93 47�11 50�19 1�55 7�99 9�01−0.3 22�18 47�83 51�44 1�42 7�44 9�560.3 22�95 46�92 50�68 1�83 7�43 9�530.6 22�70 46�99 51�35 1�54 7�76 9�62

n = 300−0.6 7�70 14�06 16�52 0�000 0�001 0�001−0.3 7�87 14�03 16�71 0�000 0�001 0�0010.3 7�59 14�80 16�10 0�000 0�001 0�0010.6 7�67 14�23 16�66 0�000 0�001 0�001

cointegration, so we are primarily interested in the type II errors. These arethe probabilities of falsely accepting the null of no cointegration when thetrue model is given by the cointegrating regression with heteroskedasticerrors. The finite sample type II errors for the five percent test arepresented in Table 4. To compute the Phillips t -test, we estimate the long-run variance using Parzen window with the automatic bandwidth selectionmethod by Andrews (1991). For the ADF test, we use the MIC rule by Ngand Perron (2001) to select the autoregressive orders. As for the VATs, thesamples of size 100, 300 are taken and simulated 10,000 times.

As seen from Table 4, the adverse effect of time heterogeneityon testing for cointegration can be severe. The type II errors can besubstantially bigger for the CR-HET model than for the CR-HOM model,especially when the sample size is small. The model we use for thehomoskedastic case represents a well formulated cointegrating regression,and we may, therefore, expect that the type II errors of the RBTs are smallin this case. Relatively, the ADF test has much bigger type II errors thanthe Phillips t -test. The ADF test has the type II errors that are as big as 50%for the samples of size 100. They decrease as the sample size increases, butremain to be substantial unless the sample size gets fairly large. In contrast,the Phillips t -test has very small type II errors. It has almost negligibletype II errors except for the case of very small samples.9 Overall, it appearsthat the Phillips t -test has much smaller type II errors than the ADF test.10

9The type II errors of the Phillips t -test increase as the model approaches to a non-cointegrated system and the number of regressors increases, and become nonnegligible. We didmore extensive simulations along this line, though we do not report the details here to save space.

10This, of course, does not imply that the Phillips t -test has more discriminatory powers thanthe ADF test. As is well known, the former has generally more severe size distortions than thelatter. Here our purpose is to demonstrate the effect of the presence of time heterogeneity onthese tests, and not to compare their relative performances.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 26: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 421

TABLE 5 Mean bias and RMSE of OLS, CCR-OLS, and CCR-GLS

Mean bias RMSE

� OLS CCR-OLS CCR-GLS OLS CCR-OLS CCR-GLS

HOM−0.6 0.00085 0.00021 0.00020 0.00145 0.00111 0.00113−0.3 0.00085 0.00020 0.00020 0.00143 0.00110 0.001120.3 0.00086 0.00021 0.00021 0.00145 0.00111 0.001130.6 0.00085 0.00020 0.00020 0.00145 0.00110 0.00112

MHT−0.6 0.00132 0.00028 0.00030 0.00242 0.00186 0.00158−0.3 0.00130 0.00030 0.00031 0.00242 0.00187 0.001550.3 0.00131 0.00028 0.00029 0.00241 0.00185 0.001540.6 0.00131 0.00030 0.00030 0.00240 0.00184 0.00154

SHT−0.6 0.00181 0.00031 0.00033 0.00351 0.00269 0.00171−0.3 0.00180 0.00031 0.00034 0.00352 0.00274 0.001750.3 0.00184 0.00032 0.00033 0.00354 0.00272 0.001720.6 0.00183 0.00032 0.00032 0.00352 0.00268 0.00170

Finally, we investigate the relative efficiency of heteroskedasticity-corrected procedure. For this purpose we look at the estimates for �obtained by OLS, CCR-OLS, and CCR-GLS, which are introduced in theprevious section.11 The nuisance parameters for the CCR transform areestimated using Parzen window with the data dependent bandwidth choicesuggested by Andrews (1991). The number of iteration is 10,000. To savespace, we only report the results for n = 300 in Table 5 and Figs. 3 and 4.The results for n = 100 are very similar qualitatively.

For the model used in our simulation, the regressor is not strictlyexogenous. Hence, the OLS estimator � is asymptotically biased andinefficient. This is clearly shown in Table 5 and Figs. 3 and 4. The bias ofthe OLS estimator is somewhat apparent in all cases. The bias is shownto be corrected by the CCR methodology. As shown in Theorem 7, theCCR-GLS estimator �∗ is more efficient than the CCR-OLS estimator �∗ interms of the RMSE. The relative efficiency of the CCR-GLS estimator overthe CCR-OLS estimator becomes more apparent as the time heterogeneitybecomes more severe. For our model with severe heterogeneity, theefficiency gain for the CCR-GLS estimator can be larger than 30% relativeto the CCR-OLS estimator. In all cases, the mean biases of the CCR-GLSestimator are largely comparable to those of the CCR-OLS estimator. Whenthere is no time heterogeneity, the RMSE of the CCR-GLS estimator is

11The comparison between the OLS and CCR estimators is not of a main interest here. Wereport the results for the OLS estimator, to compare the improvement made by CCR-OLS relativeto OLS with that made by CCR-GLS relative to CCR-OLS.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 27: Cointegrating Regressions with Time Heterogeneity

422 C. S. Kim and J. Y. Park

FIGURE 3 Estimated densities under mild time heterogeneity.

slightly bigger in some case, and this is due to the additional noise fromthe nonparametric estimation of function f . The effect of the additionalnoise, however, does not seem to be significant.

7. CONCLUDING REMARKS

We consider in the article the cointegrating regression model with timeheterogeneity. The error variance is allowed to change smoothly over timeto describe an economic relationship, the tightness of which is believedto vary. An efficient method of estimating such a model is proposed andthe relevant asymptotic theory is developed. The specification tests whichcan be used to check the adequacy of our model are also suggested.We show by the Monte Carlo simulation method that our methodologyeffectively analyzes such a model in finite samples. Empirical applicationsincluded in the article demonstrate the practical usefulness of our modeland methodology.

In our model, the error variance is specified as a function of time.Of course, time is used here simply as a proxy for many unknownfactors which have jointly affected the tightness of a given economic

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 28: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 423

FIGURE 4 Estimated densities under severe time heterogeneity.

relationship. Our theory developed in the article can be extendedto a wide class of cointegrated models with more general forms oftime heterogeneity including those driven by nondeterministic processespossibly with multiple breaks. We leave them for the future research.

MATHEMATICAL APPENDIX

Proof of Lemma 1. Part (a) is a special case of Lemma 1 of Park (1992),and the proof is, therefore, omitted. To show Part (b), we let �t =�((ws)

ts=1) and �t(·) = �(· |�t). Define

wt =∞∑k=0

(�twt+k − �t−1wt+k),

�t =∞∑k=1

�twt+k �

It follows that

wt = wt + (�t−1 − �t) and �t−1wt = 0,

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 29: Cointegrating Regressions with Time Heterogeneity

424 C. S. Kim and J. Y. Park

and, therefore, in particular, (wt) is a martingale difference sequence.Moreover, as shown in Hansen (1992), we have

1√nmax1≤t≤n

|�t | →p 0 (23)

as n → ∞.In what follows, we suppress n in �nt and simply write �t for notational

brevity. Define a process Qn by

Qn(r ) = 1√n�[nr ]z[nr ]

with Q = �W . Also, let Wn and W n be defined, respectively, from (wt) and(wt) as in (8). Then

1n

[nr ]∑t=1

�t zt−1w ′t =

∫ r

0QndWn =

∫ r

0QndW n + n(r ),

where

n(r ) = 1n

[nr ]∑t=1

(�t zt − �t−1zt−1)�′t − 1

n�[nr ]z[nr ]�′

[nr ]+1

for r ∈ [0, 1].Clearly, (Qn ,Wn) →d (Q ,W ) in D[0, 1]2� as n → ∞, where � = m + 1.

Since

‖Wn − W n‖ ≤ 2√nmax1≤t≤n

|�t | →p 0

by (23), we also have (Qn ,W n) →d (Q ,W ). Consequently, as Kurtz andProtter (1991) show, ∫ r

0QndW n →d

∫ r

0QdW

as n → ∞.Next we consider n(r ). First, notice that

max1≤t≤n

∣∣∣∣ 1n �t zt�′t+1

∣∣∣∣ ≤ ‖�‖‖Wn‖ 1√nmax1≤t≤n

|�t | →p 0,

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 30: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 425

due in particular to (23). Now we write

1n

[nr ]∑t=1

(�t zt − �t−1zt−1)�′t = 1

n

[nr ]∑t=1

(�t − �t−1)zt�′t + 1

n

[nr ]∑t=1

�t−1wt�′t

= Rn(r ) + Tn(r )�

Let � be the derivative of �. It can be easily shown that

‖Rn‖ ≤ ‖�‖∣∣∣∣ 1n2

[nr ]∑t=1

zt�′t

∣∣∣∣≤ ‖�‖

(1

n3/2

[nr ]∑t=1

|zt |)(

1√nmax1≤t≤n

|�t |)

= op(1),

due to (23).If we define mt = wt�

′t − �wt�

′t , then (mt) is an Lq/2-mixingale and

max1≤t≤n

∣∣∣∣ 1nt∑

s=1

�s−1ms

∣∣∣∣ →p 0 (24)

as n → ∞, due to Theorems 3.2 and 3.3 of Hansen (1992). We havefrom (24)

Tn(r ) = 1n

[nr ]∑t=1

�t�(wt�′t) + op(1)

= 1n

[nr ]∑t=1

�t

( ∞∑k=1

�(wtw ′t+k)

)+ op(1)

=( ∫ r

0�

)′ + op(1)

uniformly in r ∈ [0, 1], as was to be shown. �

Proof of Lemma 2. Part (a) is well known. To show Part (b), we write

nhn�(fn(r ) − �fn(r ))2

= 1nhn

∑|k|<n

∑t

f(tn

)f(t − kn

)K

(r − t/n

hn

)K

(r − (t − k)/n

hn

)c(k)

=( ∑

|k|<n

c(k))

1nhn

∑t

f(tn

)2

K(r − t/n

hn

)2

+ Rn(r )

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 31: Cointegrating Regressions with Time Heterogeneity

426 C. S. Kim and J. Y. Park

with

‖Rn‖ ≤(1n

∑|k|<n

|k||c(k)|)

‖f ‖‖f ‖‖K ‖∥∥∥∥ 1nhn

n∑t=1

∣∣∣∣K( · − t/nhn

)∣∣∣∣∥∥∥∥+

(1nhn

∑|k|<n

|k||c(k)|)

‖K ‖‖f ‖2

∥∥∥∥ 1nhn

n∑t=1

∣∣∣∣K( · − t/nhn

)∣∣∣∣∥∥∥∥,where f and K are the derivatives respectively of f and K . By a strongmixing inequality (see Corollary A.2 of Hall and Heyde, 1980), we have

|c(k)| = |�vtvt−k | ≤(supt≥1

‖vt‖p

)�1−2/pk ,

from which it follows that∑

k |c(k)| < ∞. Consequently,

1n

∑|k|<n

|k||c(k)| → 0

by the Kronecker lemma, and we have ‖Rn‖ → 0. The stated result nowfollows easily. �

Proof of Proposition 3. By convention, we let f be defined on the entire� by setting f (r ) = 0 outside the unit interval. Likewise, we assume thatthe empirical process Vn is defined on � by letting Vn(r ) = Vn(1) for allr ≥ 1 and Vn(r ) = 0 for r ≤ 0. We now write

√nhn(fn(r ) − �fn(r )) = 1√

nhn

n∑t=1

f(tn

)K

(r − t/n

hn

)vt

= 1√hn

∫ ∞

−∞f (s)K

(r − shn

)dVn(s)

=∫ ∞

−∞f (r − hns)K (s)dVn(s)

= f (r )∫ ∞

−∞K (s)dVn(s) + Sn(r ),

where Vn is defined as

Vn(s) = 1√hn

(Vn(r − hns) − Vn(r ))

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 32: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 427

and

Sn(r ) =∫ ∞

−∞(f (r − hns) − f (r ))K (s)dVn(s)�

The integrals with respect to both Vn and Vn are defined path-by-path inthe sense of Stieltjes.

We now have

�(Sn(r )hn

)2

≤ �(

1√nhn

n∑t=1

∣∣∣∣ r − t/nhn

∣∣∣∣∣∣∣∣K(r − t/n

hn

)∣∣∣∣vt)2

=( ∑

|k|<n

c(k))(

1nhn

n∑t=1

(r − t/n

hn

)2

K(r − t/n

hn

)2)+ Rn(r ),

where Rn(r ) is bounded for each r ∈ [0, 1] by(1nhn

∑|k|<n

|k||c(k)|)(

‖K ‖ 1nhn

n∑t=1

∣∣∣∣ r − t/nhn

∣∣∣∣∣∣∣∣K(r − t/n

hn

)∣∣∣∣+ ‖K ‖ 1

nhn

n∑t=1

∣∣∣∣ r − t/nhn

∣∣∣∣2∣∣∣∣K(r − t/n

hn

)∣∣∣∣) = o(1)�

Therefore,

Sn(r ) = Op

(h2n

) = op(1)�

Moreover, if we let V be a BM defined on the entire �, then

Vn →d V �

This is because Vn converges weakly to V , if restricted to a closed interval[0,M ] for any M . The reader is referred to Theorem 5.23 of Pollard(1984). We thus have

f (r )∫ ∞

−∞K (s)dVn(s) →d f (r )

∫ ∞

−∞K (s)dV (s)

= �(0, cK�2f (r )2),

as was to be shown. �

Proof of Lemma 4. Note that

fn(r ) − fn(r ) = 1nhn

n∑t=1

(e2t − e2t )K(r − t/n

hn

)�

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 33: Cointegrating Regressions with Time Heterogeneity

428 C. S. Kim and J. Y. Park

and that

e2t − e2t = −2( n∑

t=1

et x ′t

)( n∑t=1

xtx ′t

)−1

xt et

+( n∑

t=1

et x ′t

)( n∑t=1

xtx ′t

)−1

xtx ′t

( n∑t=1

xtx ′t

)−1( n∑t=1

xt et

)�

Since it is well known that(1n2

n∑t=1

xtx ′t

)−1

= Op(1) and1n

n∑t=1

�t xtut = Op(1),

the stated result would follow if we can show

n∑t=1

xtx ′t K

(r − t/n

hn

)= Op(n2hn) (25)

n∑t=1

�t xtutK(r − t/n

hn

)= Op(nh−1

n ) (26)

uniformly in r ∈ [0, 1].Partition Wn = (W1n ,W ′

2n)′ conformably with (wt). To get the result

in (25), we simply note that∣∣∣∣ n∑t=1

xtx ′t K

(r − t/n

hn

)∣∣∣∣ ≤ n2hn‖W2n‖2 1nhn

n∑t=1

K(r − t/n

hn

)

= n2hn‖W2n‖2

(1hn

∫ 1

0K

(r − shn

)ds + O

(n−1h−2

n

))= n2hn‖W2n‖2

( ∫ 1

0K (r − shn)ds + O

(n−1h−2

n

))= n2hn‖W2n‖2

(K (r ) + O(hn) + O

(n−1h−2

n

))�

The proof for (26) is more involved, and will be done in two steps. First,if we let ut = ut − �(ut |�t−1) with �t = �((ws)

ts=l), and write

n∑t=1

�t xtutK(r − t/n

hn

)=

n∑t=1

�t xt−1utK(r − t/n

hn

)+ R1n

=n∑

t=1

�t xt−1utK(r − t/n

hn

)+ R2n ,

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 34: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 429

then it can be shown that

R1n ,R2n = op(n) (27)

uniformly in r ∈ [0, 1]. Secondly, we may deduce that

n∑t=1

�t xt−1utK(r − t/n

hn

)= Op

(nh−1

n

)(28)

uniformly in r ∈ [0, 1].To prove (27), we note that for any uniformly integrable L1-mixingale

(mt)

1n

n∑t=1

�tmtK(r − t/n

hn

)→p 0

uniformly in r ∈ [0, 1]. It follows that

R1n =n∑

t=1

�t�xtutK(r − t/n

hn

)

=n∑

t=1

�t�(�xtut)K(r − t/n

hn

)+ op(n)

uniformly in r ∈ [0, 1]. Moreover, if we write

ut = ut + (�t−1 − �t)

similarly as in the proof of Lemma 1, then

R2n =n∑

t=1

�t xt−1(�t−1 − �t)K(r − t/n

hn

)

=n∑

t=1

�t�xt�tK(r − t/n

hn

)+ op(n)

=n∑

t=1

�t�(�xt�t)K(r − t/n

hn

)+ op(n)

uniformly in r ∈ [0, 1].To obtain the result in (28), we fix r0 ∈ [0, 1] and note that

n∑t=1

�t xt−1ut

(r0 − t/n

hn

)= Op(nh1/2

n ),

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 35: Cointegrating Regressions with Time Heterogeneity

430 C. S. Kim and J. Y. Park

which can be shown as in the proof of Proposition 3. Furthermore, we have

n∑t=1

�t xt−1ut

[K

(r − t/n

hn

)− K

(r0 − t/n

hn

)]= Op(nh−1

n )

uniformly in r in a neighborhood of r0 ∈ [0, 1], since its second moment isbounded by the expectation of

�2n∑

t=1

�2t xt−1x ′

t−1

[K

(r − t/n

hn

)− K

(r0 − t/n

hn

)]2

≤ h−2n |r − r0|2‖K ‖2�2

n∑t=1

�2t ‖xt−1‖2,

where �2 is the conditional variance of (ut) and K is the derivative of K .The proof is now complete. �

Proof of Lemma 5. We define �nt by �nt = √fn(t/n), and show first that

the estimates �nt and �nt of �t yield the same asymptotic results. This readilyfollows from Lemma 4. We have for (14)∣∣∣∣ n∑

t=1

(1�2nt

− 1�2nt

)zt z ′

t

∣∣∣∣ ≤(max1≤t≤n

|�2nt − �2

nt |�2nt�

2nt

) n∑t=1

|zt |2 = Op(nh−2n ) (29)

due to Lemma 4, since

max1≤t≤n

∣∣�2nt − �2

nt

∣∣ ≤ ‖fn − f ‖

and

min1≤t≤n

�2nt = min

1≤t≤n�2nt + Op(n−1h−2

n ) = inf0≤r≤1

|f (r )| + Op(n−1h−2n )�

Note that f is bounded away from zero by Assumption 3(a), being acontinuous function on a compact interval.

Also, we may easily deduce for (15) and (16) that∣∣∣∣ n∑t=1

(�nt − �nt)zt−1w ′t

∣∣∣∣ ≤(max1≤t≤n

|�nt − �nt |)( n∑

t=1

|zt−1|2)1/2( n∑

t=1

|wt |2)1/2

= Op(n1/2h−2n ) (30)

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 36: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 431

and similarly, ∣∣∣∣ n∑t=1

(�t

�2nt

− �t

�2nt

)zt−1w ′

t

∣∣∣∣ = Op(n1/2h−1/2n ) (31)∣∣∣∣ n∑

t=1

(1�nt

− 1�nt

)zt−1w ′

t

∣∣∣∣ = Op(n1/2h−1/2n )� (32)

Clearly, Op(nh−2n ) = op(n2) and Op(nl/2h−2

n ) = op(n) given ourAssumption 3(c). The results in (29)–(32) prove that �nt and �nt give usthe same asymptotics.

We now derive the results in (14)–(16) with �nt replaced by �nt . For(14), we simply note that∣∣∣∣ n∑

t=1

(1�2nt

− 1�2t

)zt z ′

t

∣∣∣∣ ≤(max1≤t≤n

|�2nt − �2

t |�2nt�

2t

) n∑t=1

|zt |2 +Op(n2hn)+Op(n3/2h−1/2n ),

due to our results in Lemma 2. The stated result, therefore, followsimmediately from Lemma 1.

For the result in (15), we write

1n

n∑t=1

�nt zt−1w ′t = 1

n

n∑t=1

�nt zt−1w ′t + 1

n

n∑t=1

�ntwt r ′t + op(1), (33)

precisely as in the proof of Lemma 1. For the second term in the right-hand side of (33), we have

1n

n∑t=1

�ntwt�t = 1n

n∑t=1

�t�(wt�t) + op(1) →p

( ∫ 1

0f 1/2

)′ (34)

as in the proof of Lemma 1. To get the asymptotics for the first termin (33), let �nt = �(�t(vs)ns=t+1), and note that (wt ,�nt) is a martingaledifference sequence by Assumption 4. We may show

fn → f (35)

uniformly on [0, 1]. We would then have for any continuous function T

T (fn ,Wn) →d T (f ,W ),

and it follows from Kurtz and Protter (1991) that

1n

n∑t=1

�nt zt−1w ′t =

∫ 1

0f 1/2n Wnd �W ′

n + op(1) →d

∫ 1

0f 1/2W dW ′

which, together with (34), establishes the result in (15).

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 37: Cointegrating Regressions with Time Heterogeneity

432 C. S. Kim and J. Y. Park

We now show (35) to complete the proof for (15). Since

fn(r ) = 1nhn

n∑t=1

f(1n

)K

(r − t/n

hn

)+ 1

nhn

n∑t=1

�2t vtK

(r − t/n

hn

),

it suffices to show that

sup0≤r≤1

∣∣∣∣ 1nhn

n∑t=1

�2t vtK

(r − t/n

hn

)∣∣∣∣ →p 0 (36)

as n → ∞.For any fixed r0 ∈ [0, 1], we have

n∑t=1

�2t vtK

(r0 − t/n

hn

)= Op(n1/2h1/2

n )

as shown in the proof of Lemma 2. Moreover, we have

n∑t=1

�2t vt

[K

(r − t/n

hn

)− K

(r0 − t/n

hn

)]= Op(n1/2h−1

n )

uniformly in r in a neighborhood of r0 ∈ [0, 1], since its mean squarederror is bounded by

h−2n |r − r0|‖K ‖

n∑t=1

n∑s=1

|�(vtvs)| = nh−2n |r − r0|‖K ‖

∑|t |≤n

(n − |k|)|c(k)|,

where K is the derivative of K as defined earlier. This proves (35). Theproof for (16) is completely analogous and omitted. �

Proof of Lemma 6. Due to the results in Lemma 5, we may assumew.l.o.g. that �t is known. By the strict exogeneity of (ut), the result inLemma 1(b) reduces to

1n

n∑t=1

xt et = 1n

n∑t=1

�t xtut →d

∫ 1

0f 1/2W2dW1�

We, therefore, have by the continuous mapping theorem

n(� − �) →d

( ∫ 1

0W1W ′

2)−1

( ∫ 1

0f 1/2W2dW1

)�

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 38: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 433

Similarly, the results in Lemma 1 yield

1n2

n∑t=1

xt�t

x ′t

�t→d

∫ 1

0f −1W2W ′

2

1n

n∑t=1

xt�t

et�t

= 1n

n∑t=1

xt�tut →d

∫ 1

0f −1/2W2 dW1,

and it follows that

n(� − �) →d

( ∫ 1

0f −1W2W ′

2

)−1( ∫ 1

0f −1W2dW1

),

which completes the proof of the first part. For the second part, let

D(r ) = f (r )−1/2W2(r ) −∫ 1

0W2W ′

2

( ∫ 1

0fW2W ′

2

)−1

f (r )1/2W2(r )

and note that∫ 1

0DD ′ =

∫ 1

0f −lW2W ′

2 −∫ 1

0W2W ′

2

( ∫ 1

0fW2W ′

2

)−1 ∫ 1

0W2W ′

2 ≥ 0 a.s.

This completes the proof. �

Proof of Theorem 7. As before, we may assume that �t is known due tothe results in Lemma 5. Moreover, given our results in Lemmas 2 and 4,it is pretty straightforward to show that the replacement of the unknownparameters by their consistent estimates in the CCR transformations doesnot change any of the subsequent asymptotic results in the CR-HET model.It is a rather trivial extension of the result in Park (1992) for the CR-HOMmodel. We will not give the details to save the space.

Let W1 and W2 be given as before, and define

W ∗1 = W1 − �12�

−122 W2�

Then we may deduce similarly as in Park (1992) that

n(�∗ − �) =(

1n2

n∑1

x∗t

�t

x∗′t

�t

)−1( 1n

n∑1

x∗t

�t

e∗′t

�t

)

→d

( ∫ 1

0f −1W2W ′

2

)−1( ∫ 1

0f −1/2W2 dW ∗

1

),

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 39: Cointegrating Regressions with Time Heterogeneity

434 C. S. Kim and J. Y. Park

and that

n(�∗ − �) =(

1n2

n∑t=1

x∗t x

∗′t

)−1( 1n

n∑t=1

x∗t e

∗t

)

→d

( ∫ 1

0W2W ′

2

)−1( ∫ 1

0f 1/2W2 dW ∗

1

),

The stated results now follow immediately upon noticing that W ∗1 is

independent of W2. �

Proof of Theorem 8. We first prove the results in part (a). The result forVAT is immediate from Theorem 7. To derive the limiting distribution ofLM, we first note that

ut = et�nt

−( n∑

t=1

1�2ntet x ′

t

)( n∑t=1

1�2ntxt x ′

t

)−1 xt�nt

� (37)

Let �nt be defined as in the proof of Lemma 5. We have

max1≤t≤n

∣∣∣∣ t∑s=1

(�s

�ns− �s

�ns

)us

∣∣∣∣ ≤(max1≤t≤n

�t |�nt − �nt |�nt�nt

) n∑t=1

|ut | = Op(h−2n ) (38)

due to Lemma 4. Also, if we define

Rn(r ) = 1√n

[nr ]∑t=1

(�t

�nt− 1

)ut ,

then Rn →d 0, and therefore,

max1≤t≤n

∣∣∣∣ t∑s=1

(�s

�ns− 1

)us

∣∣∣∣ = op(n−1/2)� (39)

It follows from (38) and (39) that we may assume �t to be known inderiving the limiting distribution of LM. The rest of the proof for part (a)is obvious and omitted.

We now prove the results in part (b). The consistency of both VATand LM against the alternative hypothesis of TVC-CR is shown in Park andHahn (1999). To prove their consistency against the alternative hypothesisof NON-CR, we assume that � = 0. This causes no loss in generality, since

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 40: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 435

here we just use the fact that the error is generated by an integratedprocess. Under this convention, define

Hn(r ) = W1n(r ) −∫ 1

0W1nW ′

2n

( ∫ 1

0W2nW ′

2n

)−1

W2n(r )� (40)

Then we have

fn(r ) = 1nhn

n∑t=1

e2t K(r − t/n

hn

)

= n(1hn

∫ 1

0Hn(s)2K

(r − shn

)Op(n−1h−2

n )

)= n(Hn(r )2 + Op(hn) + Op(n−1h−2

n ))

uniformly in r ∈ [0, 1]. It, therefore, follows that

n−1 fn →d H 2 (41)

as n → ∞, where

H (r ) = W1(r ) −∫ 1

0W1W ′

2

( ∫ 1

0W2W ′

2

)−1

W2(r )

is the limit process of Hn in (40).If we define

Un(r ) =n∑

t=1

ut−11{t − 1n

≤ r <tn

},

then it follows from (37) and (41) that Un →d U , where

U (r ) = W1(r )|H (r )| −

( ∫ 1

0H −2W1W ′

2

)( ∫ 1

0H −2W2W ′

2

)−1 W2(r )|H (r )| �

Let �11 be the usual long-run variance estimator of �11, which is given by

�11 = 1n

∑|k|≤�n

(k�n

) t∑ut ut−k ,

where �n is the lag truncation number that increases with the samplesize n, and � is an integrable weight function with support [−1, 1]. Then

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 41: Cointegrating Regressions with Time Heterogeneity

436 C. S. Kim and J. Y. Park

we may deduce that

�−1n �11 =

(1�n

∑|k|≤�n

(k�n

))(1n

n∑t=1

u2t

)+ op(1)

=( ∫ 1

−1�

)( ∫ 1

0U 2

)+ op(1)

as n → ∞, and therefore,

�−111 = Op

(�−1n

)(42)

for all large n.We now show that the VAT statistic diverges under the NON-CR model.

To avoid introducing additional notations, we consider the significancetest for �. As shown in Park (1990), the asymptotic behavior of thesignificance test for � is qualitatively the same as that for �. It is immediatefrom (41) that

1n

n∑t=1

1�2ntxt x ′

t =∫ 1

0H −2

n W2nW ′2n + op(1) →d

∫ 1

0H −2W2W ′

2 (43)

1n

n∑t=1

1�2ntxt et =

∫ 1

0H −2

n W2nW1n + op(1) →d

∫ 1

0H −2W2W1 (44)

as n → ∞. The consistency of the VAT test can now be easily establishedfrom (42)–(44). We may similarly establish the consistency of LM from(42) and

1n

n∑t=1

(1n

t∑s=1

us

)2

→d

∫ 1

0

( ∫ r

0U

)2

dr �

This completes the proof. �

ACKNOWLEDGMENTS

This is a substantially revised and extended version of the articlewritten by the second author and Jonghan Park, the first draft of which wascompleted in 1999 and circulated under the title “Longrun RelationshipsEvolving over Time.” We would also like to thank Yoosoon Chang forhelpful discussions and suggestions, and two anonymous referees forsome useful comments. The financial support from the Korea ResearchFoundation is gratefully acknowledged.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 42: Cointegrating Regressions with Time Heterogeneity

Cointegrating Regressions 437

REFERENCES

Andrews, D. W. K. (1987). Least squares regression with integrated or dynamic regressors underweak error assumptions. Econometric Theory 3:98–116.

Andrews, D. W. K. (1991). Heteroskedasticity and autocorrelation consistent covariance matrixestimation. Econometrica 59:817–858.

Baba, Y., Hendry, D. F., Starr, R. F. (1992). The demand for M1 in the USA: 1960–1988. The Reviewof Economic Studies 59:25–60.

Beare, B. K. (2004). Robustifying unit root tests to permanent changes in innovation variance.Working paper, Yale University.

Boswijk, H. P. (2005). Adaptive testing for a unit root with nonstationary volatility. Working paper,Amsterdam School of Economics.

Boswijk, H. P., Zu, Y. (2007). Testing for cointegration with nonstationary volatility. Working paper,Amsterdam School of Economics.

Cavaliere, G. (2004). Unit root tests under time-varying variance shifts. Econometric Reviews 23:259–92.Cavaliere, G. Taylor, A. M. R. (2007). Testing for unit roots in time series models with nonstationary

volatility. Journal of Econometrics 140:919–47.Engle, R. F., Granger, C. W. J. (1987). Co-integration and error correction: Representation,

estimation, and testing. Econometrica 55:251–76.Fisher, E., Park, J. Y. (1991). Testing purchasing power parity under the null hypothesis of

cointegration. Economic Journal 101:1476–84.Gasser, T., Müller, H. G. (1979). Kernel estimation of regression function. In: Smoothing Techniques

for Curve Estimation. Vol. 757 Lecture Notes in Mathematics, pp. 22–68.Hafer, R. W., Jansen, D. W. (1991). The demand for money in the United States: Evidence from

cointegration tests. Journal of Money, Credit and Banking 23:155–168.Hall, P., Heyde, C. C. (1980). Martingale Limit Theory and Its Application. New York: Academic Press.Hamori, S., Tokihisa, A. (1997). Testing for a unit root in the presence of a variance shift. Economics

Letters 57:245–253.Hansen, B. E. (1992). Convergence to stochastic integrals for dependent and heterogeneous

processes. Econometric Theory 8:489–500.Hansen, B. E. (1995). Regression with nonstationary volatility. Econometrica 63:1113–1132.Härdle, W. (1990). Applied Nonparametric Regression. New York: Cambridge University Press.Härdle, W., Vieu, P. (1992). Kernel regression smoothing of time series. Journal of Time Series Analysis

13:209–232.Harvey, A. C., Robinson, P. C. (1988). Efficient estimation of nonstationary time series regression.

Journal of Time Series Analysis 9:201–214.Hidalgo, J. (1992). Adaptive estimation in time series regression models with heteroskedasticity of

unknown form. Econometric Theory 8:161–187.Johansen, S., Juselius, K. (1990). Maximum likelihood estimation and inference on cointegration

– with applications to the demand for money. Oxford Bulletin of Economics and Statistics52:169–210.

Johansen, S., Juselius, K. (1992). Testing structural hypotheses in a multivariate cointegrationanalysis of the PPP and UIP of UK. Journal of Econometrics 53:211–44.

Juselius, K. (1995). Do purchasing power parity and uncovered interest rate parity hold in thelong run? An example of likelihood inference in a multivariate time-series model. Journal ofEconometrics 69:211–40.

Kim, T. H., Leybourne, S., Newbold, P. (2002). Unit root tests with a break in innovation variance.Journal of Econometrics 109:211–240.

Kurtz, T. G., Protter, P. (1991). Weak limit theorems for stochastic integrals and stochasticdifferential equations. The Annals of Probability 19:1035–1070.

Lucas, R. E. (1988). Money demand in the United States: a quantitative review. Carnegie–RochesterConference Series on Public Policy 29:137–168.

McConnell, M. M., Perez-Quiros, G. (2000). Output fluctuations in the United States: what haschanged since the early 1980s?. American Economic Review 90:1494–1476.

Meltzer, A. H. (1963). The demand for money: the evidence from the time series. Journal of PoliticalEconomy 71:219–246.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4

Page 43: Cointegrating Regressions with Time Heterogeneity

438 C. S. Kim and J. Y. Park

Miller, S. (1991). Monetary dynamics: an application of cointegration and error correction. Journalof Money, Credit, and Banking 23:139–154.

Ng, S., Perron, P. (2001). Lag length selection and the construction of unit root tests with goodsize and power. Econometrica 69:1519–1554.

Park, J. Y. (1990). Testing for unit roots and cointegration by variable addition. In: Rhodes, G. F.,Fomby, T. B. eds. Advances in Econometrics, Vol. 8, 107–133, Greenwich: JAI Press.

Park, J. Y. (1992). Canonical cointegrating regressions. Econometrica 60:119–143.Park, J. Y., Hahn, S. B. (1999). Cointegrating regressions with time varying coefficients. Econometric

Theory 5:664–703.Park, J. Y., Phillips, P. C. B. (1988). Statistical inference in regressions with integrated processes:

Part 1. Econometric Theory 4:468–498.Phillips, P. C. B. (1987). Time series regression with unit roots. Econometrica 55:277–302.Phillips, P. C. B. (1991). Optimal inference in cointegrated systems. Econometrica 59:283–306.Phillips, P. C. B., Park, J. Y. (1986). Asymptotic equivalence of ordinary least squares and generalized

least squares in regressions with integrated regressors. Journal of the American StatisticalAssociation 83:111–115.

Phillips, P. C. B., Ouliaris, S. (1990). Asymptotic properties of residual based tests for cointegration.Econometrica 58:165–193.

Phillips, P. C. B., Xu, K.-L. (2006). Inference in autoregression under heteroskedasticity. Journal ofTime Series Analysis 27:289–308.

Pollard, D. (1984). Convergence of Stochastic Processes. New York: Springer-Verlag.Priestley, M. B., Chao, M. T. (1972). Nonparametric function fitting. Journal of Royal Statistical Society

Series B 34:385–292.Robinson, P. M. (1987). Asymptotically efficient estimation in the presence of heteroskedasticity of

unknown form. Econometrica 55:875–891.Robinson, P. M. (1989). Nonparametric estimation of time-varying parameters. In: Hackl, P.

ed. Statistical Analysis and Forecasting of Economic Structural Change, Berlin: Springer-Verlag,pp. 253–264.

Robinson, P. M. (1991). Time-varying nonlinear regression. In: Hackl, P., Westland, A. H. eds.Economic Structural Change: Analysis and Forecasting, Berlin: Springer-Verlag, pp. 179–190.

Saikkonen, P. (1991). Asymptotically efficient estimation of cointegration regression. EconometricTheory 7:1–21.

Shin, Y. (1994). A residual-based test of the null of cointegration against the alternative of nocointegration. Econometric Theory 10:91–115.

Stock, J. H., Watson, M. W. (1993). A simple estimator of cointegrating vectors in higher orderintegrated systems. Econometrica 61:783–820.

Watson, M. (1999). Explaining the increased variability in the long-term interest rate. EconomicQuarterly 85:71–95, Federal Reserve Bank of Richmond.

Xu, K.-L and Phillips, P. C. B. (2008). Adaptive estimation of autoregressive models with time-varying variances. Journal of Econometrics 142:265–280.

Dow

nloa

ded

by [

Ston

y B

rook

Uni

vers

ity]

at 0

1:59

29

Oct

ober

201

4