the treatment-e ect estimation: a case study of the 2008...

31
The Treatment-Effect Estimation: A Case Study of The 2008 Economic Stimulus Package of China Min Ouyang * School of Economics and Management Tsinghua University, Beijing, 100084 China Yulei Peng Department of Economics, University of Arkansas Fayetteville, Arkansas 72701-1201 US December 2013 Abstract In 2008, the Chinese government put forth an economic stimulus package of 586 billion USD to minimize the impact of the global financial crisis. This is considered as one of the most important macroeconomic policy interventions in the past decade. This paper employs this policy intervention as a case study to address the challenge of evaluating the treatment effects under time-varying latent factors. In particular, we return to the framework of Hsiao, Ching, and Wan (2012), relax their linear conditional mean assumption, and extend it to a semi-parametric setting. The asymptotic distribution properties of the average treatment effect estimator is derived and studied. Both the linear model and the semi-parametric model are applied to study the treatment effect of the 2008 stimulus package on China’s macroeconomy. The estimation results show the stimulus package had raised the annual real GDP growth in China by about 3.2%, but only temporarily. These results are robust to linear setting, semiparametric setting, and various control group selections. The temporary boost in macroeconomic outcome is also evident in the estimation of other economic indicators such as real investment, real consumption, real export, and real import. JEL codes: C10, H73, H22. Key words: treatment effects, stationary data, 2008 economic stimulus plan of China, parametric estimation, semi-parametric estimation * The corresponding author: Min Ouyang. Email: [email protected]. We thank Qi Li and Cheng Hsiao for helpful comments and ChongEn Bai for insightful discussions. Zhe Lu, Matthew Kidder, and Meng Li provided excellent research assistance. Some of the data is provided by Bing Li. Errors are ours. 1

Upload: others

Post on 22-Mar-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

The Treatment-Effect Estimation:

A Case Study of The 2008 Economic Stimulus Package of China

Min Ouyang∗

School of Economics and ManagementTsinghua University, Beijing, 100084 China

Yulei PengDepartment of Economics, University of Arkansas

Fayetteville, Arkansas 72701-1201 US

December 2013

Abstract

In 2008, the Chinese government put forth an economic stimulus package of 586 billion USD to

minimize the impact of the global financial crisis. This is considered as one of the most important

macroeconomic policy interventions in the past decade. This paper employs this policy intervention

as a case study to address the challenge of evaluating the treatment effects under time-varying latent

factors. In particular, we return to the framework of Hsiao, Ching, and Wan (2012), relax their

linear conditional mean assumption, and extend it to a semi-parametric setting. The asymptotic

distribution properties of the average treatment effect estimator is derived and studied. Both the

linear model and the semi-parametric model are applied to study the treatment effect of the 2008

stimulus package on China’s macroeconomy. The estimation results show the stimulus package had

raised the annual real GDP growth in China by about 3.2%, but only temporarily. These results

are robust to linear setting, semiparametric setting, and various control group selections. The

temporary boost in macroeconomic outcome is also evident in the estimation of other economic

indicators such as real investment, real consumption, real export, and real import.

JEL codes: C10, H73, H22.

Key words: treatment effects, stationary data, 2008 economic stimulus plan of China, parametric

estimation, semi-parametric estimation

∗The corresponding author: Min Ouyang. Email: [email protected]. We thank Qi Li and Cheng Hsiao forhelpful comments and ChongEn Bai for insightful discussions. Zhe Lu, Matthew Kidder, and Meng Li provided excellentresearch assistance. Some of the data is provided by Bing Li. Errors are ours.

1

1 Introduction

Researchers usually face the challenge of estimating the counterfactuals when evaluating the conse-

quences of policy interventions. As a case study, consider one of the most significant macroeconomic

policy interventions in the past decade: China’s 2008 economic stimulus package. Starting from the

end of 2008, the central Chinese government put forth an economic stimulus package of four trillion

RMB – equivalent to 586 billion USD, about three times size of the U.S. stimulus package at the

time– as an attempt to minimize the impact of the global financial crisis. The influence of this policy

intervention has been much debated among policy makers; therefore, estimating its treatment effect

on China’s macroeconomy would provide intriguing evidence for future policy discussions. However,

in evaluating this policy, simply comparing China’s macroeconomic outcome before and after the end

of 2008 would not be sufficient, because the post-treatment outcome is under the influence of many

factors other than China’s stimulus package. Such factors can be, for example, weather, trade devel-

opment, population growth, technological advancements, or other governments’ policy interventions.

All these factors vary before and after the end of 2008 and, most importantly, their variations are ei-

ther unobservable, unmeasurable, or very difficult to control for. Therefore, the counterfactuals – the

hypothetical macroeconomic outcome of China in the absence of the 2008 stimulus package – become

the key element in estimating the treatment effects of this policy intervention.

There are several ways to construct the counterfactuals with time-varying latent factors. One can

estimate a latent factor model as suggested by Bai and Ng (2002). Unfortunately, this method applies

only when both the cross-section and the time-series dimensions are sufficiently big, which is rarely

the case when evaluating macro policies. Or, one can perform a time-series prediction by estimating

a VAR model. But a VAR prediction cannot incorporate any post-treatment shocks as its prediction

is out of the sample. Alternatively, one can use data changes in a group of regions not under the

policy intervention to control for the time-varying latent factors. This is the conventional difference-

in-difference (hereafter DID) approach (Card and Krueger, 1994). However, the DID is subject to

quite a few restrictive assumptions: first, the treatment assignment must be random (no selection

bias); second, the treatment and control regions must share the same latent factors; third, common

latent factors must have the same quantitative influence on each group so that taking differences

removes the effects. These assumptions are hard to maintain in applied work because, in reality,

policy makers’ choice of the treatment region is never random. Most importantly, regions differ in

terms of demographics, culture, and economic and political institutions, so that most latent factors as

well as their influences do vary cross-section. Failure to incorporate such cross-section heterogeneity

can severely bias the estimates. One can also use the synthetic control method proposed by Abadie

2

and Gardeazabal (2003) and Abadie, Dimond, and Hainmueller (2010) to estimate averae treatment

effects. The synthetic control method relaxes the equal weight restriction of the DID method. However,

when the treated unit and the control units exhibit different trending behaviors (in the absent of

treatment), the synthetic control method may result in poor in-sample-fit, in such cases the synthetic

control method should not be used (Abadie, Dimond, and Hainmueller 2010, page 495).

Hsiao, Ching, and Wan (hereafter HCW) (2012) initiate an approach that offers more flexibilities

than the DID and the synthetic control methods. They propose to estimate the correlations between

the treatment and control regions based on the pre-treatment data. In this way, regions that display

strong correlations should share common latent factors with the treatment region. More importantly,

estimating the correlations rather than taking simple differences allows the influence of common latent

factors to vary cross-section. This makes the approach much more applicable to reality. Bai, Li, and

Ouyang (2014) extend the HCW approach to the case with non-stationary data and prove that, as long

as the data follows I(1), the OLS estimator of the correlation coefficient is the unique and consistent

estimator that can uncover the post-treatment counterfactuals using the post-treatment observations

of the control regions.

In this paper, as a first step we return to the original framework of HCW (2012). HCW (2012)

method relies on a key assumption that the conditional mean of the error term is a linear function of the

latent factors (Assumption 6); otherwise the consistency of the OLS estimator cannot be established.

We relax this assumption by allowing the conditional mean to have a semi-parametric setting. We

show that, under quite general conditions as those in Robinson (1988) and Fan and Li (1999), one can

consistently estimate the counterfactuals and can therefore uncover the true treatment effects.

Accordingly, we adopt the HCW framework to estimate the conterfactual macroeconomic outcome

of China in the absence of the 2008 stimulus package. In selecting control economies, we look for those

that not only display strong correlations with the Chinese economy, but are relatively independent

of the 2008 stimulus package post-treatment. More specifically, we examine the trade relationships

between China and potential control economies. Economies whose total trade volume with China

constitutes only a small fraction of their own GDP should be relatively independent of any policy

interventions implemented in China. Hereby we maintain an important assumption by HCW(2012)

that the policy intervention must remain exogenous to the control group.

Applying both the parametric and semi-parametric settings, we estimate that the 2008 stimulus

package had raised the annual real GDP growth in China by about 3.2%, but only temporarily. In

particular, the estimated post-treatment counterfactual growth is 3.2% lower than the actual growth;

however, our results show that the counterfactual and actual growth coincide with each other starting

3

from about two years after the policy intervention. This suggests the stimulus package had a temporary

growth effect of 3.2% on China’s real GDP for about two years. These results remain robust to

linear setting, semi-parametric setting, and various control group selections. The temporary boost in

economic activities of the stimulus package is also evident in the estimation of other macroeconomic

indicators such as real investment, real consumption, real export, and real import.

The rest of the paper is organized as follows. Section 2 presents the model set up and extends the

model to a semi-parametric setting. Section 3 applies the model to estimate the 2008 Fiscal Stimulus

Plan of China. We conclude in Section 4.

2 The Model

Let y1it and y0

it denote unit i’s outcome in period t with and without treatment, respectively. The

treatment intervention effect for the ith unit at time t is defined as

∆it = y1it − y0

it. (1)

However, often we do not simultaneously observe y0it and y1

it. The observed data are in the form

yit = dity1it + (1− dit)y0

it, (2)

where dit = 1 if the ith unit is under the treatment at time t, and dit = 0 otherwise.

Let ft be a K × 1 vector of unobservable common factors which is the main force that drives all

the yit to change over time. Following Hsiao, Ching and Wan (2012), in this paper we assume that ft

is strictly stationary process.1 We consider the case that there is no treatment to yit for all i and for

t = 1, ..., T1. For t = T1 + 1, ..., T there is one unit, say the first unit without loss of generality, that

goes through a treatment, but all other unit yjt, j = 2, ..., N , still do not endure any treatment. So

there is no treatment for yjt for j = 2, ..., N and for all t = 1, ..., T .

Following HCW (2012) we consider the following factor model

y0it = αi + b′ift + uit, i = 1, ..., N ; t = 1, ..., T1, (3)

where αi is unit i’s individual specific intercept, bi is a K×1 vector of factor loading, uit is a stationary

I(0) error term.

Let yt = (y1t, ..., yNt)′ be an N × 1 vector of yit at time t. Since there is no policy intervention

1The case that ft is non-stationary is considered by Bai, Li and Ouyang (2012).

4

before T1, then the observed yt takes the form

yt = y0t = α+Bft + ut for t = 1, ..., T1, (4)

where α = (α1, ..., αN )′, B is a N ×K matrix with the ith row given by b′i, and ut = (u1t, ..., uNt)′.

Since at time T1 + 1, there is a policy change to the first unit, hence, from time T1 + 1 on we have

y1t = y11t for t = T1 + 1, ..., T . (5)

We assume that other units are not affected by the new policy implemented to the first unit.

Therefore,

yit = y0it = αi + b′ift + uit, for i = 2, ..., N , and t = 1, ..., T . (6)

Since y01t is not observable for t ≥ T1 + 1, we need to estimate the counterfactual outcome y0

1t. If

T and N are large, the method of Bai and Ng (2002) identifies the number of common factor, K, and

estimates ft along with B by the maximum likelihood approach. Hsiao, Ching and Wan (2012) (HCW,

hereafter) suggest an alternative method by using Yt = (y2t, ..., yNt)′ in lieu of ft to predict y0

1t for

post-treatment period t. The validity of HCW’s approach depends on a linear conditional expectation

function assumption, if the conditional mean function is not a linear function, their proposed estimator

may not lead to consistent estimation of the average treatment effects. In this paper we will generalize

HCW’s method to allow for flexible conditional mean function form so that we can consistently estimate

the average treatment effects robust to functional form misspecification.

2.1 The Parametric Regression Model

In this subsection we discuss HCW’s proposed estimation method. Recall that yt = (y1t, Y′t )′, where

Yt = (y2t, ..., yNt)′. The pre-treatment data y0

t is generated by the factor model

y0t = α+Bft + ut, for t = 1, ..., T1, (7)

where α = (α1, ..., αN )′, and B is a N×K factor loading matrix, and ft is the unobserved K×1 vector

of common factors, ut is a N ×1 idiosyncratic I(0) error. Let a = (1,−γ′)′ where γ = (γ2, ..., γN )′ such

that a′B = 0, i.e., a ∈ N (B), where N (B) is the null space of B. Then we have that

a′yt ≡ y1t − γ′Yt = a′α+ a′ut

5

because a′B = 0 so that the common factors are dropped out from the right-hand-side of the above

equation. Rearranging terms we obtain

y1t = γ1 + γ′Yt + u∗1t, (8)

where γ1 = a′α and u∗1t = a′ut = u1t − γ′Ut with Ut = (u2t, ..., uNt)′.

Because u∗1t depends on all the ujt for j = 1, ..., N , obviously, u∗1t and Yt are correlated. We can

decompose u∗1t as u∗1t = E(u∗1t|Yt) + ε1t, where ε1t = u∗1t − E(u∗1t|Yt) and E(ε1t|Yt) = 0. Then we can

re-write (8) as

y1t = γ1 + γ′Yt + E(u∗1t|Yt) + ε1t. (9)

HCW further assume that (linear conditional mean function assumption)

E(u∗1t|Yt) = a0 + b′0Yt. (10)

Equation (10) is in fact equivalent to Assumption 6 of HCW (2012).

Replacing E(u∗1t|Yt) in (9) by (10), we obtain

y1t = δ1 + δ′Yt + ε1t, (11)

where δ1 = γ1 + a0 and δ = γ + b0. Because E(ε1t|Yt) = 0 (this follows the definition of ε1t, which

follows from E(u∗1t|Yt) is linear in Yt), hence, OLS regression of y1t on (1, Y ′t ) will lead to consistent

estimate of (δ1, δ′)′.

Let (δ1, δ′) denote the OLS estimator of (δ1, δ

′) based on (11), HCW suggest to use

y01t = δ1 + Y ′t δ, t = T1 + 1, ..., T,

to predict the counterfactual outcome y01t. Then the average treatment effects is estimated via

T−12

∑Tt=T1+1 ∆1t, where T2 = T − T1 and ∆1t = y1t − y0

1t.

However, the linear conditional mean functional form (10) is restrictive,2 and a misspecified re-

gression functional form may lead to inconsistent estimation result for the treatment effects.

2.2 The Semi-parametric Regression Model

In this subsection we discuss how to relax the linear conditional mean function assumption. Without

imposing assumption (10) we can write (9) as

y1t = g(Yt) + ε1t, (12)

2When ft and ut are jointly normally distributed, yt is jointly normally distributed and (10) holds true, however,when ft and ut are not jointly normally distributed, (10) may not hold in general.

6

where g(Yt) = γ1 + γ′Yt + E(u∗1t|Yt). The functional form g(·) is not known if the functional form

E(u∗1t|Yt) is unspecified. Nevertheless, we can estimate g(·) using some standard nonparametric esti-

mation method, say by the local constant kernel method.

g(Yt) =

∑T1s=1 y1sK1s,2t∑T1s=1K1s,2t

, t = T1 + 1, ..., T (13)

where K1s,2t =∏Nj=2 k((y1s − yjt)/hj) is the product kernel function, hj is the smoothing parameter

associated with covariate yjt for j = 2, ..., N . The average treatment effects is then estimated by

∆1,NP =1

T2

T∑t=T1+1

[y1t − g(Yt)] . (14)

When N is not small, the above estimator suffers the ‘curse of dimensionality’ problem. Alterna-

tively, one may use some semi-parametric regression function to approximate g(·), say g(Yt) = β′z1t +

θ(z2t), where z1t and z2t are non-overlapping variable and their union forms Yt, i.e., Y ′t = (z′1t, z′2t).

Let q be the dimension of z2t, when z2t’s dimension q is low, the ‘curse of dimensionality’ problem

is greatly ameliorated. We can use the method of Robinson (1988) to estimate the partially linear

model. In fact, one can view that the partially linear model includes both a linear model and a general

nonparametric model as special cases. It give rise to a linear model if z1t = Yt and z2t is an empty set

of covariate, and it leads to a nonparametric regression model when z2t = Yt and z1t is an empty set.

Let β and θ(z2t) denote the semi-parametric estimators of β and θ(z2t), respectively, using the

pre-treatment data (say using the estimation method proposed by Robinson (1988)), or the estimators

proposed by Fan and Li (1999) who consider weakly dependent data case. Under quite general condi-

tions Robinson (1988) and Fan and Li (1999) show that β − β = Op(T−1/21 ). Then the counterfactual

outome y01t is estimated by y0

1t = z′1tβ + θ(z2t) for t = T1 + 1, ..., T , and the average treatment effects

is estimated by

∆1,PL =1

T2

T∑t=T1+1

[y1t − z′1tβ − θ(z2t)

]. (15)

We will show that ∆1 consistently estimates the sample mean of ∆1t (i.e., ∆1 = T−12

∑Tt=T1+1 ∆1t)

whether ∆1t is stationary or not. In the case that ∆1t is a strictly stationary process, ∆1 consistently

estimates the population mean ∆1 = E(∆1t).

To establish the consistency of ∆1, we need to make some regularity assumptions. We will consider

β-mixing process. Let {Wt} ≡ {yt, ft} be strictly stationary β-mixing process, i.e., as τ →∞,

βτ = sups∈N

E

{sup

A∈M∞s+τ (W)

[|P (A|Ms

−∞(W))− P (A)|]}→ 0,

7

where Mts(W) denotes σ(Ws, ...,Wt), the sigma algebra generated by (Ws, ...,Wt), for s < t.

We will use the class of kernel functions Kl for some positive integer l, and the class of smooth

functions Gαµ introduced by Robinson (1988). For readers’ convenience, we restate these two definitions

here.

Definition 1 Kl, l ≥ 2, is the class of even functions k: R → R satisfying∫Rvik(v)dv = δi0 (i = 0, 1, ..., l − 1)

k(v) = O((1 + |v|l+1+δ)−1), for some δ > 0,

where δij is the Kronecker’s delta.

Definition 2 Gαµ , α > 0, µ ≥ 1 is a positive integer, is the class of functions g: R → R satisfying: g

is µ-time partially differentiable, for some ρ > 0, supy∈φzρ |g(y) − g(z) − Qg(y, z)| ≤ G(z)|y − z|µ for

all z, where φzρ = {y : |y − z| < ρ}; Qg is a (µ − 1)th degree homogeneous polynomial in y − z with

coefficients the partial derivatives of g at z of orders 1 through µ− 1; g(z) and its partial derivatives

of order µ− 1 and less, and Gg(z) all have finite αth moments.

Definition 1 characterizes the class of (possibly higher order) kernel functions. When l > 2, k is

a higher order kernel function which often used to reduce nonparametric estimation bias. Definition

2 imposes smoothness and moments conditions on function g ∈ Gαµ . It means that g is differentiable

up to order µ and has a Taylor expansion given by [g(z) +Qg(y, z)] with remainder satisfying a local

Lipschitz conditions. Also, g and its partial derivative functions (up to order µ) all have finite αth

moments.

Recall that f(·) is the density function of z2t, ξt = E(z1t|z2t). Define ηt = z1t−E(z1t|z2t), we make

the following assumptions.

Assumption A1. (i) {yt, ft}, t = 1, ... is a strictly stationary β-mixing process with the mixing

coefficient βτ satisfying βδ/(1+δ)τ = O(τ−2+ε) for some 0 < ε < 1, 0 < δ ≤ 1/2 and qδ/(1 + δ) < 2.

Assumption A2. f ∈ G∞ν−1, g ∈ G4ν , where ν ≥ 2 is a positive integer. Also, g and ξ all satisfy some

(global) Lipschitz conditions: |θ(y) − θ(z)| ≤ Dθ(z)|y − z| for all y, z ∈ Rq, where Dθ has finite 4th

moments and θ = f , g, ξ.

Assumption A3. ε1t is a martingale difference process, E(ε4+δ1t ) < ∞ and E(η4+δ

j ) < ∞ for some

δ > 0, where ηj is the jth component of η for j = 1, ..., N−q. Also, σ2(z1, z2) = E(v2t |z1t = z1, z2t = z2)

and σ2v(z2) = E(v2

t |z2t = z2) both belong to G21 , where vt = ε1t or ηjt, j = 1, ..., N − q.

8

Assumption A4. (i) k ∈ K2. (ii) hj → 0 (j = 1, ..., q) and T1(h1...hq) → ∞ as T1 → ∞. (iii)

T2/T1 = O(1).

Will add some remarks to discuss the above assumptions later.

Proposition 2.1 Let ∆1 = T−12

∑Tt=T1+1 ∆t, ||h|| =

√∑qj=1 h

2j . Under assumptions A1 to A4, we

have

(i) ∆1,PL − ∆1 = Op(||h||2 + T−1/21 + T

−1/22 ).

If in addition to the above conditions, ∆1t is strictly stationary processes for t = T1 + 1, ..., T , and

let ∆1 = E(∆1t), we have

(ii) ∆1,PL −∆1 = Op(||h||2 + T−1/21 + T

−1/22 ).

Note that Proposition 2.1 (i) does not require the treatment effects ∆1t to be a stationary process.

Our average treatment effects estimator ∆1 consistently estimates the sample mean of ∆1t even if

∆1t is non-stationary. When ∆1t is a strictly stationary process, Proposition 2.1 (ii) states that ∆1

consistently estimates the population mean ∆1 = E(∆1t).

The result of Proposition 2.1 requires both T1 and T2 to be large. This is very intuitive and easy

to understand. We need T1 to be large in order to accurately estimate the conditional mean function

using the pre-treatment period data. We also need T2 to be large in order to consistently estimate the

average treatment effects using post-treatment data (so that T−12

∑Tt=T1+1 εt

p→ 0).

The next Theorem establishes the asymptotic normality result of our average treatment effects

estimator. We make two additional assumptions.

Assumption A5. (i) k ∈ Kν , (ii) h1 = ... = h+ q = h, T1h2ν → 0 and T1h

2q →∞ as T1 →∞. (iii)

E(ε1t|∆1t) = 0 for t = T1 + 1, ..., T .

Assumption A6. (i) For t ≥ T1 + 1, ∆1t and ε1t is independent with Ys for s ≤ T1. (ii) ∆1t

is a strictly stationary process and that√T2∑T

t=T1+1(∆1t − E(∆1t))d→ N(0, σ2

∆), where σ2∆ =

limT2→∞ V ar(T−1/22

∑Tt=T1+1 ∆1t).

Theorem 2.2 (i) Under assumptions A1 to A6, if T2/T1 → 0 as T2 →∞, we have√T2

[∆1,PL − ∆1

]d→ N(0, σ2

ε + σ2∆),

where σ2ε = var(ε1t).

9

(ii) Under assumptions A1 to A6, if T2/T1 → 0 as T2 → λ ∈ (0,∞), we have√T2

[∆1,PL −∆1

]d→ N(0, σ2

ε (1 + λ) + σ2∆ + V3),

where V3 is given in the Appendix.

Note that for Theorem 2.2 (i) we do not require that ∆1t to be a stationary process. For example

∆1t can contain (deterministic or stochastic) trend. In this case our estimate ∆1 still consistently

estimate ∆1 (because ∆1 − ∆1 = Op(T−1/21 )) even ∆1 may be an explosive process.

3 The Treatment Effects of the 2008 Economic Stimulus Packageof China

We apply the model to study the treatment effects of the 2008 Chinese Economic Stimulus Program.

On November 9 of 2008, the central government of China announced an economic stimulus package

of four trillion RMB (equivalent to 586 billion USD) to minimize the influence of the global financial

crisis. In relative terms, this package was the biggest stimulus program in the world, equal to about

three times size of the U.S. effort.3 The program was implemented starting from the fourth quarter

of 2008 throughout 2009 and 2010, and has been considered as one of the most significant economic

events of China over the past decade. Existent discussions of this stimulus program have been two-

folds. On the one hand, proponents asserted that it has led China to become the first major economy

to emerge from the global financial crisis, helped to stabilize the world economy, and assisted many

other countries to recover from the downturns. On the other hand, critics have maintained that it

has made the matter worse by pumping excessive investments into an economy that later ended up

overheated by over-investment.

In this section, we do not intend to provide a normative evaluations of the 2008 stimulus package.

Instead, we aim to to produce quantitative estimates of the influence of this program on China’s key

macro variables – real GDP and its private components: investment, consumption, import, and export.

Note that, to achieve this purpose, we cannot simply compare values before and after the fourth quarter

of 2008 because the impact of the global financial crisis have changed over time. Moreover, around

2008 other countries such as the U.S., also put forth economic stimulus packages that have inevitably

influenced the Chinese economy through channels such as international trade. Failures to control for

3The package amounted to 12.5% of the Chinese 2008 GDP to be spent over 27 months, while the U.S. stimuluspackage was worth about 5% of the U.S. GDP. It has been argued that the actual implementation size of the Chinesestimulus program was even bigger than its announcement value due to over-lending of the banks to local governments(Wong 2011).

10

all these time-varying latent factors would falsely attribute their influence to the 2008 Chinese stimulus

package, causing biases in the estimates.

Hence, we apply the HCW framework to estimate the counterfactuals, namely, the hypothetical

outcome of the Chinese economy in the absence of the 2008 Chinese stimulus package but under

the influence of other time-varying latent factors. We do this by first estimating the correlations

between China’s macro variables and those of control countries using the pre-treatment data and

then applying the estimates to construct the post-treatment counterfactuals for China. The difference

between the predicted counterfactuals and the actual observed post-treatment data would be the

estimated treatment effects of the 2008 Chinese Stimulus Program.

3.1 The Data

Data on potential control economies are from the OECD statistical databases. The OECD provides

data on real GDP growth, real consumption growth, real investment growth, real export growth, and

real import growth for 39 countries up to the first quarter of 2013. However, the specific coverage

of the time frame differs by variable and by country. Data for China is from the National Bureau of

Statistics of China (NBSC). The NBSC provides the Quarterly series on real GDP growth from the

first quarter of 1992 to the first quarter of 2013, and quarterly series on nominal consumption and

nominal fixed asset formation from the first quarter of 2000 to the fourth quarter of 2011. The China

Customs Statistical Yearbook (CCSY) provides the quarterly series on nominal values of imports and

exports from the first quarter of 1990 to the second quarter of 2012.

Note that all series for China are in nominal terms except for that on real GDP. To estimate

the real influences of the fiscal stimulus, we have to convert nominal variables into real terms with

appropriate price deflators. We adjust nominal consumption by the consumer price index (CPI) and

nominal fixed asset formation (investment) with the price index of fixed assets investment (PIFAI).

Quarterly series on CPI and PIFAI are both from the NBSC. The CCSY publishes a monthly series

on the price index of imported goods as well as that of exported goods. We convert the monthly series

into quarterly series by taking the monthly averages, and apply the two price indexes to deflate the

nominal values of imports and exports correspondingly. All variables are measured in quarterly series

on annual growth – the log first difference between the level of the current quarter and that of the

same quarter of the last year. In this way, we keep the series stationary and remove any remaining

seasonality.4

4The OECD series are seasonally adjusted already. But all series for China are non seasonally adjusted. We measureannual growth instead of directly removing the seasonality statistically, at the purpose of avoiding potential biases arisingfrom differences in the econometric setting in the seasonality adjustments.

11

Before applying the econometric model, we test for the existence of unit roots in all the data series.

Table 1 lists the augmented Dickey-fuller test results for the 39 OECD countries and for China. All

test specifications include a drift term and two lags. Table 1 implies that we can reject the existence

of a unit root for most of the series. This suggests that most of our data series are stationary with a

trend. Accordingly, we include a time trend in all the regressions and remove it only when the trend

coefficient turns out insignificant statistically.

3.2 The Control Group Selection

Two criteria are to be met when selecting countries into the control group for China. First, the control

country must be relevant, displaying a strong correlation with China in the variable to be estimated.

Second, the country should be exogenous of the treatment; in other words, the 2008 Chinese Stimulus

Program should have negligible impact on the control economies.

The exogeneity criterion is emphasized in HCW(2012) as Assumption 5. Violating this criterion

would cause failure in identifying the influence of latent factors on the treatment country. In reality,

however, no country can be fully exogenous to treatment events in China, as in recent years China

has rapidly developed a stronger relationship with the rest of the world.5 Nonetheless, we can identify

countries that are relatively exogenous to China’s policy interventions by investigating trade relation-

ships. If the volume of a country’s trade with China constitutes only a small fraction of this country’s

GDP, then the influence of the 2008 Fiscal Stimulus on this particular country should be relatively

minimal and negligible.

Accordingly, we construct an exogeneity measure that equals a country’s total trade volume with

China (the sum of its export to China and its import from China) as a ratio of its own GDP. Table 2

presents the average value of this measure for the years of 2008, 2009, and 2010 for the 39 countries.

The threshold value is set at 5%. Namely, we consider countries with a ratio below 5% as relatively

exogenous to China’s policy interventions. As Table 2 shows, most countries satisfy the exogeneity

criterion except for eight: Australia, Canada, Chile, Indonesia, Japan, Korea, Luxembourg, and South

Africa. Therefore, we experiment with estimations with the other 31 countries to find countries which

generate the best fit in the estimation so that which also satisfy the relevance criterion.

Note that, in constructing the exogeneity measure, we divide the trade volume with China by

the potential control country’s GDP, not by China’s GDP. In other words, it does not violate the

exogeneity criterion if one country’s trade with China takes a sizable share of Chinese GDP. This

is because we want to avoid the causality running from China to control countries, but not in the

5For example, it has been reported that the 2008 Economic Stimulus money has financed the purchase of certainmineral deposits of China from Australia and from some countries in South America.

12

opposite direction. As a matter of fact, the sizable share in China’s GDP accounted by a country’s

trade with China is exactly why this country displays strong relevance with China and why it serves

as a good candidate to control for time-varying factors other than the treatment.

3.3 The Estimation

We estimate the following parametric setting and a more flexible semiparametric setting:

y1t = α+ Y ′t γ + β1t+ x′tβ2 + ε1t, (16)

y1t = g(Yt) + β1t+ x′tβ2 + ε1t. (17)

Note that when g(Yt) = α + Y ′t γ, model (17) reduces to (16). Hence, the semi-parametric model

(17) includes the parametric model (16) as a special case.

In (16) and (17), y1t can be the real GDP growth, real consumption growth, real investment growth,

real export growth, or real import growth. α is an intercept. Yt are the corresponding variables from

control countries. t is a time trend. β1 and t are removed from the regression when their estimated

coefficients turn out statistically insignificant. xt are optional exogenous controls to help with the fit

of the data. ε1t is the error term.

Since the functional form of g(·) is not specified, (17) is a partially linear model with a time trend,

and other possible controls xt as the parametric component. We use the profile least squares method

to estimate β = (β1, β′2)′ and g(·). Specifically, let z′t = (t, x′t), then tβ1 + x′tβ2 = z′tβ, we first treat β

as if it were known and estimate g(Yt) by

g(Yt) =T−1

1

∑T1s=1(y1s − z′sβ)Kh,ts

ft≡ A1t −A′2tβ, (18)

where A1t = T−11

∑T1s=1 y1sKh,ts/ft, A2t = T−1

1

∑T1s=1 zsKh,ts/ft, ft = T−1

1

∑T1s=1Kh,ts, Kh,st =

∏ql=1

h−1l k((Ys,l − Yt,l)/hl) is the product kernel function (q is the dimension of Yt). g(Yt) is not feasible

because β is unknown. Next, we replace g(Yt) in (17) by g(Yt) given by (18), then re-arranging terms

yields

y1t −A1t = (zt −A2t)′β + ε1t, (19)

where εt = y1t −A1t − (zt −A2t)′β. Applying the least squares regression method to (19) gives

β =

[T1∑t=1

(zt −A2t)(zt −A2t)′

]−1 T1∑t=1

(zt −A2t)(y1t −A1t). (20)

13

A feasible estimator for g(Yt) is obtained from (18) with β replaced by β.

g(Yt) =T−1

1

∑T1s=1(y1s − z′sβ)Kh,ts

ft. (21)

When estimating (16)and (17), we put each macro indicator for China on the left hand side and

apply the same indicator from the control countries as Yt on the right hand side. This is based on the

belief that the same macro variable are more likely to share common latent factors.

3.3.1 The Real GDP

As the first step, we apply (16) to estimate the treatment effect on real GDP growth. Panel A of

Table 3 summarizes the parametric estimation results based on the pre-treatment data. The intercept

is suppressed as it turns out to be statistically insignificant. Control countries are Slovenia, Estonia,

and France. Although the real GDP growth series for China starts from the first quarter of 1992, data

for Estonia and Slovenia are not available until the first quarter of 1996. This gives us a total sample

size of 69 running from the first quarter of 1996 to the first quarter of 2013. The pre-treatment sample

size before the fourth quarter of 2008 equals 51 and the post-treatment sample size is 18.6

As Table 2 shows, the total trade volume with China over GDP ratios are all below 5% for each

of the three control countries, so that the exogeneity criterion is met. Based on the pre-treatment

data, the three countries’ real GDP growths together with a time trend display a strong correlation for

China’s real GDP growth: the adjusted R-square equals 0.9906 and F-statistic is 1334.41, so that the

relevance criterion is also met. Our experimentations with other specifications show that the strong

relevance of these three countries to China’s real GDP growth are quite robust: for example, if we

take off the time trend, the adjusted R-square and the F-statistic remain high as 0.9477 and 505.03

correspondingly in the estimation of the parametric setting.

To predict the post-treatment counterfactuals for China, we apply the estimated coefficients to

the post-treatment data of the control countries. Figure 1 plots, starting from the fourth quarter of

2008 until the first quarter of 2013, the values of the predicted counterfactuals based on the paramet-

ric estimation (the dashed line) and those based on the semi-parametric estimation (the dotted line)

together with the observed under-treatment data (the solid line). As Figure 1 shows, the dashed line

and the dotted line almost coincide over the entire sample period, implying the parametric estima-

tion results are very consistent with the semi-parametric estimation results and thus the parametric

model is correctly specified. The two hypothetical lines track the solid line quite closely before the

6Our search for good control countries suggests that most countries with relatively strong prediction powers for China’skey macro variables turn out to have short time series for available data. For example, Poland and Russian Republicdisplay strong prediction powers for many variables for China, but Poland’s data starts from the first quarter of 1996and Russian Republic’s data starts from the first quarter of 2003.

14

treatment, suggesting good fit by both estimation settings. Starting from the treatment quarter, the

two hypothetical lines separate from the solid line: the actual growth moves above the hypothetical

counterfactual growths, implying a positive treatment effect. Starting from the fourth quarter of 2010,

the hypothetical lines move close to the solid line and track it closely again, suggesting the treatment

effect was temporary.

The actual under-treatment data, the predicted values of counterfactuals, and the estimated treat-

ment effects are listed in Panel B of Table 3. Based on the parametric estimation, the treatment

effects remained positive starting from the beginning of the treatment and peaked to 5.44% in the

third quarter of 2009. However, based on the semi-parametric estimation, the treatment effect did

not become evident until the second quarter of 2009 – about two quarters after the actual treatment

took place. The semi-parametric estimation suggests the treatment effect peaked to 6.12% also in

the third quarter of 2009. Both estimations suggest that, starting from the fourth quarter of 2010,

the treatment effect dropped below zero and remained afterward within an approximated value range

between -1% and zero.

The summary statistics of treatment effects are listed at the bottom of Panel B. For two years

after the fiscal stimulus, the actual real GDP growth in China averages at 9.39%, the parametrically

estimated counterfactual in the absence of the stimulus is 6.19%, suggesting an average treatment

effect for either quarters of 3.20%; the semi-parametrically estimated counterfactual equals 6.93%,

suggesting an average treatment effect of 2.46%. After the fourth quarter of 2010, the estimated

treatment effect averages only -0.17% according to the parametric estimation and -0.33% according

to the semi-parametric estimation. As a comparison, the standard deviation of the pre-treatment real

GDP growth from the actual data is 0.96%. This suggests negligible treatment effects two years after

the implementation of the fiscal stimulus.

Several remarks should be made regarding the timing of the treatment effects. Our estimates

suggest that, in the absence of the stimulus, the Chinese real GDP growth should have continued to

fall during the global financial crisis, reaching its bottom value of 2% or 3% around the third quarter of

2009. Our estimation suggests the policy intervention made the recovery two years earlier. Moreover,

although the program was implemented through 2010 only, there is no reason to believe its influence

cannot last past 2010. However, our estimates suggest its impact is indeed temporary. The real GDP

growth after 2010 would have been about the same even without the stimulus.

15

3.3.2 The Real GDP Components

As a further step, we examine the treatment effects on private components of real GDP: investment,

consumption, export, and import. Unfortunately, the semi-parametric estimation is not applicable here

as the available sample sizes for these variables in China are very small. Nonetheless, our estimation

results on the real GDP growth have already indicated that the parametric model is correctly specified

as the semi-parametric estimation produces very similar results to those of the parametric estimation.

Therefore, here we estimate the treatment effects on real investment growth, real consumption growth,

real export growth, and real import growth parametrically only, at the purpose of providing further

insights on how the stimulus package has influenced China’s macro economy.

Panel A of Table 4 presents the estimated weights based on the pre-treatment data. The control

countries are Argentina and Mexico for investment, Estonia, Spain, and United States for consumption,

Russia and Sweden for export, and Netherlands, Spain, Turkey, and United Kingdom for import. All

these countries’ total trade volume with China over their own GDP ratios are below 5% according to

Table 2, implying the exogeneity criterion of our approach is satisfied. Panel A shows the adjusted

R2’s and F-statistics are all high, suggesting the related data series in the control countries are highly

correlated with the corresponding variables in China; hence, the relevance criterion is also met.

Figure 2 plots the actual and hypothetical series on real investment growth, real consumption

growth, real export growth, and real import growth before and after the treatment; the corresponding

data series are presented in Panel B of Table 4. The estimated results on the treatment effects can be

summarized as follows:

Firstly, the treatment effects are estimated to be all positive, but magnitudes of the responses

differ. The left column of Figure 2 shows the hypothetical series track the actual data quite well

before the treatment, but move below the actual data series after the treatment. In particular, we

estimate the stimulus package had raised China’s real investment growth by 22.15% , real consumption

growth by 4.61%, real export growth by 8.03%, and real import growth by 7.84%. In short, the 2008

Economic Stimulus Package had raised all private components of China’s real GDP.

Secondly, we estimate the treatment effects of the stimulus package to be temporary on all variables

except for consumption. The hypothetical series move close to the actual data series again, starting

from the fourth quarter of 2010 or slightly before. This suggests the package had raised China’s

investment, export, and import only temporarily. However, the hypothetical consumption growth

stays below the actual consumption growth throughout the entire sample period.

Thirdly, the timing of the responses to the stimulus package differs. As Figure 2 shows, investment

and export responded slightly before the actual taking place of the stimulus package, while consump-

16

tion and import responded slightly afterward. The positive treatment effect on investment ended by

the fourth quarter of 2010, exactly when the treatment effect on real GDP ended. The treatment

effects on export and import, however, ended well before the end of 2010. Our estimations show

positive treatment effect on consumption lasted throughout the sample period.

Lastly but not leastly, patterns displayed in Figure 2 are consistent with the cyclical behaviors

of investment, consumption, export, and import documented in the literature of macroeconomics.

Investment is known to be the most volatile macro variable due to rational or adaptive expectations

or animal spirits; and our results suggest China’s investment responds the fastest and with the biggest

magnitude. Consumption is documented to be the least volatile macro variable due to risk aversion

or habit formation; and our results show China’s consumption responds the slowliest and with the

smallest magnitude, while its effect lasts the longest. Perhaps more interestingly, our results also raise

some new research questions. For example, Figure 2 shows that, the real investment growth moves

below the hypothetical growth after the fourth quarter of 2011. Does it suggest the stimulus package

had otherwise suppressed China’s investment in the longer run? Moreover, the treatment effect on

consumption remains positive throughout our sample period. Has the stimulus package permanently

raised China’s consumption? We leave the exploration of these questions for future research.

In short, the estimated treatment effects on priviate components of China’s real GDP are consistent

with those on real GDP. That is, the 2008 Economic Stimulus Package has raised the real GDP growth

and growths in its private components, but only temporarily. Our results are consistent with the

cyclical patterns of key macro variables documented with the western data.

4 Conclusion

The post-treatment outcome is affected by many time-varying latent factors, which makes it difficult

to properly evaluate the effects of policy interventions. HCW (2012) initiate an approach that offers

more flexibilities by allowing the influence of common latent factors to vary cross-section, which

makes the method much more applicable to reality. In this paper, we relax the linear conditional

mean assumption in their approach by extending it to a semi-parametric setting. Then, we derive

the asymptotic distribution properties of the average treatment effect estimator and apply both HCW

(2012) and the semi-parametric model to study the macroeconomic effects arose by the 2008 Chinese

Economic Stimulus Program.

Our results show that even though the economic stimulus made the recovery two years earlier,

the treatment effect for real GDP growth is indeed temporary. Specifically, the plan had raised the

annual real GDP growth by about 3.2% for the two years after the economic stimulus. But the real

17

GDP growth after 2010 would have been about the same without the stimulus. These results remain

robust to linear setting, semi-parametric setting and various control group selctions. Moreover, this

temporary boost is consistent with real investment growth, real consumption growth, real export

growth and real import growth.

18

References

Abadie, A., and Gardeazabal, J. (2003). The Economic Costs of Conflict: A Case Study of the Basque

Country. American Economic Review 93, 112-132.

Abadie, A., Diamond A. and Hainmueller J. (2010). Synthetic Control Methods for Comparative Case

Studies: Estimating the Effect of California’s Tobacco Control Program. Journal of the American

Statistical Association 105, 493-505.

Bai, J, and Ng S. (2002). Determining the number of factors in approximate factor models. Econo-

metrica 70, 191-221.

Bai, C., Li, Q. and Ouyang, M. (2014). Property Taxes and Home Prices: A Tale of Two Cities.

Journal of Econometrics, forthcoming.

Carroll, C., Slacalek, J. and Sommer, M. (2011). International evidence on sticky consumption growth.

Review of Economics and Statistics 93, 1135-1145.

Card, D. and Krueger, A.B. (1994). Minimum Wages and Employment: A Case Study of the Fast-Food

Industry in New Jersey and Pennsylvania. American Economic Review 84 (4): 772-793.

Ching, H.S., Hsiao, C. and Wan, S.K. (2012). Impact of CEPA on the labor market of Hong Kong.

China Economics Reviews 23, 975-981.

Fan, Y. and Li, Q. (1999). Root-n-consistent estimation of partially linear time series models. Journal

of Nonparametric Statistics 11, 251-269.

Hsiao, C., Ching, H.S. and Wand, S.K. (2012). A panel data approach for program evaluaton: Mea-

suring the benefit of political and economic integration of Hong Kong with mainland China. Journal

of Applied Econometrics 27(5), 705-740.

Robinson, P. (1988). Root-n-consistent semiparametric regression. Econometrica 56, 931-954.

Wong, C. (2011). The economic stimulus Program and Problems of Macroeconomic Management in

China. Handouts on the 32nd Annual Meeting of OECD Senior Budget Officials.

Yoshihara, K. (1976). Limiting behavior of U-statistic for stationary, absolutely regular processes. Z.

Wahrscheinlichkeistheorie verw. Gebiete 35, 237-252.

19

Figure 1: Treatment Effect in Real GDP

0.0

000

0.0

500

0.1

000

0.1

500

2008q4 2010q4

actual hypothetical

hypothetical_semi

0.0

200

0.0

400

0.0

600

0.0

800

0.1

000

0.1

200

2008q4 2010q4

actual hypothetical

hypothetical_semi

Note: ”Actual” indicates the plot of data; ”hypothetical” indicates the plots of hypothetical outcomeapproximated based on the parametric estimation; ”hypothetical-semi” indicates the plot of hypothetical

outcome approximated based on the semi-parametric estimation. Data on actual real GDP growth is from theNational Bureau of Statistics of China; estimated growth is based on the estimation results before theimplementation of the 2008 economic stimulus package in the fourth quarter of 2008 (2008q4). 2010q4

indicates the fourth quarter of 2010. Real GDP growth is measured as the annual growth rate from the samequarter of the previous year to the current quarter. The top figure plots data from the first quarter of 1996 to

the first quarter of 2013; the bottom figure emphasizes the treatment effect by plotting the post-treatmentdata only. See Table 3 for the estimation results and the numerical values of the plots; see text for details.

20

Figure 2: Treatment Effects on Private Components of Real GDP

−0.2

000

0.0

000

0.2

000

0.4

000

2008q4 2010q4

actual hypothetical

(a) Investment: full sample

−0.2

000

−0.1

000

0.0

000

0.1

000

0.2

000

0.3

000

2008q4 2010q4

actual hypothetical

(b) Investment: after treatment

−0.0

500

0.0

000

0.0

500

0.1

000

0.1

500

0.2

000

2008q4 2010q4

actual hypothetical

(c) Consumption: full sample

0.0

000

0.0

500

0.1

000

0.1

500

2008q4 2010q4

actual hypothetical

(d) Consumption: after treatment

−0.4

000

−0.2

000

0.0

000

0.2

000

0.4

000

2008q4 2010q4

actual hypothetical

(e) Export: full sample

−0.4

000

−0.2

000

0.0

000

0.2

000

0.4

000

2008q4 2010q4

actual hypothetical

(f) Export: after treatment

−0.2

000

0.0

000

0.2

000

0.4

000

2008q4 2010q4

actual hypothetical

(g) Import: full sample

−0.2

000

0.0

000

0.2

000

0.4

000

2008q4 2010q4

actual hypothetical

(h) Import: after treatment

Note: data is from the National Bureau of Statistics of China and China’s Customs Statistical Yearbooks. Allvariables are measured as quarterly annual real growth rates - the annual growth rate from the same quarterof the previous year to the current quarter. Estimations are based on data before the fourth quarter of 2008.2008q4 indicates when the the economic stimulus plan was implemented; 2010q4 indicates the fourth quarter

of 2010. Hypothetical series are approximated based on estimations with the pre-treatment data. See notes toTable 4 for estimation details. See text for variable measurement and other details.

21

Table 1: The Augmented Dickey-Fuller Tests Results

Countries GDP Investment Consumption Export Import

Argentina 0.0323 0.0022 0.0117 0.0170 0.0003Australia 0.0000 0.0000 0.0046 0.0036 0.0000Austria 0.1715 0.0189 0.6859 0.0046 0.0003Belgium 0.0009 0.0188 0.0118 0.0011 0.0005Brazil 0.0001 0.0016 0.0266 0.0717 0.0002Canada 0.0030 0.0001 0.0071 0.0088 0.0000Chile 0.4241 0.0189 0.0153 0.4795 0.0017Czech Republic 0.2390 0.0527 0.4041 0.0039 0.0020Denmark 0.0233 0.0935 0.0274 0.0081 0.0017Estonia 0.0286 0.2941 0.0068 0.0000 0.0026Finland 0.0069 0.0988 0.0600 0.0003 0.0001France 0.0013 0.0010 0.1948 0.0000 0.0000Germany 0.0004 0.0054 0.1573 0.0023 0.0053Hungary 0.0052 0.0062 0.2041 0.0006 0.0010Iceland 0.3766 0.3787 0.0072 0.2609 0.1483India 0.2366 0.2465 0.2856 0.0262 0.0005Indonesia 0.0171 0.0064 0.0252 0.0000 0.0004Ireland 0.5846 0.7972 0.3765 0.0143 0.0636Israel 0.0037 0.2681 0.0247 0.0001 0.0058Italy 0.0008 0.0121 0.0011 0.0002 0.0000Japan 0.0006 0.0280 0.0083 0.0000 0.0073Korea 0.0000 0.0005 0.0003 0.0000 0.0000Luxembourg 0.0480 0.0032 0.0079 0.0006 0.0008Mexico 0.0014 0.0465 0.1835 0.1263 0.0817Netherlands 0.0142 0.1078 0.3549 0.0001 0.0023New Zealand 0.0001 0.0419 0.0215 0.0004 0.0031Norway 0.0045 0.0248 0.0122 0.0032 0.0163Poland 0.1625 0.6260 0.2779 0.0281 0.0090Portugal 0.0334 0.1306 0.0491 0.0030 0.0051Russian Federation 0.1636 0.2786 0.1249 0.0709 0.0976Slovak Republic 0.2701 0.0156 0.2740 0.0122 0.0059Slovenia 0.0107 0.2721 0.4024 0.0013 0.0424South Africa 0.0041 0.0547 0.0167 0.0027 0.0344Spain 0.0508 0.2750 0.0829 0.0401 0.0539Sweden 0.0009 0.1210 0.0172 0.0001 0.0000Switzerland 0.0023 0.0004 0.0729 0.0095 0.0024Turkey 0.0070 0.0019 0.0266 0.0025 0.0014United Kingdom 0.0009 0.0879 0.0136 0.0013 0.0019United States 0.0045 0.0340 0.0726 0.0003 0.0010China 0.0164 0.0375 0.0174 0.0002 0.0002

Note: this table reports the Mackinnon approximated p values of the augmented Dickey-Fuller tests on quarterly series for 40 countries. All variables are in real terms andmeasured as annual growth (the log first difference between each quarter and the samequarter of the last year). Test specifications include a drift and two lags. The samplesizes range from 37 to 85 with an average of 75. See the text for details on pricedeflators, measurements, and data sources.

22

Table 2: Exogeneity Measure: Trade with China over GDP Ratio

Countries trade/GDP Countries trade/GDP

Argentina 0.0441 Australia 0.0567*Austria 0.0118 Belgium 0.0122Brazil 0.0324 Canada 0.0679*Chile 0.0966* Czech Republic 0.0288Denmark 0.0237 Estonia 0.0285Finland 0.0400 France 0.0138Germany 0.0317 Hungary 0.0485Iceland 0.0076 India 0.0424Indonesia 0.0618* Ireland 0.0270Israel 0.0300 Italy 0.0166Japan 0.0550* Korea 0.1998*Luxembourg 0.0703* Mexico 0.0161Netherlands 0.0488 New Zealand 0.0339Norway 0.0104 Poland 0.0197Portugal 0.0107 Russian Federation 0.0343Slovak Republic 0.0301 Slovenia 0.0201South Africa 0.0654* Spain 0.0165Sweden 0.0209 Switzerland 0.0215Turkey 0.0172 United Kingdom 0.0172United States 0.0235

Note: this table reports the total nominal trade volume with China as aratio of each country’s nominal GDP for the 39 sample OECD countries.Total trade volume is measured as the sum of nominal values of import andexport between the sample country and China. The reported ratios arethe average of 2008, 2009, and 2010. Data are from the Chinese CustomsYear Books of 2008, 2009, and 2010. *indicates that the ratio exceeds thethreshold value of 5% and violates the exogeneity criterion.

23

Table 3: The Real GDP

Panel A: Control Countries and their parametric weightsR2 = 0.9906

F-stats=1334.41

Countries weights SD T

Slovenia 0.7001*** 0.1048 6.68Estonia 0.0949*** 0.0351 2.71France -0.5965*** 0.1184 -5.04time trend 0.0452*** 0.0026 17.66

Panel B: Treatment Effects

quarter Actual Hypothetical Treatment Hypothetical Treatmentparametric semiparametric

2008q4 0.0960 0.0785 0.0175 0.0886 0.00742009q1 0.0660 0.0481 0.0179 0.0776 -0.01162009q2 0.0750 0.0290 0.0460 0.0202 0.05482009q3 0.0820 0.0276 0.0544 0.0208 0.06122009q4 0.0920 0.0495 0.0425 0.0704 0.02162010q1 0.1210 0.0756 0.0454 0.0837 0.03732010q2 0.1120 0.0919 0.0201 0.0952 0.01682010q3 0.1070 0.0954 0.0116 0.0980 0.00902010q4 0.1040 0.1063 -0.0023 0.1077 -0.00372011q1 0.0980 0.1007 -0.0027 0.1021 -0.00412011q2 0.0960 0.0993 -0.0033 0.1010 -0.00502011q3 0.0950 0.0976 -0.0026 0.0991 -0.00412011q4 0.0930 0.0846 0.0084 0.0870 0.00602012q1 0.0810 0.0902 -0.0092 0.0917 -0.01072012q2 0.0780 0.0800 -0.0020 0.0821 -0.00412012q3 0.0770 0.0782 -0.0012 0.0799 -0.00292012q4 0.0780 0.0802 -0.0022 0.0811 -0.00312013q1 0.0770 0.0763 0.0007 0.0787 -0.0017

2008q4-2010q3

Average 0.0939 0.0619 0.0319 0.0693 0.02462010q4-2013q1

Average 0.0877 0.0894 -0.0017 0.0910 -0.0033

Note: The table reports the estimated results for real GDP growth. Real GDPgrowth is measured as quarterly annual growth: the log first difference betweenthe current real GDP level and that of the same quarter of the last year. Panel Areports the results from the parametric regression y1t = δ1+δ′Yt+ε1t. The intercept issuppressed as its coefficient turns out to be statistically insignificant. The sample sizeis 51. ***indicates significance at the 1% level. Panel B reports the treatment effectsestimated both in the parametric setting and the semi-parametric setting: it equalsthe difference between the actual real GDP growth and the hypothetical growthapproximated based on estimations. Data for China is from the National Bureauof Statistics of China; data for control countries are from the OECD statisticaldatabases. See text for details.

24

Tab

le4:

Th

eP

riva

teC

omp

onen

tsof

Rea

lG

DP

Panel

A:

Wei

ghts

of

Contr

ol

Countr

ies

Inves

tmen

tC

onsu

mpti

on

Exp

ort

Imp

ort

R2

=0.

8788

R2

=0.

9482

R2

=0.

9479

R2

=0.

9055

F-s

tats

=37

4.11

F-s

tats

=236.7

7F

-sta

ts=

195.1

3F

-sta

ts=

53.3

6

Contr

ols

Countr

ies

wei

ghts

SD

Countr

ies

wei

ghts

SD

Countr

ies

wei

ghts

SD

countr

ies

wei

ghts

SD

Arg

enti

na

0.2

296***

0.0

789

Est

onia

0.0

059***

0.0

019

Russ

ia1.9

079***

0.1

996

NL

D-1

.3977***

0.3

314

Mex

ico

1.0

455***

0.2

965

Spain

0.0

153***

0.0

053

Sw

eden

0.7

153***

0.2

209

Spain

0.9

426***

0.2

169

U.S

.-0

.0448***

0.0

079

UK

0.4

099***

0.1

217

Turk

ey0.2

492***

0.0

533

tim

etr

end

0.0

006***

0.0

001

0.0

550***

0.0

105

dum

my1

19.4

374***

2.1

550

dum

my2

-16.8

283***

2.0

893

-0.1

269***

0.0

066

Panel

B:

Tre

atm

ent

Eff

ects

quart

erIn

ves

tmen

tC

onsu

mpti

on

Exp

ort

Imp

ort

Act

ual

Pre

dic

ted

Tre

at

Act

ual

Pre

dic

ted

Tre

at

Act

ual

Pre

dic

ted

Tre

at

Act

ual

Pre

dic

ted

Tre

at

2008q4

0.1

233

0.0

107

0.1

126

0.0

498

0.1

098

-0.0

600

-0.0

807

-0.2

483

0.1

676

-0.1

386

-0.0

964

-0.0

422

2009q1

0.1

279

-0.1

199

0.2

478

0.1

037

0.0

873

0.0

164

-0.1

323

-0.3

109

0.1

787

-0.1

028

-0.1

292

0.0

264

2009q2

0.2

547

-0.1

693

0.4

241

0.1

196

0.0

842

0.0

354

-0.1

515

-0.2

390

0.0

875

0.0

249

-0.0

766

0.1

015

2009q3

0.2

520

-0.1

701

0.4

221

0.1

064

0.0

540

0.0

524

-0.0

751

-0.1

729

0.0

978

0.1

174

0.0

007

0.1

167

2009q4

0.1

747

-0.1

199

0.2

946

0.0

930

0.0

389

0.0

541

0.1

980

0.1

435

0.0

545

0.3

581

0.0

984

0.2

596

2010q1

0.1

778

-0.0

155

0.1

933

0.1

200

0.0

549

0.0

651

0.2

026

0.2

599

-0.0

574

0.3

326

0.1

577

0.1

749

2010q2

0.1

078

0.0

475

0.0

603

0.0

946

0.0

529

0.0

417

0.3

300

0.2

106

0.1

193

0.1

179

0.1

405

-0.0

225

2010q3

0.0

917

0.0

741

0.0

175

0.0

820

0.0

422

0.0

399

0.1

749

0.1

803

-0.0

053

0.1

217

0.1

083

0.0

134

2010q4

0.0

971

0.1

018

-0.0

047

0.0

934

0.0

189

0.0

745

0.1

155

0.1

610

-0.0

455

0.1

269

0.1

456

-0.0

187

2011q1

0.0

840

0.1

185

-0.0

346

0.0

785

0.0

044

0.0

742

0.2

307

0.0

902

0.1

406

0.0

879

0.1

269

-0.0

391

2011q2

0.0

788

0.1

528

-0.0

740

0.0

835

0.0

179

0.0

656

0.0

555

0.0

778

-0.0

223

0.0

190

0.0

754

-0.0

564

2011q3

0.0

611

0.1

307

-0.0

696

0.1

000

0.0

356

0.0

644

0.0

621

0.0

386

0.0

236

0.0

405

0.0

785

-0.0

379

2011q4

0.0

875

0.1

006

-0.0

131

0.1

033

0.0

274

0.0

759

0.0

140

0.0

116

0.0

024

0.0

204

0.0

282

-0.0

079

2012q1

0.0

638

-0.0

033

0.0

671

0.0

513

0.0

433

0.0

080

2012q2

0.0

781

0.0

058

0.0

724

0.0

680

0.0

298

0.0

381

2008q4-2

010q3

2008q4

and

aft

erw

ard

2008q4-2

010q3

2008q4-2

010q3

Aver

age

0.1

637

-0.0

578

0.2

215

0.0

945

0.0

483

0.0

461

0.0

582

-0.0

221

0.0

803

0.1

039

0.0

254

0.0

784

2010q4

and

aft

erw

ard

2010q4

and

aft

erw

ard

2010q4

and

aft

erw

ard

Aver

age

0.0

817

0.1

209

-0.0

392

0.0

885

0.0

545

0.0

340

0.0

591

0.0

753

-0.0

162

Note

:T

he

table

rep

ort

sth

ees

tim

ate

dre

sult

sfo

rre

al

inves

tmen

t,re

al

consu

mpti

on,

real

exp

ort

,and

real

imp

ort

.A

llva

riable

sare

mea

sure

das

the

quart

erly

annual

gro

wth

-th

elo

gfirs

tdiff

eren

ceb

etw

een

the

curr

ent

level

and

that

of

the

sam

equart

erof

the

last

yea

r.P

anel

Are

port

sth

ere

sult

sfr

om

the

regre

ssio

ny1t

+Y′ tγ

+β1t+x′ tβ

2+ε 1t.

dum

my1

indic

ate

sa

dum

my

for

the

firs

tquart

erof2006;dum

my2

indic

ate

sa

dum

my

for

the

fourt

hquart

erof2006.

***in

dic

ate

ssi

gnifi

cance

at

the

1%

level

.P

anel

Bre

port

sth

etr

eatm

ent

effec

ts,

as

the

diff

eren

ceb

etw

een

the

act

ual

data

and

the

hyp

oth

etic

al

data

appro

xim

ate

dusi

ng

wei

ghts

list

edin

Panel

A.

Data

for

Chin

ais

from

the

Nati

onal

Bure

au

of

Sta

tist

ics

of

Chin

aand

Chin

a’s

Cust

om

sSta

tist

ical

Yea

rbooks;

data

for

oth

erco

untr

ies

are

from

the

OE

CD

stati

stic

al

data

base

s.See

text

for

vari

able

mea

sure

men

tsand

more

det

ails.

25

5 Appendix

We first introduce some notations and define some quantities. We will use An = Bn + (s.o.) to denote

that Bn is the leading term of An, where (s.o.) denote terms having probability order smaller than

that of Bn. Define ξt = z1t − E(z1t|z2t), and note that y1it = y0

1t + ∆1t, y01t = z′1tβ + θ(z2t) + ε1t, and

y01t = z′1tβ + θ(z2t), where θ(z2t) is a feasible estimator of θ(z2t) and is defined in (23) below. Hence,

for t ≥ T + 1 + 1, we have

∆1t = y11t − y0

1t

= y1t − z′1tβ − θ(z2t)

= ∆1t − z′1t(β − β)− [θ(z2t)− θ(z2t)] + ε1t. (22)

From Robinson (1988) or Fan and Li (1999) we know that β − β = Op(T−1/21 ).

Let E(y1t|z2t) and E(z1t|z2t) denote the kernel estimators of E(y1t|z2t) and E(z1t|z2t), respectively.

Using θ(z2t) = E(y1t|z2t)− E(z1t|z2t)′β and y1t − z′1tβ = θ(z2t) + ε1t, we obtain

θ(z2t) = E(y1t|z2t)− E(z1t|z2t)′β

= E(y1t|z2t)− E(z1t|z2t)′β − E(z1t|z2t)

′(β − β)

= E(y1t − z′1tβ|z2t)− E(z1t|z2t)′(β − β)

= θ(z2t) + [θ(z2t)− θ(z2t)] + ε1t − E(z1t|z2t)′(β − β), (23)

where in the last equality above we added/subtracted θ(z2t), and we also used E[y1t − z′1tβ|z2t] =

E[θ(z2t) + ε1t|z2t] ≡ θ(z2t) + ε1t, respectively.

Combining (22) and (23), and noticing that E(z1t|z2t)′(β − β) = E(z1t|z2t)

′(β − β) + op(T−1/21 )

for any t ∈ {1, ..., n} by the uniform convergence result of E(z1t|z2t) − E(z1t|z2t) over the compact

set Sn. The use a trimming set Sn is also needed to trim data near the boundary of the support.

We trim sufficiently small intervals off the boundaries when the boundary of the support zt is known.

However, in practice the boundaries of zt are usually unknown. In this case, we sort the data to find

the sample minimums and maximums of zt. With sufficiently large sample size, such sample minimums

and maximums converge to the true boundaries of zt. Hence, we just trim sufficiently small intervals

of the sample boundaries when true boundaries of S are unknown. Specifically, let S = [az, bz] denote

the support of zt. When az, bz are unknown, we use az ≡ min1≤t≤n zt and bz ≡ max1≤t≤n zt to

estimate az and bz, respectively. It is well established that az − az = Op(n−1) and bz − bz = Op(n

−1)

(they are super-consistent estimates) The trimming set can be chosen as Sn = [az + δz, bz − δz], where

δz = δn,z → 0 as n→∞ and that h/δz = o(1).

26

Hence, we have, for any t ∈ {1, ..., n},

∆1t = ∆1t + [θt − θt] + [ε1t − ε1t]− ξ′t(β − β) + op(T−1/21 ), (24)

where θt = θt(z2t) and θt − θt(z2t). θt and ε1t are defined as follows

θt ≡ θ(z2t) = T−11

T1∑s=1

θ(z2s)Kh,ts/ft, (25)

ε1t = T−11

T1∑s=1

ε1sKh,ts/ft, (26)

where Kh,ts =∏qj=1 h

−1j k((z2t,j − z2s,j)/hj), z2t,j is the jth component of z2t, ft = T−1

1

∑T1s=1Kh,ts.

Consequently we have

∆1 − ∆1 =1

T2

T∑t=T1+1

[∆1t −∆1t

]

=1

T2

T∑t=T1+1

[θt − θt] +1

T2

T∑t=T1+1

[ε1t − ε1t]− E(z1|z2)′(β − β) + op(T−1/21 )

= A1n +A2n −A3n + op(T−1/21 ), (27)

where E(z1|z2) = T−12

∑Tt=T1+1E(z1t|z2t) = Op(1), A1n = T−1

2

∑Tt=T1+1[θt−θt], A2n = T−1

2

∑Tt=T1+1[ε1t−

ε1t] and A3n = E(z1|z2)′(β − β).

Proof of Proposition 2.1 (i)

Using 1/f(z2t) = 1/f(z2t) + (s.o.) uniformly in t ∈ {1, ..., n}, it is straightforward to show that

A1n = A1n,0 + (s.o.), where A1n,0 is the leading term of A1n and is defined by

A1n,0 =1

T1T2

T∑t=T1+1

T1∑s=1

(θ(z2t)− θ(z2s))Kh,ts/f(z2t).

Let (z1t, z2t) be an independent process with the same marginal distribution as (z1t, z2t). By

following the proof arguments as in Fan and Li (1999), i.e., using lemma 1 of Yoshihara (1976), one

can show that E(A1n,0) = E(A1n,0) + (s.o.) and V ar(A1n,0) = V ar(A1n,0) + (s.o.), where A1n,0 is

defined as in A1n,0 except that (z1t, z2t) are replaced by (z1t, z2t) wherever they occur. Then it is easy

27

to evaluate E(A1n,0) and V ar(A1n,0) as follows:

E(A1n,0) = E{E[(θ(z2t)− θ(z2s))Kh,ts|z2t]/f(z2t)}

= −(µ2/2)

q∑j=1

h2jE{[θjj(z2t)f(z2t) + θj(z2t)fj(z2t)]/f(z2t)}+Op(||h||3)

≡q∑j=1

h2jBj +Op(||h||3)

= O(||h||2), (28)

where Kh,ts is defined the same way as Kh,ts but with z2t and z2s replaced by z2t and z2s, µ2 =∫k(v)v2dv and Bj = −(µ2/2)E{[θjj(z2t)f(z2t) + θj(z2t)fj(z2t)]/f(z2t)}.

Also, it is easy to show that

V ar(A1n,0) =1

T 21 T

22

[(T1T

22 + T 2

1 T2)O(||h||2) + T1T2O((h1...hq)−1)]

= op(T−11 + T−1

2 ). (29)

Equations (28) and (29) imply that

A1n,0 =

q∑j=1

h2jBj + o(T

−1/21 + T

−1/22 ) (30)

since T2||h||6 = o(1).

Next, we consider A2n. Similarly to the above derivations one can easily obtain that A2n =

A2n,0 + (s.o.), where A2n,0 is the leading term of A2n and is defined by

A2n,0 =1

T2

T∑t=T1+1

[ε1t − ε1t]ft/ft

=1

T1T2

T∑t=T1+1

T1∑s=1

(ε1t − ε1s)Kh,ts/ft

=1

T1T2

T∑t=T1+1

T1∑s=1

HT1(w2t, w2s)

=1

T1

T1∑s=1

Hs +1

T2

T∑t=T1+1

Ht +1

T1T2

T1∑s=1

T∑t=T1+1

[Hts −Ht −Hs]

=1

T1

T1∑s=1

ε1t −1

T2

T∑t=T1+1

ε1s + op(max{T−1/21 , T

−1/22 })

= Op(T−1/21 + T

−1/22 ), (31)

28

where Hts = HT1(w2t, w2s) = (ε1t − ε1s)Kh,ts/ft, wt = (ε1t, z2t), Ht = E[Hts|z2t] = ε1t + (s.o.),

Hs = E[Hts|z2s] = −ε1s + (s.o.), and we also use the fact that the second moment of Bndef=

1T1T2

∑T1s=1

∑Tt=T1+1[Hts − Ht − Hs] is O((T1T2h1...hq)

−1) = o(max{T−11 , T−1

2 }), which implies that

Bn = op(max{T−1/21 , T

−1/22 }).

Finally, A3n = Op(T−1/21 ) follows from β − β = Op(T

−1/21 ) (e.g., Robinson (1988) or Fan and Li

(1999)). Summarizing the above we have shown that

∆1 − ∆1 = Op(||h||2 + T−1/21 + T

−1/22 ). (32)

Proof of Proposition 2.1 (ii)

By adding/subtracting ∆1 in (32) and using the fact that ∆1 −∆1 = Op(T−1/22 ), we obtain from

(32) that

∆1 −∆1 = ∆1 − ∆1 + ∆1 −∆1

= Op(||h||2 + T−1/21 + T

−1/22 ). (33)

Proof of Theorem 2.2

Proof of Theorem 2.2 (i)

Because V ar(A1n) = V ar(A1n,0)+(s.o.) = o(T−11 +T−1

2 ), the variance of A1n is negligible compared

with other terms. Hence, A1n affects the asymptotic distribution of ∆1 − ∆1 only through its mean:

E(A1n,0) =∑q

j=1Bjh2j +O(||h||3) as derived in (28). Hence, under the condition that T1||h||6 = o(1)

and T2||h||6 = o(1), we have

A1n,0 =

q∑j=1

Bjh2j + op(T

−1/21 + T

−1/22 ), (34)

where Bj = −(µ2/2)E{[θjj(z2t)f(z2t) + θj(z2t)fj(z2t)]/f(z2t)}.

From (31) we have that

A2n,0 =1

T1

T1∑s=1

ε1t −1

T2

T∑t=T1+1

ε1s + op(max{T−1/21 , T

−1/22 }). (35)

Case (i)

Under the assumption that T2/T1 → 0 as T1, T2 →∞, we have

A2n,0 = − 1

T2

T∑t=T1+1

ε1s + op(T−1/22 ), (36)

29

From β − β = Op(T−1/21 ) we have

A3n = Op(T−1/21 ) = op(T

−1/22 ). (37)

Summarizing (34), (36) and (37), we have shown that

√T2

∆1 − ∆1 −q∑j=1

Bjh2j

= − 1√T2

T∑t=T1+1

ε1t + op(1)d→ N(0, σ2

ε ), (38)

and

√T2

∆1 −∆1 −q∑j=1

Bjh2j

=√T2

∆1 − ∆1 + ∆1 −∆1 −q∑j=1

Bjh2j

=

1√T2

T∑t=T1+1

−ε1t + ∆1t −∆1 −q∑j=1

Bjh2j

d→ N(0, σ2

∆ + σ2ε ). (39)

Case (ii).

If T2/T1 → λ ∈ (0,∞), from (35) we obtain

√T2A2n,0 =

√T2

1

T1

T1∑s=1

ε1t −1

T2

T∑t=T1+1

ε1s

+ op(max{T−1/21 , T

−1/22 })

d→ N(0, σ2ε (1 + λ)). (40)

By Theorem 2.1 of Fan and Li (1999) we know that√T1(β − β)

d→ N(0, V0), (41)

where V0 = A−1BA−1 is the asymptotic variance of√T2(β − β) with A = E(ξtξ

′t), B = E[ε21tξtξ

′t] and

ξt = z1t − E(z1t|z2t).

Using (41) we have that√T2A3n =

√T2/T1E(ξ′t)

√T1(β − β) + op(1)

d→ N(0, λE(ξ′t)V0E(ξt))

≡ N(0, V3), (42)

where V3 = λE(ξ′t)V0E(ξt).

Given that ∆t and ε1t, t = T1 +1, ..., T are uncorrelated with the pre-treatment data, we have that

30

the leading terms of A1n, A2n are asymptotically independent with the leading term of A3n.

Summarizing (34), (40) and (42), we have

√T2

∆1 − ∆1 −q∑j=1

Bjh2j

d→ N(0, σ2ε (1 + λ) + V3). (43)

By assumption A6, we have √T2(∆1 −∆1)

d→ N(0, σ2∆), (44)

where σ2∆ = limT2→∞ V ar(T

−1/22

∑Tt=T1+1 ∆1t). Also, given that E(ε1t|∆t) = 0, we have from (43)

and (44) that

√T2

∆1 −∆1 −q∑j=1

Bjh2j

=√T2

∆1 − ∆1 + ∆1 −∆1 −q∑j=1

Bjh2j

d→ N(0, σ2

ε + σ2∆ + V3). (45)

This completes the proof of Theorem 2.2.

31