transfer entropy for nonparametric granger causality

24
Article Transfer Entropy for Nonparametric Granger Causality Detection: An Evaluation of Different Resampling Methods Cees Diks 1,2 , Hao Fang 1,2,† * 1 CeNDEF, Amsterdam School of Economics, University of Amsterdam, The Netherlands 2 Tinbergen Institute, Amsterdam, The Netherlands * Correspondence: [email protected]; Tel.: +31-06-84124179 Current address: E. 3.28, Faculty of Economics and Business, University of Amsterdam, Roetersstraat 11, 1018 WB Amsterdam, The Netherlands Academic Editor: name Version May 16, 2017 submitted to Entropy Abstract: The information-theoretical concept transfer entropy is an ideal measure for detecting 1 conditional independence, or Granger causality in a time series setting. The recent literature indeed 2 witnesses an increased interest in applications of entropy-based tests in this direction. However, those 3 tests are typically based on nonparametric entropy estimates for which the development of formal 4 asymptotic theory turns out to be challenging. In this paper, we provide numerical comparisons for 5 simulation-based tests to gain some insights into the statistical behavior of nonparametric transfer 6 entropy-based tests. In particular, surrogate algorithms and smoothed bootstrap procedures are 7 described and compared. We conclude this paper with a financial application to the detection of 8 spillover effects in the global equity market. 9 Keywords: Transfer entropy; Granger causality; Nonparametric test; Bootstrap; Surrogate data 10 MSC: 62G09; 62G10; 91B84 11 JEL Classification: C12; C14; C15 12 1. Introduction 13 Entropy, introduced by [1,2], is an information theoretical concept with several appealing 14 properties, and therefore wide applications in information theory, thermodynamics and time series 15 analysis. Based on this classical measure, transfer entropy has become the most popular information 16 measure. This concept, which was coined by [3], was applied to distinguish a possible asymmetric 17 information exchange between the variables of a bivariate system. When based on appropriate 18 non-parametric density estimates, transfer entropy is a flexible non-parametric measure for conditional 19 dependence, coupling structure, or Granger causality in a general sense. 20 The notion of Granger causality was developed by the pioneering work of [4] to capture causal 21 interactions in a linear system. In a more general model-free world, the Granger causal effect can be 22 interpreted as the impact of incorporating the history of another variable on the conditional distribution 23 of a future variable in addition to its own history. Recently, various nonparametric measures have 24 been developed to capture such difference between conditional distributions in a more complex, 25 and typically nonlinear system. There is a growing list of such methods, based on, among others, 26 correlation-integral methods [5], kernel density estimation [6], Hellinger distances [7], copula functions 27 [8] and empirical likelihood [9]. 28 Submitted to Entropy, pages 1 – 24 www.mdpi.com/journal/entropy

Upload: others

Post on 05-Jun-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Transfer Entropy for Nonparametric Granger Causality

Article

Transfer Entropy for Nonparametric GrangerCausality Detection: An Evaluation of DifferentResampling Methods

Cees Diks 1,2, Hao Fang 1,2,†*1 CeNDEF, Amsterdam School of Economics, University of Amsterdam, The Netherlands2 Tinbergen Institute, Amsterdam, The Netherlands* Correspondence: [email protected]; Tel.: +31-06-84124179† Current address: E. 3.28, Faculty of Economics and Business, University of Amsterdam, Roetersstraat 11,

1018 WB Amsterdam, The Netherlands

Academic Editor: nameVersion May 16, 2017 submitted to Entropy

Abstract: The information-theoretical concept transfer entropy is an ideal measure for detecting1

conditional independence, or Granger causality in a time series setting. The recent literature indeed2

witnesses an increased interest in applications of entropy-based tests in this direction. However, those3

tests are typically based on nonparametric entropy estimates for which the development of formal4

asymptotic theory turns out to be challenging. In this paper, we provide numerical comparisons for5

simulation-based tests to gain some insights into the statistical behavior of nonparametric transfer6

entropy-based tests. In particular, surrogate algorithms and smoothed bootstrap procedures are7

described and compared. We conclude this paper with a financial application to the detection of8

spillover effects in the global equity market.9

Keywords: Transfer entropy; Granger causality; Nonparametric test; Bootstrap; Surrogate data10

MSC: 62G09; 62G10; 91B8411

JEL Classification: C12; C14; C1512

1. Introduction13

Entropy, introduced by [1,2], is an information theoretical concept with several appealing14

properties, and therefore wide applications in information theory, thermodynamics and time series15

analysis. Based on this classical measure, transfer entropy has become the most popular information16

measure. This concept, which was coined by [3], was applied to distinguish a possible asymmetric17

information exchange between the variables of a bivariate system. When based on appropriate18

non-parametric density estimates, transfer entropy is a flexible non-parametric measure for conditional19

dependence, coupling structure, or Granger causality in a general sense.20

The notion of Granger causality was developed by the pioneering work of [4] to capture causal21

interactions in a linear system. In a more general model-free world, the Granger causal effect can be22

interpreted as the impact of incorporating the history of another variable on the conditional distribution23

of a future variable in addition to its own history. Recently, various nonparametric measures have24

been developed to capture such difference between conditional distributions in a more complex,25

and typically nonlinear system. There is a growing list of such methods, based on, among others,26

correlation-integral methods [5], kernel density estimation [6], Hellinger distances [7], copula functions27

[8] and empirical likelihood [9].28

Submitted to Entropy, pages 1 – 24 www.mdpi.com/journal/entropy

Page 2: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 2 of 24

In contrast with the above-mentioned methods, transfer entropy-based causality tests do not29

attempt to capture the difference between two conditional distributions explicitly. Instead, with the30

information theoretical interpretation, it is a natural tool for measuring directional information transfer31

and Granger causality. We refer to [10] and [11] for detailed reviews of the relation between Granger32

causality and directed information theory. However, the direct application of entropy and its variants,33

though attractive, turns out to be difficult, if not impossible at all, due to the lack of asymptotic34

distribution theory for the test statistics. For example, [12] normalize the entropy to detect serial35

dependence with critical values obtained from simulations. [13] provide the asymptotic distribution36

for the [12] statistic with a specific kernel function. [14] derive a χ2 distribution for transfer entropy at37

the cost of the model-free property.38

On the other hand, to obviate the asymptotic problem, several resampling methods on transfer39

entropy have been developed for providing empirical distributions of the test statistics. Two popular40

techniques are bootstrapping and surrogate data. Bootstrapping is a random resampling technique41

proposed by [15] to estimate the properties of an estimator by measuring those properties from42

approximating distributions. The ‘surrogate’ approach developed by [16] is another randomization43

method initially employing Fourier transforms to provide a benchmark in detecting nonlinearity in a44

time series. It is worth mentioning that the two methods are different with respect to the statistical45

properties of the resampled data. For the surrogate method, the null hypothesis is maintained, while46

the bootstrap method does not seek to impose the null hypothesis on the bootstrapped samples. We47

refer to [17] and [18] for detailed applications of the two methods.48

However, not all resampling methods are suitable for entropy-based dependence measures. As49

[13] put it, a standard bootstrap fails to deliver a consistent entropy-based statistic because it does not50

preserve the statistical properties of a degenerated U-statistic. Similarly, with respect to traditional51

surrogates based on phase randomization of the Fourier transform, [19] criticize the particularly52

restrictive assumption of linear Gaussian process, and [20] point out that it cannot preserve the whole53

statistical structure of original time series.54

As far as we are aware, there are several applications of both methods in entropy-based tests, for55

example, [7] propose a smoothed local bootstrap for entropy-based test for serial dependence, [21]56

apply stationary bootstrap in partial transfer entropy estimation, [22] use time-shifted surrogates to57

test the significance of the asymmetry of directional measures of coupling, and [23] come up with58

the effective transfer entropy, which relies on random shuffling surrogate in estimation. [24] and [21]59

provide some comparisons between bootstrap and surrogate methods for entropy-based tests.60

In this paper, we adopt transfer entropy as a test statistic for measuring conditional independence61

(Granger non-causality). Being aware of the fact that the analytical null distribution may not always be62

accurate or available in an analytical closed form, we resort to resampling techniques for constructing63

the empirical null distribution. The techniques under consideration include smoothed local bootstrap,64

stationary bootstrap and time-shifted surrogates, all of which are shown in literature to be applied to65

entropy-based test statistics. Using different dependence structures, the size and power performance66

of all methods are examined in simulations.67

The remainder of paper is organized as follows. Section 2 first provides the transfer entropy-based68

testing framework and a short introduction to kernel density estimation; then bandwidth selection rules69

are discussed. After presenting the resampling methods, including the smoothed local bootstrap and70

time-shifted surrogates for different dependence structure settings. Section 3 examines the empirical71

performance of different resampling methods. The testing size and power are presented. Section 472

considers an financial application with the transfer entropy-based nonparametric test and Section 573

summaries.74

Page 3: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 3 of 24

2. Methodology75

2.1. Transfer Entropy and the Estimator76

Information theory is a branch of applied mathematical theory of probability and statistics. Thecentral problem of classical information theory is to measure transmission of information over a noisychannel. Entropy, also referred to as Shannon entropy, is one key measure in the field of informationtheory brought by [1,2]. Entropy measures the uncertainty and randomness associated with a randomvariable. Supposing that S is a random vector with density fS(s), its Shannon entropy is defined as

H(s) = −∫

fS(s) log ( fS(s)) ds. (1)

There is a long history of applying information measures in time series modeling. For example,77

[25] applies the [26] information criterion to construct a one-sided test for serial independence. Since78

then, nonparametric tests using entropy measures for dependence between two time series are79

becoming prevalent. [12] normalize the entropy measure to identify the lags in a nonlinear bivariate80

model. [27] study dependence with a transformed metric entropy, which turns out to be a proper81

measure of distance. [13] provide a new entropy-based test for serial dependence, and the test statistic82

follows a standard normal distribution asymptotically.83

Although those heuristic approaches work for entropy-based measures of dependence, thesemethodologies do not carry over directly to measures of conditional dependence, i.e., Granger causality.Transfer entropy (TE), a term coined by [3], although it appeared in literatures earlier under differentnames, is a suitable measure to serve this purpose. The TE quantifies the amount of informationexplained in one series at k steps ahead from the state of another series, given the current and paststate of itself. Suppose we have two series {Xt} and {Yt}, for brevity put X = {Xt}, Y = {Yt} andZ = {Yt+k}, further we define a three-variate vector Wt as Wt = (Xt, Yt, Zt), where Zt = Yt+k; andW = (X, Y, Z) is used when there is no danger of confusion. Within this bivariate setting, W is a threedimensional continuous vector. In this paper, we limit ourselves to k = 1 for simplicity, but the methodcan be generalized into multiple steps easily. The transfer entropy TEX→Y is a nonlinear measure forthe amount of information explained in Z by X, accounting for the contribution of Y. Although TEdefined by [3] applies for discrete variable, it is easily generalized to continuous variables. Conditionalon Y, TEX→Y is defined as

TEX→Y = EW

(log

fZ,X|Y(Z, X|Y)fX|Y(X|Y) fZ|Y(Z|Y)

)

=∫ ∫ ∫

fX,Y,Z(X, Y, Z) logfZ,X|Y(Z, X|Y)

fX|Y(X|Y) fZ|Y(Z|Y) dx dy dz

= EW

(log

fX,Y,Z(X, Y, Z)fY(Y)

− logfX,Y(X, Y)

fY(Y)− log

fY,Z(Y, Z)fY(Y)

)= EW (log fX,Y,Z(X, Y, Z) + log fY(Y)− log fX,Y(X, Y)− log fY,Z(Y, Z)) .

(2)

Using conditional mutual information I(Z, X|Y = y), the TE can be equivalently formulated with fourShannon entropy terms:

TEX→Y = I(Z, X|Y)= H(Z|Y)− H(Z|X, Y)

= H(Z, Y)− H(Y)− H(Z, X, Y) + H(X, Y).

(3)

In order to construct a test for Granger causality based on the TE measure, one needs first to show84

quantitatively that TE is a proper basis for detecting whether null hypothesis is satisfied. The following85

Page 4: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 4 of 24

theorem, as a direct application of Kullback-Leibler criterion, lays the quantitative foundation for86

testing on the TE.87

Theorem 1. TEX→Y ≥ 0 with equality if and only if fZ,X|Y(Z, X|Y) = fX|Y(X|Y) fZ|Y(Z|Y).88

Proof. The proof of Thm. 1 is given by [28].89

It is not difficult to verify that the condition for TEX→Y = 0 coincide with the null hypothesisfor Granger non-causality defined in Equation (4), also referred to as conditional independence or nocoupling. Mathematically speaking, the null hypothesis of a Granger non-causality, H0 : {Xt} is not aGranger cause of {Yt}, can be phrased as

H0 :fX,Y,Z(x, y, z)

fY(y)=

fY,Z(y, z)fY(y)

fX,Y(x, y)fY(y)

(4)

for (x, y, z) in the support of W. A nonparametric test for Granger non-causality seeks to find statistical90

evidence of violation of Equation (4). There are many nonparametric measures available for this91

purpose, some are mentioned previously. Equation (4) lays the stone for a model-free test without92

imposing any parametric assumptions about the data generating process or underlying distributions93

for {Xt} and {Yt}. We only assume two things here. First, {Xt, Yt} is a strictly stationary bivariate94

process. Second, the process has a finite memory, i.e. variable lags lX, lY � ∞. The second (finite95

Markov order) assumption is needed in this nonparametric setting to allow conditioning on the past.96

As far as we are aware, the direct use of TE to test Granger non-causality in nonparametric97

setting is difficult, if not impossible at all, due to the lack of asymptotic theory for the test statistic.98

As [12] put it, very few asymptotic distribution results for entropy-based estimators is available.99

Although over the years several break-throughs have been made with application of entropy to testing100

serial independence, the limiting distribution of TE statistic is still unknown. One may wish to use101

simulation techniques to overcome the lack of asymptotic distributions. However, as noted by [7], there102

are estimation biases of TE statistics for non-parametric dependence measures under the smoothed103

bootstrap procedure. Even for the parametric test statistic used by [14], the authors noticed that the TE104

based estimator is generally biased.105

2.2. Density Estimation and Bandwidth Selection106

The non-negative property in Thm. 1 makes TEX→Y a desirable statistic for constructing a107

one-sided test of conditional independence; any positive divergence from zero is a sign of conditional108

dependence of Y on X. To estimate TE, there are several different approaches, such as histogram-based109

estimators [29], correlation sums [30] and nearest neighbor estimators [31]. However, the optimal rule110

for the number of neighbor points is unclear, and as [31] comment, a small value of neighbor points111

may lead to large statistical errors. A more natural method, kernel density estimators, the properties of112

which have been well studied, is applied in this paper. With the plug-in kernel estimates of densities,113

we may replace the expectation in Equation (2) by a sample average to get an estimate for TEX→Y.114

A local density estimator of a dW-variate random vector W at Wi is given by

fW(Wi) = ((n− 1)h)−dWn

∑j,j 6=i

K(Wi −Wj

h

), (5)

where K is a kernel function and h is the bandwidth. We take K(.) to be a product kernel functiondefined as K(W) = ∏dW

s=1 κ(ws), where ws is sth element in W. Using a standard univariate Gaussian

kernel, κ(ws) = (2π)−1/2e−12 (w

s)2, K(.) is the standard multivariate Gaussian kernel as described by

Page 5: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 5 of 24

[32] and [33]. Using Equation (5) as the plug-in density estimator, and replacing the expectation by thesample mean, we obtain the estimator for the transfer entropy given by

I(Z, X|Y) = 1n

n

∑i=1

(log f (xi, yi, zi) + log f (yi)− log f (xi, yi)− log f (yi, zi)

). (6a)

If we estimate the Shannon entropy in Equation (1) based on a sample of size n from the dW-dimensionalrandom vector W, by the sample average of the plug-in density estimates, we obtain

H(W) =1n

n

∑i=1

(log( fW(Wi))

), (6b)

then Equation (6a) can be equivalently expressed in terms of four entropy estimators, that is,

I(Z, X|Y) = −H(xi, yi, zi)− H(yi) + H(xi, yi) + H(yi, zi). (6c)

To construct a statistical test, we develop the asymptotic properties of I(Z, X|Y) defined in115

Equations (6a) and (6c) through two steps. In the first step, given the density estimates the consistency116

of entropy estimates is achieved and then the linear combination of four entropy estimates would117

converge in probability to the true value. In mathematics, the following two theorems ensure the118

consistency of I(Z, X|Y).119

Theorem 2. Given the kernel density estimate fW(Wi) for fW(Wi), where W is a dW-dimensional random120

vector with length n. Let H(W) be the plug-in estimate for the Shannon entropy as defined in Equation (6b),121

then H(W)P−→ H(W).122

Proof. The proof to Thm. 2 is initially investigated in [34] and also discussed in [12].123

The basic idea of the proof is to take the Taylor series expansion of log( fW(Wi)) around its124

mean and use the fact that fW(Wi), given an appropriate bandwidth sequence, converges to fW(Wi)125

pointwise to get the consistency. In the next step, the consistency of I(Z, X|Y) is provided with the126

continuous mapping theorem.127

Theorem 3. Given H(W)P−→ H(W), with the definition of I(Z, X|Y) as in Equation (6c), I(Z, X|Y) P−→128

I(Z, X|Y)129

Proof. The proof is straightforward if one applies the Continuous Mapping Theorem. See Theorem130

2.3 in [35].131

Before we move to the next section, it is worth having a careful look at the issue of bandwidth132

selection. The bandwidth h for kernel estimation determines how smooth the density estimation is; a133

smaller bandwidth reveals more structure of the data, whereas larger bandwidth delivers a smoother134

density estimate. Bandwidth selection is essentially a trade-off between the bias and variance in135

density estimation. A very small value of h could eliminate estimation bias, but variance will increase.136

On the other hand, a large bandwidth reduces estimation variance at the expense of incorporating137

more bias. See Chapter 3 in [32] and [36] for details.138

However, the bandwidth selection for the TE statistic is more involved. To the best of ourknowledge, there is a blank in the field of optimal bandwidth selection in kernel-based transferentropy estimator. As [37] show, when estimating the entropy estimator, two types of errors wouldbe generated, one is from entropy estimation and the other from density estimation, and the optimalbandwidth for density estimation may not coincide with the optimal one for entropy estimation. Thus,instead of providing the optimal multivariate density estimator, i.e. the rule-of-thumb in [33], the

Page 6: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 6 of 24

bandwidth in our study should provide an accurate estimator for I(Z, X|Y), say in the minimal MSEsense. [28] developed such bandwidth rule for their transfer entropy-based estimator. In that paper,rather than directly developing asymptotic properties for transfer entropy, we study the propertiesof the first order Taylor expansion of transfer entropy under the null hypothesis. The suggestedbandwidth is shown to be MSE optimal and allows us to obtain asymptotic normality for the teststatistic. In principal, the convergence rate of the transfer entropy estimator should be the same as theleading term of its Taylor approximation. We therefore propose to use the same rate also here, giving

h = Cn−2/7, (7)

where C is an unknown parameter. This bandwidth would deliver a consistent test since the variance139

of local estimate of I(Zi, Xi|Yi) will dominate the MSE. In [28] we suggest to use C = 4.8 based on140

simulations, while [6] suggest to set C ≈ 8 for ARCH processes. Our simulations here also may prefer141

a larger value of C because the squared bias is of higher order and hence less concern for the transfer142

entropy based statistic. A larger bandwidth could better control the estimation variance and deliver a143

more powerful test. For robustness check, we adopt C = 8 as well as C = 4.8 suggested by [28] in our144

simulation study. To match the Gaussian kernel, we standardize the data before estimate Equations (6a)145

to (6c) such that the transformed series will be centered around zero and unit-variance.1146

2.3. Resampling Methods147

To develop simulation-based tests for the null hypothesis, given in Equation (4), of no Granger148

causality from X to Y, or equivalently, for conditional independence, we consider three resampling149

techniques, i.e. (1) time shifted surrogates developed by [22], (2) the smoothed bootstrap of [7] and150

(3) the stationary bootstrap introduced by [38]. The first technique is widely applied in coupling151

measures, as for example by [39] and [40], while the latter two have already been used for detecting152

conditional independence for decades. It worth to mention that the surrogates and bootstrap methods153

treat the null quite differently. Surrogate data are supposed to preserve the dependence structure154

imposed by H0 while bootstrap data are not restricted to the H0. It is possible to bootstrap the dataset155

without restoring dependence structure of {X, Y, Z} under the null. See [41] for more details. To avoid156

resampling errors and to make different methods more comparable, we limit ourselves to methods157

that impose the null hypothesis on the resampled data. The wollowing three different resampling158

methods are implemented with different sampling details.159

1. Time-shifted Surrogates160

• (TS.a) The first resampling method only deals with the driving variable X. Suppose we have161

observations {x1, ..., xn}, the time-shifted surrogates are generated by cyclical time-shifting162

the components of the time series. Specifically, an integer d is randomly generated within163

the interval ([0.05n], [0.95n]), and then the first d values of {x1, ..., xn} would be moved to the164

end of the series, to deliver the surrogate sample X∗ = {xd+1, ..., xn, x1, ..., xd}. Compared165

with the traditional surrogates based on phase randomization of the Fourier transform, the166

time-shifted surrogates can preserve the whole statistical structure in X. The couplings167

between X and Y are destroyed, although the null hypothesis of X not causing Y is imposed.168

169

• (TS.b) The second scheme resamples both the driving variable X and the response variable Y170

separately. Similar to (TS.a), Y∗ = {yc+1, ..., yn, y1, ..., yc} is created given another random171

integer c from the range ([0.05n], [0.95n]). In contrast with the standard time-shifted172

1 Very similar results are obtained by matching the mean absolute deviation instead of the variance of standard Gaussiankernel for transfer entropy estimation.

Page 7: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 7 of 24

surrogates described in (TS.a), in this setting we add more noise to the coupling between X173

and Y.174

2. Smoothed Local Bootstrap175

The smoothed bootstrap selects samples from a smoothed distribution instead of drawing176

observations from the empirical distribution directly. See [42] for a discussion of the smoothed177

bootstrap procedure. Based on rather mild assumptions, [43] show that there is no need to178

reproduce the whole dependence structure of the stochastic process to get an asymptotically179

correct nonparametric dependence estimator. Hence a smoothed bootstrap from the estimated180

conditional density is able to deliver a consistent statistic. Specifically, we consider two versions181

of the smoothed bootstrap that are different in dependence structure to some extent.182

• (SMB.a) In the first setting, Y∗ is resampled without replacement from the smoothed local183

bootstrap. Given the sample Y = {y1, ...yn}, the bootstrap sample is generated by adding a184

smoothing noise term εYi such that y∗i = y∗i + hbεY

i , where hb > 0 is the bandwidth used in185

bootstrap procedure, εYi repersents a sequence of i.i.d. N(0, 1) random variables. Without186

random replacement from the original time series, this procedure does not disturb the187

original dynamics of Y = {y1, ...yn} at all. After Y∗ is resampled, both X∗ and Z∗ are drawn188

from the smoothed conditional densities f (x|Y∗) and f (z|Y∗) as described in [44].189

190

• (SMB.b) Secondly, we implement the smoothed local bootstrap same as [7]. The only191

difference between this setting and (SMB.a) is that the bootstrap sample Y∗ is drawn with192

replacement from the smoothed kernel density.193

3. Stationary Bootstrap194

[38] propose the stationary bootstrap to overcome the drawbacks of non-stationary bootstrap195

sample in the moving block bootstrap. This method replicates the time dependence of original196

data by resampling blocks of data with randomly varying block length. The lengths of the197

bootstrap blocks follows a geometric distribution. Given a fixed probability p, the length Li of198

block i is decided as P(Li = k) = (1− p)k−1 p for k = 1, 2, ..., and the starting points of block i are199

randomly drawn from the original n observations. To restore the dependence structure exactly200

under the null, we combine the stationary bootstrap with the smoothed local bootstrap for our201

simulations.202

• (STB) In short, firstly y∗1 is picked randomly from the original n observations of Y = {y1, ...yn},203

denoted as y∗1 = ys where s ∈ [1, n]. With probability p, y∗2 is picked at random from the204

data set; and with probability 1− p, y∗2 = ys+1, so that y∗2 would be the next observation205

to ys in original series Y = {y1, ...yn}. Proceeding in this way, {y∗1 , ..., y∗n} can be generated.206

If y∗i = ys and s = n, the ‘circular boundary condition’ would kick in such that y∗i+1 = y1.207

After Y∗ = {y∗1 , ..., y∗n} is generated, both X∗ and Z∗ are randomly drawn from the smoothed208

conditional densities f (x|Y∗) and f (z|Y∗) as in (SMB.b).209

A final remark concerns the difference between this paper and [21]; there both time-shifted210

surrogate and the stationary bootstrap are implemented for an entropy-based causal test. However,211

our paper provides additional insights into several aspects. Firstly, the smoothed bootstrap, being212

shown in the literature to work for nonparametric kernel estimators under general dependence213

structure, is applied in our paper. Secondly, they treat the bootstrap and surrogate sample in a similar214

way, but as we noted before, the bootstrap method is not designed to impose the null hypothesis215

but designed to keep the dependence structure present in the original data. The stationary bootstrap216

procedure in [21] might be incompatible with the null hypothesis of conditional independence since217

they destroy the dependence completely. Because they restore independence between X and Y rather218

than conditional independence between X|Y and Z|Y during resampling, the estimated statistics from219

Page 8: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 8 of 24

the resampled data may not necessary correspond to the statistic defined under the null. Thirdly, we220

provide rigorous size and power result in our simulations, which is missing in their paper.221

3. Simulation Study222

In this section, we investigate the performance of the five resampling methods in detecting223

conditional dependence for several date generating processes, including a linear vector autoregressive224

process, a nonlinear VAR process and a bivariate ARCH process. In Equations (8) to (10), we use225

a single parameter a to control the strength of the conditional dependence. The size assessment is226

obtained based on testing Granger non-causality from {Xt} to {Yt}, and for the power we use the same227

process but testing on Granger non-causality from {Yt} to {Xt}. Specifically, for each data-generating228

process (DGP), we run 500 simulations for sample size n = {200, 500, 1000, 2000}. Surrogate and the229

bootstrap sample size is 999. We set a = 0.4 to represent the moderate dependence in size performance230

investigation and a = 0.1 to evaluate the test power. For fair comparisons between (TS.a) and (TS.b),231

as well as between (SMB.a) and (SMB.b), we fix the seeds for random generator in the resampling232

functions to eliminate the potential effect of randomness. Besides, we use the empirical standard233

deviation of {Yt} as the bootstrapping bandwidth and C = {4.8, 8} in Equation (7) for kernel estimation.234

Further, the power performance for the two-way causal linkage is investigated in Equation (11), where235

b = −0.2 and c = 0.1. All empirical rejection rates are summarized in Tables 1 to 4.236

1. Linear vector autoregressive process (VAR).

Xt = aYt−1 + εx,t, εx,t ∼ N(0, 1)Yt = aYt−1 + εy,t, εy,t ∼ N(0, 1).

(8)

2. Nonlinear VAR. This process is considered in [45] to show the failure of linear Granger causalitytest.

Xt = aXt−1Yt−1 + εx,t, εx,t ∼ N(0, 1)Yt = 0.6Yt−1 + εy,t, εy,t ∼ N(0, 1).

(9)

3. Bivariate ARCH process.

Xt ∼ N(0, 1 + aY2t−1)

Yt ∼ N(0, 1 + aY2t−1).

(10)

4. Two-way VAR process.

Xt = 0.7Xt−1 + bYt−1 + εx,t, εx,t ∼ N(0, 1)Yt = cXt−1 + 0.5Yt−1 + εy,t, εy,t ∼ N(0, 1).

(11)

The top panels in Tables 1 to 3 summarize the empirical rejection rates obtained for the 5% and237

10% nominal significance levels for processes Equations (8) to (10) under the null hypotheses, and the238

bottom panels report the corresponding empirical power under the alternatives. Generally speaking,239

the size and power are quite satisfactory for almost all combinations of the constant C, sample size n240

and nominal significance level. The performance differences for the different resampling methods are241

not substantial.242

With respect to the size performance, most of the time we see that the realized rejection rates243

stay in line with the nominal size. Besides, the bootstrapping methods outperform the time-shifted244

surrogate methods in that their empirical size is slightly closer to the nominal size. Lastly, the size of245

the tests is not very sensitive to the choice of the constant C apart from the cases for the ARCH process246

given in Equation (10), where all tests are too conservative for the smaller constant C = 4.8.247

From the point of view of power, (TS.a) and (SMB.a) seems to outperform their counterparts, yet,248

the differences are very subtle. Along the dimension of sample size, clearly we see that the empirical249

power increases in the sample size. Furthermore, the results are very robust with respect to different250

Page 9: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 9 of 24

choices for the constant C in the kernel density estimation bandwidth. For the VAR and nonlinear251

processes given by Equations (8) and (9) a smaller C seems to give more powerful tests while a larger C252

is more beneficial for detecting conditional dependence structure in the ARCH process of Equation (10).253

Next, Table 4 presents the empirical power for the two-way VAR process, where the two variables254

{X} and {Y} are inter-tangled with each other. Due to the setting b = −0.2 and c = 0.1 in Equation (11),255

it is obvious that {Y} is a stronger Granger cause for {X} than the other way around. As a consequence,256

the reported rejection rates in Table 4 are overall higher when testing Y → X than X → Y.257

To visualize the simulation results, Figures 1 to 3 report the empirical size and power against258

the nominal size. Since the performance of the five difference re-sampling methods is quite similar,259

we only show the results for (SMB.a) for simplicity. In each figure, the left (right) panels show the260

realized size (power), and we choose C = 4.8 (C = 8) for the top (bottom) two panels. We can see261

from the figures that the empirical performance of the transfer entropy test are overall satisfactory,262

apart from panel (a) in Figure 3, where the testing size is too conservative for large sample size. The263

under-rejection is caused by the inappropriate choice C = 4.8, which makes the bandwidth for kernel264

estimation too small. Further, the empirical power against the nominal size for the two-way VAR265

Equation (11) are reported in Figure 4.266

4. Application267

In this section, we apply the transfer entropy-based nonparametric test on detecting financial268

market interdependence, in terms of both return and volatility. [46] performed a variance269

decomposition of the covariance matrix of the error terms from a reduced-form VAR model to270

investigate the spillover effect in the global equity market. More recently, [47] extended the framework271

and considered the time-varying feature in global volatility spillovers. Their research, although272

providing simple and intuitive methods for measuring directional linkages between global stock273

markets, may suffer from the limitation of the linear parametric modeling, as discussed earlier. We274

revisit the topic of spillovers in the global equity market by the nonparametric method.275

For our analysis, we use daily nominal stock market indexes from January 1992 to March 2017,276

obtained from Datastream, for six developed countries including the U.S. (DJIA), Japan (Nikkei 225),277

Hong Kong (Hangseng), the U.K. (FTSE 100), Germany (DAX 30) and France (CAC 40). The target278

series are weekly return and volatility for each index. The weekly returns are calculated in terms of279

diffferenced log prices multiplied by 100, from Friday to Friday. Where the price for Friday is not280

available due to a public holiday, we use the Thursday price instead.281

The weekly volatility series are generated following [46] by making use of the weekly high, low,opening and closing prices that obtained from the underlying daily high, low, opening and closingdata. The estimate of volatility σ2

t for week t is as follows.

σt2 = 0.511(Ht − Lt)

2 − 0.019[(Ct −Ot)(Ht + Lt − 2Ot)− 2(Ht −Ot)(Lt −Ot)]− 0.383(Ct −Ot)2,

(12)where Ht is the Monday-Friday high, Lt is the Monday-Friday low, Ot is the Monday-Friday open and282

Ct is the Monday-Friday close (in natural logarithms multiplied by 100). Futher, after deleting the283

volatility estimates for the New Year week in 2002, 2008 and 2013 due to the lack of observations for284

Nikkei 225 index, we have 1313 observations in total for weekly returns volatilities. The descriptive285

statistics, LB-test statistics and ADF-test statistics for both series are summarized in Table 5. From the286

ADF test results, it is clear that all time series are stationary for further analysis.287

We firstly provide a full-sample analysis of global stock market return and volatility spillovers288

over the period from January 1992 to March 2017, summarized in Tables 6 and 7. The two tables289

report the pairwise test statistics for conditional independence between index X and index Y, given290

the constant C in the bandwidth for kernel estimation is 4.8 or 8 and 999 resampling time series.291

In other words, we test on the one-week-ahead directional linkage from index X to Y by using the292

five re-sampling methods described in Section 2. For example, the first line in the top panel in293

Page 10: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 10 of 24

0 0.05 0.1 0.15 0.20

0.05

0.1

0.15

0.2

Norminal Size

Rej

ecti

onR

ate

(a) Size-Size Plot, C=4.8

n = 200n = 500n = 1000n = 200045-line

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Norminal Size

Rej

ecti

onR

ate

(b) Size-Power, C=4.8

n = 200n = 500n = 1000n = 200045-line

0 0.05 0.1 0.15 0.20

0.05

0.1

0.15

0.2

Norminal Size

Rej

ecti

onR

ate

(c) Size-Size Plot, C=8

n = 200n = 500n = 1000n = 200045-line

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Norminal Size

Rej

ecti

onR

ate

(d) Size-Power, C=8

n = 200n = 500n = 1000n = 200045-line

Figure 1. Size-size and size-power plots of Granger non-causality tests, based on 500 replications andsmoothed local bootstrap (a). The DGP is the bivariate VAR specified in Equation (8), with Y affectingX. The left (right) column shows observed rejection rates under the null (alternative) hypothesis. Thesample size varies from n = 200 to n = 2000.

Table 6 reports the one-week-ahead influence of DJIA returns upon other indexes by using the fisrt294

time-shifting surrogate test. Given C = 8, DJIA return is shown to be a strong Granger cause for295

Nikkei, FTSE and CAC at 1% level, and for DAX at 5% level.296

Based on Tables 6 and 7, we may draw several conclusions. Firstly, the US index and German297

index are the most important return transmitters and Hong Kong is the largest source for volatility298

spillover, judged by the numbers of significant linkages. Note that this finding is similar as the result in299

[46], where the total return (volatility) spillovers from US (Hong Kong) to others are much higher than300

any other countries. Figure 5 provides a graphic illustration for the global spillover network based on301

the result of (SMB.a) from Tables 6 and 7. Apart from the main transmitters, we can clearly see that302

Nikkei and CAC are the main receivers in the global return spillover network, while DAX is the main303

receiver of global volatility transmission.304

Secondly, the result obtained is very robust, no matter which re-sampling method is applied. The305

differences between the five resampling methods are small, although (TS.a) is seen to be slightly more306

powerful than (TS.b) in Table 6. The three different bootstrapping methods are very consistent almost307

all the time, similarly to what we observe in Section 3.308

Page 11: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 11 of 24

0 0.05 0.1 0.15 0.20

0.05

0.1

0.15

0.2

Norminal Size

Rej

ecti

onR

ate

(a) Size-Size Plot, C=4.8

n = 200n = 500n = 1000n = 200045-line

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Norminal Size

Rej

ecti

onR

ate

(b) Size-Power, C=4.8

n = 200n = 500n = 1000n = 200045-line

0 0.05 0.1 0.15 0.20

0.05

0.1

0.15

0.2

Norminal Size

Rej

ecti

onR

ate

(c) Size-Size Plot, C=8

n = 200n = 500n = 1000n = 200045-line

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Norminal Size

Rej

ecti

onR

ate

(d) Size-Power, C=8

n = 200n = 500n = 1000n = 200045-line

Figure 2. Size-size and size-power plots of Granger non-causality tests, based on 500 replicationsand smoothed local bootstrap (a). The DGP is the bivariate non-linear VAR specified in Equation (9),with Y affecting X. The left (right) column shows observed rejection rates under the null (alternative)hypothesis. The sample size varies from n = 200 to n = 2000.

However, the summary result in Tables 6 and 7 is static. The statistics are measurements for309

averaged-out directional linkages over the whole period from 1992 to 2017. The conditional dependence310

structure of time series, at any point in time, can be very different. Hence, the full-sample analysis311

is very likely to oversee the cyclical dynamics between each pair of stock indices. To investigate the312

dynamics in the global stock market, we now move from the full-sample analysis to a rolling-window313

study. Considering a 200-week rolling window starts from the beginning of the sample and admits314

a 5-week forward step for the iterative evaluation of the conditional dependence, we can access the315

variation of the spillover in the global equity market over time.316

Taking the return series of DJIA as an illustration 2, we iteratively exploit the local smoothed317

bootstrap method for detecting Granger causality from and to the DJIA return in Figure 6. The red line318

2 All volatility series are extremely skewed, see Table 5. In a small sample analysis, the test statistics turn out to be sensitiveto the clustering outliers, which typically occur during the market turmoil. As a result, the volatility dynamics are moreradical and less informative than that of returns.

Page 12: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 12 of 24

0 0.05 0.1 0.15 0.20

0.05

0.1

0.15

0.2

Norminal Size

Rej

ecti

onR

ate

(a) Size-Size Plot, C=4.8

n = 200n = 500n = 1000n = 200045-line

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Norminal Size

Rej

ecti

onR

ate

(b) Size-Power, C=4.8

n = 200n = 500n = 1000n = 200045-line

0 0.05 0.1 0.15 0.20

0.05

0.1

0.15

0.2

Norminal Size

Rej

ecti

onR

ate

(c) Size-Size Plot, C=8

n = 200n = 500n = 1000n = 200045-line

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Norminal Size

Rej

ecti

onR

ate

(d) Size-Power, C=8

n = 200n = 500n = 1000n = 200045-line

Figure 3. Size-size and size-power plots of Granger non-causality tests, based on 500 replicationsand smoothed local bootstrap (a). The DGP is the bivariate ARCH specified in Equation (10), withY affecting X. The left (right) column shows observed rejection rates under the null (alternative)hypothesis. The sample size varies from n = 200 to n = 2000.

represents the p-values for the transfer entropy-based test on DJIA weekly return as the information319

transmitter while the blue line shows the p-values associated with testing on DJIA as a receiver of320

information spillover on a weekly basis from others. The plots displays an event-dependent pattern,321

particularly for the recent financial crisis; during the early 2009 till the end of 2012, all pairwise test322

show a strong bi-directional linkage. Besides, the DJIA is strongly leading the Nikkei, Hangseng and323

CAC during the first decade of this century.324

Further, we see that the influence from other indexes to DJIA are different, typically responding325

to the economics events. For example, the blue line in the second panel of Figure 6 plunges below326

5% level twice before the recent financial crisis, meaning that the Hong Kong Hangseng index causes327

fluctuationd in the DJIA during those two periods; first in the later 90’s and again by the end of 2004.328

The timing of the first fall matches the 1997 Asian currency crisis and the latter one was in fact caused329

by China’s austerity policy in October 2004.330

Finally, the dynamic plots provide additional insights into the sample period that the full sample331

analysis may omit. The DAX and CAC are found to be less relevant for the future fluctuation in the332

weekly return of the DJIA, according to Table 6 and Figure 5, however one may clearly see that since333

Page 13: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 13 of 24

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Norminal Size

Rej

ecti

onR

ate

(a) X → Y, C=4.8

n = 200n = 500n = 1000n = 200045-line

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Norminal Size

Rej

ecti

onR

ate

(b) Y → X, C=4.8

n = 200n = 500n = 1000n = 200045-line

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Norminal Size

Rej

ecti

onR

ate

(c) X → Y, C=8

n = 200n = 500n = 1000n = 200045-line

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Norminal Size

Rej

ecti

onR

ate

(d) Y → X, C=8

n = 200n = 500n = 1000n = 200045-line

Figure 4. Size-power plots of Granger non-causality tests, based on 500 replications and smoothedlocal bootstrap (a). The DGP is the two-way VAR specified in Equation (11), with X affecting Y and Yaffecting X. The left (right) column shows observed rejection rates for testing X (Y) causing Y (X). Thesample size varies from n = 200 to n = 2000.

2001, the p-values for DAX→DJIA and CAC→DJIA are consistently below 5% for most of the time,334

suggesting an increase of the integration of global financial markets.335

5. Conclusions336

This paper provides guidelines for the practical application of transfer entropy in detecting337

conditional dependence, i.e. Granger causality in a more general sense, between two time series.338

Although there already is a tremendous literature that tried to apply the transfer entropy for this339

purpose, the asymptotics of the statistic and the performance of the resampling-based measures are340

still not understood well. We have considered five different resampling method-based tests; all of341

which were shown in the literature to be suitable for entropy-related tests, and investigated the testing342

size and power comprehensively. Two time-shifted surrogates and three smoothed bootstrap methods343

are tested on simulated data from linear VAR, nonlinear VAR and ARCH processes. The simulation344

results in this controlled environment suggest that all five measures achieve reasonable rejection rates345

under the null as well as the alternative hypotheses. Our results are very robust with respect to the346

density estimation method, including the procedure used for standardizing the location and scale of347

Page 14: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 14 of 24

DJIA

Nikkei

Hangseng

FTSE

DAX

CAC

(a) Network of Global Stock Return Spillover

DJIA

Nikkei

Hangseng

FTSE

DAX

CAC

(b) Network of Global Stock Volatility Spillover

Figure 5. Graphic representation of pairwise causalities on global stock returns and volatilities. All"−→" in the graph indicate a significant directional causality at 5% level.

the data and the choice of the bandwidth parameter, as long as the convergence rate of the kernel348

estimator of transfer entropy is consistent with its first order Taylor expansion.349

In the empirical application, we have shown how the proposed resampling techniques can be350

used on real world data for detecting conditional dependence in the data set. We use global equity data351

to carry out the detection in pairwise causalities in the return and volatility series among the world352

leading stock indexes. Our work is a nonparametric extension of the spillover measures considered353

by [46]. In accordance with them, we found evidence that US index and German index are the most354

important return transmitters and Hong Kong is the largest source for volatility spillover. Furthermore,355

the rolling window-based test for Granger causality in pairwise return series demonstrated that the356

causal linkages in global equity market is time-varying rather than static. The overall dependence is357

more tight during the most recent financial crisis, and the fluctuation of the p-values are shown to be358

event dependent.359

As for the future work, there are several directions for potential extension. On the theoretical360

side, it would be practically meaningful to consider causal linkage detection beyond the single period361

lag. Further nonparametric techniques need to be developed to play a similar role as the information362

criterion does for order selection of an estimation model in the parametric world. On the empirical363

side, it will be interesting to further exploit entropy-based statistics in testing conditional dependence364

when there exist a so-called common factor, i.e., looking at multivariate systems with more than two365

variables. One potential candidate for this type of test in the partial transfer entropy has been coined366

by [48], but its statistical properties need to be thoroughly studied yet.367

368

Acknowledgments: We would like to thank seminar participants at University of Amsterdam and Tinbergen369

Institute. The authors also wish to extend particular gratitude to Simon A. Broda and Chen Zhou for their370

suggestions on an early version of this paper. We also acknowledge SURFsara for providing the computational371

environment of the LISA cluster.372

References373

1. Shannon, C.E. A mathematical theory of communication. Bell System Technical Journal 1948, 27, 379–423;374

623–656.375

2. Shannon, C.E. Prediction and entropy of printed English. Bell System Technical Journal 1951, 30, 50–64.376

Page 15: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 15 of 24

3. Schreiber, T. Measuring information transfer. Physical review letters 2000, 85, 461.377

4. Granger, C.W. Investigating causal relations by econometric models and cross-spectral methods.378

Econometrica 1969, pp. 424–438.379

5. Hiemstra, C.; Jones, J.D. Testing for linear and nonlinear Granger causality in the stock price-volume380

relation. The Journal of Finance 1994, 49, 1639–1664.381

6. Diks, C.; Panchenko, V. A new statistic and practical guidelines for nonparametric Granger causality382

testing. Journal of Economic Dynamics and Control 2006, 30, 1647–1669.383

7. Su, L.; White, H. A nonparametric Hellinger metric test for conditional independence. Econometric Theory384

2008, 24, 829–864.385

8. Bouezmarni, T.; Rombouts, J.V.; Taamouti, A. Nonparametric copula-based test for conditional386

independence with applications to Granger causality. Journal of Business & Economic Statistics 2012,387

30, 275–287.388

9. Su, L.; White, H. Testing conditional independence via empirical likelihood. Journal of Econometrics 2014,389

182, 27–44.390

10. Hlavácková-Schindler, K.; Paluš, M.; Vejmelka, M.; Bhattacharya, J. Causality detection based on391

information-theoretic approaches in time series analysis. Physics Reports 2007, 441, 1–46.392

11. Amblard, P.O.; Michel, O.J. The relation between Granger causality and directed information theory: a393

review. Entropy 2012, 15, 113–143.394

12. Granger, C.; Lin, J.L. Using the mutual information coefficient to identify lags in nonlinear models. Journal395

of Time Series Analysis 1994, 15, 371–384.396

13. Hong, Y.; White, H. Asymptotic distribution theory for nonparametric entropy measures of serial397

dependence. Econometrica 2005, 73, 837–901.398

14. Barnett, L.; Bossomaier, T. Transfer entropy as a log-likelihood ratio. Physical Review Letters 2012,399

109, 138105.400

15. Efron, B. Computers and the theory of statistics: Thinking the unthinkable. SIAM review 1979, 21, 460–480.401

16. Theiler, J.; Eubank, S.; Longtin, A.; Galdrikian, B.; Farmer, J.D. Testing for nonlinearity in time series: the402

method of surrogate data. Physica D: Nonlinear Phenomena 1992, 58, 77–94.403

17. Schreiber, T.; Schmitz, A. Surrogate time series. Physica D: Nonlinear Phenomena 2000, 142, 346–382.404

18. Horowitz, J.L. The Bootstrap. Handbook of Econometrics 2001, 5, 3159–3228.405

19. Hinich, M.J.; Mendes, E.M.; Stone, L.; others. Detecting nonlinearity in time series: surrogate and bootstrap406

approaches. Studies in Nonlinear Dynamics & Econometrics 2005, 9.407

20. Faes, L.; Porta, A.; Nollo, G. Mutual nonlinear prediction as a tool to evaluate coupling strength and408

directionality in bivariate time series: Comparison among different strategies based on k nearest neighbors.409

Physical Review E 2008, 78, 026201.410

21. Papana, A.; Kyrtsou, K.; Kugiumtzis, D.; Diks, C.; others. Assessment of resampling methods for causality411

testing. Technical report, (CeNDEF Working Paper; No. 14-08). Amsterdam: CeNDEF, University of412

Amsterdam, 2014.413

22. Quiroga, R.Q.; Kraskov, A.; Kreuz, T.; Grassberger, P. Performance of different synchronization measures414

in real data: A case study on electroencephalographic signals. Physical Review E 2002, 65, 041903.415

23. Marschinski, R.; Kantz, H. Analysing the information flow between financial time series. The European416

Physical Journal B-Condensed Matter and Complex Systems 2002, 30, 275–281.417

24. Kugiumtzis, D. Evaluation of Surrogate and Bootstrap tests for nonlinearity in time series. Studies in418

Nonlinear Dynamics & Econometrics 2008.419

25. Robinson, P.M. Consistent nonparametric entropy-based testing. The Review of Economic Studies 1991,420

58, 437–453.421

26. Kullback, S.; Leibler, R.A. On information and sufficiency. The Annals of Mathematical Statistics 1951,422

22, 79–86.423

27. Granger, C.; Maasoumi, E.; Racine, J. A dependence metric for possibly nonlinear processes. Journal of424

Time Series Analysis 2004, 25, 649–669.425

28. Diks, C.; Fang, H. Detecting Granger causality with a nonparametric information-based statistic. (CeNDEF426

Working Paper; No. 17-03). Amsterdam: CeNDEF, University of Amsterdam 2017.427

29. Moddemeijer, R. On estimation of entropy and mutual information of continuous distributions. Signal428

Processing 1989, 16, 233–248.429

Page 16: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 16 of 24

30. Diks, C.; Manzan, S. Tests for serial independence and linearity based on correlation integrals. Studies in430

Nonlinear Dynamics and Econometrics 2002, 6-2, 1–20.431

31. Kraskov, A.; Stögbauer, H.; Grassberger, P. Estimating mutual information. Physical review E 2004,432

69, 066138.433

32. Wand, M.P.; Jones, M.C. Kernel Smoothing; Vol. 60, Crc Press, 1994.434

33. Silverman, B.W. Density Estimation for Statistics and Data Analysis; Vol. 26, CRC press, 1986.435

34. Joe, H. Estimation of entropy and other functionals of a multivariate density. Annals of the Institute of436

Statistical Mathematics 1989, 41, 683–697.437

35. Van der Vaart, A.W. Asymptotic Statistics; Vol. 3, Cambridge University Press, 2000.438

36. Jones, M.C.; Marron, J.S.; Sheather, S.J. A brief survey of bandwidth selection for density estimation.439

Journal of the American Statistical Association 1996, 91, 401–407.440

37. He, Y.L.; Liu, J.N.; Wang, X.Z.; Hu, Y.X. Optimal bandwidth selection for re-substitution entropy estimation.441

Applied Mathematics and Computation 2012, 219, 3425–3460.442

38. Politis, D.N.; Romano, J.P. The stationary bootstrap. Journal of the American Statistical association 1994,443

89, 1303–1313.444

39. Kugiumtzis, D. Direct-coupling information measure from nonuniform embedding. Physical Review E445

2013, 87, 062918.446

40. Papana, A.; Kyrtsou, C.; Kugiumtzis, D.; Diks, C. Simulation study of direct causality measures in447

multivariate time series. Entropy 2013, 15, 2635–2661.448

41. Efron, B.; Tibshirani, R.J. An introduction to the bootstrap; CRC press, 1994.449

42. Shao, J.; Tu, D. The Jackknife and Bootstrap; Springer Science & Business Media, 2012.450

43. Neumann, M.H.; Paparoditis, E. On bootstrapping L2-type statistics in density testing. Statistics &451

Probability Letters 2000, 50, 137–147.452

44. Paparoditis, E.; Politis, D.N. The local bootstrap for kernel estimators under general dependence conditions.453

Annals of the Institute of Statistical Mathematics 2000, 52, 139–159.454

45. Baek, E.G.; Brock, W.A. A general test for nonlinear Granger causality: Bivariate model. Working paper,455

Iowa State University and University of Wisconsin, Madison 1992.456

46. Diebold, F.X.; Yilmaz, K. Measuring financial asset return and volatility spillovers, with application to457

global equity markets. The Economic Journal 2009, 119, 158–171.458

47. Gamba-Santamaria, S.; Gomez-Gonzalez, J.E.; Hurtado-Guarin, J.L.; Melo-Velandia, L.F.; others. Volatility459

spillovers among global stock markets: Measuring total and directional effects. Technical Report No. 983,460

Banco de la Republica de Colombia, 2017.461

48. Vakorin, V.A.; Krakovska, O.A.; McIntosh, A.R. Confounding effects of indirect connections on causality462

estimation. Journal of neuroscience methods 2009, 184, 152–160.463

c© 2017 by the authors. Submitted to Entropy for possible open access publication under the terms and conditions464

of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).465

Page 17: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 17 of 24

Tabl

e1.

Obs

erve

dSi

zean

dPo

wer

ofth

eTr

ansf

erEn

trop

yba

sed

test

for

Line

arVA

RPr

oces

sin

Equa

tion

(8)

Size

α=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

4.8

200

0.05

600.

0500

0.04

600.

0420

0.04

400.

0820

0.07

400.

0740

0.07

800.

0780

500

0.07

400.

0680

0.06

600.

0700

0.06

600.

1160

0.11

200.

1200

0.11

600.

1220

1000

0.06

200.

0560

0.06

000.

0560

0.05

600.

0940

0.09

200.

0980

0.09

600.

0980

2000

0.03

800.

0340

0.03

800.

0460

0.04

600.

0940

0.09

200.

0960

0.09

800.

0980

α=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

8

200

0.05

000.

0440

0.04

600.

0460

0.04

400.

1120

0.10

000.

0960

0.09

600.

0920

500

0.08

400.

0760

0.07

600.

0720

0.06

600.

1360

0.13

000.

1060

0.11

600.

1140

1000

0.07

200.

0680

0.06

200.

0560

0.05

800.

1280

0.12

800.

1160

0.12

600.

1200

2000

0.08

800.

0780

0.08

200.

0760

0.08

200.

1420

0.13

800.

1320

0.13

400.

1440

Pow

erα=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

4.8

200

0.18

800.

1980

0.19

000.

1920

0.19

000.

2780

0.27

800.

2880

0.28

600.

2920

500

0.34

600.

3460

0.34

000.

3480

0.34

200.

4520

0.44

600.

4580

0.45

000.

4500

1000

0.54

400.

5340

0.53

200.

5340

0.52

800.

6400

0.65

200.

6460

0.64

800.

6460

2000

0.75

000.

7420

0.75

000.

7460

0.74

600.

8160

0.80

800.

8100

0.81

200.

8120

α=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

8

200

0.16

600.

1680

0.16

400.

1680

0.17

000.

2660

0.26

400.

2680

0.27

400.

2740

500

0.29

000.

2900

0.30

200.

3040

0.30

200.

4020

0.39

600.

3940

0.40

200.

3980

1000

0.49

800.

4980

0.50

000.

4900

0.49

800.

6040

0.61

200.

6120

0.61

400.

6120

2000

0.84

200.

8380

0.84

000.

8340

0.84

600.

8960

0.88

800.

8900

0.89

000.

8900

Not

e:Ta

ble

1p

rese

nts

the

emp

iric

alsi

zean

dp

ower

ofth

etr

ansf

eren

trop

y-ba

sed

test

at5%

and

10%

sign

ifica

nce

leve

lsfo

rp

roce

ssE

quat

ion

(8)f

ord

iffe

rent

resa

mp

ling

met

hod

s.T

heva

lues

repr

esen

tobs

erve

dre

ject

ion

rate

sov

er50

0re

aliz

atio

nsfo

rno

min

alsi

ze0.

05.S

ampl

esi

zes

gofr

om20

0to

2,00

0.Th

eco

ntro

lpar

amet

era=

0.4

for

size

eval

uatio

nan

da=

0.1

for

esta

blis

hing

pow

ers.

For

this

sim

ulat

ion

stud

yw

eco

nsid

erC=

4.8

and

C=

8.

Page 18: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 18 of 24

Tabl

e2.

Obs

erve

dSi

zean

dPo

wer

ofth

eTr

ansf

erEn

trop

yba

sed

test

for

Non

linea

rVA

RPr

oces

sin

Equa

tion

(9)

Size

α=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

4.8

200

0.05

000.

0360

0.03

400.

0380

0.03

600.

0760

0.07

200.

0780

0.07

600.

0720

500

0.05

800.

0580

0.05

800.

0580

0.05

800.

0960

0.09

800.

1040

0.10

400.

0980

1000

0.03

400.

0360

0.03

800.

0360

0.03

800.

0620

0.05

800.

0740

0.07

000.

0720

2000

0.03

800.

0320

0.04

400.

0460

0.04

200.

0780

0.07

000.

0960

0.09

200.

0880

α=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

8

200

0.05

600.

0480

0.04

600.

0520

0.04

800.

1000

0.08

000.

0940

0.09

400.

0940

500

0.06

400.

0620

0.05

000.

0480

0.05

400.

1120

0.11

000.

1080

0.10

400.

1040

1000

0.04

400.

0360

0.03

200.

0300

0.02

800.

0900

0.08

600.

0820

0.07

800.

0800

2000

0.02

600.

0300

0.02

800.

0280

0.02

600.

0700

0.06

400.

0740

0.06

600.

0640

Pow

erα=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

4.8

200

0.14

000.

1360

0.13

800.

1400

0.14

400.

2480

0.24

200.

2300

0.23

600.

2300

500

0.33

800.

3360

0.33

400.

3360

0.33

200.

4400

0.44

000.

4440

0.43

000.

4360

1000

0.60

600.

6040

0.62

400.

6220

0.62

200.

7180

0.72

600.

7140

0.71

800.

7160

2000

0.87

600.

8760

0.87

800.

8740

0.87

800.

9440

0.93

000.

9320

0.93

000.

9280

α=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

8

200

0.09

400.

0900

0.09

000.

0900

0.08

800.

1800

0.17

400.

1700

0.16

800.

1760

500

0.18

800.

1800

0.18

000.

1760

0.17

800.

2960

0.29

600.

3000

0.29

800.

3020

1000

0.38

000.

3940

0.39

000.

3840

0.38

200.

5520

0.54

400.

5480

0.55

200.

5480

2000

0.83

400.

8300

0.82

800.

8220

0.83

000.

9040

0.90

400.

9060

0.89

800.

9040

Not

e:Ta

ble

2p

rese

nts

the

emp

iric

alsi

zean

dp

ower

ofth

etr

ansf

eren

trop

y-ba

sed

test

at5%

and

10%

sign

ifica

nce

leve

lsfo

rp

roce

ssE

quat

ion

(9)f

ord

iffe

rent

resa

mp

ling

met

hod

s.T

heva

lues

repr

esen

tobs

erve

dre

ject

ion

rate

sov

er50

0re

aliz

atio

nsfo

rno

min

alsi

ze0.

05.S

ampl

esi

zes

gofr

om20

0to

2,00

0.Th

eco

ntro

lpar

amet

era=

0.4

for

size

eval

uatio

nan

da=

0.1

for

esta

blis

hing

pow

ers.

For

this

sim

ulat

ion

stud

y,w

eco

nsid

erC=

4.8

and

C=

8.

Page 19: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 19 of 24

Tabl

e3.

Obs

erve

dSi

zean

dPo

wer

ofth

eTr

ansf

erEn

trop

yba

sed

test

for

Biva

riat

eA

RC

HPr

oces

sin

Equa

tion

(10)

Size

α=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

4.8

200

0.06

600.

0580

0.06

200.

0640

0.06

200.

1420

0.12

800.

1260

0.13

400.

1280

500

0.06

400.

0560

0.05

600.

0560

0.05

800.

1120

0.10

600.

1080

0.10

200.

1000

1000

0.04

800.

0460

0.04

000.

0420

0.03

400.

0740

0.07

000.

0700

0.06

600.

0620

2000

0.02

200.

0200

0.00

800.

0080

0.00

800.

0460

0.05

000.

0220

0.02

600.

0180

α=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

8

200

0.09

200.

0760

0.07

600.

0720

0.08

400.

1540

0.13

600.

1280

0.12

600.

1320

500

0.11

000.

0900

0.07

200.

0760

0.08

000.

1980

0.18

600.

1620

0.14

800.

1600

1000

0.09

200.

0960

0.07

400.

0720

0.08

000.

1500

0.15

000.

1220

0.11

800.

1240

2000

0.07

800.

0740

0.06

600.

0620

0.06

000.

1260

0.11

800.

1140

0.11

600.

1120

Pow

erα=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

4.8

200

0.23

200.

2240

0.21

800.

2240

0.21

600.

3320

0.33

400.

3320

0.35

200.

3460

500

0.35

200.

3420

0.35

200.

3540

0.35

400.

4800

0.48

000.

4740

0.47

800.

4780

1000

0.50

200.

5060

0.51

200.

5120

0.50

000.

6340

0.63

200.

6340

0.62

800.

6300

2000

0.60

200.

6060

0.59

400.

5920

0.59

600.

7340

0.72

800.

7360

0.73

000.

7280

α=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

8

200

0.29

400.

2880

0.32

200.

3180

0.30

800.

4360

0.43

200.

4380

0.44

000.

4340

500

0.55

200.

5500

0.56

200.

5720

0.56

800.

6880

0.69

400.

6880

0.69

000.

6920

1000

0.77

200.

7720

0.78

200.

7800

0.77

400.

8600

0.85

600.

8640

0.86

000.

8640

2000

0.97

800.

9720

0.97

800.

9740

0.97

600.

9900

0.99

000.

9920

0.99

200.

9920

Not

e:Ta

ble

3pr

esen

tsth

eem

piri

cals

ize

and

pow

erof

the

tran

sfer

entr

opy-

base

dte

stat

5%an

d10

%si

gnifi

canc

ele

vels

for

proc

ess

Equ

atio

n(1

0)fo

rd

iffe

rent

resa

mpl

ing

met

hod

s.T

heva

lues

repr

esen

tobs

erve

dre

ject

ion

rate

sov

er50

0re

aliz

atio

nsfo

rno

min

alsi

ze0.

05.S

ampl

esi

zes

gofr

om20

0to

2,00

0.Th

eco

ntro

lpar

amet

era=

0.4

for

size

eval

uatio

nan

da=

0.1

for

esta

blis

hing

pow

ers.

For

this

sim

ulat

ion

stud

y,w

eco

nsid

erC=

4.8

and

C=

8.

Page 20: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 20 of 24

Tabl

e4.

Obs

erve

dPo

wer

ofth

eTr

ansf

erEn

trop

yba

sed

test

for

the

Two-

Way

VAR

Proc

ess

inEq

uati

on(1

1)

X→

Yα=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

4.8

200

0.27

800.

2600

0.35

400.

3520

0.35

600.

3980

0.37

600.

4560

0.45

600.

4660

500

0.56

400.

5360

0.62

600.

6240

0.62

600.

6780

0.66

800.

7420

0.74

000.

7460

1000

0.83

800.

8320

0.88

000.

8740

0.87

800.

9040

0.89

800.

9200

0.92

400.

9200

2000

0.99

000.

9900

0.99

800.

9960

1.00

001.

0000

1.00

001.

0000

1.00

001.

0000

α=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

8

200

0.24

800.

2240

0.30

800.

3180

0.31

000.

3540

0.33

600.

3900

0.39

400.

3920

500

0.41

800.

4020

0.47

400.

4700

0.47

000.

5400

0.53

200.

5880

0.59

200.

5800

1000

0.79

600.

7840

0.81

400.

8140

0.81

600.

8560

0.85

400.

8740

0.87

200.

8720

2000

0.98

000.

9780

0.98

200.

9800

0.98

000.

9900

0.98

800.

9900

0.99

000.

9920

Y→

Xα=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

4.8

200

0.51

400.

5060

0.54

800.

5560

0.54

800.

6120

0.59

600.

6920

0.69

200.

6940

500

0.94

000.

9340

0.94

800.

9440

0.94

600.

9620

0.95

800.

9720

0.97

000.

9740

1000

1.00

001.

0000

1.00

001.

0000

1.00

001.

0000

1.00

001.

0000

1.00

001.

0000

2000

1.00

001.

0000

1.00

001.

0000

1.00

001.

0000

1.00

001.

0000

1.00

001.

0000

α=

0.05

α=

0.10

nTS

.aT

S.b

SMB.

aSM

B.b

STB

TS.a

TS.

bSM

B.a

SMB.

bST

B

C=

8

200

0.40

400.

3700

0.44

200.

4400

0.42

800.

5060

0.49

000.

5400

0.54

000.

5320

500

0.82

600.

8220

0.82

200.

8200

0.81

600.

8740

0.87

400.

8700

0.87

200.

8680

1000

0.98

200.

9840

0.98

200.

9800

0.98

000.

9940

0.99

400.

9940

0.99

200.

9940

2000

1.00

001.

0000

1.00

001.

0000

1.00

001.

0000

1.00

001.

0000

1.00

001.

0000

Not

e:Ta

ble

4pr

esen

tsth

eem

piri

calp

ower

ofth

etr

ansf

eren

trop

y-ba

sed

test

at5%

and

10%

sign

ifica

nce

leve

lsfo

rpr

oces

sE

quat

ion

(11)

for

dif

fere

ntre

sam

plin

gm

etho

ds.

The

valu

esre

pres

ent

obse

rved

reje

ctio

nra

tes

over

500

real

izat

ions

for

nom

inal

size

0.05

.Sam

ple

size

sgo

from

200

to2,

000.

The

cont

rolp

aram

eter

b=−

0.2

and

c=

0.1

for

esta

blis

hing

pow

ers.

For

this

sim

ulat

ion

stud

y,w

eco

nsid

erC=

4.8

and

C=

8.

Page 21: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 21 of 24

Table 5. Descriptive Statistics for Global Stock Market Return and Volatility

ReturnDJIA Nikkei Hangseng FTSE DAX CAC

Mean 0.1433 -0.0126 0.1294 0.0823 0.1535 0.0790Median 0.2911 0.1472 0.2627 0.2121 0.4029 0.1984

Maximum 10.6977 11.4496 13.9169 12.5845 14.9421 12.4321Minimum -20.0298 -27.8844 -19.9212 -23.6317 -24.3470 -25.0504Std. Dev. 2.2321 3.0521 3.3819 2.3367 3.0972 2.9376Skewness -0.8851 -0.6978 -0.3773 -0.8643 -0.6398 -0.6803Kurtosis 10.8430 8.9250 5.9522 13.2777 7.9186 8.0780LB Test 49.9368∗∗ 15.4577 28.7922 61.0916∗∗ 28.0474 43.5004∗∗

ADF Test -38.9512∗∗ -37.1989∗∗ -35.4160∗∗ -38.8850∗∗ -37.2015∗∗ -38.7114∗∗

VolatilityDJIA Nikkei Hangseng FTSE DAX CAC

Mean 4.5983 7.7167 9.6164 5.5306 8.5698 8.2629Median 2.2155 4.6208 4.5827 2.7161 4.0596 4.7122

Maximum 208.2227 265.9300 379.4385 149.1572 175.0968 179.8414Minimum 0.0636 0.1882 0.1554 0.1154 0.1263 0.2904Std. Dev. 9.9961 13.5154 21.3838 10.1167 15.2845 12.7872Skewness 10.9980 9.6361 10.2868 6.7179 5.3602 6.0357Kurtosis 180.0844 140.7606 152.4263 67.0128 42.0810 58.3263LB Test 1924.0870∗∗ 933.3972∗∗ 1198.6872∗∗ 1970.7366∗∗ 2770.5973∗∗ 1982.4141∗∗

ADF Test -16.0378∗∗ -17.9044∗∗ -17.4896∗∗ -14.0329∗∗ -14.1928∗∗ -13.6136∗∗

Note: Table 5 presents the descriptive statistics for six globally leading indexes. The sample size is 1313 for both Returnsand Volatilities. The nominal returns are measured by weekly Friday-to-Friday log price difference multiplied by 100 andthe Monday-to-Friday volatilities are calculated following [46]. For LB-test and ADF-test statistics, the asterisks indicate thesignificance of corresponding p-value at 1% (∗∗) levels.

Page 22: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 22 of 24

Tabl

e6.

Det

ecti

onof

Con

diti

onal

Dep

ende

nce

inG

loba

lSto

ckR

etur

ns

PP

PPP

PP

From

ToD

JIA

Nik

kei

Han

gsen

gFT

SED

AX

CA

C

C=

4.8

C=

8C=

4.8

C=

8C=

4.8

C=

8C=

4.8

C=

8C=

4.8

C=

8C=

4.8

C=

8

TS.a

DJI

A-

0.36

70.

002∗∗

0.97

20.

273

0.60

90.

009∗∗

0.53

30.

012∗

0.27

70.

001∗∗

Nik

kei

0.23

10.

081

-0.

997

0.95

10.

883

0.05

90.

868

0.24

20.

004∗∗

0.00

1∗∗

Han

gsen

g0.

898

0.19

70.

963

0.40

7-

0.70

10.

035∗

0.96

90.

174

0.64

00.

005∗∗

FTSE

0.48

30.

004∗∗

0.00

4∗∗

0.00

1∗∗

0.41

90.

001∗∗

-0.

839

0.18

50.

917

0.61

5D

AX

0.97

70.

149

0.02

7∗0.

001∗∗

0.00

9∗∗

0.00

1∗∗

0.00

4∗∗

0.00

1∗∗

-0.

695

0.02

5∗

CA

C0.

995

0.71

30.

001∗∗

0.00

1∗∗

0.29

40.

001∗∗

0.91

80.

741

0.63

90.

203

-

TS.b

DJI

A-

0.40

20.

004∗∗

0.96

70.

341

0.59

50.

020∗

0.56

90.

049∗

0.28

80.

012∗

Nik

kei

0.21

90.

110

-0.

999

0.95

60.

853

0.10

30.

855

0.25

50.

009∗∗

0.00

6∗∗

Han

gsen

g0.

899

0.26

90.

975

0.48

5-

0.69

60.

088

0.97

10.

231

0.66

40.

023∗

FTSE

0.47

70.

016∗

0.00

6∗∗

0.00

3∗∗

0.40

40.

018∗

-0.

847

0.24

50.

896

0.64

2D

AX

0.97

60.

221

0.02

7∗0.

002∗∗

0.00

9∗∗

0.00

5∗∗

0.01

1∗0.

010∗∗

-0.

692

0.06

5C

AC

0.99

30.

729

0.00

2∗∗

0.00

2∗∗

0.34

60.

016∗

0.90

70.

763

0.65

00.

244

-

SMB.

a

DJI

A-

0.56

40.

001∗∗

0.99

90.

349

0.79

60.

003∗∗

0.81

70.

014∗

0.42

50.

001∗∗

Nik

kei

0.32

10.

147

-0.

988

0.94

60.

957

0.08

50.

944

0.32

10.

006∗∗

0.00

2∗∗

Han

gsen

g0.

946

0.27

30.

967

0.48

3-

0.86

00.

016∗

0.99

40.

188

0.79

30.

004∗∗

FTSE

0.57

90.

005∗∗

0.02

1∗0.

001∗∗

0.70

10.

003∗∗

-0.

947

0.29

70.

943

0.73

9D

AX

0.98

80.

240

0.04

4∗0.

001∗∗

0.01

2∗0.

001∗∗

0.00

6∗∗

0.00

1∗∗

-0.

788

0.02

0∗

CA

C0.

993

0.76

20.

006∗∗

0.00

1∗∗

0.59

40.

001∗∗

0.94

60.

861

0.84

20.

270

-

SMB.

b

DJI

A-

0.58

30.

002∗∗

0.99

40.

334

0.79

70.

001∗∗

0.85

50.

014∗

0.41

30.

001∗∗

Nik

kei

0.35

10.

155

-0.

992

0.94

00.

952

0.08

80.

965

0.32

20.

002∗∗

0.00

3∗∗

Han

gsen

g0.

945

0.27

60.

970

0.50

6-

0.86

60.

015∗

0.99

70.

215

0.82

90.

004∗∗

FTSE

0.63

70.

006∗∗

0.00

8∗∗

0.00

1∗∗

0.71

40.

003∗∗

-0.

954

0.34

50.

953

0.78

9D

AX

0.98

00.

256

0.03

4∗0.

001∗∗

0.01

1∗0.

001∗∗

0.00

9∗∗

0.00

1∗∗

-0.

831

0.04

2∗

CA

C0.

993

0.82

50.

003∗∗

0.00

1∗∗

0.59

10.

002∗∗

0.98

00.

898

0.86

20.

304

-

STB

DJI

A-

0.64

50.

001∗∗

0.99

60.

334

0.78

70.

002∗∗

0.82

30.

015∗

0.43

00.

001∗∗

Nik

kei

0.36

30.

114

-0.

989

0.94

40.

946

0.07

90.

955

0.29

60.

003∗∗

0.00

2∗∗

Han

gsen

g0.

964

0.27

20.

984

0.49

1-

0.85

90.

013∗

0.98

70.

197

0.78

60.

004∗∗

FTSE

0.65

20.

005∗∗

0.01

6∗0.

001∗∗

0.68

80.

003∗∗

-0.

940

0.26

20.

965

0.76

1D

AX

0.98

20.

234

0.04

8∗0.

001∗∗

0.01

7∗0.

001∗∗

0.00

6∗∗

0.00

1∗∗

-0.

828

0.02

9∗

CA

C0.

996

0.81

40.

007∗∗

0.00

1∗∗

0.57

80.

001∗∗

0.96

70.

835

0.79

90.

265

-N

ote:

Tabl

e6

pres

ents

the

stat

istic

sfo

rpa

irw

ise

tran

sfer

entr

opy

base

dte

ston

retr

uns

ofgl

obal

stoc

k-in

dexe

sfo

ron

e-w

eek

ahea

dco

nditi

onal

non-

inde

pend

ence

.The

resu

ltsar

esh

own

both

for

the

five

diff

eren

tres

ampl

ing

met

hods

inSe

ctio

n2.

3.Th

eco

nsta

ntC

take

sva

lue

4.8

and

8fo

rro

bust

ness

-che

ck.T

heas

teri

sks

indi

cate

the

sign

ifica

nce

ofth

eco

rres

pond

ing

p-va

lue,

atth

e5%

(∗)a

nd1%

(∗∗ )

leve

ls.

Page 23: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 23 of 24

Tabl

e7.

Det

ecti

onof

Con

diti

onal

Dep

ende

nce

inG

loba

lSto

ckVo

lati

litie

s

PP

PPP

PP

From

ToD

JIA

Nik

kei

Han

gsen

gFT

SED

AX

CA

C

C=

4.8

C=

8C=

4.8

C=

8C=

4.8

C=

8C=

4.8

C=

8C=

4.8

C=

8C=

4.8

C=

8

TS.a

DJI

A-

0.99

70.

998

0.99

80.

994

0.97

40.

853

0.82

80.

001∗∗

0.00

5∗∗

0.00

1∗∗

Nik

kei

0.99

80.

995

-0.

996

1.00

00.

997

0.99

40.

971

0.95

01.

000

0.99

9H

angs

eng

0.94

30.

003∗∗

0.98

90.

992

-0.

822

0.00

1∗∗

0.00

1∗∗

0.00

1∗∗

0.97

30.

953

FTSE

0.01

0∗∗

0.00

1∗∗

0.95

50.

944

0.99

71.

000

-0.

806

0.00

1∗∗

0.98

50.

946

DA

X0.

975

0.93

10.

934

0.89

80.

999

0.99

50.

001∗∗

0.00

1∗∗

-0.

072

0.00

1∗∗

CA

C0.

996

0.94

40.

988

0.98

70.

999

0.99

70.

003∗∗

0.00

1∗∗

0.05

40.

001∗∗

-

TS.

b

DJI

A-

0.99

30.

994

0.99

60.

992

0.95

80.

857

0.80

60.

010∗∗

0.01

8∗0.

005∗∗

Nik

kei

0.98

60.

994

-0.

996

0.99

70.

988

0.98

80.

942

0.92

60.

998

0.99

6H

angs

eng

0.91

90.

011∗

0.97

60.

966

-0.

819

0.00

2∗∗

0.00

1∗∗

0.00

1∗∗

0.96

50.

941

FTSE

0.01

9∗0.

003∗∗

0.95

00.

927

0.99

40.

997

-0.

785

0.00

2∗∗

0.98

20.

932

DA

X0.

976

0.90

40.

943

0.89

00.

997

0.99

90.

001∗∗

0.00

1∗∗

-0.

113

0.00

9∗∗

CA

C0.

998

0.95

10.

979

0.98

30.

999

0.99

80.

009∗∗

0.00

2∗∗

0.07

70.

003∗∗

-

SMB.

a

DJI

A-

0.82

30.

786

0.98

40.

968

0.46

80.

189

0.21

00.

004∗∗

0.02

7∗0.

002∗∗

Nik

kei

0.95

70.

941

-1.

000

1.00

00.

843

0.80

00.

743

0.61

40.

998

0.99

3H

angs

eng

0.45

80.

030∗

0.88

80.

808

-0.

165

0.00

7∗∗

0.00

1∗∗

0.00

1∗∗

0.83

90.

736

FTSE

0.03

4∗0.

001∗∗

0.55

70.

417

1.00

01.

000

-0.

358

0.00

1∗∗

0.96

70.

793

DA

X0.

702

0.25

50.

448

0.32

71.

000

1.00

00.

001∗∗

0.00

1∗∗

-0.

030∗

0.00

1∗∗

CA

C0.

887

0.24

10.

673

0.64

31.

000

1.00

00.

005∗∗

0.00

1∗∗

0.01

2∗0.

001∗∗

-

SMB.

b

DJI

A-

0.85

10.

806

0.97

30.

969

0.56

30.

269

0.22

60.

004∗∗

0.03

2∗0.

001∗∗

Nik

kei

0.95

00.

940

-1.

000

1.00

00.

888

0.84

90.

792

0.67

50.

999

0.99

3H

angs

eng

0.45

10.

022∗

0.91

90.

869

-0.

209

0.00

5∗∗

0.00

2∗∗

0.00

1∗∗

0.86

80.

793

FTSE

0.02

3∗0.

001∗∗

0.56

10.

457

1.00

01.

000

-0.

434

0.00

1∗∗

0.97

40.

840

DA

X0.

819

0.44

20.

491

0.35

61.

000

1.00

00.

001∗∗

0.00

1∗∗

-0.

030∗

0.00

1∗∗

CA

C0.

958

0.55

40.

781

0.69

61.

000

1.00

00.

009∗∗

0.00

1∗∗

0.01

4∗0.

001∗∗

-

STB

DJI

A-

0.84

70.

813

0.97

20.

964

0.60

20.

278

0.22

10.

009∗∗

0.04

5∗0.

004∗∗

Nik

kei

0.96

70.

956

-1.

000

1.00

00.

835

0.78

90.

777

0.62

40.

998

0.99

7H

angs

eng

0.52

70.

033∗

0.93

30.

887

-0.

222

0.01

0∗∗

0.00

3∗∗

0.00

1∗∗

0.86

00.

762

FTSE

0.02

4∗0.

002∗∗

0.63

50.

497

1.00

01.

000

-0.

405

0.00

1∗∗

0.97

70.

826

DA

X0.

795

0.39

20.

561

0.41

01.

000

1.00

00.

004∗∗

0.00

1∗∗

-0.

027∗

0.00

1∗∗

CA

C0.

903

0.56

10.

809

0.74

81.

000

1.00

00.

020∗

0.00

1∗∗

0.01

0∗∗

0.00

1∗∗

-N

ote:

Tabl

e7

pres

ents

the

stat

istic

sfo

rpa

irw

ise

tran

sfer

entr

opy

base

dte

ston

vola

tiliti

esof

glob

alst

ock-

inde

xes

for

one-

wee

kah

ead

cond

ition

alno

n-in

depe

nden

ce.T

here

sults

are

show

nbo

thfo

rth

efiv

edi

ffer

entr

esam

plin

gm

etho

dsin

Sect

ion

2.3.

The

cons

tant

Cta

kes

valu

e4.

8an

d8

for

robu

stne

ss-c

heck

.The

aste

risk

sin

dica

teth

esi

gnifi

canc

eof

the

corr

espo

ndin

gp-

valu

esat

the

5%(∗

)and

1%(∗∗ )

leve

ls.

Page 24: Transfer Entropy for Nonparametric Granger Causality

Version May 16, 2017 submitted to Entropy 24 of 24

11/95 10/97 09/99 08/01 07/03 06/05 05/07 04/09 03/11 02/13 01/15 12/160

0.2

0.4

0.6

0.8

1

Year

p-va

lue

Return Spillover Dynamics

DJIA→ NikkeiNikkei→ DJIA5% sig. level

11/95 10/97 09/99 08/01 07/03 06/05 05/07 04/09 03/11 02/13 01/15 12/160

0.2

0.4

0.6

0.8

1

Year

p-va

lue

DJIA→ HKHSHKHS→ DJIA5% sig. level

11/95 10/97 09/99 08/01 07/03 06/05 05/07 04/09 03/11 02/13 01/15 12/160

0.2

0.4

0.6

0.8

1

Year

p-va

lue

DJIA→ FTSEFTSE→ DJIA5% sig. level

11/95 10/97 09/99 08/01 07/03 06/05 05/07 04/09 03/11 02/13 01/15 12/160

0.2

0.4

0.6

0.8

1

Year

p-va

lue

DJIA→ DAXDAX→ DJIA5% sig. level

11/95 10/97 09/99 08/01 07/03 06/05 05/07 04/09 03/11 02/13 01/15 12/160

0.2

0.4

0.6

0.8

1

Year

p-va

lue

DJIA→ CACCAC→ DJIA5% sig. level

Figure 6. The time-varying p-values for the transfer entropy-based Granger causality test in Returnseries are presented. The causal linkages from DJIA to other markets, as well as the linkages from othermarkets to DJIA are tested.