[a_schreiber_96] improved surrogate data for nonlinearity tests.pdf

Upload: christian-f-vega

Post on 04-Nov-2015

220 views

Category:

Documents


0 download

DESCRIPTION

[A_Schreiber_96] Improved surrogate data for nonlinearity tests.pdf

TRANSCRIPT

  • arX

    iv:c

    hao-

    dyn/

    9909

    041v

    1 3

    0 Se

    p 19

    99Improved surrogate data for nonlinearity tests

    Thomas Schreiber, Andreas SchmitzPhysics Department, University of Wuppertal,

    D42097 Wuppertal, Germany(Phys. Rev. Lett. 77, 635 (1996))

    Current tests for nonlinearity compare a time series tothe null hypothesis of a Gaussian linear stochastic process.For this restricted null assumption, random surrogates canbe constructed which are constrained by the linear propertiesof the data. We propose a more general null hypothesis al-lowing for nonlinear rescalings of a Gaussian linear process.We show that such rescalings cannot be accounted for by asimple amplitude adjustment of the surrogates which leadsto spurious detection of nonlinearity. An iterative algorithmis proposed to make appropriate surrogates which have thesame autocorrelations as the data and the same probabilitydistribution.PACS: 05.45.+b

    The paradigm of deterministic chaos has become a veryattractive concept for the study of the irregular time evo-lution of experimental or natural phenomena. Nonlinearmethods have indeed been successfully applied to labo-ratory data from many different systems [1]. However,soon after the first signatures of low dimensional chaoshad been reported for field data [2], it turned out thatnonlinear algorithms can mistake linear correlations, inparticular those of the power law type, for determinism.[3] This has lead on the one hand to more critical appli-cations of algorithms like the correlation dimension. [4]On the other hand, significance tests have been proposedwhich allow for the detection of nonlinearity even whenfor example a clear scaling region is lacking in the cor-relation integral. [5] The idea is to test results againstthe null hypothesis of a specific class of linear randomprocesses.One of the most popular of such tests is the method of

    surrogate data, [6] which can be used with any nonlin-ear statistic that characterizes a time series by a singlenumber. The value of the nonlinear discriminating statis-tic is computed on the measured data and compared toits empirical distribution on a collection of Monte Carlorealizations of the null hypothesis. Usually, the null as-sumption we want to make is not a very specific one,like a certain particular autoregressive (AR) process. Wewould rather like to be able to test general assumptions,for example that the data is described by some Gaus-sian linear random process. Thus we will not try to finda specific faithful model of the data; we will rather de-sign the Monte Carlo realizations to have the same linearproperties as the data. The authors of [7] call this a con-strained realization approach.In particular, the null hypothesis of autocorrelated

    Gaussian linear noise can be tested with surrogates which

    are by construction Gaussian random numbers but havethe same autocorrelations as the signal. Due to theWienerKhinchin theorem, this is the case if their powerspectra coincide. One can multiply the discrete Fouriertransform of the data by random phases and then per-form the inverse transform (phase randomized surro-gates). Equivalently, one can create Gaussian indepen-dent random numbers, take their Fourier transform, re-place those amplitudes with the amplitudes of the Fouriertransform of the original data, and then invert the Fouriertransform. This is similar to a filter in the frequency do-main. Here the filter is the quotient of the desired andthe actual Fourier amplitudes.In practice, the above null hypothesis is not as inter-

    esting as one might like: Very few of the time series con-sidered for a nonlinear treatment pass even a simple testfor Gaussianity. Therefore we want to consider a moregeneral null hypothesis including the possibility that thedata were measured by an instantaneous, invertible mea-surement function h which does not depend on time n.A time series {sn}, n = 1, . . . , N is consistent with thisnull hypothesis if there exists an underlying Gaussian lin-ear stochastic signal {xn} such that sn = h(xn) for alln. If the null hypothesis is true, typical realizations of aprocess which obeys the null are expected to share thesame power spectrum and amplitude distribution. Buteven within the class defined by the null hypothesis, dif-ferent processes will result in different power spectra anddistributions. It is now an essential requirement that thediscrimiating statistics must not mistake these variationsfor deviations from the null hypothesis. The tedious wayto achieve this is by constructing a pivotal statisticswhich is insensitive to these differences. The alterna-tive we will pursue here is the constrained realizationsapproach: the variations in spectrum and distributionwithin the class defined by the null hypothesis are sup-pressed by constraining the surrogates to have the samepower spectrum as well as the same distribution of valuesas the data.In [6], the amplitude adjusted Fourier transform

    (AAFT) algorithm is proposed for the testing of this nullhypothesis. First, the data {sn} is rendered Gaussianby rankordering according to a set of Gaussian ran-dom numbers. The resulting series sn = g(sn) is Gaus-sian but follows the measured time evolution {sn}. Nowmake phase randomized surrogates for {sn}, call them{sn}. Finally, invert the rescaling g by rankordering{sn} according to the distribution of the original data,

    1

    userText BoxAn even more sophisticated approach has been proposed by Schreiber and Schmitz (Schreiber and Schmitz, 1996).

    userText BoxAn elegant way to construct surrogate data with the same power spectrum as the original data is to perform a Fourier transform, randomise the phases, and then perform an inverse Fourier transform (Fig. 6). This idea was first proposed by Pijn and Theiler et al. (Pijn, 1990; Theiler et al., 1992a,b).

    userText BoxTheiler et al. also proposed a slightly more sophisticated type of surrogate data that preserve the amplitude distribution as well as the power spectrum

    userHighlight

    userHighlight

    userHighlight

    userHighlight

    userHighlight

    userHighlight

    userHighlight

    userHighlightcuenta.considerar.

    userHighlightconduce a espuria

    userHighlight

    userHighlight

    userHighlight

    userHighlight

    userHighlight

    userHighlight

  • 10000

    100000

    1e+06

    0 1 Hz

    rela

    tive

    powe

    r

    frequencyFIG. 1. Discrepancy of the power spectra of human breath

    rate data (solid line) and 19 AAFT surrogates (dashed lines).Here the power spectra have been computed with a squarewindow of length 64..

    sn = g(s

    n).The AAFT algorithm should be correct asymptotically

    in the limit asN . [8] For finiteN however, {sn} and{sn} have the same distributions of amplitudes by con-struction, but they do not usually have the same samplepower spectra. One of the reasons is that the phase ran-domization procedure performed on {sn} preserves theGaussian distribution only on average. The fluctuationsof {sn} and {sn} will differ in detail. The nonlinearitycontained in the amplitude adjustment procedure (g isnot equal to g1) will turn these into a bias in the em-pirical power spectrum. Such systematic errors can leadto false rejections of the null hypothesis if a statistic isused which is sensitive to autocorrelations. The secondreason is that g isnt really the inverse of the nonlinearmeasurement function h, and instead of recovering {xn}we will find some other Gaussian series. Even if {sn}were Gaussian, g would not be the identity. Again, thetwo rescalings will lead to an altered spectrum.In Fig. 1 we see power spectral estimates of a clinical

    data set and of 19 AAFT surrogates. The data is takenfrom data set B of the Santa Fe Institute time seriescontest [9]. It consists of 4096 samples of the breath rateof a patient with sleep apnea. The sampling interval is0.5 seconds. The discrepancy of the spectra is significant.A bias towards a white spectrum is noted: power is takenaway from the main peak to enhance the low and highfrequencies.The purpose of this letter is to propose an alternative

    method of producing surrogate data sets which have thesame power spectrum and distribution as a given dataset. We do not expect that these two requirements canbe exactly fulfilled at the same time for finite N , exceptfor the trivial solution, a cyclic shift of the data set it-self. We will rather construct sequences which assumethe same values (without replacement) as the data andwhich have spectra which are practically indistinguish-able from that of the data. We can require a specific

    0.0001

    0.001

    0.01

    1 10 100 1000

    rela

    tive

    devia

    tion

    of s

    pect

    rum

    iterationsFIG. 2. Convergence of the iterative scheme to the correct

    power spectrum while the distribution is kept fixed. Firstorder AR process with nonlinear measurement. The curveswere obtained with N = 1024, 2048, . . . , 32768, counted fromabove. We also show the curve 1/i.

    maximal discrepancy in the power spectrum and reporta failure if this accuracy could not be reached.The algorithm consists of a simple iteration scheme.

    Store a sorted list of the values {sn} and the squaredamplitudes of the Fourier transform of {sn}, S2k =|N1n=0 snei2pikn/N |2. Begin with a random shue (with-out replacement) {s(0)n } of the data. [10] Now each iter-ation consists of two consecutive steps. First {s(i)n } isbrought to the desired sample power spectrum. This is

    achieved by taking the Fourier transform of {s(i)n }, replac-ing the squared amplitudes {S2,(i)k } by {S2k} and thentransforming back. The phases of the complex Fouriercomponents are kept. Thus the first step enforces the cor-rect spectrum but usually the distribution will be mod-ified. Therefore, as the second step, rankorder the re-sulting series in order to assume exactly the values takenby {sn}. Unfortunately, the spectrum of the resulting{s(i+1)n } will be modified again. Therefore the two stepshave to be repeated several times.At each iteration stage we can check the remaining dis-

    crepancy of the spectrum and iterate until a given accu-racy is reached. For finite N we dont expect convergencein the strict sense. Eventually, the transformation to-wards the correct spectrum will result in a change whichis too small to cause a reordering of the values. Thusafter rescaling, the sequence is not changed.In Fig. 2 we show the convergence of the iteration

    scheme as a function of the iteration count i and thelength of the time series N . The data here was a firstorder AR process xn = 0.7xn1 + n, measured throughsn = x

    3n. The increments n are independent Gaussian

    random numbers. For each N = 1024, 2048, . . . , 32768we create a time series and ten surrogates. In order toquantify the convergence, the spectrum was estimatedby S2k = |

    N1n=0 sne

    i2pikn/N |2 and smoothed over 21 fre-quency bins, S2k =

    k+10j=k10 S

    2k/21. Note that for the

    2

    userHighlight

    userHighlight

    userHighlight

  • 0.0001

    0.001

    1000 10000

    rela

    tive

    devia

    tion

    of s

    pect

    rum

    NFIG. 3. For the same process as used in Fig. 2 we show the

    saturation value of the accuracy for the above values of N .The straight line is 1/

    N .

    generation of surrogates no smoothing is performed. Asthe (relative) discrepancy of the spectrum at the ith

    iteration we useN1

    k=0 (S(i)k Sk)2/

    N1k=0 S

    2k. Not sur-

    prisingly, progress is fastest in the first iteration, wherethe random scramble is initially brought from its whitespectrum to the desired one (the initial discrepancy ofthe scramble was 0.20.01 for all cases and is not shownin Fig. 2). For i 1, the discrepancy of the spectrum de-creases approximately like 1/i until an N dependent sat-uration is reached. The saturation value seems to scalelike an inverse power of N which depends on the pro-cess. For the data underlying Fig. 2 we find a 1/

    N

    dependence, see Fig. 3. For comparison, the discrep-ancy for AAFT surrogates did not fall below 0.015 forall N . We have observed similar scaling behavior for avariety of other linear correlated processes. For data froma discretized MackeyGlass equation we found exponen-tial convergence exp(0.4i) before a saturation valuewas reached which decreases approximately like 1/N3/2.Although we found rapid convergence in all examples wehave studied so far, the rate seems to depend both onthe distribution of the data and the nature of the corre-lations. The details of the behavior are not yet under-stood.In order to verify that false rejections are indeed

    avoided by this scheme we compared the number of falsepositives in a test for nonlinearity for the AAFT algo-rithm and the iterative scheme, the latter as a func-tion of the number of iterations. We performed tests ondata sets of 2048 points generated by the instantaneously,monotonously distorted AR process sn = xn

    |xn|, xn =

    0.95xn1+n. The discriminating statistic was a nonlin-ear prediction error obtained with locally constant fits intwo dimensional delay space. For each test, 19 surrogateswere created and the null hypothesis was rejected at the95% level of significance if the prediction error for thedata was smaller then those of the 19 surrogates. Thenumber of false rejections was estimated by performing

    0 %

    10 %

    20 %

    30 %

    40 %

    50 %

    0 2 4 6 8 10

    false

    rejec

    tions

    iterationsFIG. 4. Percentage of false rejections as a function of the

    number of iterations performed. Horizontal line: nominal re-jection rate at the 95% level of significance. In this case, 7iterations are sufficient to render the test accurate. The usualAAFT algorithm yields 66% false rejections.

    300 independent tests. Instead of the expected 5% falsepositives we found 665% false rejections with the AAFTalgorithm. Fig. 4 shows the percentage of false rejectionsas a function of the number of iterations of the schemedescribed in this letter. The correct rejection rate for the95% level of significance is reached after about 7 itera-tions. This example is particularly dramatic because ofthe strong correlations, although the nonlinear rescalingis not very severe.Let us make some further remarks on the proposed al-

    gorithm. We decided to use an unwindowed power spec-tral estimate which puts quite a strong constraint on thesurrogates (the spectrum fixes N/2 parameters). Thus itcannot be excluded that the iterative scheme is able toconverge only by also adjusting the phases of the Fouriertransform in a nontrivial way. This might introduce spu-rious nonlinearity in the surrogates in which case we canfind the confusing result that there is less nonlinearity inthe data than in the surrogates. If the null hypothesisis wrong, we expect more nonlinearity in the data (bet-ter nonlinear predictability, smaller estimated dimensionetc.). Therefore we can always use onesided tests andthus avoid additional false rejections. However, spuriousstructure in the surrogates can diminish the power of thestatistical test. Since an unwindowed power spectral es-timate shows strong fluctuations within each frequencybin, it seems unnecessary to require the surrogates tohave exactly the same spectrum as the data, includingthe fluctuations. The variance of the spectral estimatecan be reduced for example by windowing, but the fre-quency content of the windowing function introduces anadditional bias.Let us finally remark that although the null hypothesis

    of a Gaussian linear process measured by a monotonousfunction is the most general we have a proper statisticaltest for, its rejection does not imply nonlinear dynamics.For instance, noninstantaneous measurement functions

    3

  • (e.g., sn = x2nxn1) are not included and (correctly) lead

    to a rejection of the null hypothesis, although the under-lying dynamics may be linear. Another example is firstdifferences of the distorted output from a Gaussian linearprocess. [11]In conclusion, we established an algorithm to provide

    surrogate data sets containing random numbers with agiven sample power spectrum and a given distribution ofvalues. The achievable accuracy depends on the natureof the data and in particular the length of the time series.We thank James Theiler, Daniel Kaplan, Tim Sauer,

    Peter Grassberger, and Holger Kantz for stimulating dis-cussions. This work was supported by the SFB 237 ofthe Deutsche Forschungsgemeinschaft.

    [1] Many references can be found in S. Vohra, M. Spano, M.Shlesinger, L. Pecora and W. Ditto, eds., Proceedingsof the 1st Experimental Chaos Conference : Arlington,Virginia, October 1 - 3, 1991, World Scientific, Singapore(1992); E. Ott, T. Sauer, & J. A. Yorke, Coping withChaos, Wiley, New York (1994).

    [2] C. Nicolis and G. Nicolis, Nature 311, 529 (1984); P.Grassberger, Nature 323, 609 (1986); C. Nicolis and G.Nicolis, Nature 326, 523 (1984); P. Grassberger, Nature326, 524 (1987).

    [3] A. R. Osborne and A. Provenzale, Physica D 35, 357(1989); J. Theiler, Phys. Lett. A 155, 480 (1991).

    [4] J.Theiler, J. Opt. Soc. Am. A 7, 1055 (1990); P. Grass-berger, T. Schreiber, and C. Schaffrath, Int. J. Bifurca-tion and Chaos 1, 521 (1991); H. Kantz and T. Schreiber,CHAOS 5, 143 (1995).

    [5] W. A. Brock, D. A. Hseih, and B. LeBaron, Nonlin-ear dynamics, chaos, and instability: statistical theoryand Economic evidence, MIT Press, Cambridge, MA,(1991).

    [6] J. Theiler, S. Eubank, A. Longtin, B. Galdrikian, andJ.D. Farmer, Physica D 58, 77 (1992).

    [7] J. Theiler and D. Prichard, ConstrainedrealizationMonteCarlo method for hypothesis testing, to appear inPhysica D (1996).

    [8] Asymptotically for N , g1 approaches the in-verse of the nonlinear measurement function h with meansqared fluctuations of O(1/N). The phase randomizationpreserves the Gaussian distribution in that limit. Sincethe amplitude adjustments are anticorrelated with thesignal, the AAFT algorithm introduces a bias towardsa flat spectrum which is also expected (and numericallyfound) to be O(1/N).

    [9] D.R. Rigney, A.L. Goldberger, W. Ocasio, Y. Ichimaru,G.B. Moody, and R. Mark, in A.S. Weigend and N.A.Gershenfeld, eds., Time Series Prediction: Forecastingthe Future and Understanding the Past, Santa Fe Insti-tute Studies in the Science of Complexity, Proc. Vol. XV,Addison-Wesley (1993).

    [10] Alternatively but not equivalently, one could start witha usual AAFT surrogate.

    [11] D. Prichard, Phys. Lett. A 191 245250 (1994).

    4

    userHighlight

    userHighlight

    userHighlight

    userHighlight

    userHighlight

    userHighlight