simultaneous confidence bands in spectral density estimation stathis.pdfdence bands for (a smoothed...
TRANSCRIPT
Simultaneous confidence bands in spectral densityestimation
Michael H. NeumannFriedrich-Schiller-Universitat Jena
Institut fur StochastikErnst-Abbe-Platz 2
D – 07743 Jena, GermanyE-mail: [email protected]
Efstathios PaparoditisUniversity of Cyprus
Department of Mathematics and StatisticsP.O. Box 20537
CY – 1678 Nicosia, CyprusE-mail: [email protected]
Abstract
We propose a method for the construction of simultaneous confidence bands for (a smoothedversion of) the spectral density of a Gaussian process based on nonparametric kernel esti-mators obtained by smoothing the periodogram. A studentized statistic is used to deter-mine the width of the band at each frequency and a frequency domain bootstrap approachis employed in order to estimate the distribution of the supremum of this statistic over allfrequencies. We prove by means of strong approximations that the bootstrap estimatesconsistently the distribution of the supremum deviation of interest and, consequently,that the proposed confidence bands achieve asymptotically the desired simultaneous cov-erage probability. The behavior of our method in finite sample situations is investigatedby simulations and a real-life data example demonstrates its applicability in time seriesanalysis.
2000 Mathematics Subject Classification. Primary 62G15; secondary 62M15.Keywords and Phrases. Bootstrap, confidence bands, Gaussian processes, spectraldensity, strong approximation.Short title. Confidence bands for the spectral density.
1
1. Introduction
Estimating the spectral density of a stochastic process is an important step in
the statistical analysis of its second order characteristics. Different parametric and
non-parametric procedures have been proposed for this purpose and are now well
investigated in the literature. As in any estimation problem, apart from the con-
struction of point estimators with desirable statistical properties, the construction of
interval estimators that simultaneously contain the unknown spectral density with a
pre-specified probability is also important. Such bands are useful in many situations.
For instance, simultaneous confidence bands can be used to decide if particular fea-
tures of the estimated spectral density are due to the covariance structure of the
underlying process or to the randomness of the spectral estimator used. Confidence
bands are also useful in checking the fit of parametric models. Such checks can be
done by examining if the spectral density of the fitted parametric model lies over all
frequencies within the nonparametrically obtained simultaneous confidence bands
for the spectral density of the process generating the observed time series.
In contrast to point estimators, however, the construction of simultaneous con-
fidence bands for the spectral density has received less attention in the statistical
literature and only few studies exist for this purpose. They mainly focus on the
parametric case of a finite order autoregressive process. In particular and for Gauss-
ian autoregressive processes, Newton and Pagano (1984) proposed a method for the
construction of simultaneous confidence bands based on properties of the reciprocal
spectral density and Scheffe’s projections. Tomasek (1987) derived simultaneous
confidence bands for the autoregressive spectral density using asymptotic properties
of parametric spectral density estimators and Sidak’s inequality. For the vector au-
toregressive case, Sakai and Sakaguchi (1990) using a method proposed by Koslov
and Jones (1985) and Hrafnkelsson and Newton (2000) extending the method pro-
posed by Tomasek (1987), developed different procedures for the construction of
simultaneous confidence bands for the components of the spectral density matrix
or of particular functions thereof. Although the assumption of a finite order au-
toregressive structure allows the implementation of (efficient) parametric spectral
density estimators for the construction of confidence bands it largely restricts the
applicability of the methods proposed.
This paper proposes a nonparametric method to construct simultaneous confi-
dence bands for (a smoothed version of) the spectral density of Gaussian processes.
The method does not rely on parametric structural assumptions on the underlying
stochastic process. Whenever one constructs nonparametric pointwise confidence
intervals or simultaneous confidence bands one faces a notorious bias problem. It
results from the fact that nonparametric curve estimation in the supremum norm
2
is an ill-posed inverse statistical problem. Problems at the practical level, even un-
der smoothness conditions on fX , emerge as follows. If the bandwidth is chosen of
mean-square-error (MSE) optimal order, then bias and standard deviation will be
of the same order of magnitude. The stochastic term can be taken into account by
asymptotic theory (the limiting process is a certain Gaussian process) or eventually
even better by some bootstrap technique. There is, however, no really satisfactory
approach to deal with the bias term. One can try to estimate it explicitly, however,
consistency of this estimator requires that some degrees of smoothness of fX are not
used by the initial estimator. Alternatively, one can choose the bandwidth of smaller
than MSE-optimal order to keep the bias negligible. This seems to be not really
practicable since a well-motivated rule for choosing an undersmoothing bandwidth
is not available, especially for any finite n. These problems can also be seen from a
different angle. Both remedies against the bias problem necessarily require that the
underlying estimator is not asymptotically optimal in the mean square sense. To
circumvent these problems, we urge the reader to re-think the possible initial goal
of setting up a confidence band for fX and suggest to construct the confidence band
for a kernel-smoothed version of fX , which turns the problem in a well-posed one.
We define a convolution operator Kh(·) as
Kh(fX)(λ) =∫
Kh(λ− ω)fX(ω) dω,
where Kh(·) = h−1K(·/h), K and h = hn are the smoothing kernel and the smooth-
ing bandwidth respectively. Our aim is to construct a confidence band for Kh(fX).
The method proposed uses, as a starting point, a nonparametric kernel-type
estimator of the spectral density obtained by smoothing the sample spectral density
(periodogram). To determine appropriately the width of the confidence band at
each frequency, the distribution of the supremum deviation over all frequencies of a
studentized version of the nonparametric estimator applied is used. The width of
the confidence band varies then according to the changing variability associated with
estimating the underlying spectral density at different frequencies. The distribution
of the supremum deviation of the studentized statistic involved in our construction
is estimated using a frequency domain bootstrap procedure which exploits the fact
that periodogram ordinates of a Gaussian noise process at the Fourier frequencies
are independent. This allows the approximation of the random behavior of sums
of weakly dependent random variables by that of independent ones. Asymptotic
validity of the bootstrap procedure proposed to approximate the desired supremum
distribution is then established by means of strong approximations. Using this
basic result we prove that the confidence bands obtained achieve asymptotically the
desired simultaneous coverage probability.
3
The paper is organized as follows. After stating the main assumptions imposed
in Section 2, we introduce the nonparametric spectral density estimator and the
basic studentized statistic used in our approach. The bootstrap method proposed
to approximate the supremum deviations is presented and its asymptotic validity
is established. We conclude this section by stating the main result of the paper re-
garding the asymptotic behavior of the coverage probability of the confidence bands
proposed. Section 3 presents some numerical examples illustrating the behavior of
our method in finite sample situations and a real-life data example demonstrates
its applicability in time series analysis. Finally, proofs of all results are deferred to
Section 4.
2. Confidence bands for the Spectral density
2.1. Preliminaries. We consider real-valued random variables X1, X2, . . . , Xn ob-
served from a stochastic process (Xt)t∈Z satisfying the following assumption.
Assumption 1: (Xt)t∈Z is a zero mean, stationary Gaussian process satisfying
∞∑
k=0
k|ck| < ∞, (2.1)
where ck = cov(Xt, Xt+k) is the autocovariance at lag k ∈ Z. Furthermore, we
assume that the spectral density fX of (Xt)t∈Z is everywhere positive. Notice that
by (2.1), fX exists, is Lipschitz continuous and is given by
fX(λ) =1
2π
∞∑
k=−∞ck cos(λk), λ ∈ [−π, π].
Stathis, oder hattest Du lieber e−iλk in der Summe?
Moreover, we assume that fX is bounded away from zero, that is
infλ∈[−π,π]
fX(λ) > 0. (2.2)
Our aim is to devise simultaneous confidence bands for fX or for some smoothed
version thereof, cf. Section 2.2, with an asymptotic coverage probability of 1−α, for
some given α ∈ (0, 1). Toward this goal we first consider a class of nonparametric
estimators of fX . A common starting point for many nonparametric estimators
proposed in the literature is the periodogram
In,X(λ) =1
2π|Jn,X(λ)|2 =
1
2π
n−1∑
k=−(n−1)
cos(λk)
(1
n
n−k∑
t=1
XtXt+k
), λ ∈ [−π, π],
where Jn,X(λ) = n−1/2 ∑nt=1 Xte
−iλt is the finite Fourier transform of X1, X2, . . . , Xn.
Commonly the periodogram is calculated at the Fourier frequencies λk = 2πk/n,
k ∈ Kn = {−[(n− 1)/2], . . . , [n/2]}.
4
The periodogram is not a consistent estimator of fX(λ) and a class of consis-
tent estimators is obtained by smoothing In,X(λ) over different frequencies, i.e., by
considering
fn,X(λ) =∑
k∈Zwn,k(λ)In,X(λk), (2.3)
In the following we derive our results for commonly used kernel estimators of fX
by setting either
wn,k(λ) =2π
nKh(λ− λk); (2.4)
cf. Priestley (1981), or
wn,k(λ) =∫ λk+π/n
λk−π/nKh(λ− ω) dω; (2.5)
cf. Muller and Prewitt (1992).
Notice that it may happen that we include in (2.3) some λk outside the interval
[−π, π] since we do not use one-sided kernels for estimation near the ends of [−π, π].
Notice further that in defining the periodogram, we could equally well use the mean-
corrected observations,
In,X−Xn(λ) =
1
2πn
∣∣∣∣∣n∑
t=1
(Xt − Xn)e−iλt
∣∣∣∣∣2
,
where Xn = n−1 ∑nt=1 Xt is the mean of the observed series. This would allow to drop
the assumption that the process has zero mean. However, the asymptotic theory
developed in this paper carries over to this case as well, since In,X−Xn(λk) = In,X(λk),
for k ∈ Z with k mod n 6= 0, which implies that the difference of the corresponding
kernel estimators is of negligible size.
We will assume that
Assumption 2: K : R → R is a nonnegative and symmetric kernel with
bounded total variation and support [−π, π]. Furthermore,∫ π−π K(x)dx = 1.
Assumption 3: The smoothing bandwidth h = hn depends on n and the
sequence (hn)n∈N fulfills hn ∼ n−η for some η ∈ (0, 1).
Instead of the class of estimators (2.3) based on a weighted average of the pe-
riodogram over the Fourier frequencies, we may also consider estimators of fX(λ)
which are based on a convolution of the periodogram with a kernel function, i.e.,
estimators given by
fn,X(λ) =∫ π
−πKh(λ− ω)In,X(ω)dω. (2.6)
5
Approximating the above integral by the corresponding Riemann sum gives
2π
n
∑
k
Kh(λ− λk)In,X(λk).
By Theorem 5.9.1 of Brillinger (1981, p. 162), we have that if K has a bounded
derivative, then∣∣∣∣fn,X(λ) − 2π
n
∑
k
Kh(λ− λk)In,X(λk)∣∣∣∣ = OP (n−1h−2 + log(n)(nh)−1),
where the OP term does not depend on λ. Thus if the kernel K satisfies the afore-
mentioned smoothness condition and if hn ∼ n−η for some η ∈ (0, 1/3), then the
asymptotic behavior of the estimators fn,X(λ) and fn,X(λ) is identical. This sug-
gests that properties established for the confidence bands based on the estimator
(2.3) will carry over to those using estimator (2.6).
2.2. Simultaneous confidence bands. We begin our construction of a confidence
band for Kh(fX) by considering the studentized statistic
Dn(λ) =fn,X(λ)−Kh(fX)(λ)
σ(fn,X(λ)), λ ∈ [−π, π], (2.7)
where σ(fn,X(λ)) is an estimator of the standard deviation of the kernel estimator
fn,X(λ), i.e., of σ(fn,X(λ)) =√
var(fX(λ)). We have that var(In,X(λk)) = (1 +
δk)f2X(λk) + O(n−1/2) and cov(In,X(λk1), In,X(λk2)) = O(n−1) for k1 6= k2, where
δk = 1 if λk = 0 or being a multiple of ±π and δk = 0 else; see Brockwell and Davis
(1991), Th. 10.3.2. This implies, in conjunction with supλ{∑
k |wn,k(λ)|} = O(1)
and supλ{∑
k w2n,k(λ)} = O(n−1h−1), that
σ2(fn,X(λ)) =∑
k
w2n,k(λ)(1 + δk)f
2X(λk) + O(n−3/2h−1 + n−1).
(2.8)
This suggests the estimator
σ2(fn,X(λ)) =∑
k
w2n,k(λ)(1 + δk)f
2n,X(λk)
of σ2(fn,X(λ)) which is used in (2.7).
Based on (2.7) a (1 − α)100% simultaneous confidence band for Kh(fX) is ob-
tained as[
fn,X(λ)− tn,ασ(fn,X(λ)), fn,X(λ) + tn,ασ(fn,X(λ))], (2.9)
where tn,α denotes the upper α-percentage point of the distribution of supλ∈[−π,π] |Dn(λ)|.Observe that the width of the interval (2.9) is proportional to σ(fn,X(λ)) which
reflects the varying difficulty in estimating the unknown spectral density fX(λ) at
6
different frequencies λ. Implementation of the above confidence band requires knowl-
edge of the distribution of supλ∈[−π,π] |Dn(λ)|. To approximate this distribution we
propose in the following a frequency domain bootstrap procedure which imitates the
distribution of a tractable approximation of the studentized statistic (2.7).
To elaborate on the approximation of Dn(λ) used, recall first the basic fact that
every non-deterministic stationary Gaussian process can be written as a causal linear
Gaussian process (see Proposition 2.1 of Fan and Yao (2003, p. 33)), that is, there
exists a sequence of independent innovations εt ∼ N (0, σ2ε) such that
Xt =∞∑
k=0
ψkεt−k (2.10)
and the coefficients {ψk, k ∈ N ∪ {0}} satisfy∑∞
k=0 ψ2k < ∞. Stathis, ist hier
auch anderes Ergebnis moglich? Wir haben ja jetzt sogar∑
k |ck|k < ∞vorausgesetzt, siehe neue (2.1)... Let ψ(ω) =
∑∞k=0 ψkω
k and denote by
In,ε(λ) =1
2π|Jn,ε(λ)|2,
the periodogram of the Gaussian noise series ε1, ε2, . . . , εn, i.e., Jn,ε(λ) is given by
Jn,ε(λ) = n−1/2 ∑nt=1 εte
−iλt. By Theorem 10.3.1 of Brockwell and Davis (1991,
p. 347) we can express the periodogram as
In,X(λ) = |ψ(e−iλ)|2In,ε(λ) + (2π)−1Rn(λ), (2.11)
where Rn(λ) = ψ(e−iλ)Jn,ε(λ)Yn(−λ)+ψ(eiλ)Jn,ε(−λ)Yn(λ)+ |Yn(λ)|2 and Yn(λ) =
n−1/2 ∑∞k=0 ψke
−iλk(∑n−k
t=1−k εte−iλt − ∑n
t=1 εte−iλt
). The random variable Jn,ε(λ) is
complex normal distributed with mean zero and variance σ2ε while Yn(λ) is complex
normal with mean zero and variance of order O(n−1).
Using (2.11) and |ψ(e−iλ)|2 = fX(λ)/fε(λ) = 2πfX(λ)/σ2ε we can decompose
Dn(λ) as follows:
Dn(λ) =∑
k
wn,k(λ)fX(λk)(2πIn,ε(λk)/σ
2ε − 1
)/σ(fn,X(λ))
+ (2π)−1∑
k
wn,k(λ)Rn(λk)/σ(fn,X(λ))
+( ∑
k
wn,k(λ)fX(λk)−Kh(fX)(λ))/σ(fn,X(λ)). (2.12)
We argue in the following that instead of supλ∈[−π,π] |Dn(λ)| it suffices to consider
supλ∈[−π,π] |Dn(λ)|, where
Dn(λ) =∑
k
wn,k(λ)fX(λk)(2πIn,ε(λk)/σ
2ε − 1
)/σ(fn,X(λ)),
(2.13)
7
that is, the contributions of the second and of the third term on the right hand side of
(2.12) to the distribution of the supremum of interest are asymptotically negligible.
Notice that the study of the distribution of supλ∈[−π,π] |Dn(λ)| is simpler than that of
supλ∈[−π,π] |Dn(λ)| because∑
k wn,k(λ)fX(λk)(2πIn,ε(λk)/σ2ε − 1) is a weighted sum
of independent random variables due to the fact that the In,ε(λk)’s are periodogram
ordinates of a Gaussian white noise series at the Fourier frequencies.
To see why the distribution of the supremum of |Dn(λ)| approximates correctly
the corresponding distribution of |Dn(λ)|, notice first that because of (infλ σ(fn,X(λ)))−1
= OP ((nh)1/2) we get by the properties of the kernel K and of the spectral density fX
that
supλ∈[−π,π]
∣∣∣∣∑
k wn,k(λ)fX(λk) − Kh(fX)(λ)
σ(fn,X(λ))
∣∣∣∣
≤(infλ
σ(fn,X(λ)))−1
supλ
∣∣∣∣∑
k
wn,k(λ)fX(λk)−Kh(fX)(λ)∣∣∣∣
= OP ((nh)1/2) O((nh)−1) = OP ((nh)−1/2). (2.14)
Furthermore, using the bound P (|Y | > x) ≤√
2/π(1/x)e−x2/2, for Y ∼ N (0, 1), we
obtain that for all γ < ∞ there exists a Cγ < ∞ such that
maxk
{P
(|Rn(λk)| > Cγ
log n√n
)}= O(n−γ). (2.15)
It now follows from (2.15) and (2.2) that for all γ < ∞ there exists Cγ < ∞ such
that
P
(sup
λ∈[−π,π]
{∣∣∣∣∣∑
k
wn,k(λ)Rn(λk)
∣∣∣∣∣/
σ(fn,X(λ))
}> Cγh
1/2 log n
)
≤ P
((infλ
σ(fn,X(λ)))−1
supλ{∑
k
|wn,k(λ)|} ×maxk{|Rn(λk)|} > Cγh
1/2 log n
)
= O(n−γ). (2.16)
Using (2.14) and (2.16) we finally obtain that
∣∣∣∣∣supλ|Dn(λ)| − sup
λ|Dn(λ)|
∣∣∣∣∣ ≤ supλ
∣∣∣Dn(λ) − Dn(λ)∣∣∣
= OP ((nh)−1/2) + OP (h1/2 log n), (2.17)
that is, the distribution of supλ∈[−π,π] |Dn(λ)| can be well approximated by the dis-
tribution of supλ∈[−π,π] |Dn(λ)|.
8
2.3. Bootstrap Approximations. In view of (2.17) it is clear that in order to eval-
uate the distribution of supλ |Dn(λ)| appropriately, it suffices to mimic the behavior
of the random variables
ξk = fX(λk)(2πIn,ε(λk)/σ
2ε − 1
)
by the bootstrap. Since the innovations εt are independent with εt ∼ N (0, σ2ε), the
random variables ξ0, . . . , ξ[n/2] are independent with
ξk ∼
fX(λk)(χ22/2− 1), if 1 ≤ k < n/2,
fX(λk)(χ21 − 1), if k ∈ {0, n/2}.
Here χ2m denotes the χ2-distribution with m degrees of freedom. Thus to mimic ξk
it is natural to generate independent random variables γ∗0 , . . . , γ∗[n/2], which are also
independent of the original sample X1, . . . , Xn, with
γ∗k ∼
(χ22/2− 1), if 1 ≤ k < n/2,
(χ21 − 1), if k ∈ {0, n/2}.
The bootstrap counterparts of the ξk are then defined as
ξ∗k = fn,X(λk)γ∗k, k = 0, . . . , [n/2].
According to the 2π-periodicity and the symmetry of the periodogram we define
further ξ∗[n/2]+k = ξ∗n−[n/2]−k (k = 1, . . . , n− [n/2]) and ξ∗−k = ξ∗k (k = 1, . . . , n). The
bootstrap counterpart of Dn(λ) =∑
k wn,k(λ)ξk/σ(fn,X(λ)) is then given by
D∗n(λ) =
∑
k
wn,k(λ)ξ∗k/σ(fn,X(λ)), λ ∈ [−π, π].
Based on this bootstrap approximation, the (1−α)100% simultaneous confidence
band for Kh(fX) we propose is given by[
fn,X(λ)− t∗n,ασ(fn,X(λ)), fn,X(λ) + t∗n,ασ(fn,X(λ))],
where t∗n,α denotes the upper α-percentage point of the distribution of supλ∈[−π,π] |D∗n(λ)|.
Note that this distribution can be evaluated by Monte Carlo simulation.
The following proposition establishes asymptotic validity of the bootstrap pro-
cedure proposed because it shows that (Dn(λ))λ∈[−π,π] is consistently mimicked by
its bootstrap analogue (D∗n(λ))λ∈[−π,π].
Proposition 2.1. Suppose that for every λ ∈ [−π, π] the weights {wn,k(λ), k ∈ Z}are given by (2.4) or (2.5) and that Assumptions 1 to 3 are satisfied. Then, there
exists a coupling of the random variables ξ−n, . . . , ξn and ξ∗−n, . . . , ξ∗n (the latter
9
having a distribution conditioned on X1, . . . , Xn) on an appropriate joint probability
space such that
P
(sup
λ∈[−π,π]
∣∣∣Dn(λ) − D∗n(λ)
∣∣∣ > nδ((nh)−1/4 + h)
)= O(n−γ)
holds for arbitrary δ > 0.
Notice that the bootstrap procedure used to generated replicates of the ξk’s is not
new. It has been proposed in a context different to that considered here by Hurvich
and Zeger (1987). Franke and Hardle (1992) investigated asymptotic properties of a
version of this procedure based on i.i.d. resampling of estimated frequency domain
residuals In,X(λk)/fn,X(λk) instead of the χ2-distributed random variables γ∗k . See
also Dahlhaus and Janas (1996) for the asymptotic properties of this procedure for
different classes of periodogram based statistics.
It is worth mentioning here that, as a careful inspection of the proof of Propo-
sition 2.1 shows, in order for the bootstrap to estimate consistently the random
behavior of Dn(λ))λ∈[−π,π], the random variables used to mimic the behavior of the
ξk’s can be alternatively defined as
ξ+k = In,X(λk)γ
∗k, k = 0, . . . , [n/2].
That is, the periodogram In,X(λk) can be used in place of the estimated spectral
density fn,X(λk) and the distribution of∑
k wn,k(λ)ξ∗k/σ(fn,X(λ)) can be imitated
by that of∑
k wn,k(λ)ξ+k /σ(fn,X(λ)). Our simulation findings suggest, however, that
using the estimated spectral density fn,X(λk) leads to better results in finite sample
situations.
2.4. Main Results. We first give a lemma which provides a concentration inequal-
ity for the supremum deviation and which implies that the strong approximation
result stated in Proposition 2.1 is good enough for proving consistency of the boot-
strap method.
Lemma 2.2. Suppose that for every λ ∈ [−π, π] the weights {wn,k(λ), k ∈ Z} are
given by (2.4) or (2.5) and that Assumptions 1 to 3 are satisfied. Then
P
(sup
λ∈[−π,π]
{∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣}∈ [c, d]
)
= O((d− c)
√nh log n + h(log n)3/2 + nδ(nh)−1/2
)
holds for arbitrary δ > 0.
10
The following theorem is the main result of this paper. It states that the proposed
bootstrap confidence band achieves asymptotically the desired coverage probability.
Theorem 2.3. Suppose that for every λ ∈ [−π, π] the weights {wn,k(λ), k ∈ Z} are
given by (2.4) or (2.5) and that Assumptions 1 to 3 are satisfied. Then
P(Kh(fX)(λ) ∈
[fn,X(λ)− t∗n,ασ(fn,X(λ)), fn,X(λ) + t∗n,ασ(fn,X(λ))
]
for all λ ∈ [−π, π])−→n→∞ 1 − α.
Remark 1. Nonparametric confidence intervals or bands directly for the function
of interest are still dominating in the literature. As argued in the Introduction, we
decided to deviate from this common practice and devised confidence bands for a
smoothed version Kh(fX) of the spectral density fX . Nevertheless, the approxima-
tion results derived here also allow to establish simultaneous confidence bands for
fX , provided that the maximum bias of fX , supλ |fX(λ) − Kh(fX)(λ)|, is of negli-
gible order oP ((nh log n)−1/2). This can be achieved by either choosing h = hn of
smaller than mean-square-error optimal order or by an explicit subsequent bias cor-
rection. However, as discussed at the beginning of the previous section, we do not
see a well-motivated rule for choosing an undersmoothing bandwidth h for a given
sample size n. The alternative of using a subsequent bias correction seems to be less
problematic at first glance. However, this bias correction can only be successful if
some degrees of smoothness of fX are not used by the initial estimator and are hence
left for the correction step. Besides these technical difficulties, we think that these
approaches are also awkward from the conceptional point of view. Both approaches
require the assumption of a sufficient degree of smoothness of fX , a condition that
can be hardly checked by any test. (This just reflects the fact that nonparametric
curve estimation is an ill-posed statistical inverse problem.) In contrast to that,
confidence bands for Kh(fX) do not suffer at all from these difficulties.
3. Numerical Examples
3.1. Simulations. To investigate the finite sample performance of our procedure a
small simulation study has been conducted using the following two linear processes:
1. Xt = 0.276Xt−1−0.084Xt−2+0.048Xt−3−0.039Xt−4+0.043Xt−5+0.09Xt−6+
0.21Xt−7 + εt,
and
2. Xt = Xt−1 − 0.4Xt−2 − 0.9εt−1 + εt.
In both generating equations (εt)t∈Z is an i.i.d. process with standard Gaussian
distributed random variables. The first, high order autoregressive (AR) process has
11
been used by Tomasek (1987). The second autoregressive moving-average (ARMA)
process has been chosen such that the large parameter of its moving average part
makes it difficult to approximate its spectral density by that of a low order autore-
gressive process.
We first investigate how well the method proposed estimates the exact confi-
dence bands. For this, realizations of length n = 256 and n = 1024 of both pro-
cesses have been considered. The estimator fn,X(λ) has been obtained using the
kernel weights wn,k(λ) = 2πKh(λ − λk)/n with K(·) the Bartlett-Priestley kernel
K(x) = 1[−π,π](x)3(1 − (x/π)2)/(4π) and for the values h = 0.12 for n = 256 and
h = 0.07 for n = 1024; see the discussion below for these particular choices of the
smoothing bandwidth h. For each process and sample size we have calculated the
exact confidence bands (2.9) by using 1000 replications to get estimators of the ex-
act percentage points tn,α of the distribution of supλ |Dn(λ)| and of the standard
deviation σ(fn,X(λ)).
The estimated exact confidence bands have been then compared with the confi-
dence bands obtained by using the method proposed in this paper. To get a typical
series as basis for this comparison, we generated 51 independent realizations of each
process and of each sample size considered and for each realization we have cal-
culated the estimation error n−1 ∑[n/2]k=−[(n−1)/2](fn,X(λk) − fX(λk))
2. We have then
selected for our comparison that series with the median value of this error. For the
so selected series the percentage points t∗n,α of the bootstrap confidence bands has
been estimated using 1000 bootstrap replications of supλ |D∗n(λ)| and the standard
deviation σ(fn,X(λ)) has been calculated as the square root of
σ2(fn,X(λ)) = n−2∑
k
K2h(λ− λk)(1 + δk)f
2n,X(λk).
The results obtained are shown in Figure 1 for the AR-process and in Figure 2
for the ARMA-process, respectively.
Please insert Figure 1 and Figure 2 about here
We next investigate how well the bootstrap based confidence bands achieve the
desired nominal coverage probability. Here we include in our simulation study also
the moving-average process
3. Xt = εt+0.276εt−1−0.084εt−2+0.048εt−3−0.039εt−4+0.043εt−5+0.09εt−6+
0.21εt−7+,
which has the same parameters as the autoregressive process 1). For this the em-
pirical coverage probability of the estimated bootstrap confidence bands have been
calculated for different sample sizes and different choices of the smoothing band-
width h. Nominal coverage probabilities of 90% and 95% have been considered.
12
Notice that since we estimate the spectral density nonparametrically the choice of
h is crucial for our analysis. To deal with this problem we calculated the empirical
coverage probabilities for three fixed values of h and for a choice of h based on a
cross-validation criterion like the one proposed by Beltrao, K. L. and Bloomfield,
P. (1987); cf. also Hurvich (1985). The three fixed values of h chosen, correspond
approximately to the mean value of h as well as to the values obtained by taking plus
minus two times the standard deviation of the bandwidth selected using the afore-
mentioned cross-validation method. The obtained empirical coverage probabilities
over 200 trials and 1000 bootstrap replications are summarized in Table 1.
Please insert Table 1 about here
According to the results obtained, our method to construct confidence bands
works very satisfactory in estimating accurately the exact confidence bands of inter-
est and leads to empirical coverage probabilities that are close to the desired nominal
probabilities.
3.2. A real-life data example. We apply the method proposed to construct confi-
dence bands to the egg-price data set analyzed in Fan and Yao (2003). In particular,
we demonstrate how the simultaneous confidence bands obtained using the proce-
dure proposed in this paper can be used to evaluate the fit of parametric models.
The data set considered consists of n = 1201 weekly egg prices at a German agri-
cultural market between April 1967 and May 1990. Since the data exhibit a clear
nonstationarity feature, Fan and Yao (2003, Chapter 3.6) considered the first-order
differences of the series. Using the first 300 observations, Fan and Yao (2003) pro-
posed two different models as appropriate for this data set, an ARMA(1,2) and a
MA(7) model. We re-estimated these models using the whole series of 1200 obser-
vations and evaluated their fit using the estimated simultaneous confidence bands
for the spectral density of the observed series.
In particular, Figure 3 shows the estimated spectral density (solid line) of the
differenced egg-price series together with a 95% bootstrap confidence band (dot-
ted lines) obtained using B = 1000 bootstrap replications. Displayed in the same
plots are also the smoothed spectral densities of the two fitted parametric models
shown by dashed lines. The nonparametric spectral density estimator has been ob-
tained using the Bartlett-Priestley kernel and a bandwidth of h = 0.12 selected by
cross-validation. The same bandwidth and kernel have been used to smooth the
theoretical spectral density of the two fitted parametric models which are shown in
Figure 3 by dashed lines. An inspection of these plots reveals that the ARMA(1,2)
model provides a better fit to the egg price data than the MA(7) model. Moreover,
13
the latter model should be rejected as not appropriate because of its difficulties to
parametrise satisfactory the low frequency behavior of the egg-price differences.
Please Insert Figure 3 about here
4. Proofs
Before we begin with the proofs of the assertions we introduce some notation.
We will generally use γ to denote an arbitrarily large and δ to denote an arbitrarily
small positive constant. For any sequence of random variables (Yn)n∈N and any
sequences of nonnegative constants (αn)n∈N and (βn)n∈N, we write
Yn = O(αn, βn),
if there exists some C < ∞ such that
P (|Yn| > Cαn) ≤ Cβn.
This notion is obviously stronger than the commonly used OP . It is quite an effec-
tive short hand in our context where we have to derive several times results of the
type that a large number of random variables is simultaneously below corresponding
threshold values, with a high probability.
Proof of Proposition 2.1. Abbreviate σ(fn,X(λ)) and σ(fn,X(λ)) by σ(λ) and σ(λ),
respectively. In view of (2.17), it suffices to construct a coupling of the underlying
random variables such that the bootstrap deviation process (∑
k wn,k(λ)ξ∗k/σ(λ))λ∈[−π,π]
is close to the process (∑
k wn,k(λ)ξk/σ(λ))λ∈[−π,π] with a high probability. We have,
similarly to (2.17), that∣∣∣Efn,X(λ) − fX(λ)
∣∣∣
≤∣∣∣Efn,X(λ) − Kh(fX)(λ)
∣∣∣ + |Kh(fX)(λ) − fX(λ)|= O
((nh)−1 + n−1/2 log n
)+ O(h2). (4.1)
Note that we obtain from (2.11) that
fn,X(λ) − Efn,X(λ) =∑
k
wn,k(λ) (ξk + Rn(λk)) .
Now it follows from Rosenthal’s inequality that, for all p ≥ 2,
E
∣∣∣∣∣∑
k
wn,k(λ)ξk
∣∣∣∣∣p
= O((nh)−p/2),
which implies in conjunction with (2.15) by Markov’s inequality that
fn,X(λ) − Efn,X(λ) = O(nδ(nh)−1/2, n−γ). (4.2)
14
(4.1) and (4.2) imply that
maxk
∣∣∣f 2n,X(λk) − f 2
X(λk)∣∣∣ = O(nδ(nh)−1/2, n−γ) + O(h2). (4.3)
Therefore, we obtain by (2.2) that
supλ∈[−π,π]
|σ(λ) − σ(λ)|
≤ supλ∈[−π,π]
{ |σ2(λ) − σ2(λ)|σ(λ)
}
= O((nh)1/2)
{(∑
k
w2n,k(λ)|f 2
n,X(λk) − f 2X(λk)|
)+ O(n−3/2h−1 + n−1)
}
= O(nδ(nh)−1, n−γ) + O((nh)−1/2h2 + n−1h−1/2 + n−1/2h1/2
). (4.4)
Hence, we can ignore the effect of estimating the unknown standard deviation, that
is, it suffices to construct such a coupling for the linear statistics∑
j
wn,j(λ)ξ∗j /σ(λ) and∑
j
wn,j(λ)ξj/σ(λ).
We do this in three steps. First, we replace the ξj by normal random variables
ηj ∼ N (0, var(ξj)) such that supλ{|∑
j wn,j(λ)ξj − ∑j wn,j(λ)ηj|} is small. Then we
replace in complete analogy the ξ∗j by normal random variables η∗j ∼ N (0, var(ξ∗j ))such that supλ{|
∑j wn,j(λ)ξ∗j −
∑j wn,j(λ)η∗j |} is small. And finally, we construct
a coupling of the ηj with the η∗j such that supλ{|∑
j wn,j(λ)ηj − ∑j wn,j(λ)η∗j |} is
small. Gluing these three couplings together we obtain the desired result.
We begin with the first coupling. Recall that the random variables ξ0, . . . , ξ[n/2]
are independent with
ξj ∼
fX(λj)(χ22/2− 1), if 1 ≤ j < n/2,
fX(λj)(χ21 − 1), if j ∈ {0, n/2}
.
Define
vj := var(ξj) =
f 2X(λj), if 1 ≤ j < n/2,
2f 2X(λj), if j ∈ {0, n/2}
.
According to Corollary 4 in Sakhanenko (1991, p. 76), there exists a coupling of
ξ0, . . . , ξ[n/2] with independent random variables η0, . . . , η[n/2], ηj ∼ N (0, vj), such
that, with Sk =∑
0≤j≤k ξj and Sk =∑
0≤j≤k ηj (0 ≤ k ≤ [n/2]), the following
inequality holds for some C < ∞ and arbitrary α ≥ 2:
P
(max
0≤k≤[n/2]{|Sk − Sk|} > Cαx
)≤
[n/2]∑
k=0
E|ξk|α /xα + P
(max
0≤k≤[n/2]{|ξk|} > x
).
15
Since we can majorize the right-hand side by 2(∑[n/2]
k=0 E|ξk|α)/xα and since all mo-
ments of the ξk are bounded we obtain with the choice of x = nδ and α = (γ + 1)/δ
that
P
(max
0≤k≤[n/2]{|Sk − Sk|} > Cαnδ
)
≤ 2 ([n/2] + 1) max0≤k≤[n/2]
{E|ξk|α} n−αδ = O(n−γ). (4.5)
Recall that, according to the 2π-periodicity and symmetry of the periodogram,
ξ[n/2]+k = ξn−[n/2]−k (k = 1, . . . , n− [n/2]) and ξ−k = ξk (k = 1, . . . , n). Accordingly
we set η[n/2]+k = ηn−[n/2]−k (k = 1, . . . , n− [n/2]) and η−k = ηk (k = 1, . . . , n). Now
we extend the above definition of Sk and Sk by setting, this time for −n ≤ k ≤ n,
Sk =k∑
j=−n
ξj −−1∑
j=−n
ξj and Sk =k∑
j=−n
ηj −−1∑
j=−n
ηj.
Then we have, for −n < j ≤ n, that ξj and ηj can be recovered from these partial
sum processes as ξj = Sj − Sj−1 and ηj = Sj − Sj−1. Note that we have, for
k = 1, . . . , n− [n/2], that
S[n/2]+k = S[n/2] + ξn−[n/2]−1 + · · ·+ ξn−[n/2]−k = S[n/2] + Sn−[n/2]−1 − Sn−[n/2]−k
and, analogously,
S[n/2]+k = S[n/2] + Sn−[n/2]−1 − Sn−[n/2]−k.
Furthermore, it follows, for k = −n, . . . ,−1, that Sk = −∑−1j=k+1 ξj = −∑−k−1
j=1 ξj =
S0 − S−k−1 and Sk = S0 − S−k−1. Therefore, we obtain from (4.5) that
max−n≤k≤n
{∣∣∣Sk − Sk
∣∣∣}
= O(nδ, n−γ
). (4.6)
It follows from the bounded total variation of the kernel K that the sequence of
weights (wn,j)j satisfies supλ{∑
j |wn,j(λ)− wn,j+1(λ)|} = O((nh)−1). Therefore, we
obtain from (4.6) that
supλ∈[−π,π]
∣∣∣∑
j
wn,j(λ)ξj −∑
j
wn,j(λ)ηj
∣∣∣ ≤ sup
λ
∑
j
|wn,j(λ)− wn,j+1(λ)||Sj − Sj|
= O(nδ(nh)−1, n−γ
). (4.7)
On the bootstrap side, we proceed similarly. Let
v∗j := var(ξ∗j ) =
f 2n,X(λj), if 1 ≤ j < n,
2f 2n,X(λj), if j ∈ {0, n/2}
.
16
Note that the v∗j can be conveniently bounded by a constant, that is, for any γ < ∞there exists a Cγ < ∞ such that
P
(max
0≤j≤[n/2]{v∗j} > Cγ
)= O(n−γ).
Conditionally on the event that maxj{v∗j} ≤ Cγ, we can again apply Corollary 4 in
Sakhanenko (1991) to show that there exist independent random variables η∗0, . . . , η∗[n/2],
η∗j ∼ N (0, v∗j ), such that, with S∗k =∑k
j=0 ξ∗j and S∗k =∑k
j=0 η∗j ,
max0≤k≤[n/2]
{∣∣∣S∗k − S∗k∣∣∣}
= O(nδ, n−γ
)
holds. Defining η∗[n/2]+k = η∗n−[n/2]−k (k = 1, . . . , n − [n/2]) and η∗−k = η∗k (k =
1, . . . , n) and extending the previous definition of S∗k and S∗k by setting S∗k =∑k
j=−n ξ∗j −∑−1
j=−n ξ∗j and S∗k =∑k
j=−n η∗j −∑−1
j=−n η∗j , respectively, we obtain as in
(4.6) that
max−n≤k≤n
{∣∣∣S∗k − S∗k∣∣∣}
= O(nδ, n−γ
). (4.8)
This implies, analogously to (4.7), that
supλ∈[−π,π]
∣∣∣∑
j
wn,j(λ)ξ∗j −∑
j
wn,j(λ)η∗j∣∣∣ = O(nδ(nh)−1, n−γ). (4.9)
Finally, it remains to construct a coupling of the ηj and the η∗j such that
supλ{|∑
j wn,j(λ)ηj − ∑j wn,j(λ)η∗j |} is small with a high probability. In contrast
to the pairs of random variables (ξj, ηj) and (ξ∗j , η∗j ) which have different distri-
butions but matching variances, the sequences (ηj)j and (η∗j )j consist of random
variables from a convolution-invariant family but with different variances. On the
other hand, because of the bounded total variation of K, the weights wn,j(λ) are
relatively smooth in j. Hence, the following coupling of η = (η0, . . . , η[n/2])′ and
η∗ = (η∗0, . . . , η∗[n/2])′ will prove to be appropriate. We decompose η and η∗ into
∆ ³ h−1 packages of respective lengths dj ³ nh, that is,
η = (η1,1, . . . , η1,d1 , . . . , η∆,1, . . . , η∆,d∆)′,
η∗ = (η∗1,1, . . . , η∗1,d1, . . . , η∗∆,1, . . . , η∗∆,d∆
)′.
Let vj,k = Eη2j,k, v∗j,k = Eη∗j,k
2, and wn,j,k(λ) = wn,l(λ), if l corresponds to (j, k).
Furthermore, let Vj =∑dj
k=1 vj,k and V ∗j =
∑dj
k=1 v∗j,k (j = 1, . . . , ∆). We define
tj,k =k∑
l=1
vj,l, t∗j,k =k∑
l=1
v∗j,l,
sj,k = (j − 1) + tj,k/Vj, s∗j,k = (j − 1) + t∗j,k/V∗j .
17
The coupling of η and η∗ will be defined by expressing both vectors by increments
of the same Wiener process. This Wiener process serves as an appropriate tool
to connect the ηj,k with the η∗j,k in such a way that partial sums of these with
slowly changing weights are close to each other. By interpolation with independent
Brownian bridges we build a Wiener process (W (t))t∈[0,∆] such that
ηj,k = V1/2j (W (sj,k) − W (sj,k−1)) .
Now we define, conditioned on X1, . . . , Xn, independent random variables η∗j,k ∼N (0, v∗j,k) as
η∗j,k = V ∗j
1/2(W (s∗j,k) − W (s∗j,k−1)
).
Moreover, the remaining ηj,k and η∗j,k are again defined according to the properties
of 2π-periodicity and symmetry of the periodogram, that is, η[n/2]+k = ηn−[n/2]−k,
η∗[n/2]+k = η∗n−[n/2]−k (1 ≤ k ≤ n− [n/2]), and η−k = ηk, η∗−k = η∗k (1 ≤ k ≤ n).
We decompose∑
l wn,l(λ)(ηj − η∗j ) =∑
j
∑k wn,j,k(λ)(ηj,k − η∗j,k) into a “coarse
structure” term,
∆1(λ) =∑
j
(V1/2j − V ∗
j1/2)
∑
k
wn,j,k(λ)(W (s∗j,k) − W (s∗j,k−1)
),
and a “fine structure term”,
∆2(λ) =∑
j
V1/2j
∑
k
wn,j,k(λ)[(W (sj,k) − W (sj,k−1)) −
(W (s∗j,k) − W (s∗j,k−1)
)].
We obtain from (4.3) that
maxj,k{|tj,k − t∗j,k|} = max
j,k
∣∣∣∣∣k∑
l=1
(f 2
n,X(λj,l) − f 2X(λj,l)
)(1 + I(λj,l mod π = 0))
∣∣∣∣∣
= O(nδ(nh)1/2, n−γ
)+ O(nh3). (4.10)
This yields by Vj ³ V ∗j ³ nh (for V ∗
j , with a probability exceeding 1−O(n−γ)) that
maxj
{|V 1/2
j − V ∗j
1/2|}
= maxj
|Vj − V ∗j |
V1/2j + V ∗
j1/2
= O
(nδ + n1/2h5/2, n−γ
).
Therefore, and since∑
k wn,j,k(λ)(W (s∗j,k) − W (s∗j,k−1)
)is normally distributed with
zero mean and a variance of order O((nh)−2), we get immediately that
|∆1(λ)| = O(nδ(nh)−1 + (nh)−1/2h2, n−γ
).
Proving this on an appropriate sequence of increasingly fine grids we also obtain
that
supλ∈[−π,π]
{|∆1(λ)|} = O(nδ[(nh)−1 + (nh)−1/2h2], n−γ
). (4.11)
18
To estimate ∆2(λ), we rewrite it as
∆2(λ) =∑
j
V1/2j
∑
k
wn,j,k(λ)
[∫ sj,k
sj,k−1
dW (t) −∫ s∗j,k
s∗j,k−1
dW (t)
]
=∑
j
V1/2j
∫ j
j−1[wt(λ) − w∗
t (λ)] dW (t),
where wt(λ) = wn,j,k(λ), if t ∈ (sj,k−1, sj,k], and w∗t (λ) = wn,j,k(λ), if t ∈ (s∗j,k−1, s
∗j,k].
(Note that the integrands in the integrals above are piecewise constant which means
that the integrals can be computed as weighted sums of increments of W ; the more
general concept of stochastic integrals is not needed here.) We conclude from (4.10)
that
|sj,k − s∗j,k| ≤|tj,k − t∗j,k|
Vj
+t∗j,kV ∗
j
|V ∗j − Vj|Vj
= O(nδ(nh)−1/2 + h2, n−γ
).
Moreover, since K has finite total variation we obtain from this relation that
∫(wt(λ) − w∗
t (λ))2 dt = O(nδ(nh)−5/2 + n−2, n−γ
), (4.12)
which yields that
∆2(λ) = O(nδ[(nh)−3/4 + (nh)−1/2h], n−γ
).
Proving this again on an appropriate sequence of increasingly fine grids we conclude
that
supλ∈[−π,π]
{|∆2(λ)|} = O(nδ[(nh)−3/4 + (nh)−1/2h], n−γ
). (4.13)
The assertion now follows from (4.7), (4.9), (4.11) and (4.13). ¤
Proof of Lemma 2.2. As in the proof of Proposition 2.1, (4.4) yields that we can
again ignore the effect of estimating the unknown standard deviation of fn,X(λ) and
consider (fn,X(λ)−Kh(fX)(λ))/σ(λ) instead of (fn,X(λ)−Kh(fX)(λ))/σ(λ). Recall
from (2.16) and (4.7) that
supλ∈[−π,π]
{∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣}
= supλ∈[−π,π]
∣∣∣∣∣∣∑
j
wn,j(λ)ηj
∣∣∣∣∣∣
+ O
(log n√
n+ nδ(nh)−1, n−γ
), (4.14)
where the ηj are independent normal random variables. We approximate the lat-
ter supremum by the maximum over the Fourier frequencies. Since∑
j |wn,j(λ) −
19
wn,j(ω)| = O(|λ− ω|/h) we obtain that∣∣∣∣∣∣
supλ∈[−π,π]
∣∣∣∑
j
wn,j(λ)ηj
∣∣∣ − max
−[n/2]≤k≤[n/2]
∣∣∣∑
j
wn,j(λk)ηj
∣∣∣
∣∣∣∣∣∣
≤ max−[n/2]≤k≤[n/2]
supλ∈[λk−π/n,λk+π/n]∩[−π,π]
∑
j
|wn,j(λ) − wn,j(λk)|×max
j{|ηj|}
= O
(√log n
nh, n−γ
). (4.15)
The random variables∑
j wn,j(λk)ηj (k = −[n/2], . . . , [n/2]) are jointly normal dis-
tributed and it follows from (i) of Lemma 3.1 in Neumann (2001) that
max−[n/2]≤k≤[n/2]{|∑j wn,j(λk)ηj|} has a density p∗n with
supt{p∗n(t)} = O
(√nh log n
).
This implies that
P
max−[n/2]≤k≤[n/2]
∣∣∣∑
j
wn,j(λk)ηj
∣∣∣ ∈ [c, d]
= (d− c)
√nh log n.
(4.16)
By (4.14), (4.15) and (4.16) we obtain, with cn,γ = Cγ(log n√
n+nδ(nh)−1 +
√log n
nh),that
P
(sup
λ∈[−π,π]
{∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣}∈ [c, d]
)
≤ P
max−[n/2]≤k≤[n/2]
∣∣∣∑
j
wn,j(λk)ηj
∣∣∣ ∈ [c − cn,γ, d + cn,γ]
+ O(n−γ)
= O((d− c)
√nh log n + h(log n)3/2 + nδ log n(nh)−1/2
).
¤
Proof of Theorem 2.3. It follows from Proposition 2.1 and Lemma 2.2 that
supt
∣∣∣∣∣∣P
sup
λ∈[−π,π]
∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣
σ(fn,X(λ))
≤ t
− P
sup
λ∈[−π,π]
∣∣∣ ∑j wn,j(λ)ξ∗j
∣∣∣σ(fn,X(λ))
≤ t
∣∣∣∣∣∣
= oP (1).
This implies
P
sup
λ∈[−π,π]
∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣
σ(fn,X(λ))
≤ t
∣∣∣∣∣∣t=t∗n,α
= 1 − α + oP (1),
20
which yields that
P
sup
λ∈[−π,π]
∣∣∣fn,X(λ) − Kh(fX)(λ)∣∣∣
σ(fn,X(λ))
≤ t∗n,α
= 1 − α + o(1).
¤
Acknowledgment . We thank two referees for many helpful comments.
References
Beltrao, K. L. and Bloomfield, P. (1987). Determining the bandwidth of a kernel spectrum estimate.Journal of Time Series Analysis 8, 21–38.
Brillinger, D. R. (1981). Time Series. Data Analysis and Theory. New York: McGraw-Hill.Brockwell, P. J. and Davis, R. A. (1991). Time Series: Theory and Methods, 2nd edition. New
York: Springer.Dahlhaus, R. and Janas, D. (1996). A frequency domain bootstrap for ratio statistics in time series
analysis. Annals of Statistics 24, 1934–1963.Fan, J. and Yao, Q. (2003). Nonlinear Time Series: Nonparametric and Parametric Methods. New
York: Springer.Franke, J. and Hardle, W. (1992). On bootstrapping kernel spectral estimates. Annals of Statistics
20, 121–145.Hrafnkelsson, B. and Newton, J. H. (2000). Asymptotic simultaneous confidence bands for vector
autoregressive spectra. Biometrika 87, 173–182.Hurvich, C. M. (1985). Data-driven choice of spectrum estimation: extending the applicability of
cross-validation methods. Journal of the American Statistical Association 80, 933–940.Hurvich, C. M. and Zeger, S. L. (1987). Frequency domain bootstrap methods for time series.
Technical Report 87-115. Graduate School of Business Administration. New York University.Koslov, J. W. and Jones, R. H. (1985). A unified approach to confidence bounds for the autore-
gressive spectral estimator. Journal of Time Series Analysis 6, 141–151.Muller, H. G. and Prewitt, K. (1992). Weak convergence and adaptive peak estimation for spectral
densities. Annals of Statistics 20, 1329–1349.Neumann, M. H. (2001). On robustness of model-based bootstrap schemes in nonparametric time
series analysis. Statistics 35, 1–40.Newton, J. H. and Pagano, M. (1984). Simultaneous confidence bands for autoregressive spectra.
Biometrika 71, 197–202.Priestley, M. B. (1981). Spectral Analysis and Time Series. New York: Academic Press.Sakai, H. and Sakaguchi, F. (1990). Simultaneous confidence bands for the spectral estimate of
two-channel autoregressive processes. Journal of Time Series Analysis 11, 49–56.Sakhanenko, A. I. (1991). On the accuracy of normal approximation in the invariance principle.
Siberian Advances in Mathematics 1, 58–91.Tomasek, L. (1987). Asymptotic simultaneous confidence bands for autoregressive spectral density.
Journal of Time Series Analysis 8, 469–491.
21
n=256 n=512 n=1024Process h 90% 95% h 90% 95% h 90% 95%
AR(7) 0.14 0.840 0.920 0.10 0,885 0.940 0.09 0.920 0.9550.12 0.845 0.915 0.08 0.875 0.935 0.07 0.910 0.9500.10 0.860 0.900 0.06 0.840 0.925 0.05 0.850 0.925CV 0.820 0.905 CV 0.845 0.935 CV 0.885 0.930
ARMA(2,1) 0.14 0.875 0.930 0.10 0.890 0.920 0.09 0.875 0.9550.12 0.860 0.915 0.08 0.870 0.910 0.07 0.885 0.9450.10 0.860 0.905 0.06 0.835 0.895 0.05 0.840 0.920CV 0.885 0.925 CV 0.850 0.905 CV 0.860 0.935
MA(7) 0.14 0.885 0.955 0.10 0.865 0.900 0.09 0.865 0.9300.12 0.835 0.930 0.08 0.840 0.900 0.07 0.850 0.9100.10 0.840 0.900 0.06 0.810 0.860 0.05 0.820 0.910CV 0.865 0.920 CV 0.835 0.910 CV 0.860 0.925
Table 1: Empirical coverage probabilities of 90% and 95% confidence bands for differentsample sizes n and smoothing bandwidths h. CV refers to the results obtained using a cross-validation criterion to select the bandwidth.
22
Frequency
Pow
er
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
1.0
Frequency
Pow
er
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
Figure 1. Simultaneous, 95% confidence bands for the spectral den-
sity of the ARMA(2,1) model: The solid line in both figures is the
estimated spectral density, the dashed lines refer to the estimated ex-
act confidence bands and the dotted lines to the bootstrap confidence
bands. The top figure presents the results for n = 256 and the bottom
figure for n = 1024.
23
Frequency
Pow
er
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Frequency
Pow
er
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
1.0
Figure 2. Simultaneous, 95% confidence bands for the spectral den-
sity of the AR(7) model: The solid line in both figures is the estimated
spectral density, the dashed lines refer to the estimated exact confi-
dence bands and the dotted lines to the bootstrap confidence bands.
The top figure presents the results for n = 256 and the bottom figure
for n = 1024.
24
Frequency
Pow
er
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.05
0.10
0.15
Frequency
Pow
er
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.05
0.10
0.15
Figure 3. Estimated spectral density (solid line) of the differenced
German egg-price data together with 95% confidence bands (dotted
lines). The dashed line in the top graph is the smoothed spectral
density of the fitted ARMA(1,2) model and in the bottom graph of
the fitted MA(7) model.