nrcse...whittle likelihood approximation to gaussian likelihood using periodogram: where the sum is...
TRANSCRIPT
2.1 Spatial covariances
NRCSE
Valid covariance functionsBochner’s theorem: The class ofcovariance functions is the class ofpositive definite functions C:
Why?
aij!
i! ajC(si,sj ) " 0
aij
!i
! ajC(si,sj ) = Var( ai! Z(si ))
Spectral representation
By the spectral representation any isotropiccontinuous correlation on Rd is of the form
By isotropy, the expectation depends only onthe distribution G of . Let Y be uniform on theunit sphere. Then
�
!(v) = E eiuTX"
#
$
% , v = u ,X &Rd
�
X
�
!(v) = Eeiv X Y
= E"Y (v X )
Isotropic correlation
Jv(u) is a Bessel function of the firstkind and order v.Hence
and in the case d=2
(Hankel transform)
�
!(v) = "Y (sv)0
#
$ dG(s)
�
!Y(u) =2
u
"
#
$
%
d
2&1
'd
2
"
#
$
% Jd2&1(u)
!(v) = J0 (sv)dG(s)0
"
#
The Bessel function J0
0 200 400 600 800 1000 1200 1400 1600-0.5
0
0.5
1
–0.403
The exponential correlation
A commonly used correlation functionis ρ(v) = e–v/φ. Corresponds to aGaussian process with continuous butnot differentiable sample paths.More generally, ρ(v) = c(v=0) + (1-c)e–v/φ
has a nugget c, corresponding tomeasurement error and spatialcorrelation at small distances.All isotropic correlations are a mixtureof a nugget and a continuous isotropiccorrelation.
The squared exponential
Using yields
corresponding to an underlyingGaussian field with analytic paths.This is sometimes called the Gaussiancovariance, for no really good reason.A generalization is the power(ed)exponential correlation function,
G'(x) =2x
!2e"4x2 /!2
!(v) = e" v
#( )2
!(v) = exp " v#$% &'
(( ), 0 < ( ) 2
The spherical
Corresponding variogram
!(v) =1" 1.5v + 0.5 v
#( )3
; h < #
0, otherwise
$%&
'&
!2+"2
23 t
# + (t# )3( ); 0 $ t $ #
!2+ "
2; t > #
nugget
sill range
The Matérn class
where is a modified Bessel function ofthe third kind and order κ. It corresponds toa spatial field with κ–1 continuousderivativesκ=1/2 is exponential;κ=1 is Whittle’s spatial correlation; yields squared exponential.
G'(x) =2!
"2!
x
(x2 + "#2 )1+!
!(v) =1
2"#1$(")v
%&'(
)*+
"
K"
v
%&'(
)*+
K!
! " #
Some othercovariance/variogram
families
VariogramCovarianceName
NonePower law
NoneLinear
Rationalquadratic
Wave!2 sin("t)
"t
!2 (1"
t2
1+ #t2)
!2+ "
2 (1#sin($t)
$t)
!2+
"2t2
1+ #t2
!2+ "
2t
!2+ "
2t#
RecallMethod of moments: square of all pairwisedifferences, smoothed over lag bins
Problems: Not necessarily a valid variogramNot very robust
Estimation ofvariograms
! (h) =1
N(h)(Z(si ) " Z(sj ))
2
i,j#N(h)
$
N(h) = (i, j) : h!"h
2# si ! sj # h+
"h
2
$%&
'()
! (v) = "2 (1# $(v))
A robust empiricalvariogram estimator
(Z(x)-Z(y))2 is chi-squared for GaussiandataFourth root is variance stabilizingCressie and Hawkins:
!! (h) =
1
N(h)Z(si ) " Z(sj )
12#
$%&
'()
4
0.457 +0.494
N(h)
Least squares
Minimize
Alternatives:•fourth root transformation•weighting by 1/γ2
•generalized least squares
�
!! ([ Z(si ) " Z(sj)]2" # ( si " sj ;!)( )
j
$i
$2
Maximum likelihood
Z~Nn(µ,Σ) Σ= α[ρ(si-sj;θ)] = α V(θ)Maximize
and θ maximizes the profile likelihood
�
!(µ ,! ,") = #n
2log(2$! ) #
1
2logdetV(")
+1
2!(Z# µ )TV(")#1(Z# µ )
�
ˆ µ = 1TZ / n ˆ ! = G(ˆ " ) / n G(") = (Z# ˆ µ )TV(")#1(Z# ˆ µ )
�
! * (!) = "n
2log
G2(!)
n"1
2logdetV(!)
Parana data
ml
ls
A peculiar ml fit
Some more fits
All together now...
Bayesian kriging
Instead of estimating the parameters,we put a prior distribution on them, andupdate the distribution using the data.Model:Prior:
Posterior:
(Z !) ~ N(",#2 C($) + %2I)
f(!) = f(")f(#2 )f($)f(%2 )
f(! Z = z) " f(!) f(z #)f($2 )f(%)f(&2 )d''' $2d%d&2
geoR
Prior is assigned to φ and . Thelatter assumed zero unless specified.The distributions are discretized.Default prior on mean β is flat (if notspecified, assumed constant).(Lots of different assignments arepossible)
! "
Prior/posterior of φ
Variogram estimates
meanmedianWLS
Bayes vs universal kriging
Bayes predictive mean Universal kriging
Spectral representationStationary processes
Spectral process Y has stationaryincrements
If F has a density f, it is called thespectral density.
Z(s) = exp(isT!)dY(!)Rd
"
E dY(!)2= dF(!)
Cov(Z(s1),Z(s2 )) = ei(s1-s2 )T!f(!)d!R2
"
Estimating the spectrum
For process observed on nxn grid,estimate spectrum by periodogram
Equivalent to DFT of sample covariance
In,n (!) =1
(2"n)2z(j)ei!
Tj
j#J
$
2
! =2"j
n; J = (n# 1) / 2$% &' ,...,n# (n# 1) / 2$% &'{ }
2
Properties of theperiodogram
Periodogram values at Fourierfrequencies (j,k)π/Δ are•uncorrelated•asymptotically unbiased•not consistentTo get a consistent estimate of thespectrum, smooth over nearbyfrequencies
Some commonisotropic spectra
Squared exponential
Matérn
f(!)="2
2#$exp(% !
2/ 4$)
C(r) = "2 exp(%$ r
2)
f(!) = "(#2+ !
2)$%$1
C(r) =&"(# r )%K
%(# r )
2%$1'(% + 1)#2%
A simulated process
Z(s) = gjk cos 2!js1m
+ks2n
"#$
%&'+Ujk
()*
+,-k=.15
15
/j=0
15
/
gjk = exp(! j+ 6 ! k tan(20°) )
Thetford canopy heights
39-year thinned commercialplantation of Scots pine inThetford Forest, UKDensity 1000 trees/ha36m x 120m area surveyed forcrown heightFocus on 32 x 32 subset
Spectrum of canopy heights
Whittle likelihood
Approximation to Gaussian likelihoodusing periodogram:
where the sum is over Fourier frequencies,avoiding 0, and f is the spectral densityTakes O(N logN) operations to calculateinstead of O(N3).
!(!) = logf(";!) +IN,N (")
f(";!)
#$%
&'("
)
Using non-gridded data
Consider
where
Then Y is stationary with spectraldensity
Viewing Y as a lattice process, it hasspectral density
Y(x) = !"2 h(x " s)# Z(s)ds
h(x) = 1( xi ! " / 2, i = 1,2)
fY (!) =1
"2H(!)
2fZ (!)
f! ,Y (") = H(" +
2#q
!)
2
fZq$Z2% (" +
2#q
!)
Estimation
Let where Jx is the grid square with centerx and nx is the number of sites in thesquare. Define the tapered periodogram
where . The Whittlelikelihood is approximately
Yn2(x) =
1
nxh(si ! x)Z(si )
i"Jx
#
Ig1Yn2(!) =
1
g12 (x)"
g1(x)Yn2 (x)e# ixT!
"2
g1(x) = nx / n
!Y
=n2
2!( )2
logf" ,Y (2!j / n) +Ig1,Yn2
(2!j / n)
f" ,Y (2!j / n)
#$%
&%
'(%
)%j
*
A simulated example
Estimated variogram
trueexact mleapprox mle
Thetford revisited
Features depend on spatial location
Some references
Bertil Matern; Spatial Variation. Medd. StatensSkogsforskningsinst.49: 5, Ch 2-3.(Reprinted in Springer Lecture Notes In Statistics, vol. 36)
Cressie: ch. 2.3.1, 2.4, 2.6.
P. Guttorp, M. Fuentes and P. D. Sampson (2006): Usingtransforms to analyze space-time processes.http://www.nrcse.washington.edu/pdf/trs80.pdf (toappear,SemStat 2004 proceedings, Chapman & Hall)
Banerjee, S., Carlin, B. P. and Gelfand, A. E.: HierarchicalModeling and Analysis for Spatial Data, Chapman andHall, 2004. pp. 129-135.