geostat´ıstica - leg-ufprpdf:verao.pdf · 2007. 9. 25. · geostat´ıstica 1 paulo justiniano...
TRANSCRIPT
![Page 1: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/1.jpg)
Geostatıstica 1
Paulo Justiniano Ribeiro Jr
Laboratorio de Estatıstica e Geoinformacao, Universidade
Federal do Parana and ESALQ/USP (Brasil)
em colaboracao com
Peter J Diggle
Lancaster University (UK) and Johns Hopkins University School
of Public Health (USA)
Curso de verao, IME/USP, Jan 2007
1versao modificada de transparencias do curso ”Model Based Geostatistics” apresentado por PJD
& PJRJr na IBC-2006, Montreal, CA e baseado no livro Diggle & Ribeiro Jr (2007), Springer.
![Page 2: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/2.jpg)
SESSION 1
Spatial statistics – an overview
See another set of slides
![Page 3: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/3.jpg)
SESSION 2
Introduction and motivating examples
![Page 4: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/4.jpg)
Geostatistics
• traditionally, a self-contained methodology for spatialprediction, developed at Ecole des Mines, Fontainebleau,France
• nowadays, that part of spatial statistics which isconcerned with data obtained by spatially discretesampling of a spatially continuous process
![Page 5: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/5.jpg)
Motivating examples
In the following examples we should identify:
• the structure of the available data
• the underlying process
• the scientific objectives
• the nature of the response variable(s) and potential co-variates
• combine elements/features for a possible statistical model
![Page 6: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/6.jpg)
Example 1.1: Measured surface elevations
1
2
3
4
56
X
0
1
2
3
4
5
6
Y
6570
7580
8590
9510
0Z
require(geoR) ; data(elevation) ; ?elevation
Potential distinction between S(x) and Y (x)
![Page 7: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/7.jpg)
![Page 8: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/8.jpg)
Example 1.2: Residual contamination fromnuclear weapons testing
−6000 −5000 −4000 −3000 −2000 −1000 0
−50
00−
4000
−30
00−
2000
−10
000
1000
X Coord
Y C
oord
western area
![Page 9: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/9.jpg)
![Page 10: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/10.jpg)
![Page 11: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/11.jpg)
Example 1.3: Childhood malaria in Gambia
300 350 400 450 500 550 600
1350
1400
1450
1500
1550
1600
1650
W−E (kilometres)
N−
S (
kilo
met
res)
Western
Eastern
Central
![Page 12: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/12.jpg)
Example 1.3: continued
30 35 40 45 50 55 60
0.0
0.2
0.4
0.6
0.8
greenness
prev
alen
ce
Correlation between prevalence and green-ness of vegetation?
![Page 13: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/13.jpg)
![Page 14: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/14.jpg)
Example 1.4: Soil data
5000 5200 5400 5600 5800 6000
4800
5000
5200
5400
5600
5800
X Coord
Y C
oord
5000 5200 5400 5600 5800 600048
0050
0052
0054
0056
0058
00
X Coord
Y C
oord
Ca (left-panel) and Mg (right-panel) concentrations
![Page 15: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/15.jpg)
Example 1.4: Continued
20 30 40 50 60 70 80
1015
2025
3035
4045
ca020
mg0
20
Correlation between local Ca and Mg concentrations.
![Page 16: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/16.jpg)
Example 1.4: Continued
5000 5200 5400 5600 5800 6000
2030
4050
6070
80
E−W
ca02
0
(a)
4800 5000 5200 5400 5600
2030
4050
6070
80
N−S
ca02
0
(b)
3.5 4.0 4.5 5.0 5.5 6.0 6.5
2030
4050
6070
80
elevation
ca02
0
(c)
1 2 3
2030
4050
6070
80
region
ca02
0
(d)
Covariate relationships for Ca concentrations.
![Page 17: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/17.jpg)
Terminology and Notation
• data format: (xi, yi)
• xi fixed or stochastically independent of Yi
• S(x) is the assumed underlying stochastic process
• model specification: [S, Y ] = [S][Y |S]
Note: [·] denotes the distribution of
![Page 18: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/18.jpg)
Support
• xi is in principle a point, but sometimes measurementsare taken on (maybe small) portions
• revisiting the examples (e.g. elevation and rongelap) wecan see contrating situations
• S(x) =∫
w(r)S∗(x− r)dr
• smoothness of w(s) contrains allowable forms for the cor-relation function
• support vs data from discrete spatial variation
![Page 19: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/19.jpg)
Multivariate responses and covariates
• Y (xi) can be a vector of observable variable
• not necessarily measurements are taken at coincident lo-cations
• data structure (xi, yi, di) can include covariates (potentialexplanatory variables)
• jargon: external trend and trend surface (coordinates orfunctions of them as covariates)
• distinction between multivariate responses and covariatesis not aways sharp and pragmatically, it may depend onthe objectives and/or availability of data
• revisiting examples
![Page 20: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/20.jpg)
Sampling design
• uniform vs non-uniform
– coverage of the area
– estimation of spatial correlation
– practical constraints
• preferential vs non-preferential
– effects on inference
– marked point process
![Page 21: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/21.jpg)
Scientific objectives
• Estimation: inference on model parameters
• Prediction: inference on the process (S(x) or some func-tional of it)
• Hypotesis testing (typically not a main concern)
![Page 22: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/22.jpg)
Generalised linear models
• GLM’s and marginal and mixed models
• GLGM: Generalized linear geostatistical models
• ingredients:
1. a Gaussian process S(x), the signal
2. data generating mechanism given the signal
3. relation to explanatory variables
h(µi) = S(xi) +
p∑
k=1
βkdk(xi)
• Gaussian and other models - revisiting examples
![Page 23: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/23.jpg)
Model-based Geostatistics
• the application of general principles of statisticalmodelling and inference to geostatistical problems
• Example: kriging as minimum mean square errorprediction under Gaussian modelling assumptions
![Page 24: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/24.jpg)
Final remarks
• course (book) structure
• statistical (pre)-requisites:
– exploratory analysis, regression, statistical modellingand inference.
– likelihood and Bayesian inference.
– computational methods, including MCMC.
• computation (geoR and geoRglm)
– R software (http://www.r-project.org)
– geoR (http://www.leg.ufpr.br/geoR) and geoRglm(http://www.leg.ufpr.br/geoRglm) packages
![Page 25: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/25.jpg)
Some computational resources
• geoR package:http://www.leg.ufpr.br/geoR
• geoRglm package:http://www.leg.ufpr.br/geoRglm
• R-project:http://www.R-project.org
• CRAN spatial task view:http://cran.r-project.org/src/contrib/Views/Spatial.html
• AI-Geostats web-site:http://www.ai-geostats.org
![Page 26: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/26.jpg)
SESSION 3
An overview of model based geostatistics
![Page 27: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/27.jpg)
Aims
• an overview of a canonical geostatistical analysis
• highlighting basic concepts, model features, results to beobtained
• steps of a typical data analysis
• using the surface elevation data as a running example
![Page 28: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/28.jpg)
Design
• what and where to address questions of scientific interest
• elevation data: map the true surface
• how many: sample size
– statistical criteria
– but typically limited by pratical contraints: time,costs, operational issues, etc
• where: design locations
– completely random vs completely regular
– different motivations, need to compromise
– oportunistic designs: concerns about preferential sam-pling and impact on inferences
![Page 29: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/29.jpg)
Model formulation
• here ”unusually” before EDA (observational data)
• just a basic reference model antecipating issues for EDA
• elevation data: best guess of the true underlying surfacefrom the available sparse data
• scientific reasoning: continuity and differentiability
• measurement process: distinction between S(x) and Y (x)
![Page 30: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/30.jpg)
A basic reference model
Gaussian geostatistics
The model:
• [Y, S] = [S][Y |S]
• Stationary Gaussian process S(x) : x ∈ IR2
· E [S(x)] = µ
· Cov {S(x), S(x′)} = σ2ρ(‖x− x′‖)
• Mutually independent Yi|S(·) ∼ N(S(xi), τ2)
Equivalent to:Y (x) = S(x) + ǫ
![Page 31: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/31.jpg)
Correlation function
• core of the spatially continuous models
• ρ(u) is positive definite(any
∑mi=1 aiS(xi) has a non-negative variance)
• here u ≥ 0, symmetric
• typically assuming a parametric form for ρ(·)
• The Matern class
ρ(u) = {2κ−1Γ(κ)}−1(u/φ)κKκ(u/φ)
• stationarity assumption
![Page 32: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/32.jpg)
Some possible extensions
• transformation of the response variable (Box-Cox)
Y ∗ =
{
(Y λ − 1)/λ : λ 6= 0log Y : λ = 0
• non-constant mean model (covariates or trend surface)
• more general covariance functions
• non-stationary covariance structure
word of caution: decision on one will probably affect the other
![Page 33: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/33.jpg)
Exploratory data analysis
• non-spatial vs spatial
• Non-spatial
– outliers
– non-normality
– arbitrary mean model: choice of potential covariates
![Page 34: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/34.jpg)
Spatial EDA - some tools and issues
• spatial outliers
• trend surfaces (scatterplots against covariates)
• other potential spatial covariates
• GIS tools
![Page 35: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/35.jpg)
Circle plot
0 50 100 150 200 250 300 350
−50
050
100
150
200
250
X Coord
Y C
oord
0 50 100 150 200 250 300 350
−50
050
100
150
200
250
X Coord
Y C
oord
Two possible visualisations: left – data values divided in quin-tiles, right – gray shade proportional to data, circle sizes pro-portional to a covariate value (elevation).
![Page 36: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/36.jpg)
A quick exploratory display
200 400 600 800
010
030
050
0
X Coord
Y C
oord
200 250 300 350 400
010
030
050
0
data
Y C
oord
200 400 600 800
200
250
300
350
400
X Coord
data
data
Fre
quen
cy
150 250 350
05
1015
20
![Page 37: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/37.jpg)
Residual plots
200 400 600 800
010
030
050
0
X Coord
Y C
oord
−50 0 50
010
030
050
0
residuals
Y C
oord
200 400 600 800
−50
050
X Coord
resi
dual
s
residuals
Fre
quen
cy
−50 0 50
010
2030
![Page 38: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/38.jpg)
Comments
• spatially varying mean vs correlation in the response vari-ables around the mean
• keep it simple!
• likelihood based methods for model choice
![Page 39: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/39.jpg)
Variograms
• Theoretical variogram (for cte mean)
2V (u) = Var{Y (xi) − Y (xj)} = E {[Y (xi) − Y (xj)]2}
• Empirical (semi-)variogram: V (u)
• biased for non-constant mean
• higher order polynomials vs spatial correlation
• Monte Carlo envelopes for empirical variograms
![Page 40: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/40.jpg)
An aside: distinction between parameter es-timation and spatial prediction
• assume a set of locations xi : i = 1, . . . , n on a latticecovering the area
• interest: average level of pollution over the region
• consider the sample mean:
S = n−1n
∑
i=1
Si
. . .
![Page 41: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/41.jpg)
• within a parameter estimation problem
– estimator of the constant mean parameter µ = E [S(x)]
– precision given by the M.S.E. E {[(S − µ)2]}– Var [S] = n−2
∑ni=1
∑ni=1 Cov (Si, Sj) ≥ σ2/n
• within a prediction problem
– predictor of the spatial average SA = |A|−1∫
AS(x)dx
– precision given by the M.S.E. E [(S − SA)2], SA isr.v
– precision (can even approach zero) given by
E [(S − SA)2] = n
−2
nX
i=1
nX
j=1
Cov (Si, Sj)
+ |A|−2
Z
A
Z
ACov {S(x), S(x
′)}dxdx
′
− 2(n|A|)−1
nX
i=1
Z
ACov {S(x), S(xi)}dx.
![Page 42: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/42.jpg)
Inference
• parameter estimation: likelihood based methods(other approaches are also used)
• spatial prediction: simple kriging
S(x) = µ+
n∑
i=1
wi(x)(yi − µ)
• straightforward extension for µ(x)
• Parameter uncertainty?usually ignored in traditional geostatistics(plug-in prediction)
![Page 43: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/43.jpg)
Summary(I): Notation
• (Yi, xi) : i = 1, . . . , n basic format for geostatistical data
• {xi : i = 1, . . . , n} is the sampling design
• {Y (x) : x ∈ A} is the measurement process
• {S(x) : x ∈ A} is the signal process
• T = F(S) is the target for prediction
• [S, Y ] = [S][Y |S] is the geostatistical model
![Page 44: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/44.jpg)
Summary(II):A canonical geostatistical dataanalysis
Basic steps:
• exploratory data analysis
• model choice
• inference on the model pa-rameters
• spatial prediction
Assumptions:
• stationarity (translation)global mean, variance andspatial correlation
• isotropy (rotation)
• Gaussianity
— demo 01 —
![Page 45: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/45.jpg)
Summary(III):Core Geostatistical Problems
Design
• how many locations?
• how many measurements?
• spatial layout of the loca-tions?
• what to measure at each loca-tion?
Modelling
• probability model for the sig-nal, [S]
• conditional probability modelfor the measurements, [Y |S]
Estimation
• assign values to unknownmodel parameters
• make inferences about (func-tions of) model parameters
Prediction
• evaluate [T |Y ], the condi-tional distribution of the tar-get given the data
![Page 46: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/46.jpg)
SESSION 4
Linear (Gaussian) geostatistical models
![Page 47: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/47.jpg)
Opening remarks
• Gaussian stochastic process are widely used
• physical representation, behaviour and *tractability*
• underlying structure many geoestatistical methods
• benchmark for hierarquical models
This section focus on characterization, properties and simula-tions of Gaussian models.
![Page 48: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/48.jpg)
The reference model
• Equivalent model formulation for the Gaussian model
Yi = µ+ S(xi) + Zi
• Schematic representation in 1-D:
0.0 0.2 0.4 0.6 0.8 1.0
4.0
4.5
5.0
5.5
6.0
6.5
7.0
locations
data
![Page 49: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/49.jpg)
Covariance function• The assumed stationary Gaussian spatial process S(x) is
fully specified by:
– the mean function µ = E [S(x)]
– the covariance function Cov {S(x), S(x′)} = σ2ρ(x, x′)
• A symmetric function Cov (·) is positive definite if
n∑
i=1
n∑
j=1
aiaj Cov (xi − xj) ≥ 0
for all ai ∈ IR, xi ∈ IRd and n ∈ IN.
• a function Cov (·) : IR → IR is a valid covariance functioniff Cov (·) is positive definite
• geostatistics uses covariance functions to characterise spa-tial processes
![Page 50: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/50.jpg)
Properties
1. Cov [Z(x), Z(x + 0)] = Var [Z(x)] = Cov (0) ≥ 0
2. Cov (u) = Cov (-u)
3. Cov (0) ≥ | Cov (u) |
4. Cov (u) = Cov [Z(s), Z(x+u)] = Cov [Z(0), Z(u)]
5. If Cov j(u) , j = 1, 2, . . . , k, are valid cov. fc. then∑kj=1 bj Cov j(u) is valid for bj ≥ 0∀j
6. If Cov j(u) , j = 1, 2, . . . , k, are valid cov. fc. then∏kj=1 Cov j(u) is valid
7. If Cov (u) is valid in Rd, then is also valid in Rp , p < d
![Page 51: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/51.jpg)
Smoothness
• A formal description of the smoothness of a spatial sur-face S(x) is its degree of differentiability.
• A process S(x) is mean-square continuous if, for all x,
E [{S(x+ u) − S(x)}2] → 0 as h → 0
• S(x) is mean square differentiable if there exists a processS′(x) such that, for all x,
E
[
{
S(x+ u) − S(x)
u− S′(x)
}2]
→ 0 as h → 0
• the mean-square differentiability of S(x) is directly linkedto the differentiability of its covariance function
![Page 52: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/52.jpg)
• Let S(x) be a stationary Gaussian process with correla-tion function ρ(u) : u ∈ IR. Then:
– S(x) is mean-square continuous iff ρ(u) is continu-ous at u = 0;
– S(x) is k times mean-square differentiable iff ρ(u) is(at least) 2k times differentiable at u = 0.
![Page 53: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/53.jpg)
Spectral representation
Bochener Theorem (iff):
Cov(u) =
∫ +∞
−∞
exp{iu}s(w)dw
• s(w) is the spectral density function
• Cov (u) and s(w) form a Fourier pair (the latter can beexpressed as a function of the former)
• provided an alternative way to estimate covariance struc-ture from the data using periodogram – s(w)
• provide ways for testinf valid covariance functions and/orto derive new ones
![Page 54: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/54.jpg)
The Matern family of correlation functions
ρ(u) = {2κ−1Γ(κ)}−1(u/φ)κKκ(u/φ)
• parameters κ > 0 (smoothness of S(x)) and φ > 0 (extentof the spatial correlation)
• Kκ(·) denotes modified Bessel function of order κ
• for κ = 0.5, ρ(u) = exp{−u/φ}: exponential corr. fct.
• for κ → ∞ ρ(u) = exp{−(u/φ)2}: Gaussian corr. fct.
• κ and φ are not orthogonal
– scale parameters φ are not comparable for differentorders κ of the Matern correlation function
– reparametrisation: α = 2φ√κ
– effects on parameter estimation
![Page 55: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/55.jpg)
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
u
ρ(u)
κ = 0.5 , φ = 0.25κ = 1.5 , φ = 0.16κ = 2.5 , φ = 0.13
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
2
x
S(x
)
![Page 56: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/56.jpg)
Notes
• many other are proposed in the literature
• correlation functions are typically, but not necessarily,decreasing functions
• models valid in d dimentions are valid for lower but notnecessarily higher dimensions. Matern is valid in 3-D.
• Matern models are ⌈κ − 1 times differentiable. For theexample, κ = 0.5, 1.5 and 2.5 correspond to processesmean square continuous, once and twice differentiable.
• Whittle (1954) proposed a special case with κ = 1
• for monotonic models, the pratical range is defined as thedistance where the correlation is 0.05.
• we assume here punctual support for all data. For differ-ent supports (mis-aligned data) regularization is needed.
![Page 57: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/57.jpg)
Properties of the process (revisited)
• Strict stationarity
• Weak (second-order, covariance) stationarity
• isotropy
Variogram representations
• intrinsic stationarity (intrinsic random functions, Math-eron,1973)
• validity (Gneiting, Sasvari and Schlather, 2001)
![Page 58: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/58.jpg)
Simulating from the model
• For a finite set of locations x, S(x) is multivariate Gaus-sian.
• A ”standard” way for obtaining (unconditional) simula-tions of S(x) is:
– define the locations
– define values for model parameters
– compute Σ using the correlation function
– obtain Σ1/2, e.g. by Cholesky factorization of sin-gular value decomposition
– obtain simulations S = Σ1/2Z where Z is a vectorof normal scores.
![Page 59: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/59.jpg)
Simulating from the model (cont.)
• Large simulations are often need in practice and requireother methods, e.g.:
– Wood and Chan (1994) – fast fourier transforms
– Rue and Tjelmeland (2002) – approximation by MarkovGaussian Random FieldsGibbs scheme using approximated sparse (n− 1) ×(n− 1) full conditionals (GMRFlib)
– Schlather (2001) – package RandomFields : imple-ments a diversity of methods (circulant embedding,turning bands, etc)
— demo 02 —
![Page 60: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/60.jpg)
SESSION 5
Linear (Gaussian) geostatistical models(cont.)
![Page 61: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/61.jpg)
Other families (I): powered exponential
ρ(u) = exp{−(u/φ)κ}
• scale parameter φ and shape parameter κ
• non-orthogonal parameters
• 0 < κ ≤ 2
• non-differentiable for κ < 2 e infinitely dif. for κ = 2
• asymptotically behaviour (pratical range)
![Page 62: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/62.jpg)
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
u
ρ(u)
κ = 0.7 , φ = 0.16κ = 1 , φ = 0.25κ = 2 , φ = 0.43
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
2
x
S(x
)
![Page 63: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/63.jpg)
Other families (II): spherical model
ρ(u) =
{
[1 − 1.5(u/φ) + 0.5(u/φ)3] for 0 ≤ u ≤ φ0 for u > φ
• finite range φ
• non- differentiable at origin
• only once differentiable at u = φ
• potential difficulties for MLE
• overlapping volume between two spheres
![Page 64: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/64.jpg)
Other families (III): wave model
ρ(u) = (u/φ)−1sin(u/φ)
• non-monotone
• oscilatory behaviour reflected in realisations
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
u
ρ(u)
0.0 0.5 1.0 1.5
−0.
20.
00.
20.
40.
60.
81.
0
u
ρ(u)
![Page 65: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/65.jpg)
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
2
x
S(x
)
![Page 66: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/66.jpg)
The nugget effect
• discontinuity at the origin
• interpretations
– Var [Y |S]
– measurement error (Y )
– micro-scale variation (S)
– combination of both
• importance for sampling design
• usually indistinguishable
• except repeated measurements at coincident locations
• impact on predictions and their variance
![Page 67: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/67.jpg)
Spatial trends
• term reffers to a variable mean function µ(x)
• trend surface and covariates
• deterministic vs stochastic: interpretation of the process
• exploratory analysis: possible non-linear relations
![Page 68: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/68.jpg)
Directional effects
• environmental conditions wind, flow, soil formation, etc)can induce directional effects
• non-invariant properties of the cov. function under rota-tion
• simplest model: geometric anisotropy
• new coordinates by rotation and stretching of the originalcoordinates:
(x1′, x2′) = (x1, x2)
(
cos(ψA) − sin(ψA)sin(ψA) cos(ψA)
) (
1 00 1
ψR
)
• add two parameters to the covariance function
• (ψA, ψR) anisotropy angle and ratio parameters
![Page 69: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/69.jpg)
−1.5 −1.0 −0.5 0.0 0.5 1.0
−0.
50.
00.
51.
01.
5
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
−1.5 −1.0 −0.5 0.0 0.5 1.0
−0.
50.
00.
51.
01.
5
12
34
56
78
910
1112
1314
1516
![Page 70: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/70.jpg)
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
0.0 0.2 0.4 0.6 0.8 1.00.
00.
20.
40.
60.
81.
0
Realisations of a geometrically anisotropic Gaussian spatialprocesses whose principal axis runs diagonally across the squareregion with anisotropy parameters (π/3, 4) for the left-handpanel and (3π/4, 2) for the right-hand panel.
![Page 71: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/71.jpg)
Non-stationary models
• Stationarity is a convenient working assumption, whichcan be relaxed in various ways.
– Functional relationship between mean and variance:sometimes handled by a data transformation
– Non-constant mean: replace constant µ by
µ(x) = Fβ =k
∑
j=1
βjfj(x)
![Page 72: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/72.jpg)
Non-stationary models (cont.)
• Non-stationary random variation:
– intrinsic variation a weaker hypothesis than station-arity (process has stationary increments, cf randomwalk model in time series), widely used as defaultmodel for discrete spatial variation (Besag, Yorkand Molie, 1991).
– Spatial deformation methods (Sampson and Gut-torp, 1992) seek to achieve stationarity by complextransformations of the geographical space, x.
– spatial convolutions (Higdon 1998, 2002; Fuentes eSmith)
– low-rank models (Hastie, 1996)
– non-Euclidean distances (Rathburn, 1998)
– locally directional effects
![Page 73: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/73.jpg)
need to balance increased flexibility of general modelling assumptionsagainst over-modelling of sparse data, leading to poor identifiability ofmodel parameters.
![Page 74: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/74.jpg)
An illustration
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Globally (left) and locally (right) directional models
![Page 75: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/75.jpg)
Other topics
• transformed Gaussian models
• non-Gaussian (GLM) models (Gotway & Stroup, 1997 ;Diggle, Tawn & Moyeed, 1998)
• unconditional and conditional simulations
• decomposing the error term Z (”nugget effect”):Z = short scale variation + measurement error
• multivariate models
– second order properties
– constructions
![Page 76: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/76.jpg)
Constructing multivariate models
One example: A common-component model
• assume independent processes S∗0(·), S∗
1(·) and S∗2(·)
• Define a bivariate process S(·) = {S1(·), S2(·)}
• Sj(x) = S∗0(x) + S∗
j (x) : j = 1, 2.
• S(·) is a valid bivariate process with covariance structure
Cov{Sj(x), Sj′(x− u)} = Cov 0(u) + I(j = j′)Cov j(u)
• for different units it requires an additional scaling pa-rameters so that S∗
0j(x) = σ0jR(x) where R(x) has unitvariance.
More general contructions are presented by Chiles and Delfiner,1999; Gelfand, Schmidt, Banerjee & Sirmans (2004), Schmidt& Gelfand (2003)
![Page 77: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/77.jpg)
SESSION 6
Parameter Estimation
![Page 78: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/78.jpg)
Opening remarks
• The canonical problem is spatial prediction of the form
S(x) = µ(x) +n
∑
i=1
wi(x)(yi − µ(x))
• The prediction problem can be tackled by adopting somecriteria (e.g. minimise MSPE)
MSPE(T ) = E [(T − T )2] e.g. above T = S(x)
• However this requires knowledge about model parame-ters
• infer first and second-moment properties of the processfrom the available data
![Page 79: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/79.jpg)
First moment properties (trend estimation)
• The OLS estimator
β = (D′D)−1D′Y
is unbiased irrespective the covariance structure (assum-ing the model is correct)
• A more efficient GLS estimator:
β = (D′V −1D)−1D′V −1Y
– unbiased
– smaller variance
– MLE
– requires knowledge about covariance parameters
![Page 80: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/80.jpg)
• for non-cte mean, OLS residuals can inform about covari-ance structure
R = Y −Dβ
• strategies: two stages (which can be interactive) or jointestimation
![Page 81: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/81.jpg)
Second order properties
Under the assumed model, for u = ||xi − xj||
• Variances and covariances:
Var [Y (x)] = τ2 + σ2 Cov [Y (xi), Y (xj)] = σ2ρ(||u||)
• The (theoretical) variogram
V(xi, xj) = V(u) =1
2Var{S(xi) − S(xj)}
• Under stationarity
V (u) = τ2 + σ2{1 − ρ(u)}
• So, the theoretical variogram is a function which sumarisesall the properties of the process
![Page 82: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/82.jpg)
Terminology
• the nugget variance: τ2
• the sill: σ2 = Var{S(x)}
• the total sill: τ2 + σ2 = Var{Y (x)}
• the range: φ, such ρ0(u) = ρ(u/φ)
• the pratical range: u0, such
– ρ(u) = 0 (finite range correlation models)
– ρ(u) = 0.95σ2 (correlation functions approachingzero asymptotically)
– or, in terms of variogram V(u) = τ2 + 0.95σ2
– this is just a practical convention!
![Page 83: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/83.jpg)
Schematic representation
0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
u
V(u
)
sill
nugget
practical range
![Page 84: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/84.jpg)
Paradigms for parameter estimation
• Ad hoc (variogram based) methods
– compute an empirical variogram
– fit a theoretical covariance model
• Likelihood-based methods
– typically under Gaussian assumptions
– more generally needs MCMC or approximations
– Optimal under stated assumptions, robustness is-sues
– full likelihood not feasible for large data-sets
– variations on the likelihood function (pseudo-likelihoods)
• Bayesian implementation, combines estimation and pre-diction
![Page 85: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/85.jpg)
Empirical variograms
• The theoretical variogram suggests an empirical estimateof V (u):
V (uij) = average{0.5[y(xi) − y(xj)]2} = average{vij}
where each average is taken over all pairs [y(xi), y(xj)]such that ||xi − xj || ≈ u
• the variogram cloud is a scatterplot of the points (uij , vij)
• the empirical variogram is derived from the variogramcloud by averaging within bins: u− h/2 ≤ uij < u+ h/2
![Page 86: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/86.jpg)
0 2 4 6 8
010
000
2000
030
000
u
V(u)
0 2 4 6 8
010
0020
0030
0040
0050
0060
00
u
V(u)
• sample variogram ordinates Vk; (k − 1)h < uij < kh
• convention uk = (k − 0.5)h (mid-point of the interval)
• may adopt distinct hk
• excludes zero from the smallest bin (deliberate)
• typically limited at a distance u < umax
![Page 87: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/87.jpg)
Variations on empirical variograms
• for a process with non-constant mean (covariates) replacey(xi) by residuals r(xi) = y(xi) − µ(xi) from a trendremoval
• usage of kernel or spline smoothers, however notice 12n(n−
1) points are not independent
• may not be worth the trouble (bandwidth issues, etc)considering exploratory purposes
• a diversity of alternative estimators is available
![Page 88: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/88.jpg)
Exploring directional effects
![Page 89: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/89.jpg)
Difficulties with empirical variograms
• vij ∼ V (uij)χ21
• the vij are correlated
• the variogram cloud is therefore unstable, both pointwiseand in its overall shape
• binning removes the first objection to the variogram cloud,but not the second
• is sensitive to mis-specification of µ(x)
![Page 90: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/90.jpg)
Variogram model fitting
• fitting a typically non-linear variogram function (as e.g.the Matern) to the empirical variogram provides a wayto estimate the models parameters.
• e.g. a weighted least squares criteria minimises
W (θ) =∑
k
nk{[Vk − V (uk; θ)]}2
where θ denotes the vector of covariance parameters andVk is average of nk variogram ordinates vij.
• in practice u is usually limited to a certain distance
• variations includes:– fitting models to the variogram cloud– other estimators for the empirical variogram– different proposals for weights– . . .
![Page 91: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/91.jpg)
Comments on variograms - I
• equally good fits for different ”extrapolations” at origin
0 2 4 6 8 10 12 14
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
u
V(u
)
![Page 92: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/92.jpg)
Comments on variograms - II
• correlation between variogram points points
0.0 0.2 0.4 0.6
0.0
0.5
1.0
1.5
u
V(u
)
Empirical variograms for three simulations from the same model.
![Page 93: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/93.jpg)
Comments on variograms - III• sensitivity to the specification of the mean
• solid smooth line: true model, dotted: empirical vari-ogram, solide: empirical variogram from true residuals,dashed: empirical variogram from estimated residuals.
0 2 4 6
0.0
0.5
1.0
1.5
2.0
2.5
u
V(u
)
![Page 94: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/94.jpg)
Computing variograms
— demo on variograms —
![Page 95: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/95.jpg)
Parameter estimation: maximum likelihood
For the basic geostatistical model
Y ∼ MVN(µ1, σ2R+ τ2I)
1 denotes an n-element vector of ones,
I is the n× n identity matrix
R is the n×n matrix with (i, j)th element ρ(uij) where uij =||xi − xj ||, the Euclidean distance between xi and xj .
Or more generally for
S(xi) = µ(xi) + Sc(xi)
µ(xi) = Dβ =k
∑
j=1
fk(xi)βk
where dk(xi) is a vector of covariates at location xi
![Page 96: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/96.jpg)
Y ∼ MVN(Dβ,σ2R+ τ2I)
The likelihood function is
L(β, τ, σ, φ, κ) ∝ −0.5{log |(σ2R+ τ2I)| +
(y −Dβ)′(σ2R+ τ2I)−1(y −Dβ)}.
• reparametrise ν2 = τ2/σ2 and denote σ2V = σ2(R+ν2I)
• the log-likelihood function is maximised for
β(V ) = (D′V −1D)−1D′V −1y
σ2(V ) = n−1(y −Dβ)′V −1(y −Dβ)
• concentrated likelihood: substitute (β, σ2) by (β, σ2) andthe maximisation reduces to
L(τr, φ, κ) ∝ −0.5{n log |σ2| + log |(R+ ν2I)|}
![Page 97: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/97.jpg)
Some technical issues
• poor quadratic approximations, unreliable Hessian ma-trices
• identifiability issues for more than tow parameters in thecorrelation function
• for models such as Mat’ern and powered exponential φand κ are not orthogonal
• For the Matern correlation function we suggest to take κin a discrete set {0.5, 1, 2, 3, . . . , N} (”profiling”)
• other possible approach is reparametrization such as re-placing φ by α = 2
√κφ (Handcock and Wallis)
• stability: e.g. Zhang’s comments on σ2/φ
• reparametrisations and asymptotics, e.g. θ1 = log(σ2/φ2κ)and θ2 = log(φ2κ)
![Page 98: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/98.jpg)
Note: variations on the likelihood• we strongly favor likelihood based methods.
• examining profile likelihoods can be reavealing on modelidentifiability and parameter uncertainty.
• restricted maximum likelihood is widely recommendedleading to less biased estimators but is sensitive to mis-specification of the mean model. In spatial models dis-tinction between µ(x) and S(x) is not sharp.
• composite likelihood uses independent contributions forthe likelihood function for each pair of points.
• approximate likelihoods are useful for large data-sets.
• Markov Random Fields can be used to approximate geo-statistical models.
• . . .
![Page 99: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/99.jpg)
Example: Surface elevation datamodel with constant mean
model µ σ2 φ τ 2 logL
κ = 0.5 863.71 4087.6 6.12 0 −244.6
κ = 1.5 848.32 3510.1 1.2 48.16 −242.1
κ = 2.5 844.63 3206.9 0.74 70.82 −242.33
model with linear trend
model β0 β1 β2 σ2 φ τ 2 logL
κ = 0.5 919.1 −5.58 −15.52 1731.8 2.49 0 −242.71
κ = 1.5 912.49 −4.99 −16.46 1693.1 0.81 34.9 −240.08
κ = 2.5 912.14 −4.81 −17.11 1595.1 0.54 54.72 −239.75
0 1 2 3 4 5 6 7
010
0020
0030
0040
0050
0060
00
u
V(u
)
0 1 2 3 4 5 6 7
050
010
0015
0020
00
u
V(u
)
![Page 100: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/100.jpg)
SESSION 7
Geostatistical spatial prediction (kriging)
![Page 101: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/101.jpg)
Prediction – general results
goal: predict the realised value of a (scalar) r.v. T , usingdata y a realisation of a (vector) r.v. Y .
predictor: of T is any function of Y , T = t(Y )
a criterion – MMSPE: the best predictor minimises
MSPE(T ) = E [(T − T )2]
The MMSEP of T is T = E(T |Y )
The prediction mean square error of T is
E[(T − T )2] = EY [Var(T |Y )],
(the prediction variance is an estimate of MSPE(T )).
E[(T − T )2] ≤ Var(T ), with equality if T and Y are inde-pendent random variables.
![Page 102: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/102.jpg)
Prediction – general results (cont.)
• We call T the least squares predictor for T , and Var(T |Y )its prediction variance
• Var(T )−Var(T |Y ) measures the contribution of the data(exploiting dependence between T and Y )
• point prediction, prediction variance are summaries
• complete answer is the distribution [T |Y ] (analytically ora sample from it)
• not transformation invariant:T the best predictor for T does NOT necessarily implythat g(T ) is the best predictor for g(T ).
![Page 103: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/103.jpg)
Prediction – Linear Gaussian model
Suppose the target for prediction is T = S(x)
The MMSEP is T = E[S(x)|Y ]
• [S(x), Y ] are jointly multivariate Gaussian. with meanvector µ1 and variance matrix
[
σ2 σ2r′
σ2r τ2I + σ2R
]
where r is a vector with elements ri = ρ(||x − xi||) : i =1, . . . , n.
• T = E[S(x)|Y ] = µ+ σ2r′(τ2I + σ2R)−1(Y − µ1) (1)
• Var[S(x)|Y ] = σ2 − σ2r′(τ2I + σ2R)−1σ2r
![Page 104: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/104.jpg)
Prediction – Linear Gaussian model (cont.)
• for the Gaussian model T is linear in Y , so that
T = w0(x) +n
∑
i=1
wi(x)Yi
• equivalent to a least squares problem to find wi whichminimise MSPE(T ) within the class of linear predictors.
• Because the conditional variance does not depend on Y ,the prediction MSE is equal to the prediction variance.
• Equality of prediction MSE and prediction variance is aspecial property of the multivariate Gaussian distribu-tion, not a general result.
![Page 105: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/105.jpg)
Prediction – Linear Gaussian model (cont.)
• Construction of the surface S(x), where T = S(x) is givenby (1), is called simple kriging.
• Assumes known model parameters.
• This name is a reference to D.G. Krige, who pioneeredthe use of statistical methods in the South African miningindustry (Krige, 1951).
![Page 106: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/106.jpg)
Features of spatial prediction
The minimum mean square error predictor for S(x) is given by
T = S(x) = µ+n
∑
i=1
wi(x)(Yi − µ)
= {1 −n
∑
i=1
wi(x)}µ+n
∑
i=1
wi(x)Yi
• shows the predictor S(x) compromises between its un-conditional mean µ and the observed data Y ,
• the nature of the compromise depends on the target loca-tion x, the data-locations xi and the values of the modelparameters,
• wi(x) are the prediction weights.
![Page 107: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/107.jpg)
0.2 0.4 0.6 0.8
−1.
5−
0.5
0.0
0.5
1.0
1.5
locations
Y(x
)
0.2 0.3 0.4 0.5 0.6 0.7 0.8
−2
−1
01
2Y
(x)
![Page 108: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/108.jpg)
Swiss rainfall data – trans-Gaussian model
0 50 100 150 200 250 300 350
0
50
100
150
200
Coordinate X (km)
Coo
rdin
ate
Y (
km)
Y ∗i = hλ(Y ) =
{
(yi)λ−1λ
if λ 6= 0log(yi) if λ = 0
ℓ(β, θ, λ) = − 1
2{log |σ2V | + (hλ(y) −Dβ)′{σ2V }−1(hλ(y) −Dβ)}
+n
∑
i=1
log(
(yi)λ−1
)
![Page 109: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/109.jpg)
κ µ σ2 φ τ2 log L0.5 18.36 118.82 87.97 2.48 -2464.3151 20.13 105.06 35.79 6.92 -2462.4382 21.36 88.58 17.73 8.72 -2464.185
•
•
••• •
•
•
•
•
•
•
•
•
•
•
•
••
••50 100 150 200 250 300
-2464.0
-2463.5
-2463.0
-2462.5
•
•
•
••• •
••
•
•
•
•
•
•
•
•
•
•
20 30 40 50 60 70
-2464.0
-2463.5
-2463.0
-2462.5
•
•
•
•
•
•
••••
••
•
•
•
•
•
•
5 6 7 8 9
-2464.0
-2463.5
-2463.0
-2462.5
![Page 110: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/110.jpg)
0 50 100 150 200 250 300
−50
050
100
150
200
250
X Coord
Y C
oord
100 200 300 400
0 50 100 150 200 250 300
−50
050
100
150
200
250
X Coord
Y C
oord
20 40 60 80
Fre
quen
cy
0.36 0.38 0.40 0.42
050
100
150
0 50 100 150 200 250 300
−50
050
100
150
200
250
Y C
oord
0.2 0.4 0.6 0.8
![Page 111: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/111.jpg)
SESSION 8
Bayesian Inference
![Page 112: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/112.jpg)
Bayesian BasicsBayesian inference deals with parameter uncertainty by treat-ing parameters as random variables, and expressing inferencesabout parameters in terms of their conditional distributions,given all observed data.
• model specification includes model parameters:
[Y, θ] = [θ][Y |θ]
• inference using Bayes’ Theorem:
[Y, θ] = [Y |θ][θ] = [Y ][θ|Y ]
• to derive the posterior distribution
[θ|Y ] = [Y |θ][θ]/[Y ] ∝ [Y |θ][θ]
• The prior distribution [θ] express the uncertainty aboutthe model parameters
![Page 113: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/113.jpg)
• The posterior distribution [θ|Y ] express the revised un-certainty after observing Y
• conjugacy is achieved in particular models where conve-nient choices of [θ] produces [θ|Y ] within the same family
• more generally [θ|Y ] may be an unknown and [Y ] =∫
[Y |θ][θ]dθmay need to be evaluated numerically.
• probability statements and estimates are based on theposterior density obtained through
p(θ|y) =ℓ(θ; y)π(θ)
∫
ℓ(θ; y)π(θ)dθ
are usually expressed as summary statistics (mean, me-dian, mode) and/or Bayesian credibility intervals
• credible intervals are not uniquely defined (e.g. quantilebased, highest density interval, etc)
![Page 114: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/114.jpg)
Prediction
For Bayesian prediction expand the Bayes’ theorem to includethe prediction target, allowing for uncertainty on model pa-rameters to be accounted for.
• and for prediction
[Y, T, θ] = [Y, T |θ][θ]
• derive the predictive distribution
[T |Y ] =
∫
[T, θ|Y ]dθ =
∫
[T |Y, θ][θ|Y ]dθ
• can be interpreted as a weighted prediction over possiblevalues of [θ|Y ]
• in general, as data becomes more abundant [θ|Y ] concen-trates around θ
![Page 115: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/115.jpg)
Bayesian inference for the geostatistical model
Bayesian inference for the geostatistcal model expands the pre-vious results acknowledging for Y and S as specified by theadopted model.
• model specification:
[Y, S, θ] = [θ][Y, S|θ] = [θ][S|θ][Y |S, θ]
• inference using Bayes’ Theorem:
[Y, S, θ] = [Y, S|θ][θ] = [Y ][θ, S|Y ]
• to derive the posterior distribution
[θ|Y ] =
∫
[θ, S|Y ]dS =
∫
[Y |S, θ][S|θ][θ][Y ]
dS
![Page 116: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/116.jpg)
• where [Y ] =∫ ∫
[Y |θ][S|θ][θ]dSdθ is typically difficult toevaluate
• For prediction
[Y, T, S, θ] = [Y, T |S, θ][S|θ][θ]
• derive the predictive distribution
[T |Y ] =
∫ ∫
[T, S, θ|Y ]dSdθ =
∫ ∫
[T |Y, S, θ][S, θ|Y ]dSdθ
• and explore the conditional independence structure ofthe model to simplify the calculations
![Page 117: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/117.jpg)
Notes I
• likelihood function occupies a central role in both classi-cal and Bayesian inference
• plug-in prediction corresponds to inferences about [T |Y, θ]
• Bayesian prediction is a weighted average of plug-in pre-dictions, with different plug-in values of θ weighted ac-cording to their conditional probabilities given the ob-served data.
• Bayesian prediction is usually more cautious than plug-inprediction.Allowance for parameter uncertainty usually results inwider prediction intervals
![Page 118: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/118.jpg)
Notes II
1. The need to evaluate the integral which defines [Y ] rep-resented a major obstacle to practical application,
2. development of Markov Chain Monte Carlo (MCMC)methods has transformed the situation.
3. BUT, for geostatistical problems, reliable implementa-tion of MCMC is not straightforward. Geostatisticalmodels don’t have a natural Markovian structure for thealgorithms work well.
4. in particular for the Gaussian model other algorithms canbe implemented.
![Page 119: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/119.jpg)
Results for the Gaussian models - I
• fixing covariance parameters and assuming a (conjugate)prior for β
β ∼ N(
mβ ; σ2Vβ)
• The posterior is given by
[β|Y ] ∼ N((V −1β +D′R−1D)−1(V −1
β mβ +D′R−1y) ;
σ2 (V −1β +D′R−1D)−1)
∼ N(
β ; σ2 Vβ
)
• and the predictive distribution is
p(S∗|Y, σ2, φ) =
∫
p(S∗|Y, β, σ2, φ) p(β|Y, σ2, φ) dβ.
![Page 120: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/120.jpg)
• with mean and variance given by
E[S∗
|Y ] = (D0 − r′V
−1D)(V
−1
β+ D
′V
−1D)
−1V
−1
βmβ +
h
r′V
−1+ (D0 − r
′V
−1D)(V
−1
β+ D
′V
−1D)
−1D
′V
−1i
Y
Var[S∗
|Y ] = σ2
h
V0 − r′V
−1r+
(D0 − r′V
−1D)(V
−1
β+ D
′V
−1D)
−1(D0 − r
′V
−1D)
′
i
.
• predicted mean balances between prior and weighted av-erage of the data
• The predictive variance has three interpretable compo-nents: a priori variance, the reduction due to the dataand the uncertainty in the mean.
• Vβ → ∞ results can be related to REML and universal(or ordinary) kriging.
![Page 121: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/121.jpg)
Results for the Gaussian models - II
• fixing correlation parameters and assuming a (conjugate)prior for [β, σ2] ∼ Nχ2
ScI
(
mb, Vb, nσ, S2σ
)
given by:
[β|σ2] ∼ N(
mβ ; σ2Vβ)
and [σ2] ∼ χ2ScI(nσ, S
2σ)
• The posterior is [β, σ2|y, φ] ∼ Nχ2ScI
(
β, Vβ, nσ + n, S2)
β = Vβ(V−1b mb +D′R−1y)
Vβ = (V −1b +D′R−1D)−1
S2 =nσS
2
σ+m′
bV−1
bmb+y
′R−1y−β′V−1
ββ
nσ+n
![Page 122: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/122.jpg)
• The predictive distribution [S∗|y] ∼ tnσ+n
(
µ∗, S2Σ∗)
• with mean and variance given by
E[S∗|y] = µ∗,
Var[S∗|y] =nσ + n
nσ + n− 2S2Σ∗,
µ∗
= (D∗
− r′V
−1D)V
βV
−1
bmb
+h
r′V
−1+ (D
∗− r
′V
−1D)V
βD
′V
−1i
y,
Σ∗
= V0
− r′V
−1r + (D
∗− r
′V
−1D)(V
−1
b+ V
−1
β)−1
(D∗
− r′V
−1D)
′.
• valid if τ2 = 0
• for τ2 > 0, ν2 = τ2/σ2 can regarded as a correlationparameter
![Page 123: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/123.jpg)
Results for the Gaussian models - III
Assume a prior p(β, σ2, φ) ∝ 1σ2p(φ).
• The posterior distribution for the parameters is:
p(β, σ2, φ|y) = p(β, σ2|y, φ) p(φ|y)
• where p(β, σ2|y, φ) can be obtained analytically and
pr(φ|y) ∝ pr(φ) |Vβ| 1
2 |Ry|−1
2 (S2)− n−p
2
• analogous results for more general prior:
[β|σ2, φ] ∼ N(
mb, σ2Vb
)
and [σ2|φ] ∼ χ2ScI
(
nσ, S2σ
)
,
• choice of prior for φ can be critical. (Berger, De Oliveira& Sanso, 2001)
![Page 124: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/124.jpg)
Algorithm 1:
1. Discretise the distribution [φ|y], i.e. choose a range ofvalues for φ which is sensible for the particular applica-tion, and assign a discrete uniform prior for φ on a set ofvalues spanning the chosen range.
2. Compute the posterior probabilities on this discrete sup-port set, defining a discrete posterior distribution withprobability mass function pr(φ|y), say.
3. Sample a value of φ from the discrete distribution pr(φ|y).
4. Attach the sampled value of φ to the distribution [β, σ2|y, φ]and sample from this distribution.
5. Repeat steps (3) and (4) as many times as required; theresulting sample of triplets (β, σ2, φ) is a sample from thejoint posterior distribution.
![Page 125: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/125.jpg)
The predictive distribution is given by:
p(S∗|Y ) =
∫ ∫ ∫
p(S∗, β, σ2, φ|Y ) dβ dσ2 dφ
=
∫ ∫ ∫
p(
s∗, β, σ2|y, φ)
dβ dσ2 pr(φ|y) dφ
=
∫
p(S∗|Y, φ) p(φ|y) dφ.Algorithm 2:
1. Discretise [φ|Y ], as in Algorithm 1.
2. Compute the posterior probabilities on the discrete sup-port set. Denote the resulting distribution pr(φ|y).
3. Sample a value of φ from pr(φ|y).
4. Attach the sampled value of φ to [s∗|y, φ] and sample fromit obtaining realisations of the predictive distribution.
5. Repeat steps (3) and (4) to generate a sample from therequired predictive distribution.
![Page 126: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/126.jpg)
Notes
1. The algorithms are of the same kind to treat τ and/or κas unknown parameters.
2. We specify a discrete prior distribution on a multi-dimensionalgrid of values.
3. This implies extra computational load (but no new prin-ciples)
![Page 127: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/127.jpg)
Elevation data
0 0.8 1.6 2.4 3.2 4 4.8 5.6
priorposterior
φ
dens
ity0.
000.
050.
100.
150.
200.
25
0 0.15 0.3 0.45 0.6 0.75 0.9
priorposterior
τrel2
dens
ity0.
00.
10.
20.
30.
4
870 880 890 900 910
0.00
0.05
0.10
0.15
f(y)
860 870 880 890 900 910 920 930
0.00
0.01
0.02
0.03
0.04
f(y)
![Page 128: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/128.jpg)
0 15 37.5 60 82.5 105 135
priorposterior
0.00
0.05
0.10
0.15
0.20
0.25
0 0.1 0.2 0.3 0.4 0.5
priorposterior
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0 50 100 150 200 250 300
−50
050
100
150
200
250
X Coord
Y C
oord
100 200 300 400
0 50 100 150 200 250 300
−50
050
100
150
200
250
X Coord
Y C
oord
20 40 60 80
![Page 129: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/129.jpg)
Table 1: Swiss rainfall data: posterior means and 95% central quantile-
based credible intervals for the model parameters.parameter estimate 95% interval
β 144.35 [53.08 , 224.28]σ2 13662.15 [8713.18 , 27116.35]φ 49.97 [30 , 82.5]ν2 0.03 [0 , 0.05]
0.36 0.38 0.40 0.42 0.44
010
2030
A200
f(A20
0)
0 50 100 150 200 250 300
−50
050
100
150
200
250
X Coord
Y C
oord
0.2 0.4 0.6 0.8
![Page 130: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/130.jpg)
SESSION 9
Generalised linear geostatistical models
![Page 131: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/131.jpg)
Generalized linear geostatistical model
• Preserving the assumption of a zero mean, stationaryGaussian process S(·),
• our basic model can be generalized replacing the assump-tion of mutually independent Yi|S(·) ∼ N(S(x), τ 2)by assumingYi|S(·) are mutually independent within the class of gen-eralized linear models (GLM)
• with a link function h(µi) =∑pj=1 dijβj + S(xi)
• this defines a generalized linear mixed model (GLMM)with correlated random effects
• which provides a way to adapt classical GLM for geosta-tistical applications.
![Page 132: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/132.jpg)
GLGM
• usually just a single realisation is available, in contrastwith GLMM for longitudinal data analysis
• The GLM approach is most appealing when follows annatural sampling mecanism such as Poisson model forcounts and logist-linear models for binary/binomial re-sponses
• in principle transformed models can be considered forskewed distributions
• variograms for such processes can be obtained althoughproviding a less obvious summary statistics
• empirical variograms of GLM residuals can be used forexploratory analysis
![Page 133: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/133.jpg)
An example: a Poisson model
• [Y (xi) | S(xi)] is Poisson with density
f(yi; ζi) = exp(−ζi)ζyi
i /yi! yi = 0, 1, 2, . . .
• link: E[Y (xi) | S(xi)] = ζi = h(µi) = h(µ+ S(xi))
• log-link h(·) = exp(·)
• more generaly the models can be expanded allowing forcovariates and/or uncorrelated random effects
h(µi) =
p∑
j=1
dijβj + S(xi) + Zi
which, differently from Gaussian models, distinguish be-tween the terms of the nugget effect: Poisson variationaccounts for the anologue of measurement error and spa-tially uncorrelated component to the short scale variation
![Page 134: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/134.jpg)
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
5 10 15
0.0 0.2 0.4 0.6 0.8 1.00.
00.
20.
40.
60.
81.
0
200 400 600 800 1000
Simulations from the Poisson model; grey-scale shading repre-sents the data values on a regular grid of sampling locations andcontours represents the conditional expectation surface, withµ = 0.5 on the left panel and µ = 5 on the right panel.
![Page 135: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/135.jpg)
Another example: a Binomial logistic model
• [Y (xi) | S(xi)] is Binomial with density
f(yi; ζi) =
(
ni
yi
)
ζyi
i (1 − ζi)(ni−yi) yi = 0, 1, . . . , ni
• logistic link: E[Y (xi) | S(xi)] = niζi = exp{µi}1+exp{µi}
• mean: µi = µ+ S(xi)
• again can be expanded as
h(µi) =
p∑
j=1
dijβj + S(xi) + Zi
• typically more informative with larger values of ni
![Page 136: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/136.jpg)
An simulated example from binary model
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
locations
data
• in this example the binary sequence is not much infor-mative on S(x)
• wide intervals compared to the prior mean of p(x)
![Page 137: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/137.jpg)
Inference
• Likelihood function
L(θ) =
∫
IRn
n∏
i
f(yi;h−1(si))f(s | θ)ds1, . . . , dsn
• Involves a high-dimensional (numerical) integration
• MCMC algorithms can exploit the conditional indepen-dence scructure of the model
S∗ S Y
θ β
��
��
�
@@
@@
@
��
��
�
@@
@@
@
![Page 138: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/138.jpg)
Prediction with known parameters
• Simulate s(1), . . . , s(m) from [S|y] (using MCMC).
• Simulate s∗(j) from [S∗|s(j)], j = 1, . . . ,m(multivariate Gaussian)
• Approximate E[T (S∗)|y] by 1m
∑mj=0 T (s∗(j))
• if possible reduce Monte Carlo error by
– calculating E[T (S∗)|s(j)] directly
– estimate E[T (S∗)|y] by 1m
∑mj=0 E[T (S∗)|s(j)]
![Page 139: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/139.jpg)
MCMC for conditional simulation• Let S = D′β + Σ1/2Γ, Γ ∼ Nn(0, I).
• Conditional density of [Γ |Y = y]
f(γ|y) ∝ f(y|γ)f(γ)
Langevin-Hastings algorithm
• Proposal: γ′ from a Nn(ξ(γ), hI) where ξ(γ) = γ +h2∇ log f(γ | y).
• E.g for the Poisson-log Spatial model:
∇ log f(γ|y) = −γ+(Σ1/2)′(y−exp(s)) where s = Σ1/2γ.
• Expression generalises to other generalised linear spatialmodels.
• MCMC output γ1, . . . , γm. Multiply by Σ1/2 and obtain:s(1), . . . , s(m) from [S|y].
![Page 140: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/140.jpg)
MCMC for Bayesian inference
Posterior:
• Update Γ from [Γ|y, β, log(σ), log(φ)](Langevin-Hasting described earlier)
• Update β from [β|Γ, log(σ), log(φ)] (RW-Metropolis)
• Update log(σ) from [log(σ)|Γ, β, log(φ)] (RW-Metropolis)
• Update log(φ) from [log(φ)|Γ, β, log(σ)] (RW-Metropolis)
Predictive:
• Simulate (s(j), β(j), σ2(j), φ(j)), j = 1, . . . ,m(using MCMC)
• Simulate s∗(j) from [S∗|s(j), β(j), σ2(j), φ(j)],j = 1, . . . ,m (multivariate Gaussian)
![Page 141: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/141.jpg)
Comments
• Marginalisation w.r.t β and σ2 is possible using conjugatepriors
• Discrete prior for φ is an advantage (reduced computingtime).
• thinning: not to store a large sample of high-dimensionalquantities.
• similar algorithms for MCMC maximum likelihood esti-mation
![Page 142: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/142.jpg)
Example code from geoRglm
— demo 03 —
![Page 143: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/143.jpg)
SESSION 10
Case studies on generalised lineargeostatistical models
![Page 144: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/144.jpg)
A simulated Poisson data
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
X Coord
Y C
oord
1 0 0 3 3 4 4 0 1 1
0 0 2 5 10 3 3 2 0 0
0 1 3 7 8 9 2 2 3 0
0 0 0 0 4 2 1 1 1 3
0 4 8 5 3 2 0 6 5 12
2 4 8 9 7 1 3 3 8 16
3 6 16 6 8 2 4 4 5 8
3 3 8 5 2 4 4 1 6 4
2 1 6 4 3 2 2 2 1 2
0 1 3 3 1 0 5 0 2 4
![Page 145: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/145.jpg)
R code for simulation
## setting the seed
> set.seed(371)
## defining the data locations on a grid
> cp <- expand.grid(seq(0, 1, l = 10), seq(0, 1, l = 10))
## simulating from the S process
> s <- grf(grid = cp, cov.pars = c(2, 0.2), cov.model = "mat",
+ kappa = 1.5)
## visualising the S process
> image(s, col = gray(seq(1, 0.25, l = 21)))
## inverse link function
> lambda <- exp(0.5 + s$data)
## simulating the data
> y <- rpois(length(s$data), lambda = lambda)
## visualising the data
> text(cp[, 1], cp[, 2], y, cex = 1.5, font = 2)
![Page 146: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/146.jpg)
R code for the data analysis
set.seed(371)
## calibracao do algoritmo MCMC
MCc <- mcmc.control(S.scale=0.025, phi.sc=0.1, n.iter=110000,
burn.in=10000, thin=100, phi.start=0.2)
## especificacao de priors
PGC <- prior.glm.control(phi.prior="exponential", phi=0.2,
phi.discrete=seq(0,2,by=0.02),tausq.rel=0)
## opo de saida
OC <- output.glm.control(sim.pred=T)
## escolhendo 2 localizacoes para predicao
locs <- cbind(c(0.75, 0.15), c(0.25, 0.5))
##
pkb <- pois.krige.bayes(dt, loc=locs, prior=PGC, mcmc=MCc, out=OC)
![Page 147: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/147.jpg)
Summaries of the posterior for the simulated Poisson data:posterior means and 95% central quantile-based intervals.
parameters true values posterior mean 95% intervalβ 0.5 0.4 [0.08 , 1.58]σ2 2.0 1.24 [0.8 , 2.76]φ 0.2 0.48 [0.3 , 1.05]
β
Den
sity
−4 −2 0 1 2 3
0.0
0.2
0.4
0.6
σ2
Den
sity
0 1 2 3 4
0.0
0.2
0.4
0.6
0.8
φ
Den
sity
0.5 1.0 1.5
0.0
0.5
1.0
1.5
2.0
−4 −2 0 1 2 3
0.5
1.5
2.5
3.5
β
σ2
−4 −2 0 1 2 3
0.2
0.6
1.0
1.4
β
φ
0.5 1.5 2.5 3.5
0.2
0.6
1.0
1.4
σ2
φ
![Page 148: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/148.jpg)
05
1015
0.0 0.1 0.2 0.3 0.4
S(x)|y
density
![Page 149: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/149.jpg)
Rongelap Island
— see other set of slides —
![Page 150: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/150.jpg)
The Gambia malaria
— see other set of slides —
![Page 151: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/151.jpg)
Spatial prediction in tropical disease epidemi-ology
![Page 152: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/152.jpg)
AfricanProgramme forOnchocerciasisControl
• “river blindness” – an endemic disease in wet tropicalregions
• donation programme of mass treatment with ivermectin
• approximately 30 million treatments to date
• serious adverse reactions experienced by some patientshighly co-infected with Loa loa parasites
• precautionary measures put in place before masstreatment in areas of high Loa loa prevalence
http://www.who.int/pbd/blindness/onchocerciasis/en
![Page 153: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/153.jpg)
The Loa loa prediction problem
Ground-truth survey data
• random sample of subjects in each of a number of villages
• blood-samples test positive/negative for Loa loa
Environmental data (satellite images)
• measured on regular grid to cover region of interest
• elevation, green-ness of vegetation
Objectives
• predict local prevalence throughout study-region (Cameroon)
• compute local exceedance probabilities,
P(prevalence > 0.2|data)
![Page 154: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/154.jpg)
Loa loa: a generalised linear model
• Latent spatial process
S(x) ∼ SGP{0, σ2ρ(u)}ρ(u) = exp(−|u|/φ)
• Linear predictor
d(x) = environmental variables at location x
η(x) = d(x)′β + S(x)
p(x) = log[η(x)/{1 − η(x)}]
• Error distribution
Yi|S(·) ∼ Bin{ni, p(xi)}
![Page 155: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/155.jpg)
Schematic representation of Loa loa model
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
environmental gradient
local fluctuation
sampling errorpre
vale
nce
location
![Page 156: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/156.jpg)
The modelling strategy
• use relationship between environmental variables and ground-truth prevalence to construct preliminary predictions vialogistic regression
• use local deviations from regression model to estimatesmooth residual spatial variation
• Bayesian paradigm for quantification of uncertainty inresulting model-based predictions
![Page 157: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/157.jpg)
logit prevalence vs elevation
0 500 1000 1500
−5
−4
−3
−2
−1
0
elevation
logi
t pre
vale
nce
![Page 158: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/158.jpg)
logit prevalence vs MAX = max NDVI
0.65 0.70 0.75 0.80 0.85 0.90
−5
−4
−3
−2
−1
0
Max Greeness
logi
t pre
vale
nce
![Page 159: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/159.jpg)
logit prevalence vs SD = std.error NDVI
0.10 0.12 0.14 0.16 0.18 0.20
−5
−4
−3
−2
−1
0
Std Greeness
logi
t pre
vale
nce
![Page 160: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/160.jpg)
Comparing non-spatial and spatial predictionsin Cameroon
Non-spatial
Predicted prevalence - 'without ground truth data'
3020100
Ob
serv
ed
pre
vale
nce
(%
)60
50
40
30
20
10
0
![Page 161: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/161.jpg)
Spatial
Predicted prevalence - 'with ground truth data' (%)
403020100
Obs
erve
d pr
eval
ence
(%
)
60
50
40
30
20
10
0
![Page 162: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/162.jpg)
Probabilistic prediction in Cameroon
![Page 163: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/163.jpg)
Next Steps
• analysis confirms value of local ground-truth prevalencedata
• in some areas, need more ground-truth data to reducepredictive uncertainty
• but parasitological surveys are expensive
![Page 164: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/164.jpg)
Field-work is difficult!
![Page 165: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/165.jpg)
RAPLOA
• a cheaper alternative to parasitological sampling:
– have you ever experienced eye-worm?
– did it look like this photograph?
– did it go away within a week?
• RAPLOA data to be collected:
– in sample of villages previously surveyedparasitologically (to calibrate parasitology vs RAPLOAestimates)
– in villages not surveyed parasitologically (to reducelocal uncertainty)
• bivariate model needed for combined analysis ofparasitological and RAPLOA prevalence estimates
![Page 166: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/166.jpg)
UNDP/World Bank/WHO Special Programme for Research & Training in Tropical Disease (TDR)
R E P O RT O F A M U LT I - C E N T R E S T U DY
Rapid AssessmentProceduresfor Loiasis
TDR/IDE/RP/RAPL/01.1
![Page 167: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/167.jpg)
RAPLOA calibration
0 20 40 60 80 100
020
4060
8010
0
RAPLOA prevalence
para
sito
logy
pre
vale
nce
−6 −4 −2 0 2
−6
−4
−2
02
RAPLOA logitpa
rasi
tolo
gy lo
git
locatio
n
Empirical logit transformation linearises relationshipColour-coding corresponds to four surveys in different regions
![Page 168: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/168.jpg)
RAPLOA calibration (ctd)
0 20 40 60 80 100
020
4060
8010
0
RAPLOA prevalence
para
sito
logy
pre
vale
nce
locatio
n
Fit linear functional relationship on logit scale and back-transform
![Page 169: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/169.jpg)
Parasitology/RAPLOA bivariate model
• treat prevalence estimates as conditionally independentbinomial responses
• with bivariate latent Gaussian process in linear predictor
• asymmetric formulation,
S1(x) = α+ βS2(x) + Z(x)
• low-rank spline representation of S2(x) to easecomputation
![Page 170: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/170.jpg)
Including individual-level variation
What to do when parasitological and RAPLOA estimates arefrom same individuals?
• model for bivariate binary response at individual level
• Pijk = P(positive in village i, method j, individual k)
• logit(Pijk) = µijk + Sj(xi) + Uik
– linear model for µijk
– Uik ∼ N(0, ν2)
• straightforward in principle, but computationally awk-ward
• may not make much difference in practice
![Page 171: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/171.jpg)
SESSION 11
More on generalised linear geostatisticalmodels
![Page 172: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/172.jpg)
Covariance functions and variograms
• In non-Gaussian settings, the variogram is a less naturalsummary statistic but can still be useful as a diagnostictool
• for GLGM the model with constant mean:
E [Y (xi)|S(xi)] = µi = g(α+ Si) vi = v(µi)
γY (u) = E[1
2(Yi − Yj)
2]
=1
2ES[EY [(Yi − Yj)
2|S(·)]]
=1
2
(
ES[{g(α+ Si) − g(α+ Sj)}2] + 2ES[v(g(α+ Si))])
≈ g′(α)2γS(u) + τ2
![Page 173: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/173.jpg)
• the variogram on the Y -scale is approximately propor-tional to the variogram of S(·) plus an intercept
• the intercept represents an average nugget effect inducedby the variance of the error distribution of the model
• however it relies on a linear approximation to the inverselink function
• it may be inadequate for diagnostic analysis since theessence of the generalized linear model family is its ex-plicit incorporation of a non-linear relationship betweenY and S(x).
• The exact variogram depends on higher moments of S(·)
• explicit results are available only in special cases.
![Page 174: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/174.jpg)
Spatial survival analysis
• specified through hazard function h(t) = f(t)/{1−F (t)},
• h(t)δt is the conditional probability event will occour inthe interval (t, t+ δt), given it has not occour until timet
• proportional hazards model with λ0(t), an unspecified base-line hazard function
hi(t) = λ0(t) exp(d′iβ)
• hi(t)/hj(t) does not change over time
• alternativelly, fully specified models are proposed
• frailty corresponds to random effects can be introduced by
hi(t) = λ0(t) exp(z′iβ + Ui) = λ0(t)Wi exp(d′
iβ)
![Page 175: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/175.jpg)
• e.g. log-Gaussian frailty model and gamma frailty model
• replacing Ui by S(xi) introduces spatial frailties (Li &Ryan, 2002; Banerjee, Wall & Carlin, 2003)
• E [S(x)] = −0.5Var [S(x)] preserves interpretation of exp{S(x)}as a frailty process
• other possible approaches, e.g. Henderson, Shimakuraand Gorst (2002) extends the gamma-frailty model
![Page 176: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/176.jpg)
Geostatistical models for point process
• Two possible connections between point process and geo-statistics:
1. measurement process replaced by a point process
2. choice of data locations for Y (xi)
![Page 177: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/177.jpg)
Cox point processesDefinition:A Cox process is a point process in which there is an unob-served, non-negative-valued stochastic process S = {S(x) : x ∈IR2} such that, conditional on S, the observed point process isan inhomogeneous Poisson process with spatially varying in-tensity S(x).
• fits into the general geostatistical framework
• derived as limiting form of a geostatistical model as δ → 0for locations on lattice-spacing δ
• log-Gaussian Cox process is a tractable form of Cox pro-cess (e.g. Moller, Syversveen and Waagepetersen,1998;Brix & Diggle, 2001)
• inference generally requires computationally intensive MonteCarlo methods, whose implementation involves carefultuning
![Page 178: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/178.jpg)
• moment-based method provides an analogue of the vari-ogram, for exploratory analysis and preliminary estima-tion of model parameters
![Page 179: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/179.jpg)
Cox point processes
• intensity surface Λ(x) = exp{S(x)}
• has mean and variance λ = exp{µ+ 0.5γ(0)}
• also represents the expected number of points per unitarea in the Cox process, and φ(u) = exp{γ(u)} − 1.
• K(s): reduced second moment measure of a stationarypoint process
• λK(s): expected number of further points within dis-tance s of an arbitrary point of the process
• For the log-Gaussian Cox process
K(s) = πs2 + 2πλ−2
∫ s
0
φ(u)udu
![Page 180: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/180.jpg)
• A non-parametric estimator:
K(s) =|A|
n(n− 1)
n∑
i=1
∑
j 6=i
w−1ij I(uij ≤ s)
• wij allows for edge correction
• preliminary estimates of model parameters can then beobtained by minimising a measure of the discrepancy be-tween theoretical and empirical K-functions
![Page 181: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/181.jpg)
Geostatistics and marked point processes
locations X signal S measurements Y
• Usually write geostatistical model as
[S, Y ] = [S][Y |S]
• What if X is stochastic? Usual implicit assumption is
[X,S, Y ] = [X][S][Y |S],
hence can ignore [X] for inference about [S, Y ].
• Resulting likelihood:
L(θ) =
∫
[S][Y |S]dS
![Page 182: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/182.jpg)
Marked point processes
locations X marks Y
• X is a point process
• Y need only be defined at points of X
• natural factorisation of [X, Y ]?
![Page 183: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/183.jpg)
Example 1. Spatial distribution of disease
X : population at riskY : case or non-case
• Natural factorisation is [X,Y ] = [X][Y |X]
• Usual scientific focus is [Y |X]
• Hence, can ignore [X]
![Page 184: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/184.jpg)
Example 2. Growth of natural forests
X : location of treeY : size of tree
• Two candidate models:
– competitive interactions ⇒ [X, Y ] = [X][Y |X]
– environmental heterogeneity ⇒ [X, Y ] = [Y ][X|Y ]?
• focus of scientific interest?
![Page 185: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/185.jpg)
Preferential sampling
locations X signal S measurements Y
• Conventional model:
[X,S, Y ] = [S][X][Y |S] (1)
• Preferential sampling model:
[X,S, Y ] = [S][X|S][Y |S,X] (2)
• Key point for inference: even if [Y |S,X] in (2) and [Y |S]in (1) are algebraically the same, the term [X|S] in (1)cannot be ignored for inference about [S, Y ], because ofthe shared dependence on the unobserved process S
![Page 186: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/186.jpg)
A model for preferential sampling
[X,S, Y ] = [S][X|S][Y |S,X]
• [S] = SGP(0, σ2, ρ) (stationary Gaussian process)
• [X|S] = inhomogenous Poisson process with intensity
λ(x) = exp{α+ βS(x)}
• [Y |S,X] = N{µ+ S(x), τ 2} (independent Gaussian)
![Page 187: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/187.jpg)
Simulation of preferential sampling model
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Y C
oord
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Y C
oord
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Y C
oord
Locations (dots) and underlying signal process (grey-scale):
• left-hand panel: uniform non-preferential
• centre-panel: clustered preferential
• right-hand panel: clustered non-preferential
![Page 188: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/188.jpg)
Likelihood inference
[X,S, Y ] = [S][X|S][Y |S,X]
• data are Xand Y , hence likelihood is
L(θ) =
∫
[X,S, Y ]dS = ES [[X|S][Y |S,X]]
• evaluate expectation by Monte Carlo,
LMC(θ) = m−1m∑
j=1
[X|Sj ][Y |Sj, X]
• use anti-thetic pairs S2j = −S2j−1
![Page 189: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/189.jpg)
Practical solutions to weak identifiability
1. Strong Bayesian priors (if you can believe them)
2. Explanatory variables as surrogate for S
3. Two-stage sampling
![Page 190: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/190.jpg)
SESSION 12
Geostatistical design
![Page 191: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/191.jpg)
Geostatistical design
• Retrospective
Add to, or delete from, an existing set of measurementlocations xi ∈ A : i = 1, . . . , n.
• Prospective
Choose optimal positions for a new set of measurementlocations xi ∈ A : i = 1, . . . , n.
![Page 192: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/192.jpg)
Naıve design folklore
• Spatial correlation decreases with increasing distance.
• Therefore, close pairs of points are wasteful.
• Therefore, spatially regular designs are a good thing.
![Page 193: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/193.jpg)
Less naıve design folklore
• Spatial correlation decreases with increasing distance.
• Therefore, close pairs of points are wasteful if you knowthe correct model.
• But in practice, at best, you need to estimate unknownmodel parameters.
• And to estimate model parameters, you need your designto include a wide range of inter-point distances.
• Therefore, spatially regular designs should be temperedby the inclusion of some close pairs of points.
![Page 194: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/194.jpg)
Examples of compromise designs
0 1
0
1
A) Lattice plus close pairs design
0 1
0
1
B) Lattice plus in−fill design
![Page 195: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/195.jpg)
Designs for parameter estimation
Comparison of random and square lattice designs, each with n = 100 sample locations, with respect tothree design criteria: spatial maximum of mean square prediction error M (x); spatial average of meansquare prediction error M (x); scaled mean square error, 100 × MSE(T ), for T =
R
S(x)dx. The
simulation model is a stationary Gaussian process with parameters µ = 0, σ2 + τ 2 = 1, correlationfunction ρ(u) = exp(−u/φ) and nugget variance τ 2. The tabulated figures are averages of eachdesign criterion over N = 500 replicate simulations.
max M (x) average M (x) MSE(T )Model parameters Random Lattice Random Lattice Random Lattice
φ = 0.05 9.28 8.20 0.77 0.71 0.53 0.40
τ 2 = 0 φ = 0.15 5.41 3.61 0.40 0.30 0.49 0.18φ = 0.25 3.67 2.17 0.26 0.19 0.34 0.10
φ = 0.05 9.57 8.53 0.81 0.76 0.54 0.41
τ 2 = 0.1 φ = 0.15 6.22 4.59 0.50 0.41 0.56 0.28φ = 0.25 4.44 3.34 0.37 0.30 0.47 0.22
φ = 0.05 10.10 9.62 0.88 0.86 0.51 0.40
τ 2 = 0.3 φ = 0.15 7.45 6.63 0.65 0.60 0.68 0.43φ = 0.25 6.23 5.70 0.55 0.51 0.58 0.38
![Page 196: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/196.jpg)
A Bayesian design criterion
Assume goal is prediction of S(x) for all x ∈ A.
[S|Y ] =
∫
[S|Y, θ][θ|Y ]dθ
For retrospective design, minimise
v =
∫
A
Var{S(x)|Y }dx
For prospective design, minimise
E(v) =
∫
y
∫
A
Var{S(x)|y}f(y)dy
where f(y) corresponds to
[Y ] =
∫
[Y |θ][θ]dθ
![Page 197: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/197.jpg)
Results
Retrospective: deletion of points from a monitoring network
0 1
0
1
Start design
![Page 198: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/198.jpg)
Selected final designs
0 1
0
1
τ2/σ2=0
A
0 1
0
1
τ2/σ2=0
B
0 1
0
1
τ2/σ2=0
C
0 1
0
1
τ2/σ2=0.3
D
0 1
0
1
τ2/σ2=0.3
E
0 1
0
1
τ2/σ2=0.3
F
0 1
0
1
τ2/σ2=0.6
G
0 1
0
1
τ2/σ2=0.6
H
0 1
0
1
τ2/σ2=0.6
I
![Page 199: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/199.jpg)
Prospective: regular lattice vs compromise designs
0.2 0.4 0.6 0.8 10.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Des
ign
crite
rion
τ2=0
RegularInfillingClose pairs
0.2 0.4 0.6 0.8 10.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5τ2=0.2
RegularInfillingClose pairs
0.2 0.4 0.6 0.8 10.2
0.3
0.4
0.5
0.6
0.7τ2=0.4
RegularInfillingClose pairs
0.2 0.4 0.6 0.8 10.2
0.3
0.4
0.5
0.6
0.7
Des
ign
crite
rion
Maximum φ
τ2=0.6
RegularInfillingClose pairs
0.2 0.4 0.6 0.8 10.3
0.4
0.5
0.6
0.7
0.8
Maximum φ
τ2=0.8
RegularInfillingClose pairs
![Page 200: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/200.jpg)
![Page 201: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/201.jpg)
Monitoring salinity in the Kattegat basin
Denmark(Jutland)
Kattegat
Nor
th S
ea
Sweden
Baltic Sea
Denmark(Zealand)
Germany
A B
Solid dots are locations deleted for reduced design.
![Page 202: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/202.jpg)
Further remarks on geostatistical design
1. Conceptually more complex problems include:
(a) design when some sub-areas are more interestingthan others;
(b) design for best prediction of non-linear functionalsof S(·);
(c) multi-stage designs (see next session).
2. Theoretically optimal designs may not be realistic(eg Loa loa photo).
3. Goal here is NOT optimal design, but to suggestconstructions for good, general-purpose designs.
![Page 203: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/203.jpg)
Closing remarks
• There is nothing special about geostatistics.
• Parameter uncertainty can have a material impact onprediction.
• Bayesian paradigm deals naturally with parameter un-certainty.
• Implementation through MCMC is not wholly satisfac-tory:
– sensitivity to priors?
– convergence of algorithms?
– routine implementation on large data-sets?
![Page 204: Geostat´ıstica - LEG-UFPRpdf:verao.pdf · 2007. 9. 25. · Geostat´ıstica 1 Paulo Justiniano Ribeiro Jr Laborato´rio de Estat´ıstica e Geoinforma¸ca˜o, Universidade Federal](https://reader033.vdocuments.net/reader033/viewer/2022060715/607b859046db404090487e22/html5/thumbnails/204.jpg)
• Model-based approach clarifies distinctions between:
– the substantive problem;
– formulation of an appropriate model;
– inference within the chosen model;
– diagnostic checking and re-formulation.
• Analyse problems, not data:
– what is the scientific question?
– what data will best allow us to answer the question?
– what is a reasonable model to impose on the data?
– inference: avoid ad hoc methods if possible
– fit, reflect, re-formulate as necessary
– answer the question.