basic geostatistics austin troy. interpolation three methods in arc gis idw spline kriging

24
Basic geostatistics Austin Troy

Upload: anissa-owens

Post on 17-Dec-2015

271 views

Category:

Documents


4 download

TRANSCRIPT

Basic geostatistics

Austin Troy

Interpolation

Three methods in Arc GIS

• IDW

• SPLINE

• Kriging

Inverse Distance Weighting

• Weighted moving average

• Weights by distance

• Assumes unknown value influenced more by nearby than far away points, but we can control how rate of decay

• Validity testing requires taking additional observations.

• Sensitive to sampling, with circular patterns

Where λi are given by some weighting fn and

• Common form of weighting function is d-p

yielding:

Inverse Distance Weighting

n

iii xzxz

10

^

)()(

n

ii

1

1

n

i

pij

n

i

piji

d

dxzxz

1

10

^)(

)(

IDW-How it works

• Z value field: numeric attribute to be interpolated

• Power: determines relationship of weighting and distance; where p= 0, no decrease in influence with distance; as p increases distant points becoming less influential in interpolating Z value at a given pixel

• Neighborhood type: standard or smooth

Neighborhood types: Standard

• Specifications– # neighbors to

include– Ellipse

height/width– Sector type/

angles: to ensure inclusion of obs. from all directions

Neighborhood types: Smooth

• 3 Ellipses• Smoothing factor:

size of gaps between ellipses– Inner : same

weighting as “standard”

– In between: weights multiplied by sigmoidal value from 1 (inner edge) to 0 (outer edge).

IDW Weights

IDW- P parameter• What is the best P to use?

• Minimize Root Mean Squared Prediction Error (RMSPE)

• Use cross validation to check.

• You can look for an optimal P by testing your sample point data against a validation data set

Plot of model fits

The blue line indicates degree of spatial autocorrelation (required for interpolation). The closer to the dashed (1:1) line, the more perfectly autocorrelated.

Where horizontal, indicates data independence

Mean pred. Error near zero means unbiased

Plot of model errors

Spline Method• Another option for interpolation method

• This fits a curve through the sample data assign values to other locations based on their location on the curve

• Thin plate splines create a surface that passes through sample points with the least possible change in slope at all points, that is with a minimum curvature surface.

• Uses piece-wise functions fitted to a small number of data points, but joins are continuous, hence can modify one part of curve without having to recompute whole

• Overall function is continuous with continuous first and second derivatives.

Kriging Method

• Kriging is a geostatistical method and a probabilistic method, unlike the others, which are deterministic. That is, there is a probability associated with each prediction. Kriging has both a deterministic and probabilistic component, respectively

Z(s) = μ(s) + ε(s), where both are functions of distance

• Like IDW in that taking weighted moving average, but the weights (λ) are based on statistically derived autocorrelation measures.

• Interpolation parameters (e.g. weights) are chosen to optimize fn

• Assumes that variable in space can be modeled as sum of three components: 1) structure/deterministic part, 2) random but spatially correlated part and 3) spatially uncorrelated random part

Semivariance

n

hxzxzh

n

iii

2

)}()({)( 1

2

• Semivariogram(distance h) = 0.5 * average [ (value at location i– value at location j)2] OR

• Based on the scatter of points, the computer (Geostatistical analyst) fits a curve through those points

• The inverse is the covariance matrix whichshows correlation over space

Kriging Method• Hence, foundation of Kriging is notion of spatial autocorrelation,

or tendency of values of entities closer in space to be related.

• Autocorrelation can be assessed using a semivariogram.

• Where autocorrelation exists, the semivariance should increase until certain distance where SV= variance around mean, so flattens out. That value is called a “sill.” The sloped area, or “range” is where values are related to each other. Intercept is nugget

Kriging Method• Semivariograms measure the strength of statistical correlation as

a function of distance; they quantify spatial autocorrelation

• Because Kriging is based on the semivariogram, it is probabilistic, while IDW and Spline are deterministic

• Kriging associates some probability with each prediction, hence it provides not just a surface, but some measure of the accuracy of that surface

• Kriging equations are determined by fitting line through points so as to minimize weighted sum of squares between points and line

• These equations are weighted based on spatial autocorrelation, which is determined from the semivariograms

Steps

• Variogram cloud; can use bins to make box plot

• Empirical variogram: choose bins and lags• Model variogram: fit function through

empirical variogram– Functional forms?

Variogram• Plots semi-variance against

distance between points• Where autocorrelation exists,

the semivariance should have slope

• Is binned to simplify• Binned values generated by

grouping empirical points using square cells one lag wide.

• To show local variation around mean

Binning with average only

Binning with average and grouping

Functional Forms

From Fortin and Dale Spatial Analysis

Kriging Method• We can then use a

scatter plot of predicted versus actual values to see the extent to which our model actually predicts the values

• If the blue line and the points lie along the 1:1 line this indicates that the kriging model predicts the data well

Kriging Method• Produces four types of prediction maps:

• Prediction Map: Predicted values

• Probability Map: Probability that value over x

• Prediction Standard Error Map: fit of model

• Quantile maps: Probability that value over certain quantile

Kriging: Ordinary vs. Universal

• Known as Kriging in the presence of universal trends.

• Universal kriging is used where there is an underlying trend beyond the simple spatial autocorrelation

• Generally this trend occurs at a different scale

• Trend may be fn of some geographic feature that occurs on one part of the map

Z(s) = µ(s) + ε(s),