gis modeling - university of colorado s outline • we will begin with geostatistics which means...

33
GIS Modeling Class (Block) 9: Variogram & Kriging Geography 4203 / 5203

Upload: phungtruc

Post on 19-Mar-2018

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

GIS Modeling

Class (Block) 9: Variogram &

Kriging

Geography 4203 / 5203

Page 2: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Some Updates

• Today class + one proposal

presentation

• Feb 22 Proposal Presentations

• Feb 25 Readings discussion

(Interpolation)

Page 3: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Last Lecture

• You have seen the first part of spatial estimationwhich was about interpolation techniques

• You hopefully obtained an idea of what the basicidea of interpolation is and when and why me aresupposed to make use of it

• You had some insights into the concepts of differenttechniques such as NN, pycnophylactiv I., IDW,Splines,…

• You hopefully understood the mathematicalfoundation of these approaches to better understandwhat you are doing in the lab

Page 4: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Today‘s Outline

• We will begin with Geostatistics which means wetalk about Kriging

• It is important to understand the difference betweenKriging and interpolation techniques we had so far

• We will talk about the conceptual basics, constraintsand methods for Kriging without going too much intodetail

• Here we need a deeper understanding ofautocorrelation and regionalized variable theoryas well as the incorporation of semivariogrammodels into estimations

Page 5: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Learning Objectives

• You will understand how Kriging works and

why it is called an optimized local

interpolator

• You will better understand terms like

autocorrelation, semivariance, covariance

• You will understand the mathematical

foundation of Kriging and how we make use

of spatial autocorrelation and semivariance to

estimate weights for estimating

Page 6: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Back to How to Improve

Spatial Estimation…• Remember how we incorporated autocorrelation

into interpolation so far

• “Implicitly” - we just assumed the data show spatialdependence, we knew this from data exploration

• Parameter settings e.g. in IDW are arbitrary(Radius?, # neighbors?, Weighting?) - we get moreknowledge through testing different combinations

• How to make theory-based choices and parameter-settings?

• This is where Geostatistical methods come into thepicture to help us with this decision-making process

Page 7: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

• Predictions can be

improved by incorporating

the knowledge of

autocorrelation (knowing

the value at one location

provides information of

locations nearby)

Page 8: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Geostatistics

• Too many definitions to go into detail

• Basically, it is about the analysis andinference of continuously-distributedvariables (temperature, concentrations,…)

• Analysis: describing the spatial variability ofthe phenomenon under study

• Inference: Estimation of the values atunknown locations (unknown values)

• Overall we need techniques for statisticalestimation of spatial phenomena and this iswhat Geostatistics provides…

Page 9: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Tobler and his Message -

how to…Theory-based• According to Tobler we expect attribute

values close together to be more similarthan those more distant

• This expectation drives the idea ofinterpolation and Geostatistical methods

• A graphical representation of this is thevariogram cloud

• Presentation of the square root of thedifference between attribute pairs againstthe distance between them

Page 10: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Variogram Cloudiness

• Let’s look at oneexample fromO’Sullivan andUnwin (2003)

• Might be weunderstand moreabout spatialautocorrelationand how to catch it

• Do we have atrend in here?

Page 11: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Variogram Cloudiness

• Can you see a

trend?

• If so what does

it mean?

• Remember the

visual trend in

the surface…

Page 12: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Variogram Cloudiness

• This is the representationof N-S (open circles) andE-W (filled circles)directions

• Which one shows aclearer trend and why?

• How do we call thiseffect?

• Why are there so fewsample points now?

Page 13: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Variogram Cloudiness

• For only few points we havelots of pairs to be representedand a variogram orsemivariogram becomes hardto be interpreted

• How to summarize thevariogram cloud?

• One way is to use box-plotsshowing means, medians,quartiles and extremes forindividual lags

Page 14: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Regionalized VariableTheory (RVT)

• By putting together the last slides youcan understand what RVT means

• The value of a variable in space (atlocation x) can be expressed bysumming together three components:

- A structural component (e.g., aspatial trend or just the mean, m(x) )

- Random, spatially autocorrelated(“regionalized”) variable (local spatialautocorrelation = ’(x))

- Random “noise” (stochastic variation,not dependent on location = ’’)

Z(x) = m(x) + ’(x) + ’’

Page 15: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Regionalized VariableTheory (RVT)

• Combination of these three components in amathematical model to develop an estimationfunction (function is applied to measured data toestimate values across area)

• m(x) is our trend (global trend surface)

• ’’ is the random component that has nothing todo with location

• ’(x) is our regionalized component and describes thelocal variation of the variable in space

• This latter relationship can be represented by adispersion measure…

Page 16: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Experimental Semivariance

• … here we call it the experimental semivariance forlag h:

• N(h) is the number of pairs separated by lag h (+/- lagtolerance)

• h - lag width or the distance between pairs of points• z - is the value of a point at u or u+h• Remember what we did with the variogram cloud by

“binning” the amount of data points

2( )

1

1ˆ( ) ( ) ( )

2 ( )

N h

h z u z u hN h =

= +r r

r

Page 17: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

The Experimental

Semiviogram

Nugget: Initial semivariancewhen autocorrelation is highest(intercept); or just the uncertaintywhere distance is close to 0 ( )Sill: Point where the curve levelsoff (inherent variation wherethere is little autocorrelation)Range: Lag distance where thesill is reached (over whichdifferences are spatiallydependent)

Page 18: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Lag and Lag Tolerance

• This has to do with “binning”

• Lag distance is the direct(planar) distance betweenpoints a and b

• Tolerances used to definesets of values that are“similar” distances apart(distances rarely repeat inreality)

Page 19: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Semivariance, Covariance

and Correlation

from Bohling (2005)

Spread vs. Similarity

Page 20: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Semivariogram Model

• The empirical semivariogram allows us toderive a semivariogram model to representsemivariance as a function of separationdistance

• Thus we can infer the characteristics of theunderlying process using the model

• Compute the semivariance between points

• Interpolate between sample points using anoptimal interpolator (“kriging”)

Page 21: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Constraints for a

Semivariogram Model

• Monotonically increasing

• Asymptotic max (sill)

• Non-negative intercept

(nugget)

• Anisotropy

Page 22: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Semivariogram Models

Exponential modelGaussian modelSperical model

from http://www.asu.edu/sas/sasdoc/sashtml/stat/chap34/sect12.htm

a - range

c - (nugget + sill) …however fitting the semivariogram is complex

What can you think of would happen if there is a trend in the data?

Page 23: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Where is the Best Fit?

from Bohling (2005)

Page 24: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Why is Anisotropy

Important• So far an isotropic structure of spatial

correlation has been assumed

• The semivariogram depends on the lagdistance not on direction (omnidirectionalsemivariogram)

• In reality we often have differences indifferent directions and need an anisotropicsemivariogram

• One model is geometric anisotropy (samesill in all direction but within different ranges)

Page 25: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Geometric Anisotropy

• Find ranges in three orthogonal principal directions

and create a three-dimensional lag vector h =

(hx,hy,hz)

• Transform this 3D vector into an equivalent isotropic

lag:

• Semivariance values computed for pairs that fall

within directional bands and prescribed lag limits

(searching for directional dependence)

Page 26: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Geometric Anisotropy

• These directionalbands are definedby azimuthaldirection, angulartolerance andbandwidth

• You did thisalready in the laband hopefullyknew what youwere doing…

Page 27: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Kriging

• Statistically-based “optimal”estimator of spatialvariables

• Predictions based onregionalized variabletheory - you know thisnow!

• Kriging first used byMatheron (1963) in honourof D.G. Krige - a southAfrican mining engineerwho laid the groundwork for“geostatistics”

Page 28: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Kriging Principles

• Similar to IDW, weights are used with

measured variables to estimate values at

unknown locations (a weighted average)

• Weights are given in a statistically optimal

fashion (given a specific model, assumptions

about the trend, autocorrelation and

stochastic variation in the predicted variable)

• In other words: We use the semivariance

model to formulate the weights

Page 29: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Ordinary Kriging

• For a basic understanding we look at Ordinary Kriging• Weighting of data points according to distance• Estimated values z are the sum of a regional mean

m(x) and a spatially correlated random component ’(x)

• (m(x) is implicit in the system)

• Assumptions:no trendisotropyvariogram can be defined with math modelsame semivariogram applies for the whole study area(variation is a function of distance, not location +constant mean - stationarity of the spatial process)

Page 30: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Ordinary Kriging• You see it looks similar to IDW

• But: Weight computation is much more

complex based on the inverted variogram

• Weights reach the value zero when we

reach the range a

• Ordinary: no trend, regional mean is

estimated from sample

Page 31: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

How Semivariance

Determines Kriging Weights

• Weighted average of theobserved attribute values

• Weights are a function of thevariogram model & sum to 1(they should to be unbiased!)

• You see the sizes of thepoints proportional to theirweights they get

• Points with distances > rangewill have a weight of zero,WHY?

Page 32: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

Typical Working Steps

• Describing the data / identifyingspatial autocorrelation and trends(experimental semivariogram)

• Building the semivariogram modelby using the mathematicalfunction

• Using the semivariogram modelfor defining the weights

• Evaluate interpolated surfaces

Page 33: GIS Modeling - University of Colorado  s Outline • We will begin with Geostatistics which means we talk about Kriging • It is important to understand the difference between

A First Summary

• Kriging is complex

• Here you obtained a first overview of the principlesof this technique

• Weights are derived from the semivariogram modelto intelligently infere unmeasured values

• You have seen the underlying rules of thesemivariogram and hopefully understood whyautocorrelation, directional trends andstationarity are important

• You also understood the difference to other localinterpolators, don’t you?