comparison of ordinary kriging and artificial neural network

8/8/2019 Comparison of Ordinary Kriging and Artificial Neural Network

1/36

Presented by: Pejman Tahmasebi

Supervisor: Dr.Katibeh

August, 2010


2/36

Role of ANN (Artificial Neural Network) andGeostatistics in Enviromental Sciences

What is ANN?? What are the most prevalent geostatisticals

methods??? What is the differences between simulation and

estimation?? A case study by applying and comparison of

ordinary kriging (OK) and ANN


3/36


4/36

Animals are able to react adaptively to changes in their

external and internal environment, and they use their

nervous system to perform these behaviours.

An appropriate model/simulation of the nervous systemshould be able to produce similar responses and behaviours

in artificial systems.

The nervous system is build by relatively simple units, the

neurons, so copying their behaviour and functionality should

be the solution.

4


5/36

5


6/36

ANNs

NNMathematics

Architectures

LearningAlgorithms

Methodology

Problems

6


7/36

7

Inputs

Output

),(

),(

),(

),(

1

44

1

4

1

33

1

3

1

22

1

2

111

11

wxfy

wxfy

wxfy

wxfy

!

!

!!

),(

),(

),(

2

3

12

3

2

2

12

2

2

1

12

1

wyfy

wyfy

wyfy

!

!

!

!

1

4

1

3

1

2

1

1

1

y

y

y

y

y ),(3

1

2wyfyOut !

!2

3

2

3

2

3

2

y

y

y

y


8/36

8

MLP neural networksMLP neural networks

RBFRBF

x yout

x

yout


9/36

9

Error measure:

N

t

tt WxFN

E1

2;1

Rule for changing the synaptic weights:

j

i

j

i

newj

i

ji

j

i

www

Ww

E

cw

(!

x

x!(

,

)(

c is the learning parameter (usually a constant)


10/36

10

MLP neural network with p layers

Data: ),(),...,,(),,( 22

1

1

N

N

yxyxyxError: 22 ));(())(()( t

t

tout yWxFytytE !!

It is very complicated to calculate the weight changes.

x

yout

1 2 p-1 p1

22

1

2

2

2

11

1

1

1

1

;

...

...

...11

1

...

...11

1

2

212

1

11

ppT

out

T

M

aywk

T

M

axwk

ywWxy

yyy

Mke

y

yyy

Mke

y

kkT

kkT


11/36

11

Solution of the complicated learning:

calculate first the changes for the synaptic weights of the

output neuron; calculate the changes backward starting from layer p-1, andpropagate backward the local error terms.

The method is still relatively complicated but it is muchsimpler than the original optimization problem.


12/36

Why NN?? Application Results

Methodology

Geological Setting

12

Complex EstimationData AnalysisNetwork Learning AlgorithmNetwork ArchitectureValidationTestingDeveloping for new locations


13/36

Geostatistical analysis is distinct from other spatialmodels in the statistics literature in that it assumes

the region of study is continuous

Observations could betaken at any point

within the study area

Interpolation at pointsin between observedlocations makes sense

0

5

1015

20

X

0

5

10

15

20

Y

0

0.1

0.

2

0.3

0.4

0.5

Z


14/36

Spatial modeling is based on the assumptionthat observations close in space tend to co-vary more strongly than those far from eachother Positively co-vary: values are similar in value

E.g. elevation (or depth) tends to be similar for locationsclose together)

Negatively co-vary: values tend to be opposite invalue E.g. density of an organism that is highly spatially

clustered, where observations in between clusters arelow and values within clusters are high


15/36

Definition: two variables are said to co-vary if theircorrelation coefficient is not zero

whereV is the correlation coefficient betweenXandYand W

X(W

Y) is the standard deviation ofX(Y)

Consider this in the context of a single variable

E.g. do nearest neighbors have non-zero covariance?

yxyxyxyxEyxyx WVWQQWW !!!! )])([(),cov(),(

,


16/36

Notation

Z(s) is the random process at location s=(x, y)

z(s) is the observed value of the process atlocation s=(x, y)

D is the study region

The sample is the set {z(s) : s D} . We say thatit is a partial realization of the random spatial

process {Z(s) : s D}


17/36

whereQ(s) is the mean structure; called large-scale non-spatial

trend

(s) = W(s) + L(s) is a zero-mean, stationary processwith autocorrelation which combines the smooth

small- scale and micro-scale variation

I(s) is the random noise term with zero-mean andconstant variance which is independent of W(s) and L(s)

)()()()()( sssWssZ !

)()()()( ssssZ IHQ !


18/36

The theory of regionalised variables leads to

an optimal interpolation method, in the

sense that the prediction variance isminimized.

This is based on the theory of random

functions, and requires certain assumptions.

A Best Linear Unbiased Predictor (BLUP)

that satisfies certain criteria for optimality.


19/36

In OK, we model the value of variable zatlocation sias the sum of a regional mean m anda spatially-correlated random component e(si):

Z(si) = m+e(si)

The regional mean m is estimated from thesample, but not as the simple average, becausethere is spatial dependence. It is implicit in theOK system.


20/36

Predict at points, with unknown mean (whichmust also be estimated) and no trend

Each point xis predicted as the weighted

average of the values at allsamples

The weights assigned to each sample point sumto 1

Therefore, the prediction is unbiased Ordinary: no trend or strata; regional mean

must be estimated from sample

?!

!

!

i

ii*

ZZn

1i0xx


21/36

Linear combination of nearest neighboursLinear combination of nearest neighbours

xx11 xx22

xx33

xx44

xx00

Inverse Distance WeightsInverse Distance Weights KrigingKrigingLocal MeansLocal Means

21d

ZZ

i

ii*

n

1i0

!

!

!x

x ?!

!

!

i

ii*

ZZn

1i0x

x


22/36

xx11 xx22

xx33

xx44

xx00

Variogram analysisVariogram analysis11

Variogram adjustmentVariogram adjustment

22

44

Kriging estimatorKriging estimator

Modelo de ajuste do semivariogramaModelo de ajuste do semivariograma

33

? Ahhh

h SphCCa21

a23

CC 1010

3

!!

-


23/36

=

Substituting the values we find the weightsSubstituting the values we find the weights

Kriging estimator:Kriging estimator:

VarianceVariance

Covariance matrix elementsCovariance matrix elements

)(CC)()C(C1

ijhh0 !!

P

P

P

E

:n

1

C C .........C 1C C .........C 1: : : :

C C .........C 11 1 ......... 1 0

11 12 1n

21 22 2n

n1 n2 nn

C

C

:

C

1

10

20

n0

!!

iiZZ

n

1i0x

x

kT10

2ko CC !


24/36

-

g

=

-

-

101111

1

1

1

1

04

03

02

01

44434241

34333231

24232221

14131211

C

C

C

C

CCCC

CCCC

CCCC

CCCC1

Estimator:Estimator:

5050

5050 xx11

xx22

xx33

xx44

xx00

Matrix elements: CMatrix elements: Cijij = C= C00 +C+C11 -- KK((hh)) ModeloTericoModeloTerico

-

3

3

)200(

)250(5,0

200

2505,1202

CC1212 = C= C2121 = C= C0404 = C= C00 +C+C11 -- KK((50 250 2))

== 99,,8484= (= (22++2020))--


25/36

CC1414 = C= C4141 = C= C0202 = (C= (C00 +C+C11))-- KK ??VV ((100100))22+(+(5050))

22] =] = 44,,98985050

5050 xx11

xx22

xx33

xx44

xx00

CC1313 = C= C3131 = (C= (C00 +C+C11))-- KK ??VV ((150150))22+(+(5050))

22] =] = 11,,2323

CC2323 = C= C3232 = (C= (C00 +C+C11))-- KK ??VV ((100100))22+(+(100100))

22] =] = 22,,3333

CC2424 = C= C4242 = (C= (C00 +C+C11))-- KK ??VV ((100100))22+(+(150150))

22] =] = 00,,2929

CC3434 = C= C4343 = (C= (C00 +C+C11))-- KK ??VV ((200200))22+(+(5050))

22] =] = 00

CC0101 = (C= (C00 +C+C11))-- KK ((5050) =) = 1212,,6666

CC0303 = (C= (C00 +C+C11))-- KK ((150150) =) = 11,,7272

CC1111 = C= C2222 = C= C3333 = C= C4444 = (C= (C00 +C+C11))-- KK ((00) =) = 2222


26/36

5050

5050 xx11

xx22

xx33

xx44

xx00

Substituting the values CSubstituting the values Cijij, we find the following weights:, we find the following weights:

The estimator is

PP11 == 00,,518518 PP22== 00,,022022 PP33== 00,,089089 PP44== 00,,371371

0,518 z(x1)+0,022 z(x2)+0,089 z(x3)+0,371 z(x4)!*xoZ !*xoZ


27/36

It was investigated the hypothesis that non-linearity matters in the spatial mapping of complexpatterns of groundwater arsenic contamination

One ANN and a variogram model were used torepresent the spatial structure of arseniccontamination.

The probability for successful detection of a well as

safe or unsafe was found to be atleast 15% largerthan that by kriging under the country-widescenario.


28/36

Extensive groundwater contamination by arsenic isobserved in many alluvial aquifers of the world today.

Soluble arsenic compounds are generally rapidly

absorbed into the body from the gastrointestinaltract.

Studies have shown that twenty years of sustainedconsumption of contaminated water exceeding 50g/l of arsenic can cause internal cancers and affect10% of all exposed.

detection of groundwater arsenic contamination canprevent widespread diseases which could otherwisebe very costly to treat


29/36

Spatial mapping of arsenic contamination on the basisof sparse in situ sampling data can be considered onesuch cost-effective and non-structural method of

contamination detection at non-sampled locations. Conventional methods for spatial mapping of

groundwater contamination based on lineargeostatistical theory (such as kriging) can howeverhave high uncertainty at non-sampled locations.

The objective of this study is to explore the validity ofthe hypothesis that non-linearity matters in thespatial mapping of complex patterns of groundwaterarsenic contamination.


30/36

Arsenic data were obtained from theBritish Geological Survey (BGS) which, incollaboration with local authorities inBangladesh, surveyed randomly selected

wells from 1998 to 2000. Measurement of arsenic was taken at a

single depth close to the screen for eachwell, wherein the depths varied from 10300 ft below the surface.

Arsenic measurements of BGS-DPHE

(2001) survey were based on the AtomicAbsorption Spectro- photometric (AAS)

method, which can be considered a veryreliable method for arsenic testing.


31/36

The weights were trainedusing the back propagation(BP) algorithm

Class (1) predicted

concentration is less than 10parts per billion (ppb, or g/l);Class (2) predictedconcentration is between 10and 50 ppb; andClass (3)predicted concentration ishigher than 50 ppb. Note thatthe 10 and 50 ppb are the safelimits prescribed by the WorldHealth Organization (WHO)and Bangladesh Government,respectively.


32/36

In this study, it was used the LevenbergMarquardt(LM) algorithm for training of ANN.

This algorithm is a trust region based method with

hyperspherical trust region that has proved to be abetter solution in searching for the minima. In order to elicit the essential features of the spatial

pattern of arsenic data and thereby facilitate themodeling equally for each mapping tool, datapreprocessing was performed.

In the un-preprocessed format, the spatial nature ofarsenic data is known to be highly irregular in thesouthern and south central regions of Bangladesh.

Data from each well was grouped in 5 5 km grids


33/36


34/36

For assessing the accuracy of each method for spatialinterpolation of arsenic concentration at non-sampledlocations, the following three metrics were used:

1. Probability of successful detection: This is theprobability that the predicted class value matches with thein-situ class value of a non-sampled well.

2. Probability of false hope: This is the probability that thepredicted class value is underestimated significantlyleading to an unsafe well being predicted wrongly as safe

for a non-sampled well. 3. Probability of false alarm: This is the probability that the

predicted class value is overestimated significantly leadingto a safe well being predicted wrongly as unsafe for a non-sampled well.


35/36

clearly observe that ANN, by virtue of its

ability to generalize the spatial pattern using

a highly nonlinear network, showsconsiderably more accuracy when compared

to ordinary kriging subject to the samebreadth and constraints in data.

The probability for successful detection is atleast 15% higher than that by kriging for the

country as a whole.


36/36

The mapping (spatial interpolation) is made relatively easier

by a technique. The study demonstrated that ANNs can also be used to map

with noticeably higher accuracy than kriging the complexand seemingly erratic spatial pattern of groundwatercontamination provided that reasonable data preprocessing

and exploratory data analysis are performed. The challenge now is to find practical ways to leverage the

information gained from chaos analysis towards the robustdesign of ANN-type mapping schemes that can build uponconventional kriging methods.

comparison of ordinary kriging and artificial neural network

Documents