gis geostatistics

7/28/2019 GIS Geostatistics

1/17

Environmental and Ecological Statistics 8, 361377, 2001

GIS and geostatistics: Essential partners

for spatial analysis

P. A . B U R R O U G H

Utrecht Centre for Environment and Landscape Dynamics (UCEL),

Faculty of Geographical Sciences, Utrecht University, Post Box 80.115, 3508 TC Utrecht,

The Netherlands

E-mail: [email protected]

Received June 1999; Revised May 2001

Initially, geographical information systems (GIS) concentrated on two issues: automated map

making, and facilitating the comparison of data on thematic maps. The rst required high quality

graphics, vector data models and powerful data bases, the second is based on grid cells that can be

manipulated by suites of mathematical operators collectively termed ``map algebra''. Both kinds of

GIS are widely available and are taught in many universities and technical colleges. After more than

20 years of development, most standard GIS provide both kinds of functionality and good quality

graphic display, but until recently they have not included the methods of statistics and geostatistics as

tools for spatial analysis.

Recently, standard statistical packages have been linked to GIS for both exploratory data analysis

and statistical analysis and hypothesis testing. Standard statistical packages include methods for the

analysis of random samples of cases or objects that are not necessarily co-located in spaceif the

results of statistical analysis display a spatial pattern then that is because the underlying data alsoshare that pattern.

Geostatistics addresses the need to make predictions of sampled attributes (i.e., maps) at

unsampled locations from sparse, often expensive data. To make up for lack of hard data

geostatistics has concentrated on the development of powerful methods based on stochastic theory.

Though there have been recent moves to incorporate ancillary data in geostatistical analyses,

insufcient attention has been paid to using modern methods of data display for the visualization of

results.

GIS can serve geostatistics by aiding geo-registration of data, facilitating spatial exploratory data

analysis, providing a spatial context for interpolation and conditional simulation, as well as

providing easy-to-use and effective tools for data display and visualization. The value of

geostatistics for GIS lies in the provision of reliable interpolation methods with known errors,

methods of upscaling and generalization, and for supplying multiple realizations of spatial patterns

that can be used in environmental modeling. These stochastic methods are improving understanding

of how errors in models of spatial processes accrue from errors in data or incompleteness in thestructure of the models.

New developments in GIS, based on ideas taken from map algebra, cellular automata and image

analysis are providing high level programming languages for modeling dynamic processes such as

erosion or the development of alluvial fans and deltas. Research has demonstrated that these models

need stochastic inputs to yield realistic results. Non-stochastic tools such as fuzzy subsets have been

shown to be useful for spatial analysis when probabilistic approaches are inappropriate or

impossible. The conclusion is that in spite of differences in history and approach, the linkage of GIS,

statistics and geostatistics provides a powerful, and complementary suite of tools for spatial analysis

in the agricultural, earth and environmental sciences.

1352-8505 # 2001 Kluwer Academic Publishers


2/17

Keywords: geographic information systems, geostatistics, statistical methods, spatial analysis,environmental modeling, map algebra, fuzzy sets

1352-8505 # 2001 Kluwer Academic Publishers

1. IntroductionGIS, statistics and geostatistics

Geographical information systems, in the sense of computer tools for handling spatial data

(Burrough and McDonnell, 1998), have been used since the late 1960s (Coppock and

Rhind, 1991). Their initial development was mainly in North America, stimulated by the

need to map, plan and manage large areas of terrain, but major contributions came also

from Britain and other European countries, and from Japan and Australasia. Initially there

were two different kinds of GIS. The rst kind, dominated by cartographers, aimed at

automating the map making process: ultimately this was to replace the paper map by the

much more exible electronic database. Initially, the essential ingredients of this approach

were geometrical accuracy, and elegant hard copy output. The second approach, pioneered

by the Harvard Laboratory for Computer Graphics, focused on spatial analysis, in

particular the overlaying of different thematic maps so that relations and conicts in land

use could be resolved. Whereas the rst approach was an automated version of the

cartographer's eye, arm and hand, and insisted on full cartographic design standards, the

Harvard approach concentrated on the clever combination of data linked to a gridded

division of space. As the computer output devices of the time were limited to line printers

having a unit cell measuring 1=661=10 inch, differences in values could only be indicated

by overprinting different alphanumeric characters, so the gridded (or raster maps) were notat all pretty. GIS anno 1980 consisted of two opposing camps, the one with expensive,

beautiful, but essentially dumb products that were the electronic equivalent of paper maps;

the other, a sort of mapping spreadsheet, in which spatial analysis could be carried out with

great mathematical exibility, but ugly results and huge demands on the then limited

computer memories. Developments in computer technology and the analysis of remotely

sensed images has reinforced the gridded approach for environmental study. Iinitially,

however, the differences in budgets and apparatus between the remote sensing

professionals and environmental scientists ensured that raster GIS and the classication

and display of remotely sensed images remained separate areas of development.

Technical advances since the 1980s have ensured that the division of GIS practitioners

into two opposing camps has largely disappeared, and the input of gridded maps and

remotely sensed images to GIS has now become standard practice. True, there are still

arguments today as to whether the raster (gridded) or the vector (point, line, polygon)

approach is better, but the discussion now focuses on the correct choice of spatial

paradigm for a given application, and not on the limitations of the approaches per se

(Burrough and McDonnell, 1998). Today, most commercial GIS provide facilities for

working with raster or vector data, either individually, or in combination. They also

provide database facilities for storing, retrieving, modifying the attributes of the spatial

entities that have been recognized for the given application, and many also include their

own internal programming languages which allow the user to treat the spatial data as

inputs to a virtually unlimited range of environmental models (Burrough, 1996).

362 Burrough


3/17

In brief, GIS are sets of computer tools for the storage, retrieval, analysis and display ofspatial data. GIS may also be required to supply data to numerical models of

environmental processes (e.g., air quality, water quality and quantity, plant-soil-

environment responses, etc.) and display the results of these models as cartographically

acceptable screen or hard copy images. By convention, GIS analyses are almost

exclusively deterministic and data are assumed to be exact. Apart from specialists (e.g.,

Heuvelink and Lemmens, 2000) the GIS community has shown little regard for issues of

uncertainty and spatio-temporal variability apart from geometric precision. This is not

because of computational problems, but because market forces have determined that many

GIS applications need not address these issues.

1.1 GIS and statistics

Statistical theory and practice for describing the average properties of samples, and for

hypothesis testing are well known in environmental science. Conventionally, the

geographical location of the individual observations is not taken into account, but if

these methods are used for attributes of spatially located objects then one may be able to

set up and test hypotheses as to whether geographically separate, but eponymous objects

(e.g., instances of soil series, land use classes) really share the same sets of attributes.

Statistical spatial data analysis (SSDA) (Wise et al., 2001) treats the objects in the spatial

data base (points, lines, areas, pixels) as though they and their attributes were samples

from a larger population. As Wise et al. (2001) point out, two main approaches have been

developedexploratory spatial data analysis (ESDA) and conrmatory spatial data

analysis (CSDA). ESDA is a spatial extension of Tukey's (1977) methods for robust andvisual analysis of data: the accent is on descriptive univariate and multivariate statistics

(means, deviations, ranges, correlations, principal components) in which one searches for

outliers or oddities in the value patterns of the spatial objects under consideration. In

CSDA, attention is focused on building empirical regression models and/or the testing of

hypotheses.

Several standard statistical packages (SPSS, S-plus, etc.) include a wide range of

methods for EDA and CDA, though they may not include all the hyper data links

envisaged by the developers of ESDA (e.g., Wise et al., 2001). Never the less, today it is

comparatively easy to link a statistical analysis of tabular attribute data to a set of

geographical objects in a GIS like ARC-VIEW, either via a DBase le (e.g., using SPSS)

or embedded links (using S-plus).

As an example of simple descriptive statistical analysis linked to GIS, consider Fig. 1,

which shows a soil map with three soil types and 126 sample locations.

In the study area the soil is usually less than 100 cm thick over bedrock. In a GIS

analysis we might want to test the hypothesis that there is no signicant difference in soil

thickness between the three soil types so that the map pattern may be simplied without

loss of information. Visual inspection of the right hand gure suggests that the different

soil types do have different soil thickness, and this is easily conrmed by extracting the

observed thickness data for each site and carrying out an ANOVA analysis for all soil

types. As Table 1 shows, the mean soil thickness per soil type does differ signicantly; the

analysis returns a F-value of 22.67 with p40.001. A post-hoc Scheffe test suggests that all

GIS and geostatistics 363


4/17

three soil types have signicantly different means (Table 2) so there is little point in

simplifying the soil map.

As another example of straightforward statistical analysis using a linked statistics

package, Fig. 2 presents the results of carrying out a multivariate discriminant analysis on

all the 20 attributes of the soil collected at each of the 126 sample sites. This clearly shows

that though the centroids of the three soil types clearly differ in multivariate space, there is

considerable overlap.

1.2 GIS and geostatistics

As noted, the standard GIS approach to recording and analyzing the attributes of pre-

dened objects implies no spatial variation within an object, and all change occurs at

object boundaries. In many applications (hydrology, oceanography, earth sciences, soil

Figure 1. Left: Soil prole classes at sample sites (dot is unit Cr, small circle with dot is unit Ct and

large circle with dot is unit Ia). Right: Soil thickness at sample sites (dot is 040 cm, small ag is 40

80 cm, and large ag is4 80 cm).

Table 1. Descriptive statistics of soil thickness for each soil type.

Soil Type N Mean Std. Error

Ct 36 51.15 4.20

Cr 31 67.02 4.48

Ia 59 32.76 2.79

Total 126 46.45 2.42

364 Burrough


5/17

science to name but a few), this approach is not always sensible and it is better to consider

the variation of the attribute in terms of a continuous, but noisy surface. This surface is

often constructed by interpolation from sets of point data. Though there are many methods

for interpolation (see Burrough and MacDonnell, 1998), most of these treat the data as if

they can be modeled by a smooth, differentiable surface and no attention is paid to the

uncertainty of the results. The methods of geostatistics (Matheron, 1965; Journel, 1996;

Goovaerts, 1997) use the stochastical theory of spatial correlation both for interpolation

and for apportioning uncertainty.

Although still unfamiliar to many GIS users, in terms of technical development, the

Table 2. Post hoc Scheffe test indicates that all three soil types have signicantly differentthicknesses.

Subset for Alpha 0.05

Soil Type N 1 2 3

3 59 32,7617

1 36 51,1528

2 31 67,0232

Sig. 1000 1000 1000

Figure 2. Plot of discriminant functions for all 126 soil observations compared with map classes.



6/17

methods of geostatistics are of similar age to GIS, but have different roots. Whereas GISwas seen as a way to automate the creation of exact, deterministic models of the world in a

dominantly cartographic context, geostatistics is about making predictions under

conditions of uncertainty and limited information. The path of geostatistics from its

founders Krige and Matheron in the 1960s and 1970s to present day exponents such as

Journel, Goovaerts and others emphasizes the role of chance in spatial prediction. Where

GIS ignores statistical variation, geostatistics uses the understanding of statistical variation

as an important source of information for improving predictions of an attribute at

unsampled points, given a limited set of measurements. Geostatistics are therefore a very

useful ``add on'' or extension to the GIS toolkit for spatial analysis.

A central aspect of geostatistics is the use of spatial autocovariance structures, often

represented by the (semi)variogram, or its cousin the autocovariogram, which differentiate

different kinds of spatial variation. The semivariance indicates the degree of similarity of

values of a regionalized variable Zover a given sample spacing or lag, h. Semivariograms

(Fig. 3) are graphs of the semivariance gh against sample spacing or lag, h: they aredened as:

gh 1

2VarfZxi Zxi hg 1

and estimated from sampled data by:

gh 1

2nXn

i 1

fzxi zxi hg2

2

where n is the number of samples, and zxi; zxi h are measurements separated by a

distance h.In practice, gh is estimated from sets of point samples which can be extracted from the

GIS data base. Because experimentally derived semivariances do not always follow a

smooth increase with sample spacing, a theoretical variogram model is tted to the data

(Burrough and McDonnell, 1998; Deutsch and Journel, 1998; Goovaerts, 1997). The

interpolation weights for predicting the value of attribute z at unsampled locations x are

derived with the help of this tted model and the method is known as ordinary point

kriging (OPK) after its rst exponent. Predictions can also be computed for units of land

(blocks) larger than those sampled, thereby smoothing out local variationsthis is known

as block kriging. Much practical geostatistics is concerned with the estimation and tting

of variograms to experimental data (Pannatier, 1996) followed by interpolation or

conditional simulation of gridded surfaces (Pebesma and Wesseling, 1998). Besides

interpolation, kriging provides information on interpolation errors. Knowledge of the

spatial correlation structures may also be used to generate sets of equiprobable realizations

(simulations) of the attribute z that can be of great value for studying error propagation

through spatial models that may be linked to the GIS.

For many users of GIS, kriging is no more than an alternative method of interpolation

(see Burrough and McDonnell, 1998 for references). Indeed, many statisticians and

geographers use other methods for statistical spatial analysis (c.f. Bailey and Gatrell, 1995;

Cressie, 1991). The general lack of appreciation of geostatistics by the GIS community

during the seminal years from the mid-1970s to the mid-1990s was due to many factors,

including the publication of Matheron's original treatize in French (Matheron, 1965),

366 Burrough


7/17

which is therefore inaccessible to most native English speakers. Until the mid-1990s, the

high prices charged for geostatistics software packages and their almost exclusive use by

mining corporations made it difcult to teach geostatistics in many universities. Of course,

a contributing factor to the lack of interest in geostatistics by the GIS practitioner is itsgrounding in mathematical statistics which clearly bafes those of us who have little

feeling for the statistical treatment of sampling, variance analysis and correlation and

regression.

2. The mutual benets of linking GIS, statistics andgeostatistics

In this Section I present some examples of the ways in which GIS, statistics and

geostatistics complement each other in spatial analysis.

2.1 The value of GIS for geostatistics

Besides acting as a spatial database, GIS provides several benets to statisticians and

geostatisticians that are largely concerned with the correct geometric registration of

sample data, prior data analysis, the linking of hard and soft data, and the presentation of

results.

Geo-registration. As with all spatial data, spatial analysis must be carried out on data

that have been collected with reference to a properly dened coordinate system. GIS can

Figure 3. Example of a semivariogram tted to experimental data. The numbers indicate the

numbers of pairs of points used at each lag.



8/17

provide the means to register the locations of samples directly (via GPS or other methods),or to convert local coordinates to standard coordinates. The use of standard coordinates

ensures that data collected at different times can be properly combined and overlaid on

conventional maps. The use of standard coordinate systems is particularly important when

international databases are created from different sources, such as occurs in Europe, for

example.

Exploratory spatial data analysis. As already noted, ESDA is a useful toolkit for

examining data prior to analysis. For geostatisticians, the presence and location of spatial

outliers, or other irregularities in the data may have important consequences for the tting

of variograms, or for determining whether data should be transformed to logarithms. GIS

often provide search engines that can be linked to statistical packages to determine

whether any given data set contains anomalies or unexpected structure. The underlying

reasons for such anomalies may sometimes be easily seen when these data are displayed

on a map together with other information. Not all users of ESDA in GIS use conventional

geostatistics, however, and other measures of spatial autocorrelation such as Moran's I

statistic are often used (Pereira et al., 1998).

Spatial context and the use of external information. Increasingly, the suite of

geostatistical methods currently available allow the user to incorporate external

information that can be used to modify, and possibly improve, the predictions or

simulations required. Geostatisticians term the external information ``secondary'',

because they believe that the ``hard data'' measured at the sample locations is most

important. But GIS practitioners might prefer to call the ``primary data'' that which

separates a landscape into its main componentsdifferent soils, or rock types, or land

cover classes, regarding the sampled data as merely lling in the details that were not

apparent at the smaller map scale. In any case, GIS makes it possible to incorporate data

from other aspects of the environment with the geostatistical study of autocorrelationstructures, so that differentiated knowledge of different patterns of variation can be used to

best effect. For example, in the c. 56 2 km study area used in Principles of Geographical

Information Systems (Burrough and McDonnell, 1998) the distribution of heavy metals

(zinc) in the top soils of the river alluvium was clearly inuenced by ooding regime,

which in turn is affected by factors such as distance from the river and the relative

elevation of the oodplain. Fig. 4 shows how the extra information may be used in several

ways. Stratied kriging involves dividing the original set of 155 soil samples into classes

based on ooding frequencya simple ``point-in-polygon'' search in GISto yield three

strata. Variograms were estimated for each stratum and these were interpolated to yield a

single map (Fig. 4b). In a second approach, a multiple regression model was computed

from the triplets of zinc level, elevation and distance to river measured at all data points

(Fig. 4c). A third approach, known as ``Universal kriging'' directly incorporates the trend

in the estimation of the interpolation weights and Fig. 4d illustrates how both stratication

and trends may be combined.

The results clearly show the differences in the patterns obtained with and without the

ancillary data. The single, or combined incorporation of external information through

stratication and strata-specic trends yielded maps with good levels of prediction and a

spatial resolution that was better than could have been obtained from ordinary point

kriging alone. Other examples are given in Goovaerts (1997, 1999).

Display and visualization2D, 3D, plus time. Who is the recipient of a geostatistical

interpolation? If a geostatistician, or statistician, then simple maps and tables of numbers

368 Burrough


9/17

may sufce, but environmental managers need to see how the results relate to other aspects

of the terrain. Today it is easy to import the results of a kriging interpolation into a GIS and

display the results in conjunction with a scanned topographic map, or display them in 3D

over a digital elevation model (DEM) of the landscape from which the samples were taken

(Fig. 5). Such presentation invites visual interpretation, the re-evaluation of results and the

discovery of more information, and therefore is an essential part of the spatial analysis

process.

Figure 4. Results of interpolating the ln(Zinc) levels of topsoils (010 cm) in a frequently ooded

part of the Maas oodplain, Limburg, NL. a: ordinary point kriging, b: OPK within different ooding

strata, c: using a regression model based on elevation and distance from the river, d: universal kriging

with a single trend, e: universal kriging with stratication and different trends for each stratum.



10/17

2.2 The value of geostatistics for GIS

Besides providing powerful means of interpolating point data to areas, there are many

useful ways in which statistics and geostatistics can bring major improvements to the

understanding of uncertainty and error in GIS-based spatial analyzes. This is particularly

so for most kinds of GIS-based environmental modeling where a priori we are dealing

with incomplete data and uncertainty. Indeed, to pretend, as the standard GIS paradigms

do, that all data are exactly known, and exactly located, is not to recognize reality

Geostatistics provides at least the following attractive options for environmental GIS

and environmental decision support systems: interpolation from point data and estimates

of error bounds, estimates of error propagation and uncertainty ranges for spatial and

temporal modeling, and data reduction and generalization.

Interpolation errors. Although surfaces interpolated by kriging are smooth, all forms of

kriging yield estimates of the estimation uncertainty or kriging error. Such values can be

mapped to provide error surfaces which can be combined with other information. Kriging

errors depend on the form of the variogram and the disposition of observationsthe more

Figure 5. 3-Dimensional display of interpolation results obtained from stratied kriging on a digitalelevation model with shading and transparency oated above a scanned topographic map. Dark gray

zones indicate heavy metal concentrations.

370 Burrough


11/17

data surrounding an unsampled location, and the stronger the autocorrelation structure, thelower the estimation variance.

Error propagation in spatial models. When data from interpolated surfaces are used as

inputs to numerical models, the error surfaces associated with kriging interpolation may be

used to understand the propagation of errors through spatial models. Heuvelink (1998) gives

both theory and examples of using Taylor series expansion on interpolated data to compute

error propagation through cartographic modelssee also Burrough and McDonnell (1998).

An increasingly popular alternative to the Taylor expansion method is to use methods of

conditional simulation (Pebesma and Wesseling, 1998) to provide sets of multiple

realizations of data surfaces for inputs to numerical models like the 3D groundwater model

``MODFLOW'', so that error propagation and model sensitivity can be followed using

Monte Carlo methods (e.g., Bierkens, 1994; Gomez-Hernandez and Journel, 1992).

Monte Carlo techniques using conditional simulation may also be useful for comparingdata collected at different times and locations within the same area. Recent work on the

redistribution of137Cs fallout from the Chernobyl nuclear disaster in 1986 has shown that

the normal decay of radiocaesium levels and uptake rates in cow's milk can be temporally

reversed if the cows are grazing on recently ooded, poorly drained peat soils (Burrough

and McDonnell, 1998; Burrough et al., 1999a). The data for these studies consisted of

radionuclide determinations made on bulked soil samples taken in 1988 and 1993.

Unfortunately, the samples were collected at different sites in the two years, so it was

difcult to use the raw data to test the hypothesis that the ood events had really enhanced

radio caesium levels near the rivers. However, by computing the variograms for the data

sets from both years and using these to compute sets of conditional simulations of the

normalized differences of radiocaesium in the topsoil between the two sampling times and

at all sampled sites, it was possible to establish a clear relation between the incidence of

ooding and ood-induced enhancement of radiocaesium which could enter the food

chain (Burrough et al., 1999a). Fig. 6 shows clearly that although there seem to be

systematic differences between the two years (mean values for 1993 exceed those for 1988

by 0.51.0 standard errors) sites within 1.5 km of a ooding river are not only more

variable, but many have higher levels of radio caesium.

Data reduction and spatial generalization. In some applications there may be too much

data, which may need to be reduced to manageable proportions or common coordinates.

An example is the need to compare the yields of different crops over several years on the

same plot when yields have been recorded using data loggers and GPS. For example,

Burrough and Swindell (1997) report the collection of annual yield data for three

successive crops on a 5 ha eld at the experimental farm of the Royal College of

Agriculture, Cirencester, UK. Data were collected on wheat, barley and oilseed rape in

successive years by a combine harvester tted with a data logger whose location was

pinpointed by locally referenced GPS. The spatial resolution of the sample was

approximately 4 m (the width of the harvester)6 2.5 m (along the cut), and each survey

yielded some 2000 samples or more.

Because of locational noise in the GPS and errors in the amount of crop cut each 2.5 m

by the harvester, it was not possible to relate the yields of the three crops directly to

location in the eld nor to investigate links between crop yields and soil conditions. To

generalize and smooth the data, for each year an isotropic variogram was computed: the

data were then interpolated to a common grid of 2.5 m resolution using block kriging with



12/17

units of 256 25 m. Each annual map was normalized to give a map showing relative

yield; these three maps were then combined to give a three year, normalized average.

Comparison of the normalized average yield map with a computer enhanced, scanned

aerial image of the site (Fig. 7) demonstrates clear relations between site conditions and

normalized crop yields that otherwise were not apparent.

Figure 6. Plots of conditional simulations for the 19881993 normalized differences of137Cs at data

points, with distance to rivers that ood.

Figure 7. Comparison between aerial photo image of eld A and displayed on its right, the average,

standardized crop yields as interpolated using block kriging.

372 Burrough


13/17

Geostatistics and remote sensing. The applications of geostatistical methods in theanalysis of remotely sensed images is a topic in itself. Here I refer the reader to the recent

issue of Photogrammetric Engineering and Remote Sensing (January, 1999) for a recent

compilation of research. Remote Sensing applications of geostatistics have less to do with

interpolation from sparse data (the images are complete unless masked by cloud cover in

which interpolation could be used to ll in the gaps) than with the description and analysis

of gridded, stochastic surfaces and the simulation of multiscale data sets.

3. Stochastic inputs to the modeling of spatial processes

As already indicated, geostatistical methods of conditional simulation are useful for

following the propagation of errors through spatial models that may be linked to, or runfrom GIS. Recent research in the modeling of dynamic spatial processes (van Deursen,

1995; Takeyama and Couclelis, 1997; Wesseling et al., 1996) indicates the value of

including an understanding of errors and roughness in many models of dynamic spatial

processes, particularly when processes are non-linear.

Stability of the topology of drainage nets. The automatic derivation of surface topology

from gridded digital elevation model is now a standard operation in GIS that are used for

hydrological projects (Fig. 8a). The usual procedure is to use thin plate splines to

interpolate a DEM (digital elevation model) from digitized contours to a ne grid so that

the resulting topological net is free from discontinuities (Mitasova and Hoerka, 1993).

Unfortunately, although smooth interpolators guarantee continuity in surface topology,

they also constrain the topology to a single set of drainage lines, which may result inserious artefacts in hydrological derivatives such as wetness indices (see Burrough and

MacDonnell, 1998 for denitions). Simple methods, such as the D8 algorithm, for deriving

drainage nets from gridded surfaces, produce a unique solution in which the main stream

line is only one cell wide (e.g., Fig. 8a). Large differences in the size of the upstream

contributing catchment area between a cell on the main drainage line and its off-line

neighbor may arise. This is counter-intuitive, because we expect cells close to each other

to have similar conditions and contributing areas, especially in the bottoms of valleys. A

Figure 8. a: Single realization of a drainage network derived from a smooth DEM; b: average image

computed from 100 realizations derived from the initial DEM plus 10 cm root mean square (RMS)

error.



14/17

better idea of surface water drainage may be obtained by considering the averageproperties of a suite of possible drainage nets that are obtained when surface roughness is

added to the DEM. The roughness can easily be modeled by a small Gaussian noise which

is added to each cell (a standard deviation equal to 0.1% of the maximum relief difference

in the area is enough as a rst approximation); the result yields one possible realization of

the net. Repeating the procedure for 1001000 times with different random values for

roughness creates an average probability density map of the cumulative contributing area

(Fig. 8b) which appears to be more realistic than the single deterministic solution. Note

that one cannot compute Fig. 8b by passing a moving window smoothing function over

Fig. 8a.

The effects of small errors on the derived ow paths may be effectively demonstrated by

displaying the whole set as a movie, when the amplitudes and locations of the swings of

drainage paths resulting from the minor errors will become very apparent. Though this

example uses spatially uncorrelated noise for each realization of the DEM surface, one

could of course examine the effects of spatially correlated noise on the model by rst

creating a set of conditional simulations based on a known or assumed variogram.

Repeating the analysis for multiple realizations and displaying these using dynamic

visualization enhances understanding of the results.

Adding stochasticity to make a deterministic process model work properly. In certain

situations it appears to be necessary to add roughness to a surface so that a well-known

deterministic process can be modeled effectively, and this is illustrated using the example

of the creation of an alluvial fan. If a hillside is modeled as a smooth inclined plane, then

the topology consists merely of a set of parallel lines that run from top to bottom, much

like the way rain falling on the windscreen of a stationary car runs off in parallel streams.

These streams can be ``forced'' to merge if the initial surface is roughend (e.g., Liverpooland Edwards, 1995). In the case of the alluvial fan, each ``event'' by which material falls

down the slope and is added to the fan modies the surface roughness in a way that is very

difcult to predict, but which must not be ignored. So the initial roughness is modied by

feedback from the sedimentation process so that for each cycle there is a new surface for

the ow and deposition. If the deposits are sufciently large, the surface topology changes

with each cycle.

The need for initial roughness which is modied but maintained during the development

of the delta is a nice example of how a better understanding of the physical process may

arise by linking geostatistics with interactive dynamic modeling. Ongoing research in

Utrecht and elsewhere is beginning to demonstrate the value of conditional simulation in

dynamic, as well as static models of landscape change (see Karssenberg et al., in press).

4. Non-stochastic tools for analyzing uncertainty in spatialdata: fuzzy subsets

In many situations we know there is uncertainty, but we do not know, nor can we construct

probability distributions. We may also be uncertain how to dene the geographical objects

in the data base (Burrough and Frank, 1996). The development of fuzzy subsets in

environmental science is increasingly being seen not as a replacement for statistics and

374 Burrough


15/17

geostatistics, but as a complementary suite of methods for operating in uncertainconditions. The main uses of fuzzy subsets in GIS are for the selection and retrieval of data

under conditions of uncertainty (eg., Burrough and McDonnell, 1998; Canters, 1997), and

in creating multivariate classes that overlap (fuzzy k-means) (Burrough et al., 1999b).

Data retrieval using fuzzy subsets has been demonstrated to be less error prone than

conventional Boolean SQL methods (Heuvelink and Burrough 1993). Fuzzy memberships

can be interpolated using kriging (de Gruijter et al., 1997; Burrough and McDonnell,

1998) and the application of fuzzy k-means to derivatives of digital elevation models pro-

vides convincing and objective methods for classifying terrain (Burrough et al., 2000,

2001). Fuzzy subsets can also be used to address issues of the crispness of spatial bound-

aries (e.g., Lagacherie et al., 1996) or the intervisibility across 3D surfaces (Fisher, 1995).

Fuzzy subsets may also be used to dene sensible ways to select point data for kriging.

5. Conclusions

This review has demonstrated that GIS, statistics and geostatistics have much to give to

each other, particularly when GIS are used for environmental analysis. Geostatistics

benet from having standard methods of geographical registration, data storage, retrieval

and display, while GIS benets by being able to incorporate proven methods for testing

hypotheses and for handling and understanding errors in data and illustrating their effects

on the outcomes of models used for environmental management. In some situations,

geostatistics may be supplemented by non-probabilistic methods of handling uncertainty

such as provided by fuzzy subsets.

References

Bailey, T.C. and Gatrell, A.C. (1995) Interactive Spatial Data Analysis, Longman, Harlow, 413 pp.

Bierkens, M.F.P. (1994) Complex Conning Layers: A Stochastic Analysis of Hydraulic Properties at

Various Scales, Royal Dutch Geographical Association (KNAW)/Faculty of Geographical

Sciences, University of Utrecht, Utrecht, NL.

Burrough, P.A. (1996) Opportunities and limitations of GIS-based modeling of solute transport at the

regional scale. In: Application of GIS to the Modeling of Non-Point Source Pollutants in the

Vadose Zone, SSSA Special Publication 48, Soil Science Society of America, Madison, 1937.

Burrough, P.A. and Frank, A. (1996) (eds), Geographic Objects with Indeterminate Boundaries,

GISDATA Series 2, Taylor and Francis, London.

Burrough, P.A., van Gaans, P.F.M., and MacMillan, R.A. (2000) High-resolution landform

classication using fuzzy k-means. Journal of Fuzzy Sets and Systems, 113, 3752.Burrough, P.A., van Gaans, P.F.M., Wilson, J., and Hansen, A.J. (2001) Fuzzy k-means classication

of topo-climatic data as an aid to forest mapping in the Greater Yellowstone Area, USA.

Landscape Ecology, 16, 52346.

Burrough, P.A. and McDonnell, R.A. (1998) Principles of Geographical Information Systems,

Oxford, Oxford University Press, 330 pp.

Burrough, P.A. and Swindell J. (1997) Optimal mapping of site-specic multivariate soil properties.

In Precision Agriculture: Spatial and Temporal Variability of Environmental Quality, J. Lake,

G. Bock, and J. Goode (eds), Proc: CIBA Foundation Symposium 210, John Wiley and Sons,

Chichester, pp. 20820.



16/17

Burrough, P.A., van der Perk, M., Howard, B., Prister, B., Sansone, U., and Voitsekhovitch, O.V.(1999a) Environmental mobility of Radiocaesium in the Pripyat Catchment, Ukraine/Belarus.

Water, Air and Soil Pollution, 110, 3555.

Burrough, P.A., van Gaans, P.F.M., and MacMillan, R.A. (2000) High-resolution landform

classication using fuzzy k-means. Journal of Fuzzy Sets and Systems, 113, 3752.

Canters, F. (1997) Evaluating the uncertainty of area estimates derived from fuzzy land-cover

classication. Photogrammetric Engineering and Remote Sensing, 63, 40314.

Coppock, J.T. and Rhind, D.W. (1991) The history of GIS. In: Geographical Information Systems,

Vol. 1, Principle, D.J. Maguire, M.F. Goodchild, and D.W. Rhind (eds), Longman Scientic

and Technical, New York, pp. 2143.

Cressie, N. (1991) Statistics for Spatial Data, Wiley, New York, 900 pp.

De Gruijter, J.J., de Walvoort, D., and van Gaans, P. (1997) Continuous soil mapsa fuzzy set

approach to bridge the gap between aggregation levels of process and distribution models.

Geoderma, 77, 16995.

Deutsch, C. and Journel, A.G. (1998) GSLIB Geostatistical Handbook, 2nd edition, Oxford.Fisher, P.F. (1995) An exploration of probable viewsheds in landscape planning. Environment and

Planning B: Planning and Design, 22, 52746.

Gomez-Hernandez, J.J. and Journel, A.G. (1992) Joint sequential simulation of multigaussian elds.

In: A. Soares (ed), Proc. Fourth Geostatistics Congress, Troia, Portugal. Quantitative Geology

and Geostatistics, (5), 8594, Dordrecht, Kluwer Academic Publishers.

Goovaerts, P. (1997) Geostatistics for Natural Resources Evaluation, Oxford University Press,

483 pp.

Goovaerts, P. (1999) Using elevation to aid the geostatistical mapping of rainfall erosivity. CATENA,

34, 22742.

Heuvelink, G.B.M. (1998) Error Propagation in Environmental Modeling, Taylor and Francis,

London, 127 pp.

Heuvelink, G.B.M. and Burrough, P.A. (1993) Error propagation in cartographic modeling using

Boolean logic and continuous classication. Int. J. Geographical Information Systems, 7, 231

46.

Heuvelink, G.B.M. and Lemmens, T. (2000) (eds), Accuracy 2000. Proceedings of the 4th

International Meeting on Accuracy in Spatial Data, Amsterdam, July, Delft University Press,

Delft.

Karssenberg, D.J., Torqvist, T., and Bridges, J. (2001) Conditioning a process-based model of

sedimentatry architecture to well data. Journal of Sedimentary Research, 71(6).

Lagacherie, P., Andrieux, P., and Bouzigues, R. (1996) Fuzziness and uncertainty of soil boundaries:

from reality to coding in GIS. In: P.A. Burrough and A.U. Frank (eds), Geographical Objects

with Indeterminate Boundaries, Taylor and Francis, London, pp. 27586.

Liverpool, T. and Edwards, S. (1995) Modeling meandering rivers. Physical Review Letters, 75,

3016.

Matheron, G. (1965) La Theorie des Variables Regionalisee et ses Applications, Masson, Paris.

Mitasova, H. and Hoerka, J. (1993) Interpolation by regularized spline with tension: Application to

terrain modeling and surface geometry analysis. Mathematical Geology, 25, 65769.Pannatier, Y. (1996) Variowin. Software for spatial data analysis in 2D. Statistics and Computing,

Springer Verlag, Berlin, 91 pp.

Pebesma, E. and Wesseling, C.G. (1998) GSTAT: A program for geostatistical modeling, prediction

and simulation. Computers and Geosciences, 24, 1731.

Pereira, J.M.C., Carreiras, J.M.B., and Perestrello de Vasconcelos, M.J. (1998) Exploratory data

analysis of the spatial distribution of wildres in Portugal 19801989. Geographical Systems,

5, 35590.

Takeyama, M. and Couclelis, H.M. (1997) Map dynamics: integrating cellular automata and GIS

through Geo-Algebra. International Journal of Geographical Information Science, 11, 7392.

376 Burrough


17/17

Tukey, J.W. (1977) Exploratory data analysis, Addison-Wesley, Reading, Massachusets.Van Deursen, W.P.A. and Wesseling, C.G. (1995) PCRaster, Department of Physical Geography,

Utrecht University.

Wesseling, C.G., Karssenberg, D., Burrough, P.A., and van Deursen, W.P.A. (1996) Integrating

dynamic environmental models in GIS: The development of a dynamic modeling language.

Transactions in GIS 1, 408.

Wise, S., Haining, R., and Ma, J. (2001) Providing spatial statistical data analysis functionality for

the GIS user. The SAGE project. International Journal of Geographical Information Science,

15, 239254.

Biographical sketch

Peter A. Burrough, since 1984, is Professor of Physical Geography and Geographical

Information Systems, Faculty of Geographical Sciences, University of Utrecht. Dr.

Burrough is also the Director of the Utrecht center for Environment and Landscape

Dynamics (UCEL). He is Chairman of the Interfaculty center for Hydrology, Utrecht

(ICHU). He is a member of the advisory committee on Earth Sciences, Physical

Geography and Geology for the Dutch National Science Foundation NOW, and a member

of the Scientic Board for the ``Fonds voor Wetenschappelijk Onderzoek'' (FWO) for

Vlaanderen, Belgium.


gis geostatistics

Documents