chapter 11 spatial analysis credit to prof michael goodchild
TRANSCRIPT
![Page 1: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/1.jpg)
Chapter 11 Spatial AnalysisChapter 11 Spatial Analysis
Credit to Prof Michael Goodchild
![Page 2: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/2.jpg)
Methods for working with spatial data to detect patterns, anomalies to find answers to questions to test or confirm theories (deductive
reasoning) to generate new theories and generalizations
(inductive reasoning) Methods for adding value to data
in doing scientific research in trying to convince others
What is spatial analysis?What is spatial analysis?
![Page 3: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/3.jpg)
A collaboration between human and machine the machine does things the human finds too
tedious, difficult, complex to do by hand the human directs, makes interpretations
and inferences Ranging from simple to complex
some methods are mathematically sophisticated e.g. statistical tests
other methods are visual, intuitive, simple e.g. making and examining maps
![Page 4: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/4.jpg)
The Snow map The Snow map
Cholera outbreak in Soho, 1854 Dr John Snow and the pump inference regarding the transmission mechanism
for cholera see www.jsi.com updating Snow
Openshaw's map of childhood leukemia in N England
![Page 5: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/5.jpg)
Data types
Discrete objects (points, lines, areas) Fields spatially intensive, spatially extensive nominal, ordinal, interval, ratio, cyclic variables
Application domains Objectives
Types of spatial analysis Types of spatial analysis
![Page 6: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/6.jpg)
nominal
e.g. vegetation class
no implied order, no arithmetic operations
no average
"central" value is the commonest class (mode) Ordinal
e.g. ranking from best to worst
implied order, but no arithmetic operations
no average
"central" value has half of cases above, half below (median)
Data typesData types
![Page 7: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/7.jpg)
Interval
e.g. Fahrenheit temperature
differences make sense
arbitrary zero point
"central" value is the mean Ratio
e.g. weight
ratios make sense
absolute zero point
"central" value is the mean Cyclic
e.g. aspect
be careful with arithmetic
average of 1 and 359 is 180
![Page 8: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/8.jpg)
Queries and reasoning Measurements Transformations Descriptive summaries Optimization Hypothesis testing
Six distinct objectives Six distinct objectives
![Page 9: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/9.jpg)
In ArcMap map view table view linked views histogram view scatterplot view
QUERIES QUERIES
![Page 10: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/10.jpg)
Exploratory spatial data analysis interactive methods to explore spatial data use of linked views finding anomalies mining large masses of data SQL structured or standard query language e.g. SELECT FROM counties WHERE median
value > 100,000
![Page 11: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/11.jpg)
We spend our lives in the vague world of human discourse "is Santa Barbara north of LA?"
a GIS needs to know exactly what is meant by "north of"
is Reno east or west of San Diego?
we tend to think of the US as a square, with two N-S coasts
how to design a GIS to provide driving directions?
to direct people through airports?
REASONING WITH GIS REASONING WITH GIS
![Page 12: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/12.jpg)
a GIS would be easier to use if could "think" and "talk" more like humans or
if there could be smooth transitions between our vague world and its precise world
in our vague world, terms like "north of" are context-specific
geographically relevant terms like "across" or "in" have many meanings
![Page 13: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/13.jpg)
Measurements are often difficult to make by hand from maps
MEASUREMENT WITH GIS MEASUREMENT WITH GIS
![Page 14: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/14.jpg)
Distance and length
calculation from metric coordinates
straight-line distance on a plane
Pythagorean distance
d = sqrt ((x1-x2)2+(y1-y2)2)
distance on a spherical Earth
from (lat1,long1) to (lat2,long2)
R is the radius of the Earth, roughly 6378 km
d = R arccos [sin lat1 sin lat2 + cos lat1 cos lat2 cos (long1 - long2)]
Length of a complex object
add the lengths of polyline or polygon segments
Two types of distortions
if segments are straight, length will be underestimated in general
for lines and areas
lengths are measured in the horizontal plane
underestimated in hilly areas
![Page 15: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/15.jpg)
Area (of a polygon)
proceed in clockwise direction around the polygon
for each segment
drop perpendiculars to the x axis
this constructs a trapezium
compute the area of the trapezium
difference in x times average of y
keep a cumulative sum of areas
at the end, the sum will be the area of the polygon
when might the algorithm fail?
islands must all be scanned clockwise
holes must be scanned anticlockwise
holes have negative area
because of limited computer precision
results could be wrong if the area is very small and the
coordinate values are very large
e.g. in UTM or SPC
need double precision for calculations
but not for results
![Page 16: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/16.jpg)
applying the algorithm to a coverage
keep running total for each polygon
for each arc
proceed segment by segment from FNODE to TNODE
add trapezia areas to R polygon area
subtract from L polygon area
on completing all arcs, totals are correct areas
![Page 17: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/17.jpg)
Shape
how to measure shape of an area?
a compact shape has a small perimeter for a given area
compare perimeter to the perimeter of a circle of the same area
shape = perimeter / [3.54 sqrt (area)]
other types of districts designed with GIS
administrative regions
sales districts
![Page 18: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/18.jpg)
Slope and aspect
measured from DEM raster
by comparing elevations of points in a 3x3 neighborhood
slope and aspect at one point estimated from elevations of it and
surrounding 8 points
various methods
important to know how your favorite GIS calculates slope
number points row by row from top left
from 1 to 9
b denotes slope in the x direction
c denotes slope in the y direction
D is the spacing of points
e.g. 30m for USGS DEMs
![Page 19: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/19.jpg)
find the slope that fits best to the 9 elevations
minimizes the total of squared differences
between point elevation and the fitted slope
weighting four closer neighbors higher
b = dZ/dX = (z3 + 2z6 + z9 - z1 - 2z4 - z7) / 8D
c = dZ/dY = (z1 + 2z2 + z3 - z7 - 2z8 - z9) / 8D
tan (slope) = sqrt (b2 + c2)
slope defined as angle
or rise over horizontal run
or rise over actual run
tan (aspect) = b/c
aspect measured clockwise from vertical to direction of steepest slope
add 180 to aspect if c is positive, 360 to aspect if c is negative and b is
positive
![Page 20: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/20.jpg)
Buffering Point in polygon Polygon overlay Spatial interpolation Density estimation
TRANSFORMATIONS TRANSFORMATIONS
![Page 21: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/21.jpg)
Buffering
Transformations create new objects and data sets from existing objects and data sets
buffering takes points, lines, or areas and creates areas
every location within the resulting area is either:
in/on the original object
within the defined buffer width of the original object
Two versions
discrete object:
for every object, result is a new polygon object
new objects may overlap
field (objects cannot overlap):
every location on the map has one of two values:
inside buffer distance
outside buffer distance
![Page 22: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/22.jpg)
Applications
find all households within 1 mile of a proposed new freeway
and send them notification of proposal
find all areas of Los Padres National Forest beyond 1 mile from a road
find all liquor stores within 1 mile of a school
and notify them of a proposed change in the law
Variants
raster and vector versions
vary the object's buffer width according to an attribute value
e.g. noise buffers depending on road traffic volume
vary the rate of spread according to a friction field
only in raster
e.g. travel speed varies
Thiessen polygons for point objects
the area closest to each point forms a polygon
![Page 23: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/23.jpg)
POINT IN POLYGON
Determine whether a given point lies inside or outside a given polygon
assign a set of points to a set of polygons
e.g. count numbers of accidents in counties
e.g. whose property does this phone pole lie in?
Algorithm
draw a line from the point to infinity
count intersections with the polygon boundary
inside if the count is odd
outside if the count is even
Field case
point must lie in exactly one polygon
Discrete object case
point can lie in any number of polygons, including zero
Issues
algorithm for a coverage
what if the point lies on the boundary?
special cases
![Page 24: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/24.jpg)
POLYGON OVERLAY
Create polygons by overlaying existing polygons
how many polygons are created when two polygons are overlaid?
Discrete object case
find overlaps between two polygons
e.g. a property and an easement
creates a collection of polygons
Field case
overlay two complete coverages
creates a new coverage
e.g. find all areas that are owned by the Forest Service and classified
as wetland
in vector or raster
in raster the values in each cell are combined, e.g. added
Issues
major computing workload
indexing
swamped by slivers
tolerance
![Page 25: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/25.jpg)
SPATIAL INTERPOLATION
What is interpolation?
intelligent guesswork
an interval/ratio variable conceived as a field
temperature
soil pH
population density
sampled at observation points
needed:
values at other points
a complete surface
a contour map
a TIN
a raster of point values
Two methods commonly used in GIS
inverse-distance weighting (IDW)
Kriging (geostatistics)
![Page 26: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/26.jpg)
Moving average/distance weighted average/inverse distance weighting
estimates are averages of the values at n known points
known values z1,z2,...,zn
unknown value z = Sum over i (wizi) / Sum over i (wi)
where w is some function of distance, such as:
w = 1/dk
w = e-kd
an almost infinite variety of algorithms may be used, variations
include:
the nature of the distance function
varying the number of points used
the direction from which they are selected
is the most widely used method
objections to this method arise from the fact that the range of interpolated
values is limited by the range of the data
other problems include:
how many points should be included in the averaging?
what to do about irregularly spaced points?
how to deal with edge effects?
![Page 27: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/27.jpg)
Example
ozone concentrations at CA measurement stations
objectives:
1. estimate a complete field, make a map
2. estimate ozone concentrations at other locations
e.g. cities
data sets:
measuring stations and concentrations (point shapefile)
CA outline (polygon shapefile)
DEM (raster)
CA cities (point shapefile)
IDW wizard in Geostatistical Analyst
opening screen defines data source
next screen defines interpolation method
which power of distance? (2)
how many sectors? (4)
how many neighbors in each sector? (10-15)
next screen gives results of cross-validation
results map
![Page 28: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/28.jpg)
Kriging
developed by Georges Matheron, as the "theory of regionalized variables",
and D.G. Krige as an optimal method of interpolation for use in the mining
industry
the basis of this technique is the rate at which the variance between points
changes over space
this is expressed in the variogram which shows how the average
difference between values at points changes with distance between
points
Kriging is based on an analysis of the data, then an application of the results
of this analysis to interpolation
![Page 29: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/29.jpg)
Variograms
vertical axis is E(zi - zj)2, i.e. "expectation" of the difference
i.e. the average difference in elevation of any two points distance d
apart
d (horizontal axis) is distance between i and j
most variograms show behavior like the diagram
the upper limit (asymptote) is called the sill
the distance at which this limit is reached is called the range
the intersection with the y axis is called the nugget
a non-zero nugget indicates that repeated measurements at the
same point yield different values
![Page 30: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/30.jpg)
in developing the variogram it is necessary to make some assumptions about
the nature of the observed variation on the surface:
simple Kriging assumes that the surface has a constant mean, no
underlying trend and that all variation is statistical
universal Kriging assumes that there is a deterministic trend in the
surface that underlies the statistical variation
in either case, once trends have been accounted for (or assumed not to
exist), all other variation is assumed to be a function of distance
![Page 31: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/31.jpg)
Deriving the variogram
the input data for Kriging is usually an irregularly spaced sample of points
to compute a variogram we need to determine how variance increases with
distance
begin by dividing the range of distance into a set of discrete intervals, e.g. 10
intervals between distance 0 and the maximum distance in the study area
for every pair of points, compute distance and the squared difference
in z values
assign each pair to one of the distance ranges, and accumulate total
variance in each range
after every pair has been used (or a sample of pairs in a large dataset)
compute the average variance in each distance range
plot this value at the midpoint distance of each range
fit one of a standard set of curve shapes to the points
"model" the variogram
![Page 32: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/32.jpg)
Computing the estimates
once the variogram has been developed, it is used to estimate distance weights
for interpolation
interpolated values are the sum of the weighted values of some number of
known points where weights depend on the distance between the interpolated
and known points
weights are selected so that the estimates are:
unbiased (if used repeatedly, Kriging would give the correct result on
average)
minimum variance (variation between repeated estimates is minimum)
problems with this method:
when the number of data points is large this technique is
computationally very intensive
the estimation of the variogram is not simple, no one technique is best
since there are several crucial assumptions that must be made about
the statistical nature of the variation, results from this technique can
never be absolute
![Page 33: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/33.jpg)
simple Kriging routines are available in
the Surface II package (Kansas Geological Survey)
and Surfer (Golden Software)
the GEOEAS package for the PC developed
by the US Environmental Protection Agency, and in
ArcInfo 8 as an add-on Geostatistical Analyst
![Page 34: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/34.jpg)
DENSITY ESTIMATION
Suppose you had a map of discrete objects and wanted to calculate their density
density of population
density of cases of a disease
density of roads in an area
density would form a field
density estimation is one way of creating a field from a set of discrete
objects
Methods
count the number of points in every cell of a raster
measure the length of lines, e.g. roads
result depends on cell size
result is very noisy, erratic
![Page 35: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/35.jpg)
Density estimation using kernels
think of each point being replaced by a pile of sand of constant shape
add the piles to create a surface
example kernel
width of the kernel determines the smoothness of the surface
Density estimation and spatial interpolation applied to the same data
density of ozone measuring stations
using Spatial Analyst
kernel is too small (radius of 16 km)
kernel radius 150 km
what's the difference?
![Page 36: Chapter 11 Spatial Analysis Credit to Prof Michael Goodchild](https://reader035.vdocuments.net/reader035/viewer/2022062322/5697bfb61a28abf838c9e209/html5/thumbnails/36.jpg)