spatial association and spatial statistic techniques

48
Spatial Association and spatial statistic techniques Danlin Yu Ph.D. Candidate Dept. of Geography, UWM

Upload: zofia

Post on 16-Jan-2016

76 views

Category:

Documents


2 download

DESCRIPTION

Spatial Association and spatial statistic techniques. Danlin Yu Ph.D. Candidate Dept. of Geography, UWM. Detecting Spatial Association. What is spatial association Spatial objects tend to relate with one another Types of spatial association - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Spatial Association and spatial statistic techniques

Spatial Association and spatial statistic techniques

Danlin Yu

Ph.D. Candidate

Dept. of Geography, UWM

Page 2: Spatial Association and spatial statistic techniques

Detecting Spatial Association

What is spatial associationSpatial objects tend to relate with one another

Types of spatial associationSpatial autocorrelation: similar (dissimilar) values in space tend to cluster togetherSpatial heterogeneity: spatial regimes, space is not homogeneousAutocorrelation and heterogeneity are closely related

Page 3: Spatial Association and spatial statistic techniques

Detecting spatial association

Why study spatial association It is inherent in geographic researchesWhen working on spatial data, analyses based on regular statistics are VERY likely to be misleading or incorrect

How to detect spatial associationPower of GISExploratory Spatial Data Analysis (ESDA): let the data speak

Page 4: Spatial Association and spatial statistic techniques

Background

The first law of Geography:Everything is related, but things nearby are more related than things far away

Characteristics of spatial statisticsExistence of spatial association violates an important statistical assumption: independenceSpatial patterns are results of spatial processes – the one we see, is one of numerous possibilities from the same spatial process

Page 5: Spatial Association and spatial statistic techniques

Types of spatial association

Point spatial associationDistance is critical in deciding point spatial association

Line spatial associationDistance and path

Areal spatial associationDistance and contiguity

Page 6: Spatial Association and spatial statistic techniques

Today’s topic: univariate SA

Univariate: for pattern detectionExamples: per capita GDP for economic performance pattern; surface temperature for local climate pattern, etc.Central question: is the pattern we see a result of some specific processes (usually random or normal processes – our null hypothesis)?

Multivariate: spatial regression or geographically weighted regression (GWR)

Page 7: Spatial Association and spatial statistic techniques

Researching means

Hypothesis testing in answering this question is conducted via spatial statistic meansFor univariate geographic data, there are a few indexes in literature:

Moran’s Index (Moran’s I)Geary’s Index (Geary’s c)Getis’s G or O

Page 8: Spatial Association and spatial statistic techniques

Spatial statistic indexes

Purposes of the three indexes are very similar – based on the geographic data, calculate an index, test the index against the null

The most often encountered index is the Moran’s I

Discussion on Moran’s I are applicable to other indexes subject to minor adjustments

Page 9: Spatial Association and spatial statistic techniques

Moran’s Index (I)

Structured like the Pearson’s product-moment statistic: measure of covariance

n

ii

n

i

n

jjiij

n

i

n

jij yy

yyyyw

w

nI

2)(

))((

Page 10: Spatial Association and spatial statistic techniques

Moran’s I

wij is the weight, wij=1 if locations i and j are adjacent and zero otherwise (wii=0, a region is not adjacent to itself).

yi and are the variable in the ith location and the mean of the variable, respectivelyn is the total number of observationsI is used to test hypotheses concerning similarity

y

Page 11: Spatial Association and spatial statistic techniques

Determining the weights

Two rulesDistance: locations within a certain distance are considered as neighbors

Border-sharing (for areal units only): areas sharing borders are considered as neighbors

Weights matrix: could be symmetric or asymmetric – binary weights matrix, general weights matrix (distance decaying)

Page 12: Spatial Association and spatial statistic techniques

Determining the weights

Spatial weights matrix should be constructed judiciously

Ideally, related to general concepts from spatial interaction theory, such as the notions of accessibility and potential etc.

Page 13: Spatial Association and spatial statistic techniques

Determining the weights

When used in hypothesis testing, this requirement is less stringent

Since our purpose is to test the null – spatial independence

Still, trying a few structures is a good idea – border sharing, different distances

Page 14: Spatial Association and spatial statistic techniques

Determining the weights

A typical symmetric weights matrix is a binary weights matrix where neighbors are coded as 1, others 0

Without losing generality, it is usually row standardized – all elements of one row add up to 1

Page 15: Spatial Association and spatial statistic techniques

Hypothesis testing

The expected values and the variance for Moran’s I are used for testingHowever, it is observed that in the null hypothesis, Moran’s I usually does not follow normal distributionAlternatives

Random permutationSaddlepoint approximation

Page 16: Spatial Association and spatial statistic techniques

Hypothesis testing

Monte Carlo (random) permutation for Moran’s I

Randomly arrange the values among the space and calculate I each time (e.g., 999 times)Comparing the actual I with the 999 randomly gained IsIf the actual I falls into area of either more than 95% or less than 5%, it is said the I is psuedo significant at 5% level (positive/negative)

Page 17: Spatial Association and spatial statistic techniques

Hypothesis testing

Saddlepoint approximation (Tiefolsdorf, 2001)

Exact distribution of Moran’s I can be obtained, but computationally prohibitive for even medium size data setA saddlepoint distribution approximates the exact distribution with reasonable accuracyBased on the ratio of quadratic normal variablesUsually, random permutation would do the job

Page 18: Spatial Association and spatial statistic techniques

Global and local (1)

The Moran’s I just introduced are based on simultaneous measurements from many locations – hence, it is a GLOBAL statistics

Global statistics provides only a limited set of spatial association measurements

You see the pattern, details are ignored – tree and forest dilemma

Page 19: Spatial Association and spatial statistic techniques

Global and local (2)

Recently, a number of statistics have been developed to measure dependence in portion of the study area – the local statistics

In spatial data analysis, the name is Local Index of Spatial Association (LISA) by Anselin (1995)

Page 20: Spatial Association and spatial statistic techniques

Global and local (3)

Definition of LISA (Anselin, 1995)The local statistics for each observation gives an indication of the extent of significant spatial clustering of similar values around that observation

The sum of local statistics for all observation is proportional (or equal) to a corresponding global statistics

Page 21: Spatial Association and spatial statistic techniques

Global and local (4)

Local statistics are well suited toIdentify existence of pockets or “hot spots”

Assess assumptions of stationarity

Identify distances beyond which no discernible association obtains

Global and local statistics are often used together for thorough understanding of spatial association and processes

Page 22: Spatial Association and spatial statistic techniques

Global and local (5)

This discussion is based on the decomposition of the Moran’s I to its local versionOthers can be done similarly, however, there is an important aspects of Moran’s I that will assist further understanding in spatial analysis

It can be decomposed into its local version, AND a graphic version – Moran’s scatterplot

Page 23: Spatial Association and spatial statistic techniques

Local Moran’s I

Following Anselin’s (1995) definition, a local Moran’s Ii may be defined as:

zis are the deviations from the mean of yis

The weights are row standardized

n

jjijii zwzI

Page 24: Spatial Association and spatial statistic techniques

Local Moran’s I

Hypothesis test for local Moran’s I is more complex

The distribution of local Moran’s I is definitely not normal, furthermore, local Moran’s I’s distribution is influenced by the global patternRandom permutation won’t work – for one specific location, during the permutation, the local Moran’s I’s mean and variance keep changing – which is not the case for global one

Page 25: Spatial Association and spatial statistic techniques

Local Moran’s I

Exact distribution of local Moran’s I can be obtained, but extremely computationally prohibitive

Saddlepoint approximation currently is thus far one potential resolution

Details can be found at Tiefelsdorf (2000; 2002)

Page 26: Spatial Association and spatial statistic techniques

Local Moran’s I

In addition, local Moran’s Is correlate with one another due to overlapping neighbors

Bonferroni correction or other correction methods are needed for acquiring robust testing results

These are all done in the SPDEP package in R

Page 27: Spatial Association and spatial statistic techniques

Moran’s scatterplot

A graphic tool for detecting local spatial association

Derived directly from the global Moran’s I

It can be used together with the local Moran’s I for better understanding

Page 28: Spatial Association and spatial statistic techniques

Moran’s scatterplot

Recall the formula of Moran’s I:

If use row standardized weights matrix the first term will be 1

n

ii

n

i

n

jjiij

n

i

n

jij yy

yyyyw

w

nI

2)(

))((

Page 29: Spatial Association and spatial statistic techniques

Moran’s scatterplot

Therefore, I could be re-written as:

Or:

n

ii

n

i

n

jjiij

yy

yyyyw

I2)(

))((

n

ii

n

i

n

jjiji

yy

yywyy

I2)(

))()((

Page 30: Spatial Association and spatial statistic techniques

Moran’s scatterplot

Recall the coefficient of the linear regression, b:

indi and depi are the independent and dependent variables; the “bar” versions are their means, respectively; and b is the regression coefficient

n

ii

n

iii

indind

depdepindindb

2)(

))((

Page 31: Spatial Association and spatial statistic techniques

Moran’s scatterplot

Yes, similarity between the Moran’s I and the regression coefficient b

Actually, is the so-called

“spatial lag” of location i.

So, I is formally equivalent to a regression coefficient in a regression of a location’s spatial lag on itself

n

jjij yyw ))((

Page 32: Spatial Association and spatial statistic techniques

Moran’s Scatterplot

This interpretation enables us to visualize Moran’s I in a scatterplot of a location’s spatial lag and itself – the Moran’s scatterplot

Moran’s I is the slope of the regression line

A lack of fit (in the scatterplot) would indicate important local spatial process and associations (local pockets/non-stationarity)

Page 33: Spatial Association and spatial statistic techniques

Moran’s scatterplot

The scatterplot is centered on the coordinate Origin

The first and third quadrants of the plot represent positive association (high-high and low-low), while the second and fourth negative (high-low, low-high)

The density of the quadrants represent the dominating local spatial process

Page 34: Spatial Association and spatial statistic techniques

Moran’s scatterplot

A so-called LOWESS (LOcally Weighted rEgression Scatterplot Smoothing) curve can aid the visual effects

Turning of the LOWESS curve usually indicates interesting local pockets, regimes or non-stationarity

An example: demonstration in R

Page 35: Spatial Association and spatial statistic techniques

More about Moran’s Scatterplot

A very important ESDA tools for spatial data analysis

Further information could be obtained from: The Moran Scatterplot as an ESDA tool to assess local instability in spatial association. pp. 111–125 in M. M. Fischer, H. J. Scholten and D. Unwin (eds.) Spatial Spatial analytical perspectives on GISanalytical perspectives on GIS, London: Taylor and Francis

Page 36: Spatial Association and spatial statistic techniques

An analytical example

Spatial pattern detection in China’s provincial development

The variable used: per capita GDP

Dynamic patterns – global Moran’s I

Specific local spatial process – local Moran’s I and the Moran’s scatterplot

Page 37: Spatial Association and spatial statistic techniques

EasternRegion

CentralRegion

WesternRegion

0 1,000 2,000500 Kilometers

0 500 1,000250 Miles

Yuan

175 - 291

292 - 430

431 - 680

681 - 1290

1291 - 2498

China: per capita GDP in 1978

Page 38: Spatial Association and spatial statistic techniques

EasternRegion

CentralRegion

WesternRegion

0 1,000 2,000500 Kilometers

0 500 1,000250 Miles

Yuan

869 - 1913

1914 - 3162

3163 - 4532

4533 - 8411

8412 - 15593

China: per capita GDP in 2000

Page 39: Spatial Association and spatial statistic techniques

An analytical example

0

0.05

0.1

0.15

0.2

0.25

19781979198019811982198319841985198619871988198919901991199219931994199519961997199819992000

Year

Glo

bal M

oran

's I

Dynamic change of global Moran’s I from 1978 to 2000, all are significant at 5% level per random permutation

Page 40: Spatial Association and spatial statistic techniques

An analytical example

There is a clustering trend in China’s provincial level development (represented by per capita GDP

But the global Moran’s I can’t tell on which side does the clustering trend take place: high values cluster or low values cluster?

Page 41: Spatial Association and spatial statistic techniques

GDP per capita (standardized)

543210-1

Spa

tial l

ag o

f GD

P p

er c

apita

(st

anda

rdiz

ed) 3.0

2.0

1.0

0.0

-1.0

XJ

NXQHGSSSXXZ

YNGZ

SC

HaN

GX GDHuNHuB

HeNSDJXFJAH

ZJ

JS

SHHLJ

JL

LN

NMG

SX

HeB

TJ

BJ

The Moran’s scatterplot in 1978

Page 42: Spatial Association and spatial statistic techniques

GDP per capita (standardized)

543210-1

Spa

tial l

ag o

f GD

P p

er c

apita

(st

anda

rdiz

ed) 3

2

1

0

-1

-2

XJNXQHGSSSXXZ

YNGZSC

HaN

GX GDHuN

HuB

HeN

SDJX

FJ

AH

ZJ

JS

SH

HLJ

JL

LNNMGSX

HeB

TJ

BJ

The Moran’s scatterplot in 2000

Page 43: Spatial Association and spatial statistic techniques

EasternRegion

CentralRegion

WesternRegion

0 1,000 2,000500 Kilometers

0 500 1,000250 Miles

Local Moran's I

< - 0.3

- 0.3 - 0

0 - 0.3

0.3 - 1.0

> 1.0

Local Moran’s I in 1978

Page 44: Spatial Association and spatial statistic techniques

Local Moran’s I in 2000

EasternRegion

CentralRegion

WesternRegion

0 1,000 2,000500 Kilometers

0 500 1,000250 Miles

Local Moran's I

- 0.3 - 0

0 - 0.3

0.3 - 1.0

> 1.0

Page 45: Spatial Association and spatial statistic techniques

An analytical example

First, China’s coast-interior divide persistedInterior provinces exhibit great geographical similarity in economic development and spatial contributions to the global Moran’s I

Second, the municipalities (Beijing, Tianjin, Shanghai) always contribute the most

Shanghai’s position is worth noting, it development changed the spatial pattern the most

Page 46: Spatial Association and spatial statistic techniques

An analytical example

Third, Guangdong’s contribution to the global index corresponds with its changing spatial behavior depicted in the Moran scatterplot

Fourth, while most of the interior provinces have similar patterns, coastal provinces vary greatly

Page 47: Spatial Association and spatial statistic techniques

An analytical example

Fifth, Shandong fell into the low-low quadrant, and contributed very little to the global index

Sixth, Guizhou and Yunnan, two provinces in southwest China, contributed relatively highly to the global index in 2000

The poorest ones tend to form a poor cluster

Page 48: Spatial Association and spatial statistic techniques

Demo – with R and SPDEP

A little demonstration

The software package RR: freeware, : freeware, powerful, open sourcepowerful, open source

Packages: SPDEPSPDEP and MAPTOOLSMAPTOOLS

If you have spatial data and interested in utilizing ESDA, you can approach me for your research