axhausen, k.w. (2007) problems with errors: a brief...
TRANSCRIPT
1
Preferred citation style for this presentation
Axhausen, K.W. (2007) Problems with errors: A brief introduction, Englishseminar 2007, Schliersee, February 2007.
Problems with errors: A brief introduction
KW Axhausen
IVTETHZürich
February 2007
3
Example: Population growth of Swiss municipalities
4
Example: Link speeds (Kanton Zürich)
0-1920-3940-5960-7980-99
100-119>120
Km/h
5
Software
Hierarchial (multilevel) models: • MLWin• Any software, which estimates “mixed models” (SAS, GLIM,
LIMDEP, etc.)
Spatial error and lag models:• S or R (integrated into ArcGIS)• GeoDA• LeSage’s Econometric Toolbox for MatLab
6
Starting point
OLS assumes:
y Dependent variableβ Vector of parametersX Matrix of independent variablesε Errorσ Variance of the error
εβ += Xy~ (0, )iid Nε σ
7
What can go wrong ?
Heteroscedacity 1
Heteroscedacity 2
Collinearity
ˆ~ yε
~ xε1 0 00 1 0
cov( , )0
0 0 1
i jx x
⎧ ⎫⎪ ⎪⎪ ⎪≠ ⎨ ⎬⎪ ⎪⎪ ⎪⎩ ⎭
K
K
M M O
K
8
What else can go wrong ?
1 0 00 1 0
cov( , )0
0 0 1
n mε ε
⎧ ⎫⎪ ⎪⎪ ⎪≠ ⎨ ⎬⎪ ⎪⎪ ⎪⎩ ⎭
K
K
M M O
K
Spatial or temporal vicinity
9
Hierarchical regression (Simplest 2-level model)
with:
and:
ijijijij xxy 1100 ββ +=
0 0 0 0ij j ijuβ β ε= + +
1 1 1 1ij j ijuβ β ε= + +
fixed part random part
random partfixed part
Example:
y Relative population growthβ0,1 Parameterx0 Constantx1 Change in accessibilityu Systematic error (departure of the
j-th Cantons intercept (slope) from the overall value)
ε Error (departure of the i-th municipality‘s actual score from the predicted score)
i Level 1 (Municipality)j Level 2 (Kanton)
~ (0, )iid Nε σ
10
Example: Swiss population growth
Accessibility change
Rel
ativ
e po
pula
tion
grow
th
11
Example: “Systematic errors”
inte
rcep
tS
lope
Rank
Rank
12
Example: Neighbourhoods in Swiss population growth
OtherHighest interceptsHighest slopes
13
Spatial regression models
Spatial autoregressive model (SAR):
Spatial error model (SEM)
Spatial autoregressive and spatial error model combined (SAC):
with W: neighborhood matrix (contiguity matrix) with row sum=1ρ: influence factor of spatial autoregressive dependenceλ: influence factor of spatial dependence of error
εβρ ++= XyWy A
uXyWy A ++= βρ ελ += uWu E
uXy += β ελ += uWu E
~ (0, )iid Nε σ
14
Neighbors: Euclidean distance vs. network distance
Five spatially symmetric nearest neighbors by Euclidean distance
Five neighbors within a network distance of up to two intersections
15
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 160
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
SE-neighbors (λ)
AR-n
eigh
bors
-17283--17250
-17317--17283
-17350--17317
-17383--17350
-17417--17383
-17450--17417
Log-Likelihood
AR-n
eigh
bors
(ρ)
0.0 2.0 5.9 12.5 21.1 31.9 45.0 60.4 78.20.0
2.0
5.9
12.5
21.1
31.9
45.0
60.4
78.2
SE-neighbors (λ)
AR
-nei
ghbo
rs
-17150--17117
-17183--17150
-17217--17183
-17250--17217
-17283--17250
-17317--17283
-17350--17317
-17383--17350
-17417--17383
-17450--17417
Log-Likelihood
Nearest neighbors byEuclidean distance
Nearest neighbors bynetwork distance of intersections
SAR: ρ4=0.249 -SEM: - λ7=0.374SAC: ρ11=0.242 λ3=0.173
SAR: ρ5.9=0.298 -SEM: - λ21.1=0.632SAC: ρ12.5=0.299 λ2.0 =0.142
Log-Likelihood Measure: Choosing Best Model (W-Matrix)
16
1 intersection (~2 neighbors) 5 intersections (~32 neighbors)
The CPU memory problem of the W-Matrices
17
Predictions
Ordinary least squares and weighted least squares (OLS, WLS):
Spatial autoregressive model (SAR):
Spatial error model (SEM):
Spatial autoregressive and spatial error model combined (SAC):
βXy =ˆ
βXy =ˆ
βρ XWIy A1)(ˆ −−=
βρ XWIy A1)(ˆ −−=
18
Do you always check ?
• Collinearity (Covariance matrix of the X) ?
• Correlations between y and ε ?
• Correlations between X and ε ? • Presence of structural units (e.g. organisational membership,
social networks and groups, cohorts, life style groups, residential clustering) ?
• Temporal correlations (DW – Test) ?• Spatial correlations (Moran’s I) ?
19
spatially correlated error?
auto-regressive correlation?
hetero- scedastic residuals?
homo- scedastic residuals?
spatially correlated error?
Spatial autoregression Spatially correlated error
OLS
WLS
SAR
SAC
No spatial correlation
SEM
Choosing appropriate regression model
20
Thanks to
Martin Tschopp (Hierarchical models)
Michael Bernard and Jeremy Hackney (Spatial lag and error models)