integrating spatial statistics and gis for regional ... · the spatial patterns of regional...

INTEGRATING SPATIAL STATISTICS AND GIS FORREGIONAL STUDIES IN THAILAND

TRAN Hung, Dr.; Yoshifumi YASUOKA, Prof.Institute of Industrial Science, University of Tokyo

4-6-1 Komaba, Meguro-ku, Tokyo 153-8505, JapanTel: +81-3-5452-6415 Fax: +81-3-5452-6410 E-mail: [email protected]

Abstract: This paper utilizes spatial statistical techniques integrated with GIS to analyze regional developmentproblems associated with the rapid expansion of the regional city of Chiang Mai – Lamphun in the Northern Thailandduring the 1986–1994 period. A set of combined GIS procedures is used to integrate up-to-date information fromremotely sensed, physio-graphical and socio-economic data in order to create a comprehensive spatial database for thestudy region. Then, with a combination of logical and statistical operations, reliable cross-sectional spatialized studyvariables describing physical, socio-demographic and economic aspects of the regional development at the sub-districtlevel are derived for subsequent spatial statistical analysis. The spatial patterns of regional development in the ChiangMai – Lamphun area over time are traced out in various development aspects using spatial autocorrelation and spatialassociation statistics. The unbalanced regional development pattern and significant urban-rural disparity associatedwith rapid urban growth in the region are explored. Then, the spatial statistical modeling is adopted to simultaneouslymodel spatialized physical and socio-economic factors affecting the urban-rural disparity and urban-rural interactionin order to provide insights into the region-wide spatial impacts on rural surroundings of the rapid economic growth inChiang Mai – Lamphun urban centers. This paper demonstrates that GIS can help to solve the “data barriers” problemin developing countries, while its integration with spatial statistics is useful in gaining further insights into regionaldevelopment problems.

Keywords: Spatial Statistics, GIS, Regional Development, Developing Countries, Thailand.

1. INTRODUCTION

Up-to-date and reliable information is vital for themanagement of a region’s human and naturalresources and for dealing with regional developmentdecisions that have a spatial context (Klosterman,1995). A comprehensive information base couldreduce uncertainty and enhance decision-making.Managers and policy makers may wish to integratesocial, economic and environmental data in order toformulate strategic development plans (Kliskey,1995). In developing countries, however, the databarriers are still obvious due to both institutional andtechnical reasons. As institutional issues are beingrecognized and governments start to invest millions ofdollars in collecting data, the data management andusage are still far from satisfactory level. Informationon various aspects of regional development – social,economic and environmental data – is originallycollected for different purposes, at different scales, atdifferent time frames and with different underlyingassumptions about the nature of the phenomena. Thiscreates technical difficulties to the integration ofsocial and environmental data, and explains thescarcity of successful empirical researches on regionaldevelopment analysis in developing countries.

In recent years, Geographic Information Systems(GIS) have become an important tool for regional andurban research. As about 80-90% of data collected andused for regional and environmental informationsystems are related to geography (Huxhold, 1991),

GIS provides such an integrated computingenvironment for social and environmental dataintegration. It is widely recognized that GIS providesa large range of analytical capabilities to operate ontopological relationships or spatial aspects of thegeographical data, on the non-spatial attributes of suchdata, or on non-spatial and spatial attributes combined.GIS facilitates the integration of disparate data sets,creation of new and derivative data sets, anddevelopment and analysis of spatially explicitvariables. Furthermore, the integration of GIS withspatial statistical analysis has the potential to become apowerful analytical toolbox, enabling regional andsocial scientists to gain fundamental insight into thenature of spatial structures of regional development(Brown, 1996). Many efforts have been made to applyGIS, spatial statistics and modeling to regional studies.Key contributions to this emerging literature includethose by Getis and Ord (1992), Anselin (1994, 1995),Chou (1995), and Bao et al. (1995), who contributed tothe building of theoretical concepts.

The objective of this empirical study is twofold: (1) todemonstrate GIS capability in integrating disparatedatasets and creating a comprehensive regional spatialdatabase, and (2) to analyze regional developmentproblems associated with the rapid expansion of theregional city of Chiang Mai – Lamphun in theNorthern Thailand. Specifically, this study attempts toprovide answers to the following questions:

1. What is the spatial pattern of urban growthcenters in the Chiang Mai - Lamphun area?

2. To which spatial extent could the urban growthcenters have impacts on rural surroundings?

3. What are the socio-economic factors explainingthe pattern and intensity of urban core - ruralperiphery interactions?

2. STUDY AREA

The Chiang Mai - Lamphun area is defined as the firstregional growth center for the North, according toThailand’s Fourth National Development Plan (1977-1981) in the line with decentralized/regionallybalanced urban development. It has been continuouslyincluded as one of the Industrial Promotion Zones ofsubsequent Plans. The expected development impactsof those growth centers are to provide services to andinduce growth in the hinterland through diffusion ofinnovations and strengthening forward and backwardlinkages (Lo, 1981). However, as indicated by Sharma(1984) and Potter & Unwin (1989), the tendency forthe polarization forces is stronger than trickle-downforces, which may cause spatial structure of adominant core with a dependent periphery, and widenincome inequalities.

The study area is located approximately betweenlatitude 18008’ N and 19006’N and longitude 98030’ Eand 99025’ E, with a total area of 5806 km2.Administratively, the study area is composed of tendistricts of Chiang Mai Province and six districts of

Lamphun Province, resulting in a total of 146subdistricts or tambols1 (Figure 1). Topographically,the area covers most of the Chiang Mai basinassociated with the Ping River, surrounded by hilly tomountainous terrain. This natural condition allows oneto consider the area as an independent functionaleconomic area2 in spatial regional analysis. Thetransportation network is concentrated around ChiangMai City and Lamphun Municipality, indicating apattern of region with two major urban growth centers(Figure 2). In fact, Chiang Mai City has become themonocentric economic, financial and cultural centerfor the whole region. Moreover, during the last twodecades, the area has experienced rapid urbanexpansion, with ever more rapid industrialestablishments (Suwan et al., 1992). From 1986onward, the Northern Industrial Estate with 87projects implemented (as of December 1994) wasbuilt in Muang Lamphun District. LamphunMunicipality, thus, has emerged as a new industrialcenter in the area. The average income was rising aslabor shifted from the agricultural to themanufacturing sector. After large investments hadbeen made in the Chiang Mai - Lamphun urbancenters during recent decades, it is of interest toinvestigate what impact they have had on rural areas,and whether any ‘trickling-down’ effect has occurred.Moreover, an understanding of development patterns,phases and constraints and an appraisal of how farurbanization and industrialization could contribute tothe development of its rural hinterland is necessary toarrive at recommendations for development planningin the region, which might be applicable to otherregional cities as well.

Figure 1. Administrative boundaries in the Chiang Mai – Lamphun area

Figure 2. Transportation network in the Chiang Mai – Lamphun area

3. GIS SPATIAL DATA INTEGRATION

3.1 Data Collection and Database Building

With regional development issues in the focus, data inChiang Mai – Lamphun area are collected fromvarious government offices in the form of physio-graphic data (e.g., topographic, administrative, land-use, industrial locations as well as transportationnetwork maps) and socio-economic indicators.Satellite TM images of different dates (1986 and1994) are used to provide up-to-date land cover andland use information. The image processingprocedures are used to classify georeferencedremotely sensed images and to produce updatedclassified land-use maps and transportations networksmaps in raster format (ERDAS Imagine). The spatialphysio-graphic data sets from paper maps areclassified, digitized and fed into vector GIS(Arc/Info). The classified images are, then, integratedwith other environmental and societal data setsthrough raster-to-vector data conversion to update andbuild time series data. As the loose data integrationapproach was used in this research, data are managedin both raster and vector format and convert from oneto another when data analysis required.

The major source for socio-economic data is theNational Rural Development Database (NRD-2C),which provides surveyed data at village level aftereach two years from 1986 composing of more than100 economic and social indicators. The data are alsocollected from other government documents,statistical records at provincial and municipal offices.They are selected, reclassified and combined based onthe basic administrative unit IDs – village codenumber in dBase IV format. A program in VisualFoxPro is written to automate the process ofextracting, normalizing and combining socio-economic indicators including population, income,education, health, natural environmental conditions,services, agriculture and industrial activities, workforce, capital investment, employment, etc. Thedetailed database building procedures are discussed inTran (1998).

3.2 Integrated GIS Database Management

As data management in GIS facilitates the integrationof diverse data sets and determines the analysespossible with those data, some data transformation

routines are built in this research to facilitate theconversion of physio-graphic and socio-economic datato a common spatial structure (e.g., set of areal units).The spatial physio-graphic data in GIS database are tosummarize/regionalize by administrative units in orderto be compatible with socio-economic data. Based oninternal homogeneity criterion, tambol is chosen asbasic spatial unit for data integration for this study.Compilation of socio-economic data: The socio-economic data are aggregated from village to tambollevel, and are normalized as relative shares of the totalpopulation of each respective tambol, in order tofurther reduce the effect of unequal sizes of tambol.

Creating Spatial Indicators within GIS: The spatialphysio-graphic data such as land use types, roadnetworks, irrigation networks, industrial factories areaggregated to tambol level using spatial overlay andlogical-statistical analysis functions in GIS (Arc/Info).Some accessibility measures such as distance fromresidential areas to nearest roads and nearest factoriesare derived through GIS spatial joins functionsutilizing the locational information of data. Thecommon summarizing / regionalizing procedures arepresented in Figure 3.

Comprehensive spatial GIS database: With allaggregated socio-economic data and regionalizedspatial physio-graphic data to common tambol level,the GIS join function through a key item – tambol ID– is used to complete the comprehensive spatial GISdatabase for Chiang Mai – Lamphun area. The GISdatabase, thus, contains comprehensive spatialinformation characterizing development states of theChiang Mai – Lamphun area (in 1986 and 1994) foreach tambol in terms of:

- spatial physio-graphic data : % of urban land-use,industrial land-use, agricultural land-use, roadlength density, irrigation length density, mediandistance from industrial land to closest residentialareas, median distance from residential area to thenearest road;

- demographic aspect: population density;economic aspect: average household propertytaxes, travel time to nearest town and commercialcenter, % of farmer, per capita number ofvehicles, number of factories, per capita industrialcapital investment, % factory employees, averagehousehold income, % people working far fromhome; and social aspect: level of primaryeducation, secondary education, illiterate rate, etc.

Maps of spatial features:land-use, roads, industries,

etc.

Administrative map(tambol boundaries)

Spatial Overlaying JOINING

Update with derivedspatial variables.

Figure 3 GIS procedure to regionalize the spatial physio-graphic indicators into tambol level from Chiang Mai -Lamphun spatial database.

4. SPATIAL STATISTICAL ANALYSIS

The resulting spatial data sets in unique format areuseful for further empirical analysis of regional spatialdevelopment patterns using various multivariatestatistical and spatial statistical analysis techniques.According to Potter and Unwin (1989), we exploredevelopment impacts of an urban growth center onrural surroundings focusing on three main headings:demographic, economic or social aspects. We chosepopulation density to represent demographic aspect, aspopulation pressure could be one of the importantfactors pushing rural people from their village to lookfor employment in other places. Based on availabledata, the social aspect is represented by differentlevels of education attainment (illiteracy, primaryeducation, secondary education), as education is anessential qualification for rural people to findemployment in urban areas (Sriboonruang, 1992). Theset of available numerous economic variablesrepresenting primary, secondary and tertiary sectors issubmitted to factor analysis, in order to identify theunderlying dimensions, or factors of the existingeconomic structure. As a result, the economicstructure of the study area is represented by threemajor composite economic factors having respectivegroups of high-factor-loading original variables, assummarized in Table 1. The detailed procedure toderive three major economic factors and theirinterpretation are beyond the scope of this paper; theyare discussed in Tran (1998).

As an important building block in spatial informationtheory, the concept of spatial autocorrelation providesinsight into spatial patterns and association of spatialdata. As in its most general sense, spatialautocorrelation[A-1] (Moran I) is concerned with thedegree of clustering of similar objects, or indicates theextent to which the occurrence of one feature isinfluenced by the distribution of similar features. In aregional setting, it can be seen as an indicator of acausal process, which suggests the degree of influence

exerted by an urban center upon its rural periphery(e.g., spatial interaction processes, externalities, spatialdiffusion, copy-catting, spill-overs, etc.). Furthermore,LISA statistics [A-2], suggested by Anselin (1994), canprovide further insights into the nature of core –periphery structure as the local Moran statistic allowsfor the identification of spatial agglomerative patterns,while the local Geary allows for the identification ofspatial patterns of similarity/dissimilarity (interactions).As exploratory spatial analyses reveal strong spatialdependence in core – periphery structure, spatialmodeling[A-4] (which accounts for spatial effects) ischosen to have insights into mechanism of influence ofsignificant socio-economic factors upon core –periphery interactions. Supplementary technical noteson measurement of spatial statistics and spatialmodeling adopted in this study are described in theAppendix.

Using the derived set of spatial variables, theintegrated spatial statistical analysis and GIS areapplied involving three main steps: (1) exploring theoverall spatial structure of regional development interms of various socio-economic indicators usingspatial association statistics; (2) exploring the spatialextent of impact zones of the two major growthcenters; and (3) spatial modeling of socio-economicfactors to understand the roles of growth centers inspreading development to their rural hinterland. TheGIS selection and manipulation functions of Arc/Info7.0 utilize spatial information such as location,topology and distance to create spatial weight matricesfor exploratory SDA statistical modules. TheSpaceStat 1.80 developed by Luc Anselin (Anselin,1995) has been used for analyses of global / localspatial patterns, spatial cross-correlograms and spatialmodeling, while the Regional Analysis System (RAS)module developed by Shuming Bao (Bao et al., 1995)have been applied for ring analysis. Then, location-specific results of the spatial statistical analyses aretransferred back to the GIS (ArcView 3.2) forvisualization and mapping.

Table 1 Factor characteristics and respective groups of high-factor-loading economic variables.

• Factor 1 (Index of Urban-biased Economy) highly positively correlates with percentage of urbanized andresidential areas, road density, property taxes and proportion of trading population.

• Factor 2 (Index of Industrial-based Economy) highly positively correlates with normalized total number ofindustrial employees, number of employees in large-scale factories, number of factories, total capitalinvestments, and percentage of industrial land-use.

• Factor 3 (Index of Lacking Opportunity) highly positively correlates with travel time to nearest town and marketcenters, median distance to industrial centers and nearest roads, and farmer population, and negatively correlateswith percentage of agricultural land.

5. SPATIAL IMPACTS OF URBAN CENTERS

5.1 Spatial Patterns of Urban Growth Centers

To identify the spatial agglomeration of urban andindustrial development for whole region, the globalMoran indexes [A-1] for levels of Urban-biased andIndustrial-based Economies are calculated. Visually,the study area is characterized by significantclustering patterns of transportation network,population density and average household incomearound Chiang Mai City and Lamphun Municipality,as shown in Figures 2, 4 and 5 respectively. Theclustering patterns of Urban-biased and Industrial-based Economies are confirmed by significantpositive Moran indexes of 0.66 and 0.26, withsignificance levels lower than 0.01%. The Moranscatterplots [A-1] (Figure 4) show good of fit for level ofUrban-biased Economy (i.e., highly concentratedurban development around Chiang Mai City) andrelatively lack of fit (albeit significant) of regressionline for level of Industrial-based Economy (i.e.,

relatively more scattered industrial development in thestudy area).

To have further insights into localized urban core –rural periphery inter-dependencies in terms of urbanand industrial development, LISA statistics [A-2] arecalculated. The calculated LISA statistics indicate alocal, positive spatial association (++) based on thelevel of Urban-biased Economy (significantly positiveIi and Gi) around Chiang Mai City. This confirms thatthe urbanization process has spread through thegrowth[A-2.1] of the Chiang Mai City core onto the ruralperiphery. On the other hand, the LISA statisticsindicate a local, positive spatial association (++) ofIndustrial-based Economy, but negative association (-+) of Urban-biased Economy for LamphunMunicipality. It confirms that Lamphun Municipality,with its moderate level of urbanization (spreadthrough decentralization [A-2.2]), serves only as agrowth center to spread industrial development. Itsurban infrastructure seems not to catch up with therapid industrial development at the LamphunIndustrial Estate.

Figure 4. Moran’s scatterplots for Urban-biased Economy and Industrial-based Economy indexes by tambol in 1994

y = 0.2686x + 0.0308

-1

-0.5

0

0.5

1

1.5

2

-2 -1 0 1 2 3 4 5

Index of Industrial-based Economy in 1994

Lag

ged

fac

tor

2 in

199

4

y = 0.6639x - 0.0365

-2

-1

0

1

2

3

4

-2 -1 0 1 2 3 4 5

Index of Urban-biased Economy in 1994

Lag

ged

fac

tor

1 in

199

4

Figure 5. Population density by tambol in the Chiang Mai – Lamphun area in 1994

Figure 6. Interpolated surface of average household income by tambol in the Chiang Mai – Lamphun area in 1994(using spherical kriging method)

5.2 Spatial Extent of Urban/Industrial Centers byRing Analysis

The spatial extent of urban centers (by exploring theurban core – rural periphery relationship spatially) canbe addressed using LISA statistics in the modifiedring analysis (Bao et al., 1995). Starting with theassumption of an isotropic structure of space (classicalcentral place theory), the space could be divided intorings at different distances from urban cores. Bycomparing the local Moran indexes for the urban corewithin different rings, we are testing the extent ofspatial association between development in the citycore and its effects on the rural periphery. Based onthe nature of local spatial associations describedabove, the indexes of Urban-biased and Industrial-based Economies are used in studying the spatialspreading effects of Chiang Mai City and LamphunMunicipality, respectively; results are shown in Table2.

For Chiang Mai City, the center of the rings is chosenat Chang Klan tambol centroid (in the center of theinner city). All the tambol of the study area are thendivided into five rings, centered at the urban coreaccording to the adjacency criterion. To identify the

scope of spatial association of the urban core with therural areas, the local Moran and local Geary indexes(p < 0.05) within different rings are calculated. Asignificant local Moran index value indicates thatrelatively high values are associated with the corearea. As evident in Table 2, there is a significantpositive spatial relationship (++) between core tamboland adjacent tambols. Chiang Mai City can beconfirmed again as the spread through growth [A-2.1]

type of growth center, i.e., the Chiang Mai suburbantambols get spreading effects of the growth of the citycore in terms of Urban-biased Economy (e.g., spatialexpansion of urban facilities). In addition, the localMoran index value for the urban core is significant (p< 0.05) within the 3rd ring. This suggests that theeconomic development of the Chiang Mai City core isstrongly associated with the growth of rural areas,within a radius of about eleven kilometers. ForLamphun Municipality, the centroid of Nai Muangtambol is chosen as the center of the rings. Asignificantly negative spatial relationship (-+) isobserved between the municipality core and adjacenttambols known as spread through decentralization[A-

2.2], meaning the growth of adjacent tambols isassociated with the slow growth in the Municipalitycore. The local Moran index value for theMunicipality core is significant (p < 0.05) within the

1st ring of seven kilometers in radius. This hasdelineated the impact zone of any significantassociation between urban core and rural periphery ofthis industrial cluster, which exerted spatial influenceson rural areas within seven kilometers. The combinedimpact zones of Chiang Mai City and LamphunMunicipality are then mapped as shown in Figure 7,which resembles the interpolated surface of theaverage household income distribution (Figure 6). Inaddition, using ANOVA, it is found that there aresignificant differences between inside and outsidedelineated impact zones in terms of average householdincome (F = 6.809 and p < 0.001%), proportions ofpopulation working outside their home tambols (F =2.942 and p < 0.5%) and proportions of populationshaving secondary and higher education (F = 9.203 andp < 0.001%). This suggests that, people living insidethe delineated impact zones, close to major urbancenters, have benefited from recent urban andindustrial development significantly more than peopleoutside the zones.

5.3 Spatial Extent of Industrial Center by SpatialCross-Correlograms

From another perspective, effects of spatialconfiguration on the measure of spatialautocorrelation could be examined using spatialcorrelograms [A-3], which show the variations of thecoefficient over a higher-order spatial relationship(Chou, 1995). While spatial autocorrelation has been

used for characterizing the spatial pattern of aphenomenon, the concepts of spatial cross-association(multiple spatial correlation[A-3]) could be useful tocharacterize the relationship of two or morephenomena in the spatial domain. In the regionaldevelopment context, spatial relationships(interaction) tend to go beyond immediate neighbors(or very close distance). Certain variations in spatialpatterns, thus, may not be detected using statisticsderived from the direct spatial relationship alone. Tocombine multivariate spatial correlation into spatialcorrelograms, we propose the so-called spatial cross-correlograms, where instead of using Moran indexesin Chou’s spatial correlograms, the multivariatespatial correlation coefficients [A-3] are used (Tran,1998).

While strong effects of urban development on socio-economic life in the study area were detected bysignificant Pearson’s correlation coefficients betweenindex of Urban-biased Economy and socio-demographic indicators, no significant direct effect wasfound for industrial development. However, significantpositive first-order spatial correlation coefficient3 (r =0.133) between indexes of Urban-biased andIndustrial-based Economies suggests possible indirectimpact of industrialization on other aspects ofdevelopment. To shed more lights on spatial effects ofan industrial center exerting upon its rural surroundingsin different social and demographic aspects, the spatialcross-correlograms are exploited.

Table 2 Identifying the urban core - rural area linkages for the Chiang Mai - Lamphun area using local Moranand local Geary indexes of concentrated rings

1. Impact Rings of Chiang Mai City

The urban core at centroid of tambol Chang Klan - Var. Factor 1 CORE_Z: 3.70841Local Moran and Geary indexes within the distance of the ith Ring - aggregate rings

-------------------------------------------------------------------------------------------------------------------------Ring Dist(km) ZMEANi Ii E(Ii) V(Ii) Z(Ii) Ci E(Ci) V(Ci) Z(Ci)

2 3 0.1567 0.5811 -0.0046 0.0040 9.2609** 0.0679 0.7171 0.0076 -7.4597** 3 6 0.2069 0.7675 -0.0183 0.0136 6.7477** 1.7728 2.8684 0.0257 -6.8360** 4 11 0.1191 0.4159 -0.0582 0.0206 3.9973* 8.2118 9.1173 0.0391 -4.5810* 5 15 0.0061 0.0227 -0.0877 0.0061 1.4127 13.5566 13.7272 0.0116 -1.5865

2. Impact Rings of Lamphun Municipality

The urban core at centroid of tambol Bang Klang - Var. Factor 2 CORE_Z: 3.72787Local Moran and Geary indexes within the distance of the ith Ring - aggregated rings

-------------------------------------------------------------------------------------------------------------------------Ring Dist(km) ZMEANi Ii E(Ii) V(Ii) Z(Ii) Ci E(Ci) V(Ci) Z(Ci)

2 7 0.0324 0.1207 -0.0053 0.0046 1.8617* 0.5697 0.8276 0.0105 -1.93904* 3 11 0.0317 0.1181 -0.0463 0.0219 1.1098 7.0319 7.2413 0.0504 -0.9327 4 17 0.0116 0.0432 -0.0760 0.0144 0.9927 11.7461 11.8964 0.0331 -0.8256 5 21 -0.0306 -0.1142 -0.0945 0.0012 -0.5690 14.8379 14.7929 0.0027 0.8603

Note: * indicates pseudo-significance at p<0.05, ** at p<0.01. Zi are standardized values converted from the original inputs Factor1 or Factor2 – Factor1 ~ N (0,1), Factor2 ~ N (0,1)).

ZMEANi are the average values of the units surrounding unit i within designed distance. The value of unit i is not included

Figure 7. Combined significant impact zones of urban growth centers in the Chiang Mai – Lamphun area.

As distance is essential in spatial interaction models,spatial weight matrices based on distance criteriawould well represent the possibilities of interactionbetween pairs of points in space. Hence, the spatialcorrelation coefficients for each distance wouldindicate the intensity of its spatial interaction. In thisstudy, the centers of industrial establishment groupswithin each tambol are assigned as polygon labelpoints for distance calculation. Based on distancesbetween neighboring tambol centers ranging fromfour to ten kilometers, the spatial weight matrices fordistance bands of 2, 4, 5, 10, 15, 20, 25, 30, 35 and 40km are calculated. (Spatial weight wij(d) here is set toone if centroid of tambol j falls within a givendistance d from centroid of tambol i, and zerootherwise). Then, multiple spatial correlationcoefficients [A-3] between index of Industrial-basedEconomy (Factor 2) and index of Urban-biasedEconomy (Factor 1), Population Density, Proportionof Primary-educated Population, and Proportion ofWorking-out Population for 1994 are calculated usingSpaceStat 1.80. The spatial cross-correlograms areconstructed as graphs of multiple spatial correlationcoefficients by distance bands (Figure 8).

Based on both amplitude and wavelength, thebehavior of the cross-correlograms provides muchmore reliable information than any single Moranindex / spatial correlation coefficient, revealing less-evident spatial impacts of industrial development. Forall development aspects, the constructed spatial cross-correlograms show no significant spatial impacts ofindustrial development on rural areas at any distancefarther than 20 kilometers, even for highly mobilelabor-flow. Moreover, the maxima on the cross-correlograms show the distance where the mostintensive spatial interaction is possibly taking place.The spatial cross-correlograms for index of Urban-biased Economy and Population Density have maximaat around five to ten kilometers (which apparentlyequals worker’s commuting distance at current times),showing spatial relationships between industrialestablishments and inner urban centers. As for urban-rural linkages, the spatial cross-correlogram forProportions of Working-out Population shows weakalbeit significant relationships (correlation) with thelevel of industrial development, within 15 to 20kilometers, and a maximum at four to 15 kilometers

for 1994. Revealing this hidden spatial relationshipprovides valuable input into spatial modeling of core-

periphery interactions below.

-0.40

-0.30

-0.20

-0.10

0.00

0.10

0.20

0.30

0.40

0 10 20 30 40

Distance adjacency criteria (km)

Sp

atia

l Co

rrel

atio

n C

oef

fici

ents

Factor 2 Factor 1 PDENS94

W_OUT94 P_EDUC94

Figure 8. Spatial cross-correlograms based on multiple spatial correlation coefficients varying by different distance-based adjacency weight matrices for level of industrial-based economy with other aspects of development.

5.4 Spatial Modeling of Core-PeripheryInteractions

Given the “push” and “pull” factors in urban-rurallinkages, rural population tends to look foremployment outside the place of their residence(Kaur, 1995). Thus, the rural labor outflow indicatingthe job attraction of urban centers as well as theexcess of free labor released from the agriculturalsector could represent the intensity of urban-rurallinkages, as a result of regional development. TheProportion of Working-out Population of each tambolis used as response variable to model socio-economicfactors significantly influencing the intensity andextent of core-periphery interaction. The Proportionof Working-out Population may initially beformulated as a linear regression model ofdemographic (population density), social (variouslevels of education attainment) indicators and threemajor economic factors (Table 1).

To avoid possible non-normal errors, the dependentvariable is transformed using natural logarithmfunction and, then, submitted to classical OLSregression analysis using SpaceStat 1.80. Theinsignificant explanatory variables are excluded fromthe model based on t-value (p = 0.1), and the eventualOLS linear regression model is derived as shown inTable 3. The regression diagnostics show a significantspatial autocorrelation error (at significance level of0.001 %), indicating a significant deviation from the

basic assumption for linear regression analysis onspatial independence of sample observations and,thus, reducing the validity of significance tests.Moreover, as the flow of labor is a spatially dependentprocess (indicated by a positive, strong Moran I of0.4096), the explanation is not complete without somecharacterization of spatial interaction. Therefore, inorder to improve model estimates and account forspatial effects, the spatial-lag regression model[A-4] isadopted using SpaceStat 1.80, with its output shownin Table 4. The spatial lag term is highly significantand, more importantly, its addition reduced the spatialautocorrelation in the model residuals to aninsignificant level (p = 0.125). The adjusted R2 of0.4921 (vs. 0.4567 for OLS model), the log-likelihoodof –88.428 (vs. –101.514) and Akaike InformationCriterion of 122.855 (vs. 171.027) show significantimprovement in overall fit and more reliableparameter estimates of the spatial lag model.

According to the sign of estimated parameters forexplanatory determinants and the meaning of themodel (Table 4), significant ‘pull’ factors are levels ofUrban-biased Economy (Factor 3), levels ofIndustrial-based Economy (Factor 2), and PopulationDensity, while significant ‘push’ factors are Rural-Urban Indicator and the Spatial Lag-of-response-variable itself. A closer look at the original economicvariables constituting Factor 3 (Table 1) reveals thataccessibility is, indeed, a crucial factor in providingrural people the opportunity to move out and seekemployment, i.e., in intensifying the urban-ruralinteraction. Moreover, the model is also in support of

the conventional wisdom that land pressure is the realforce affecting the outflows of free agricultural labor(Table 1). However, for the study area, it is found thatpopulation pressure is not a factor affecting theoutflow of rural labor. The rural rather than the urbanpopulation tend to rush out to seek employment, as theurban centers are providing sufficient employmentopportunity. Concerning the economic factors, thelevel of Urban-biased Economy (Factor 1) appears

not significantly affecting the outflow of labor fromrural areas, while the level of Industrial-basedEconomy (Factor 2) is significantly attracting rurallabor. These findings have spatial relevance sinceurbanization, in fact, is concentrated mostly aroundChiang Mai City, and the rapid industrializationprocess in the study area since 1986 appears to have afavorable impact on employment generation for therural population.

Table 3 Results of traditional regression analysis (OLS) with test diagnostics

Dependent Variable: ln(Working-out Population + 1)

R2 = 0.4567 R2-adj = 0.4187 Log-likelihood = -101.514 AIC = 171.027

Variable Coefficients Std. Err. t-value Prob

Constant 1.94702 0.197472 9.859745 0.000000Factor 1 0.1777 0.100939 1.760476 0.080511Factor 3 -0.452775 0.0591491 -7.654810 0.000000Illiteracy rate -0.00745726 0.00679591 -1.097315 0.144387Pop. Density -0.000138259 5.54217E-05 -2.494679 0.013770Rural-Urban Indicator 0.630054 0.205611 3.064298 0.002618

Regression Diagnostics

Multicollinearity condition number = 7.165699Kiefer-Salmon (error normality) = 11.435 (p = 0.003)Koenker-Bassett test (heteroskedasticity) = 33.138 (p = 0.000)Moran's I (error) = 0.276 (p = 0.000)Lagrange Multiplier (error) = 29.165 (p = 0.000)Lagrange Multiplier (lag) = 37.499 (p = 0.000)

Table 4 Results of spatial-lag regression analysis solved through maximum likelihood, with diagnostics ofresiduals.

Response Variable: ln(Working-out Population + 1)pseudo R2 = 0.4921 Log-likelihood = -88.428 AIC = 122.855

Variable Coefficients Std. Err. z-value ProbLagged variable ofLn(Working-out Population+1) 0.0497432 0.0126568 3.930157 0.000085Constant 1.39591 0.232631 6.000540 0.000000Rural-Urban Indicator 0.464627 0.187835 2.473583 0.013377Factor 3 -0.322413 0.0597546 -5.395623 0.000000Pop. Density -9.88531E-05 5.02207E-05 -1.968374 0.049025Lagged Factor 2 0.186305 0.105104 1.772585 0.076297

Regression DiagnosticsBreusch-Pagan (heteroskedasticity) = 36.347 (p = 0.000)Lagrange Multiplier test (spatial error dependence) = 1.966 (p = 0.125)

6. CONCLUDING REMARKSThe paper has indicated that GIS is a useful technicalplatform in integrating social and environmental datafor regional development studies in Thailand. It

facilitated the manipulation of large amounts ofgeographic data, generating spatial variables from GISdatabase to supplement available spatialized socio-economic indicators, and constructing the topologicalstructure, which altogether facilitated the spatialanalysis of complicated spatial phenomena.Furthermore, with the developed spatial databases,GIS can serve as an efficient technical vehicle forspatial analysis and spatial modeling functions to gaininsights into regional development problems, e.g., thepatterns and roles of two major growth centers in theregional development of the Chiang Mai – Lamphunarea have been empirically explored. The combinationof exploratory and explanatory spatial data analyseshas revealed the impact of demographic, economicand social factors upon the spatial relationshipbetween urban core and rural periphery. Specifically,the exploratory analyses based on LISA statistics, ringanalysis and spatial cross-correlograms have revealedindirect spatial neighborhood relationships betweenvarious indicators, which had hardly been researchedso far. Accounting for the spatial association inherentin the data resulted in a spatial model that betterextracts information from the variables and has moreprecise estimates of model coefficients than does theOLS model. With core-periphery interaction explored,development impacts of urban and industrialexpansion are evaluated in order to enhance growthcenter development strategies through facilitatingvarious scenarios. Finally, findings from this meso-scale study provide an overall picture of regionaldevelopment, which has additional importanceopening up a vast scope for further detailed researchwork in the region as decentralization planning is onincrease in developing Thailand.

ENDNOTES

1 Tambol, equivalent to sub-district in other countries, is thesmallest Thailand’s administrative unit with clearly definedspatial border. Since 1994, tambol has its own elected localgovernment, the so-called Tambol Administrative Organization,with certain administrative decision-making autonomy.

2 Functional Economic Area is defined as a relatively self-contained labor market, which contains a metropolitan centralcity and hinterlands within commuting distance (Bao et al.,1995).

3 First-order spatial correlation coefficient is calculated followingequations (A5) – (A7) with spatial weight matrix defined bydirect adjacency criterion, i.e., wij is set to one if tambol j isadjacent to tambol i, and zero otherwise.

REFERENCES

Anselin L., (1994). “Local Indicators of SpatialAssociation – LISA”. Research Paper 9331,Morgantown, WV: Regional Research Institute, WestVirginia University.

Anselin L. (1995). SpaceStat, A Software Program forthe Analysis of Spatial Data, Version 1.80.Morgantown, WV: Regional Research Institute,West Virginia University.

Bao S., Henry M. S. & Barkley D., 1995. “RAS: ARegional Analysis System integrated with Arc/Info”.Computers, Environment and Urban Systems, Vol.19, No.1, pp. 37-56.

Brown, D. G., 1996. “Spatial statistics and GIS appliedto internal migration in Rwanda, Central Africa”.Practical Handbook of Spatial Statistics, ArlinghausS. L. (Eds.). New York: CRC.

Chou Y. H., 1995. “Spatial patterns and spatialautocorrelation”. Spatial Information Theory. ATheoretical Basis for GIS , Frank A.V. & Kuhn W.(eds.), Springer, New York.

Getis, A. and Ord, J.K., (1992). “The analysis of spatialassociation by the use of distance statistics”.Geographical Analysis, Vol. 24, pp. 189-206.

Huxhold W.E., 1991. Introduction to UrbanGeographic Information Systems. New York: OxfordUniversity Press Inc.

Kaur R., 1995. Urban-Rural Relations. A geographicanalysis. Anmol, New Dehli.

Kliskey A.D., 1995. “The role and functionality of GISas a planning tool in natural-resource management”.Computer, Environment and Urban Systems, vol. 19,No. 1, pp. 15-22.

Klosterman R.E., 1995. “The appropriateness ofgeographic information systems for regional planningin the developing world”. Computer, Environmentand Urban Systems, vol. 19, No. 1, pp. 1-13.

Lo F-C. (ed.), 1981. Rural-urban relations andregional development . United Nations Centre forRegional Development, Nagoya, Japan.

Potter R.B. & Unwin T. (eds.), 1989. The geography ofurban-rural interaction in developing countries.Routledge, London.

Setty D. E., 1991. “Rural Industrialization, Small-scaleand Cottage Industries in Asia”. UNCRD WorkingPaper No.91-2. United Nations Centre for RegionalDevelopment, Nagoya, Japan.

Sharma P. R., 1984. “Growth centres and regionaldevelopment: Aspects of theory and policy”.HABITAT International, Vol. 8, No. 2, pp. 133-150.

Sriboonruang, S., 1992. Chiang Mai province and itsemerging development problems. Faculty ofEconomics, Chiang Mai University, Chiang Mai,Thailand.

Suwan, M. et al., 1992. Impacts of industrialisationupon the villager’s life in Northern Thailand. Facultyof Social Science, Chiang Mai University, ChiangMai, Thailand.

Tran H., 1998. Integrating GIS with spatial dataanalysis to study the development impacts ofurbanisation and industrialisation: Case study ofChiang Mai - Lamphun area, Thailand. Ph.D.Dissertation No. SR-98-3, Asian Institute ofTechnology, Bangkok, Thailand.

Wartenberg D. (1985). “Multivariate spatialcorrelation: a method for exploratory geographicalanalysis”. Geographical Analysis, Vol. 17, pp. 263-283.

APPENDIX: MEASUREMENT OF SPATIALSTATISTICS AND MODELING

[A-1] Moran Index and Moran Scatterplot

To measure the global spatial autocorrelation, one ofthe most popular indicators is the Moran I that isdefined by:

20 )(

))((

xx

xxxxwSN

= Iij

jiijji

−∑

−−∑∑ (A1)

where N is the number of observed geographic units;wij denotes the spatial relationship between the ith and jth

geographic units, which equals 1 for adjacent units and0 otherwise; S0 = ΣiΣjwij is the total number of adjacentpairs. The value of the Moran I is generally between -1and 1, indicating the spatial clustering patterns of aphenomenon. The Moran I is positive when nearbyobjects tend to be similar in attributes, and negativewhen they tend to be more dissimilar than what isnormally expected. It is approximately zero whenattribute values are arranged randomly andindependently in space. The equation (A1) shows thatthe Moran I is calculated based on specification ofspatial weight matrix {wij}, which can be defined byeither continuity and/or distance criteria. In this study,spatial weights are defined by the spatial adjacency(i.e., wij is set to one if tambol j is adjacent to tambol i ,and zero otherwise) for all spatial analyses, unlessotherwise indicated.

On the other hand, Moran I can be expressed in matrixnotation as (Anselin, 1995):

yyWyy

SN

I'

'

0

= (A2)

where y is a vector of observations in deviation fromthe means, W is spatial weight matrix (when W is row-standardized, N = S0), and Wy is the associated spatiallag, which is a weighted average of the neighboringvalues. Thus, the Moran I gives a formal indication ofthe degree of linear association between a vector ofobserved values y and its spatial lag Wy. To visualizeand summarize the overall pattern of linear association,Anselin (1994) suggested bivariate spatial lagscatterplot of spatial lag Wy against y, which is referredto as a Moran scatterplot. The Moran I, here, can beinterpreted as the slope of a regression line of spatiallag Wy on y and a lack of fit would indicate importantlocal pockets of nonstationarity. In addition, the Moranscatterplot can be used as a means to identify “outliers”– locations with extreme values with respect to thecentral tendency reflected by the regression slope.

[A-2] Local Spatial Statistics and Core-peripheryInter-dependencies

Decomposing global indicator into the contribution ofeach observation in order to assess the influence ofindividual locations, Anselin (1994) proposed LISA asmeasurements of local spatial associations, which

include the local Moran and local Geary. The localMoran and local Geary statistics for each observation iis defined as follows (Anselin, 1994):

I d Z w Zi i ijj i

n

j( ) ,=≠

∑ (A3)

C d w Z Zji i j

j i

n

i( ) ( )= −≠

∑ 2 (A4)

where the observations Zi and Zj are in standardizedform (with mean of zero and variance of one). Thespatial weights Wij are in row-standardized form. So, Iiis a product of Zi and the average of the observations inthe surrounding locations. Significant local Moran withconsistent signs between Zi and its standardized valuesuggests that location i is associated with relatively highvalues in surrounding locations and otherwise. On theother hand, Ci is a measure of the weighted sum ofsquared differences between Zi and those of itssurrounding locations. A small and significant Cisuggests a positive spatial association (similarity) ofobservation i with its surrounding observations, while alarge and significant Ci suggests a negative spatialassociation (dissimilarity).

In a regional analysis context, Bao et al. (1995)extended Anselin’s work by analyzing urban core –rural periphery interdependencies based oncombinations of local Moran with local Geary statistics.The spatial association between urban cores and theirsurrounding rural areas may suggest one of thefollowing four types:

[A-2.1] Spread through growth (++): Rural growthis associated with rapid growth in the urban core,i.e., a significant local Moran index with consistentsigns between Zi and its standardized value, and asmall and significant local Geary index;[A-2.2] Spread through decentralization (-+):Rural growth is associated with slow growth in theurban core, i.e., a significant local Moran index withconsistent signs between Zi and its standardizedvalue, and a large and significant local Geary index;[A-2.3] Backwash (+-): Urban core growth isassociated with slow growth or decline in the ruralareas, a significant local Moran index withinconsistent signs between Zi and its standardizedvalue, and a large and significant local Geary;[A-2.4] Independence (?): Growth in rural areas isnot closely associated with changes in economicactivity in the urban core, i.e., the local Moran andlocal Geary indexes are not significant.

[A-3] Spatial Correlation and SpatialCorrelograms

To extend the concept of Pearson’s correlationbetween two variables, the spatial correlation is taking

into account the spatial effects of adjacent areas. Forirregularly spaced data (areal features), themultivariate measure of spatial correlation computedin SpaceStat 1.80 follows the approach suggested byWartenberg (1985). First, all variables arestandardized:

zk = (xk - µk) / σk (A5)

where the subscript k refers to the vector ofobservations on the kth variables, µk is the mean forvariables k , and σk is its standard deviation. Also, thespatial weight matrix is converted to a stochasticmatrix, i.e., a matrix for which all elements sum toone. The resulting matrix (Ws) is always symmetric,with elements

wsij = w ij / ΣiΣj wij (A6)

where wij are the elements in the unstandardizedweights matrix. A matrix of coefficients of spatialassociation is constructed as:

M = Z’WsZ (A7)

where Z is a matrix with the values for thestandardized variables as columns. The associationrepresented in this matrix is similar in form to abivariate Moran index between two variables (formore technical details see Anselin, 1995).

Similar to Moran I, the spatial correlation coefficientis calculated based on the specification of spatialweight matrix. Chou (1995) proposed spatialcorrelograms as a graph of Moran I against different-order spatial relationships (specified by eithercontinuity or distance criteria), based on which onecould define the wavelength and amplitude of spatialpatterns of phenomenon under study, e.g., clusteringor agglomeration effects of regional industrialdevelopment.

[A-4] Spatial Lag Regression Model

The linear models are widely used to demonstrate theco-variation of a response variable with its majorsocio-economic independent determinants. Thepresence of spatial dependence in cross-sectional geo-referenced data could be utilized to interpret the formof spatial interaction, the precise nature of spatialspill-over and the economic and social processes thatlie behind this. The transactions occurring near eachother may exhibit an adjacency effect, which could beincorporated into the model as an additionalexplanatory variable in form of spatial lag. Formally,a mixed regressive-spatial-autoregressive modelincludes a spatially lagged variable, Wy, as one of theexplanatory variables (Anselin, 1995):

y = ρWy + Xβ + ε (A8)

where y is a vector of observations on the responsevariable, Wy - spatial lag for y, ρ - spatialautoregressive coefficient, X is a matrix ofobservations on the (exogenous) explanatory variableswith associated vector of regression coefficients β.The estimate for ρ can clearly be considered as anindication of spatial autocorrelation, for example, asan alternative to the use of Moran (I), Geary (c), orspatial association (G) statistics. As the correlation ofthe lag Wy (as one of the explanatory variables) withthe error term invalidates the optimality of OLS as anestimator for this model, ML approach needs to beused instead. The estimates of the coefficients in amixed regressive-spatial-autoregressive model can beinterpreted in several ways. The inclusion of Wy inaddition to other explanatory variables allows one toassess the degree of spatial dependence in the model,while controlling the effect of other explanatoryvariables. Hence, the main interest is in the spatialeffect. Alternatively, the inclusion of Wy allows one toassess the significance of the other (non-spatial)explanatory variables, after the spatial dependence iscontrolled for.

integrating spatial statistics and gis for regional ... · the spatial patterns of regional...

Documents