Transcript
Page 1: Developing a Tutorial for Grouping Analysis in ArcGIS

Developing a Tutorial for Grouping Analysis in ArcGIS

Daniel PierreMay 29, 2014

Page 2: Developing a Tutorial for Grouping Analysis in ArcGIS

1. Introduction

2. Data

3. Grouping Analysis Workflows

4. Tutorial Exercises

5. Conclusions: Recommendations

Presentation Outline

Page 3: Developing a Tutorial for Grouping Analysis in ArcGIS

Lauren Rosenshein Bennett, MSGeoprocessing Product Engineer, [email protected]

Dr. Konrad DramowiczFaculty, Centre of Geographic

[email protected]

Dr. Ela DramowiczFaculty, Centre of Geographic

[email protected]

Introduction

Project Sponsor & Supervisors

Page 4: Developing a Tutorial for Grouping Analysis in ArcGIS

Introduction

• Experimental testing of tool with multiple datasets

• Incorporation of Grouping Analysis with other tools

• Review of technical literature on clustering algorithms

• Review of existing tutorials

Project Overview

Page 5: Developing a Tutorial for Grouping Analysis in ArcGIS

Introduction

• Introduced at ArcGIS 10.1

• Available with Basic, Standard and Advanced license levels

• Found in the Spatial Statistics toolbox, within the Mapping Clusters toolset

• Script tool

Grouping Analysis Tool

Page 6: Developing a Tutorial for Grouping Analysis in ArcGIS

Introduction

• “...Performs a classification procedure that tries to find natural clusters in your data.” - Esri

• An aid for data comprehension• Feature similarity is based on

attributes specified as analysis fields and optionally, spatial constraints

• Given a number of groups, features within each output group are as similar as possible while groups are as different as possible

Grouping Analysis Tool

Page 7: Developing a Tutorial for Grouping Analysis in ArcGIS

Introduction

• Two algorithm types: cluster analysis (traditional K-means) and regionalization (spatial K-means)

• Thirteen parameters (six required)

• Grouping results contingent on the number of groups, analysis fields, and type of spatial constraint

Grouping Analysis Tool

Page 8: Developing a Tutorial for Grouping Analysis in ArcGIS

Data

Features:• Esri• City of Vancouver

Multivariate Data:• World Bank• BBC• Weatherbase• Statistics Canada

Data Sources

Page 9: Developing a Tutorial for Grouping Analysis in ArcGIS

Data

• Data Enrichment (ArcGIS Online)

• HTML table import

• Spreadsheet reformatting

• Table joins

• Feature class edits

Data Preparation

Page 10: Developing a Tutorial for Grouping Analysis in ArcGIS

Data

Selection Criteria:

• Two scales of analysis

• Illustration of various spatial constraint effects on results

• Sufficient number of features

• Visible spatial patterns in results

Tutorial Datasets

Page 11: Developing a Tutorial for Grouping Analysis in ArcGIS

General Steps:

• Exploratory data analysis

• Preprocessing

• Determining appropriate Grouping Analysis settings

• Postprocessing, interpretation and evaluation of results

Grouping Analysis Workflows

Page 12: Developing a Tutorial for Grouping Analysis in ArcGIS

Exploratory Data Analysis

1. Distribution of variable values• Thematic mapping• Spatial autocorrelation

2. Spatial relationships among features

• Contiguity of features and number of neighbours

• Spatial autocorrelation

Exploratory Data Analysis

Page 13: Developing a Tutorial for Grouping Analysis in ArcGIS

Exploratory Data Analysis

• Explore distribution of dataset variables

• Choropleth maps and graduated symbol maps

• Identify set of variables to be used for Grouping Analysis

Thematic Mapping

Page 14: Developing a Tutorial for Grouping Analysis in ArcGIS

Exploratory Data Analysis

• Analyze contiguity relationships among features

• Polygon Neighbors tool

• Determine relative connectivity of features by counting number of neighbours

• Frequency tool

Spatial Relationships

Page 15: Developing a Tutorial for Grouping Analysis in ArcGIS

Exploratory Data Analysis

• Analyze contiguity and/or proximity relationships among features using GeoDa

• Create spatial weights

• Display histogram of feature connectivity according to defined spatial relationships

• Histogram linked to map and attribute table

Alternative Approach

Page 16: Developing a Tutorial for Grouping Analysis in ArcGIS

Exploratory Data Analysis

• Considers attribute values and location of features simultaneously

• Moran’s I statistic determines whether spatial pattern of values is dispersed, random or clustered

• Significance of pattern evaluated with corresponding z-score

• One variable at a time

Spatial Autocorrelation

Page 17: Developing a Tutorial for Grouping Analysis in ArcGIS

Preprocessing

Use hot spots to limit study area for Grouping Analysis:

• Calculate incremental spatial autocorrelation

• Identify distance band of most intense clustering

• Create hot spot map• Select features from original

dataset based on location of hot spots

Preprocessing

Page 18: Developing a Tutorial for Grouping Analysis in ArcGIS

Grouping Analysis Settings

1. How many groups should be created?

2. Which analysis fields should be used?

3. Is a spatial constraint necessary? If so, which type is appropriate?

Grouping Analysis Settings:Key Considerations

Page 19: Developing a Tutorial for Grouping Analysis in ArcGIS

Grouping Analysis Settings

• Default number is 2

• Sturge’s rule:

C = 1 + 3.3 log(n), whereC is the number of groups and n is the number of features

• Evaluate the optimal number of groups (up to a maximum of 15)

Number of Groups

Page 20: Developing a Tutorial for Grouping Analysis in ArcGIS

Grouping Analysis Settings

Two vs. Three Groups

Page 21: Developing a Tutorial for Grouping Analysis in ArcGIS

Grouping Analysis Settings

• Generally driven by research purpose and objectives of grouping

• Guide selection of analysis fields with exploratory data analysis findings

• Spatial variables may be used as indirect spatial constraints

• Assess effectiveness of fields to distinguish features with output report

Analysis Fields

Page 22: Developing a Tutorial for Grouping Analysis in ArcGIS

Grouping Analysis Settings

Temperature: Spatial Variable

Page 23: Developing a Tutorial for Grouping Analysis in ArcGIS

Grouping Analysis Settings

• Choice of spatial constraint or no spatial constraint determines which algorithm is used for grouping

• No spatial constraint – traditional K-Means (data space only)

• Any spatial constraint – Spatial ‘K’luster Analysis by Tree Edge Removal (SKATER) method (spatial K-Means)

Spatial Constraints

Page 24: Developing a Tutorial for Grouping Analysis in ArcGIS

Grouping Analysis Settings

No Spatial Constraint vs.Spatial Constraint

Page 25: Developing a Tutorial for Grouping Analysis in ArcGIS

Grouping Analysis Settings

• Contiguity – edges only (“rook” type) or edges and corners (“queen” type)

• Delaunay triangulation – contiguity of representations of features as Voronoi polygons

• Proximity – K nearest neighbours

• Spatial weights

Spatial Constraint Types

Page 26: Developing a Tutorial for Grouping Analysis in ArcGIS

Grouping Analysis Settings

• Evaluate optimal number of groups

• Guide selection of analysis fields with calculated R2 values

• Visually assess results of specified spatial constraint

Iterative Process for Optimizing Grouping Analysis

Page 27: Developing a Tutorial for Grouping Analysis in ArcGIS

Interpretation & Evaluation

• Spatial distribution of groups (map)

• Global statistics (output report)

• Group and variable statistics (output report)

• Group profiles

Interpretation of Results

Page 28: Developing a Tutorial for Grouping Analysis in ArcGIS

Interpretation & Evaluation

• Compare group means with each other and global range

Group Profiles

Page 29: Developing a Tutorial for Grouping Analysis in ArcGIS

Interpretation & Evaluation

• Compare group means and ranges for each variable

Group Profiles (2)

Page 30: Developing a Tutorial for Grouping Analysis in ArcGIS

• Consider global mean, median and range for each variable

Group Profiles (3)

Interpretation & Evaluation

Page 31: Developing a Tutorial for Grouping Analysis in ArcGIS

Interpretation & Evaluation

• Global Moran’s I statistic

• Determine spatial pattern of group membership

• Measure spatial compactness of group membership

• Clustered groups generally desired

Evaluation of Results: Spatial Autocorrelation

Dispersed

Clustered

Random

Page 32: Developing a Tutorial for Grouping Analysis in ArcGIS

Interpretation & Evaluation

• Smallest to largest group

• Indicator of balance in group membership

• Balanced number of group members generally desired for comparison of statistics

• Frequency tool

Evaluation of Results: Cluster Size Ratio

Page 33: Developing a Tutorial for Grouping Analysis in ArcGIS

Interpretation & Evaluation

• Goodness measure that combines concepts of cohesion and separation

• Adapted from cluster analysis to consider attribute data and location

• Silhouette coefficient is calculated for every feature and the average is taken for the entire dataset

Evaluation of Results: Silhouette

Page 34: Developing a Tutorial for Grouping Analysis in ArcGIS

Interpretation & Evaluation

(B – A) / max(A, B) where

A is the distance between a feature and its group center

B is the distance between the feature and its neighbouring group center

Silhouette Coefficient

Page 35: Developing a Tutorial for Grouping Analysis in ArcGIS

Interpretation & Evaluation

• Range between –1 (poor) and 1 (excellent)

• < 0.2 indicates poor clustering

• > 0.5 indicates good partition of the data

Silhouette Coefficient Values

Page 36: Developing a Tutorial for Grouping Analysis in ArcGIS

Tutorial Exercises

• Six exercises

• Two scenarios (3 exercises for each)

• Suitable for users at all levels of experience

• Exercises take the user through the steps of preprocessing, group creation, interpretation and evaluation of results outlined here

Grouping Analysis Tutorial

Page 37: Developing a Tutorial for Grouping Analysis in ArcGIS

Tutorial Exercises

Exercises:

1. Data exploration

2. Grouping for exploratory data analysis

3. Using Spatial Statistics tools to target areas of interest

Scenario 1: Analysis of Crime in Chicago

Page 38: Developing a Tutorial for Grouping Analysis in ArcGIS

Tutorial Exercises

Exercises:

4. Create groups and use results to write profiles

5. Explore effects of spatial constraints

6. Evaluation of results

Scenario 2: Analysis of Olympic Results

Page 39: Developing a Tutorial for Grouping Analysis in ArcGIS

Tutorial Exercises

1. All tutorial exercises use polygon data exclusively; point features not covered

2. Space-time constraints using spatial weights matrix file not covered

3. Catered to general user; no exercises specifically target advanced users

Limitations

Page 40: Developing a Tutorial for Grouping Analysis in ArcGIS

Recommendations

1. Exploratory data analysis

2. Grouping Analysis

3. Evaluation of results

Recommendations: Enhancements and Additional Tools

Page 41: Developing a Tutorial for Grouping Analysis in ArcGIS

Recommendations

• Multi-step process using Polygon Neighbors, Frequency and table joins could be simplified

• Dynamic linking of objects can make use of existing ArcGIS functionality

Determining Spatial Relationships Among Features

Page 42: Developing a Tutorial for Grouping Analysis in ArcGIS

Recommendations

• Expand types of spatial relationships that can be analyzed

• Enable the analysis of higher order relationships

Determining Spatial Relationships Among Features (continued)

Page 43: Developing a Tutorial for Grouping Analysis in ArcGIS

Recommendations

• Tools for determining most useful diagnostic or predictor variables

• Guide selection of analysis fields for data partitioning

• Adapt neural networks or other data mining tools to work with spatial constraints

Identification of Useful Diagnostic Variables

Page 44: Developing a Tutorial for Grouping Analysis in ArcGIS

Recommendations

Grouping Analysis Tool Enhancements

• Create unique identifier

• Replace null values

Page 45: Developing a Tutorial for Grouping Analysis in ArcGIS

Recommendations

• Spatial weights matrix can be used as the spatial constraint for creating groups

• Custom weights require either manual table creation or programming

• Solution: interactive feature selection

User-defined spatial relationships among features

Page 46: Developing a Tutorial for Grouping Analysis in ArcGIS

Recommendations

• Expand beyond R2 and F-statistic values in output report

• Adapt methods used to evaluate cluster analysis algorithms (e.g. Silhouette)

• Challenge: universally applicable evaluation methods may not be feasible

Evaluation of Results

Page 47: Developing a Tutorial for Grouping Analysis in ArcGIS

THANK YOU!


Top Related