lecture 9 spatial data mining - fudan universityadmis.fudan.edu.cn/member/sgzhou/courses/data... ·...
TRANSCRIPT
![Page 1: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/1.jpg)
Data Mining: Tech. & Appl.
Lecture 9Spatial Data Mining
Zhou Shuigeng
May 27, 2007
![Page 2: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/2.jpg)
Data Mining: Tech. & Appl.
OutlineSpatial DatabasesSpatial Data MiningSpatial Data WarehousingSpatial Data Mining MethodsSummaryReferences
![Page 3: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/3.jpg)
Data Mining: Tech. & Appl.
OutlineSpatial DatabasesSpatial Data MiningSpatial Data WarehousingSpatial Data Mining MethodsSummaryReferences
![Page 4: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/4.jpg)
Data Mining: Tech. & Appl.
Spatial DataSpatial data has location or geo-referenced featuresSome of these features are:
Address, latitude/longitude (explicit)Location-based partitions in databases (implicit)
![Page 5: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/5.jpg)
Data Mining: Tech. & Appl.
Spatial DatabasesSpatial Database Systems (SDBS)
database systems supporting spatial datatypes in data model and implementationobjects with location and extension in a multi-dimensional space
![Page 6: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/6.jpg)
Data Mining: Tech. & Appl.
Spatial Data FormatRaster Data
represents spatial data as rows / columns of pixels (volume representation)obtained from equipment such as earth observation satellites which measure the emitted / reflected amplitude in some frequency band
Vector Datarepresent spatial data by their boundary (boundary representation)points, lines, polygons, polyhedrons, etc.often obtained from raster data using image processing methods
![Page 7: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/7.jpg)
Data Mining: Tech. & Appl.
Spatial Queries (1)Spatial selection may involve specialized selection comparison operations:
NearNorth, South, East, WestContained inOverlap/intersect
Region (Range) query find objects that intersect a given regionNearest neighbor query find object close to identified objectDistance scan find object within a certain distance of an identified object where distance is made increasingly larger
![Page 8: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/8.jpg)
Data Mining: Tech. & Appl.
Spatial Queries (2)
![Page 9: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/9.jpg)
Data Mining: Tech. & Appl.
Spatial Queries (3)
![Page 10: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/10.jpg)
Data Mining: Tech. & Appl.
Spatial Data StructuresData structures designed specifically to store or index spatial dataOften based on B-tree or Binary Search TreeCluster data on disk based on geographic locationMay represent complex spatial structure by placing the spatial object in a containing structure of a specific geographic shapeTechniques:
Quad TreeR-Treek-D Tree
![Page 11: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/11.jpg)
Data Mining: Tech. & Appl.
MBRMinimum Bounding RectangleSmallest rectangle that completely contains the object
![Page 12: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/12.jpg)
Data Mining: Tech. & Appl.
MBR Examples
![Page 13: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/13.jpg)
Data Mining: Tech. & Appl.
Quad TreeHierarchical decomposition of the space into quadrants (MBRs)Each level in the tree represents the object as the set of quadrants which contain any portion of the objectEach lower level is a more exact representation of the objectThe number of levels is determined by the degree of accuracy desired
![Page 14: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/14.jpg)
Data Mining: Tech. & Appl.
Quad Tree Example
![Page 15: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/15.jpg)
Data Mining: Tech. & Appl.
R-TreeAs with Quad Tree the region is divided into successively smaller rectangles (MBRs).Rectangles need not be of the same size or number at each levelRectangles may actually overlapLowest level cell has only one objectTree maintenance algorithms similar to those for B-trees
![Page 16: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/16.jpg)
Data Mining: Tech. & Appl.
R-Tree Example
![Page 17: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/17.jpg)
Data Mining: Tech. & Appl.
K-D TreeDesigned for multi-attribute data, not necessarily spatialVariation of binary search treeEach level is used to index one of the dimensions of the spatial objectLowest level cell has only one objectDivisions not based on MBRs but successive divisions of the dimension range
![Page 18: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/18.jpg)
Data Mining: Tech. & Appl.
k-D Tree Example
![Page 19: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/19.jpg)
Data Mining: Tech. & Appl.
Topological RelationshipsDisjointOverlaps or IntersectsEqualsCovered by or inside or contained inCovers or contains
![Page 20: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/20.jpg)
Data Mining: Tech. & Appl.
Distance Between Objects
EuclideanManhattanExtensions:
![Page 21: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/21.jpg)
Data Mining: Tech. & Appl.
OutlineSpatial DatabasesWhat’s Spatial Data Mining?Spatial Data WarehousingSpatial Data Mining MethodsSummaryReferences
![Page 22: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/22.jpg)
Data Mining: Tech. & Appl.
Spatial Data Mining (SDM)The process of discovering
interesting,useful, non-trivial patterns from large spatial datasets
Spatial patternsSpatial outlier, discontinuities
bad traffic sensors on highwaysLocation prediction models
model to identify habitat of endangered speciesSpatial clusters
crime hot-spots, cancer clustersCo-location patterns
predator-prey species, symbiosis(共生现象)
Dental health and fluoride(氟化物)
![Page 23: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/23.jpg)
Data Mining: Tech. & Appl.
Spatial Cluster: ExampleThe 1854 Asiatic Cholera(亚细亚霍乱)in London
![Page 24: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/24.jpg)
Data Mining: Tech. & Appl.
Spatial Outliers: ExampleSpatial Outliers
Traffic Data in Twin CitiesAbnormal Sensor DetectionsSpatial and Temporal Outliers
![Page 25: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/25.jpg)
Data Mining: Tech. & Appl.
Predictive Models: ExampleLocation Prediction: Bird Habitat Prediction
Given training dataPredictive model buildingPredict new data
![Page 26: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/26.jpg)
Data Mining: Tech. & Appl.
Co-locations: ExampleGiven: A collection of different types of spatial eventsFind: Co-located subsets of event types
![Page 27: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/27.jpg)
Data Mining: Tech. & Appl.
Data in Spatial Data MiningNon-spatial Information
Same as data in traditional data miningNumerical, categorical, ordinal, boolean, etce.g., city name, city population
Spatial InformationSpatial attribute: geographically referenced
Neighborhood and extentLocation, e.g., longitude, latitude, elevation
Spatial data representationsRaster: gridded spaceVector: point, line, polygonGraph: node, edge, path
![Page 28: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/28.jpg)
Data Mining: Tech. & Appl.
Relationships on Data in Spatial Data Mining (1)
Relationships on non-spatial dataExplicitArithmetic, ranking(ordering), etc.Object is instance of a class, class is a subclass of another class, object is part of another object, object is a membership of a set
![Page 29: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/29.jpg)
Data Mining: Tech. & Appl.
Relationships on Data in Spatial Data Mining (2)
Relationships on Spatial DataMany are implicitRelationship Categories
Set-oriented: union, intersection, and membership, etcTopological: meet, within, overlap, etcDirectional: North, NE, left, above, behind, etcMetric: e.g., Euclidean: distance, area, perimeterDynamic: update, create, destroy, etcShape-based and visibility
Granularity
![Page 30: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/30.jpg)
Data Mining: Tech. & Appl.
Relationships on Data in Spatial Data Mining (3)
Granularity of Spatial DataExamples of granularity
![Page 31: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/31.jpg)
Data Mining: Tech. & Appl.
What’s NOT Spatial Data MiningSimple Querying of Spatial Data
Find neighbors of Canada given names and boundaries of all countries
Testing a hypothesis via a primary data analysisFemale chimpanzee territories are smaller than male territories
Uninteresting or obvious patterns in spatial dataHeavy rainfall in Minneapolis is correlated with heavy rainfall in St. Paul, Given that the two cities are 10 miles apart
Mining of non-spatial dataDiaper sales and beer sales are correlated in evening
![Page 32: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/32.jpg)
Data Mining: Tech. & Appl.
SDM ApplicationsGeology(地质学)
GIS SystemsEnvironmental ScienceAgricultureMedicineRoboticsMay involved both spatial and temporal aspects
![Page 33: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/33.jpg)
Data Mining: Tech. & Appl.
OutlineSpatial DatabasesSpatial Data MiningSpatial Data WarehousingSpatial Data Mining MethodsSummaryReferences
![Page 34: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/34.jpg)
Data Mining: Tech. & Appl.
Spatial Data WarehousingSpatial data warehouse: Integrated, subject-oriented, time-variant, and nonvolatile spatial data repositorySpatial data integration: a big issue
Structure-specific formats (raster- vs. vector-based, OO vs. relational models, different storage and indexing, etc.)Vendor-specific formats (ESRI, MapInfo, Integraph, IDRISI, etc.)Geo-specific formats (geographic vs. equal area projection, etc.)
Spatial data cube: multidimensional spatial databaseBoth dimensions and measures may contain spatial components
![Page 35: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/35.jpg)
Data Mining: Tech. & Appl.
Dimensions and Measures in Spatial Data Warehouse
Dimensionsnon-spatial
e.g. “25-30 degrees”generalizes to“hot” (both are strings)
spatial-to-nonspatiale.g. Seattle generalizes to description “Pacific Northwest” (as a string)
spatial-to-spatiale.g. Seattle generalizes to Pacific Northwest (as a spatial region)
Measuresnumerical (e.g. monthly revenue of a region)
distributive (e.g. count, sum)algebraic (e.g. average)holistic (e.g. median, rank)
spatialcollection of spatial pointers (e.g. pointers to all regions with temperature of 25-30 degrees in July)
![Page 36: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/36.jpg)
Data Mining: Tech. & Appl.
Spatial-to-Spatial Generalization
Generalize detailed geographic points into clustered regions, such as businesses, residential, industrial, or agricultural areas, according to land usageRequires the merging of a set of geographic areas by spatial operations
Dissolve
Merge
Clip
Intersect
Union
![Page 37: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/37.jpg)
Data Mining: Tech. & Appl.
Example: British Columbia Weather Pattern Analysis
InputA map with about 3,000 weather probes scattered in B.C.Daily data for temperature, precipitation, wind velocity, etc.Data warehouse using star schema
OutputA map that reveals patterns: merged (similar) regions
GoalsInteractive analysis (drill-down, slice, dice, pivot, roll-up)Fast response timeMinimizing storage space used
ChallengeA merged region may contain hundreds of “primitive” regions (polygons)
![Page 38: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/38.jpg)
Data Mining: Tech. & Appl.
Star Schema of the BC Weather WarehouseSpatial data warehouse
Dimensionsregion_nametimetemperatureprecipitation
Measurementsregion_mapareacount
Fact tableDimension table
![Page 39: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/39.jpg)
Data Mining: Tech. & Appl.
Dynamic Merging of Spatial Objects
Materializing (precomputing) all?—too much storage spaceOn-line merge?—slow, expensivePrecompute rough approximations?—accuracy trade offA better way: object-based, selective (partial) materialization
![Page 40: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/40.jpg)
Data Mining: Tech. & Appl.
Methods for Computing Spatial Data Cubes
On-line aggregation: collect and store pointers to spatial objects in a spatial data cube
expensive and slow, need efficient aggregation techniquesPrecompute and store all the possible combinations
huge space overheadPrecompute and store rough approximations in a spatial data cube
accuracy trade-offSelective computation: only materialize those which will be accessed frequently
a reasonable choice
![Page 41: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/41.jpg)
Data Mining: Tech. & Appl.
OutlineSpatial DatabasesSpatial Data MiningSpatial Data WarehousingSpatial Data Mining MethodsSummaryReferences
![Page 42: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/42.jpg)
Data Mining: Tech. & Appl.
Spatial Mining TasksSpatial correlationSpatial regressionSpatial associationSpatial co-locationSpatial classificationSpatial clusteringSpatial outlier detection
![Page 43: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/43.jpg)
Data Mining: Tech. & Appl.
Spatial Auto-correlation (SA)First Law of Geography
All things are related, but nearby things are more related than distant things
Tobler [1970]Examples
People with similar backgrounds tend to live in the same areaEconomies of nearby regions tend to be similarChanges in temperature occur gradually over space (and time)
![Page 44: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/44.jpg)
Data Mining: Tech. & Appl.
Spatial Correlation MeasuresSpatial Autocorrelation
Measuresdistance-based(e.g., K-function)neighbor-based(e.g., Moran’s I)
Spatial Cross-CorrelationMeasures
distance-based, e.g., cross K-function
![Page 45: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/45.jpg)
Data Mining: Tech. & Appl.
Moran’s I MeasureDefinition
z= {x1 −x^-, . . . , xn − x^-}xi : data values; x^-: mean of x; n: number of dataW is the row-normalized contiguity matrix
![Page 46: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/46.jpg)
Data Mining: Tech. & Appl.
Moran’s I MeasureRanges between -1 and +1
higher positive value ⇒ high SA, Cluster, Attractlower negative value ⇒ interspersed, de-clustered, repel
e.g., spatial randomness ⇒ MI = 0e.g., distribution of vegetation durability ⇒MI = 0.7e.g., checker board ⇒ MI = -1
![Page 47: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/47.jpg)
Data Mining: Tech. & Appl.
K-FunctionK-function Definition:
Test against randomness for point patternK(h) = λ−1E[number of events within distance h of an arbitrary event]
λ is intensity of eventModel departure from randomness in a wide range of scales
![Page 48: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/48.jpg)
Data Mining: Tech. & Appl.
K-Function: ExampleFor Poisson complete spatial randomness(csr): K(h) = πh2
![Page 49: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/49.jpg)
Data Mining: Tech. & Appl.
Cross-CorrelationCross K-Function Definition
Kij(h) = λ−1 E [number of type j event within distance h of a randomly chosen type i event]Cross K-function of some pair of spatial feature typesExample
Which pairs are frequently co-located?
![Page 50: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/50.jpg)
Data Mining: Tech. & Appl.
Cross-Correlation: Example
![Page 51: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/51.jpg)
Data Mining: Tech. & Appl.
Cross-Correlation: Example
![Page 52: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/52.jpg)
Data Mining: Tech. & Appl.
Location PredictionGiven
n spatial objects:d different features / maps:a dependent (target) class:a family of function mappings:
Finda classifier predicting the location of objects of the given classes which maximizes classification accuracy
![Page 53: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/53.jpg)
Data Mining: Tech. & Appl.
Location Prediction: Exampleknown nest locationsTask: predict other nest locations using the maps below
![Page 54: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/54.jpg)
Data Mining: Tech. & Appl.
Location Prediction: MethodsPrediction
• Continuous: trend, e.g., regressionLocation aware: spatial autoregressive model(SAR)
Discrete: classification, e.g., Bayesian classifier
Location aware: Markov random fields(MRF)
![Page 55: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/55.jpg)
Data Mining: Tech. & Appl.
Spatial Contextual Model: SARSpatial Autoregressive Model (SAR)
y = ρWy + X β + εAssume that dependent values y are related to each other yi = f(yj) for i ≠ jDirectly model spatial autocorrelation using W
Geographically Weighted Regression (GWR)A method of analyzing spatially varying relationships
parameter estimates vary locallyModels with Gaussian, logistic or Poisson forms can be fittedExample: y = X β′ + ε′.
where β′ and ε′ are location dependent
![Page 56: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/56.jpg)
Data Mining: Tech. & Appl.
Spatial Contextual Model: MRFMarkov Random Fields Gaussian Mixture Model (MRF-GMM)
Undirected graph to represent the interdependency relationship of random variablesA variable depends only on neighborsIndependent of all other variablesfC(Si) independent of fC(Sj) if W(si, sj) = 0Predict fC(Si) , given feature value X and neighborhood class label CN
Assume Pr(ci), Pr(X,CN|ci) and Pr(X,CN) are mixture of Gaussian distributions
![Page 57: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/57.jpg)
Data Mining: Tech. & Appl.
Spatial Association RulesA spatial association rule is an association rule containing at least one spatial neighborhood relationSpatial association rule: A ⇒ B [s%, c%]
A and B are sets of spatial or non-spatial predicatesTopological relations: intersects, overlaps, disjoint, etc.Spatial orientations: left_of, west_of, under, etc.Distance information: close_to, within_distance, etc.
s% is the support and c% is the confidence of the rule
![Page 58: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/58.jpg)
Data Mining: Tech. & Appl.
Spatial Association Rules Mining Methods
Examplesis_a(x, large_town) ^ intersect(x, highway) => adjacent_to(x, water)
[7%, 85%]Two approaches
Transaction based approachTransaction free approach
![Page 59: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/59.jpg)
Data Mining: Tech. & Appl.
Transaction-Based ApproachDetermine object type of interest (target object type)Transform spatial database into set of transactions
Transaction = one target object plus set of neighboring objects
neighborhood definition is crucialApply (modified) algorithm for mining frequent itemsets
e.g., Apriori algorithm
![Page 60: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/60.jpg)
Data Mining: Tech. & Appl.
Progressive Refinement Mining of Spatial Association Rules
Hierarchy of spatial neighborhood relations “g_close_to” may be specialized to near_by, touch, intersect, contain, etc.Basic Idea: if two objects do not fulfill a rough relationship (such as intersect) they cannot fulfill a refined relationship (such as meet)Two-step procedure for spatial neighborhood relations
Step 1: rough spatial computation (as a filter)Using MBR or R-tree for rough estimation
Step2: Detailed spatial algorithm (as refinement)Is very expensive (e.g. intersect test)Apply only to those objects which have passed the rough spatial association test (no less than min_support)
![Page 61: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/61.jpg)
Data Mining: Tech. & Appl.
Example
![Page 62: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/62.jpg)
Data Mining: Tech. & Appl.
Example
![Page 63: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/63.jpg)
Data Mining: Tech. & Appl.
Transaction-Free ApproachTransaction-based approach requires target object type, which restricts set of rules discoveredAlternative approach: based on cliques of neighboring objects
R-proximity neighborhoodsDatabase: set of spatial features of different types (e.g., A, B, C):
Example of R-proximity neighborhoods
![Page 64: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/64.jpg)
Data Mining: Tech. & Appl.
Transaction-Free ApproachCo-location: set of feature types, e.g., {A,C} or {A,B,C}Participation ratio of fi in c: proportion of instances of feature (type) fi participating in co-location c
participation ratio of A in {A,B} = 2/3 = 0.67participation ratio of B in {A,B} = 2/2 = 1.0
Participation index: minimum participation ratio over all features fi in a co-location c
participation index of {A,B} = min{0.67, 1.0} = 0.67Participation index is an upper bound of the cross-K function (Spatial Statistics)Participation index is monotonically decreasing with increasing co-location size
Goal: find all co-locations with minimum participation index
![Page 65: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/65.jpg)
Data Mining: Tech. & Appl.
The MethodAlternatives for generation co-location candidates
combinatorial join, geometric join, hybrid approachPruning of candidates using the participation indexMulti-resolution pruning
Start with coarse resolution neighborhood definitionPrune if coarse resolution participation falls below threshold
anti-monotone because of spatial auto-correlationDecrease resolution of neighborhood definition
![Page 66: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/66.jpg)
Data Mining: Tech. & Appl.
Spatial Cluster Analysis
Mining clusters: k-means, k-medoids, hierarchical, density-based, etc.Analysis of distinct features of the clusters
![Page 67: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/67.jpg)
Data Mining: Tech. & Appl.
Constraints-Based ClusteringConstraints on individual objects
Simple selection of relevant objects before clustering
Clustering parameters as constraintsK-means, density-based: radius, min-# of points
Constraints specified on clusters using SQL aggregates
Sum of the profits in each cluster > $1 millionConstraints imposed by physical obstacles
Clustering with obstructed distance
![Page 68: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/68.jpg)
Data Mining: Tech. & Appl.
Constraint-Based Clustering: Planning ATM Locations
Mountain
RiverBridg
e
Spatial data with obstacles
C1
C2C3
C4
Clustering without takingobstacles into consideration
![Page 69: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/69.jpg)
Data Mining: Tech. & Appl.
Mining Spatiotemporal Data
Spatiotemporal dataData has spatial extensions and changes with time Ex: Forest fire, moving objects, hurricane & earthquakes
Automatic anomaly detection in massive moving objects
Moving objects are ubiquitous: GPS, radar, etc.Ex: Maritime vessel surveillance
Problem: Automatic anomaly detection
![Page 70: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/70.jpg)
Data Mining: Tech. & Appl.
Analysis: Mining Anomaly in Moving Objects
Raw analysis of collected data does not fully convey “anomaly” informationMore effective analysis relies on higher semantic featuresExamples:
A speed boat moving quickly in open waterA fishing boat moving slowly into the docksA yacht circling slowly around landmark during night hours
![Page 71: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/71.jpg)
Data Mining: Tech. & Appl.
Framework: Motif-Based Feature Analysis
Motif-based representationA motif is a prototypical movement patternView a movement path as a sequence of motif expressions
Motif-oriented feature spaceAutomated motif feature extractionSemantic-level features
ClassificationAnomaly detection via classificationHigh dimensional classifier
![Page 72: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/72.jpg)
Data Mining: Tech. & Appl.
Movement MotifsPrototypical movement of object
Right-turn, U-turnCan be either defined by an expert or discovered automatically from data
Defined in our frameworkExtracted in movement pathsPath becomes a set ofmotif expressions
![Page 73: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/73.jpg)
Data Mining: Tech. & Appl.
Motif Expression AttributesEach motif expression has attributes (e.g., speed, location, size)Attributes express how a motif was expressedConveys semantic information useful for classification
a tight circle at 30mph near landmark Y.A tight circle at 10mph in location X
![Page 74: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/74.jpg)
Data Mining: Tech. & Appl.
Motif-Oriented Feature SpaceAttributes describe how motifs are expressedLet there be A attributes, each path is a set of (A+1)-tuples
{(mi, v1, v2, …, vA), (mj, v1, v2, …, vA)}Naïve Feature space construction
1. Let each distinct (mj, v1, v2, …, vA) be a feature2. If path exhibits a particular motif-expression, its
value is 1. Otherwise, its value is 0.
![Page 75: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/75.jpg)
Data Mining: Tech. & Appl.
Analyzing Naïve Feature SpaceLet there be M distinct motifs and V different possible values for each of the A attributesSize of feature space is
M * VA
V is usually very large due to high granularity of measurements
E.g., seconds for time or meters for locationModest values for A and M could lead to extremely high dimensional feature space
![Page 76: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/76.jpg)
Data Mining: Tech. & Appl.
More on Naïve Feature SpaceHigh dimensional feature space could make effective learning hardMore importantly, high granular features make generalization impossible!
(mj, v1, 10:01am, …, vA) vs (mj, v1, 10:02am, …, vA)Learning on one feature has no effect on another feature
Intuition: should have features that describe general high-level concepts
“Early Morning” instead of 2:03am, 2:04am, …“Near Location X” instead of “50m west of Location X”
Solution: Clustering on naïve feature space
![Page 77: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/77.jpg)
Data Mining: Tech. & Appl.
Motif Feature ExtractionFor each motif attribute, cluster values to form higher level conceptsFrequency and distribution in learning data dictates the final clustersHierarchical micro-clustering
Small clusters so concepts are not merged unnecessarilyHierarchy allows flexibility in describing objects
For example: “afternoon” vs. “early afternoon” and “late afternoon”
![Page 78: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/78.jpg)
Data Mining: Tech. & Appl.
Feature ClusteringRough, fast micro-clustering method based on BIRCH (SIGMOD’96)A micro-cluster is represented by a CF Vector: CF = (n, LS, SS)Centroid and radius can be calculated from CF vectorCF Additive Theorem allows two CF Vectors to be combined quickly and losslesslyCF Tree is a hierarchy of CF Vectors
A parent CF Vector holds information for all descendent CF VectorsLeaf CF Vector corresponds to a set of actual points
![Page 79: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/79.jpg)
Data Mining: Tech. & Appl.
More on Feature ClusteringBuild CF Tree from raw data, much like B-treeTwo parameters in clustering
B: branching factor of CF TreeT: radius threshold of CF Vector
Parameters control how fine micro-clusters are constructedHierarchical agglomerative clustering on leaves of CF TreeEntire process is efficient: O(N)
![Page 80: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/80.jpg)
Data Mining: Tech. & Appl.
Extracted Feature SpaceLeaf nodes in final clustering become the new featuresMore general than the original naïve feature spaceDimensionality could still be moderately highUse Support Vector Machine for classification
![Page 81: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/81.jpg)
Data Mining: Tech. & Appl.
ExperimentsSynthetic Data
Generated at motif-expression levelAbnormal paths are injected with abnormal motif-expressions
ClassifiersSVM using naïve feature spaceSVM using extracted feature spaces of varying refinement levels
![Page 82: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/82.jpg)
Data Mining: Tech. & Appl.
Experiment
![Page 83: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/83.jpg)
Data Mining: Tech. & Appl.
Experiment (2)
![Page 84: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/84.jpg)
Data Mining: Tech. & Appl.
Summary: Moving Object Anomaly Detection
Higher level semantic analysis of moving objects yields better resultsAutomated feature extractionFuture work
Automatic determination of t parameterBetter use of feature space hierarchyOther analysis, such as clustering and local outlier detection for anomaly detectionMining other knowledge for moving objects
![Page 85: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/85.jpg)
Data Mining: Tech. & Appl.
OutlineSpatial Databases Spatial Data MiningSpatial Data WarehousingSpatial Data Mining MethodsSummaryReferences
![Page 86: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/86.jpg)
Data Mining: Tech. & Appl.
Summary (1)What’s Special About Spatial Data Mining?
Input DataStatistical FoundationOutput PatternsComputational Process
![Page 87: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/87.jpg)
Data Mining: Tech. & Appl.
Summary (2)
![Page 88: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/88.jpg)
Data Mining: Tech. & Appl.
References (1)J. Roddick, K. Hornsby and M. Spiliopoulou, Yet AnotherBibliography of Temporal, Spatial Spatio-temporal Data Mining Research, KDD Workshop, 2001S. Shekhar, C. T. Lu, and P. Zhang, A Unified Approach to Detecting Spatial Outliers, GeoInformatica, 7(2), KluwerAcademic Publishers, 2003S. Shekhar and S. Chawla, Spatial Databases: A Tour, Prentice Hall, 2003S. Shekhar, P. Schrater, R. Vatsavai, W. Wu, and S. Chawla, Spatial Contextual Classification and Prediction Models for Mining Geospatial Data, IEEE Transactions on Multimedia (special issue on Multimedia Databases), 2002
![Page 89: Lecture 9 Spatial Data Mining - Fudan Universityadmis.fudan.edu.cn/member/sgzhou/courses/data... · Daily data for temperature, precipitation, wind velocity, etc. Data warehouse using](https://reader034.vdocuments.net/reader034/viewer/2022042911/5f4200623b6c6207fc05b8e3/html5/thumbnails/89.jpg)
Data Mining: Tech. & Appl.
References (2)S. Shekhar and Y. Huang, Discovering Spatial Co-location Patterns: A Summary of Results ,SSTD, 2001A. Fotheringham, C. Brunsdon, and M. Charlton, Geographically Weighted Regression : The Analysis of Spatially Varying Relationships, John Wiley & Sons, 2002.P. Tan and M. Steinbach and V. Kumar and C. Potter and S. Klooster and A. Torregrosa, Finding Spatio-Temporal Patterns in Earth Science Data, KDD Workshop on Temporal Data Mining, 2001P. Zhang, Y. Huang, S. Shekhar, and V. Kumar, Exploiting Spatial Autocorrelation to Efficiently Process Correlation-Based Similarity Queries, SSTD, 2003