mining weather data for decision support roy george army high performance computing research center...
TRANSCRIPT
![Page 1: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/1.jpg)
Mining Weather Data for Decision Support
Roy GeorgeArmy High Performance Computing Research Center
Clark Atlanta UniversityAtlanta, GA 30314
![Page 2: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/2.jpg)
2
Research
Clustering Algorithms for Data Mining Spatio-Temporal Domain Parallelization of Algorithms
Algorithms for Feature Extraction and Knowledge Discovery
![Page 3: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/3.jpg)
3
Challenges of Geographical Data
Complexities associated with data volume Terabyte databases
Domain complexities Interesting signals hidden by stronger patterns
Complexities caused by local variation Systems are interconnected
Data gathering and sampling Interpretation of aggregated data
Formalizing the domain
![Page 4: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/4.jpg)
4
Background: Issues with Hard Background: Issues with Hard ClusteringClustering
Issue: Force data with imprecision and/or uncertainty into discrete classes
Result: Missing important outliers, boundary patterns
Approach: Use of Approximate Clustering Technique
![Page 5: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/5.jpg)
5
Background: K-Means Clustering
Partition the data into K Clusters that are homogenous
Algorithm Select K time series as initial centroids Assign all time series to the most similar centroid Re-compute the centeroids Repeat till centroids do not change
Variations based on different measures of similarity
![Page 6: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/6.jpg)
6
Unsupervised Fuzzy K-Means (UKFM) Clustering
Choose the initial number of clusters Develop a clustering using the Fuzzy K-
Means Merge the cluster pair that have maximum
correlation Compute validity measure Repeat till until termination condition reached
![Page 7: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/7.jpg)
7
UKFM ResultsWeather Data Set
Initial: 11 Clusters Optimal: 8 Clusters
Final: 4 Clusters
![Page 8: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/8.jpg)
8
Global Earth Science Data
Collaborative Effort with V. Kumar (UMinn) Test bed for UKFM (comparison with existing
techniques) Data Set
Global Sea Pressure (1989 – 1993) Ocean Climate Indices
Capture Teleconnections Result UKFM can capture even weaker OCI’s using
coarse clusters
![Page 9: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/9.jpg)
9
Global Climate Data(Sea Level Pressure)
Intermediate: 60 Clusters
![Page 10: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/10.jpg)
10
Global Climate Data(Sea Level Pressure)
Final: 26 Clusters
![Page 11: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/11.jpg)
11
Relation with SOI
![Page 12: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/12.jpg)
12
Integrating Multi Datasets in UFKM Clustering
Motivation: Data-based approach of Determining “interesting” clusters Validate using multi datasets
Rule: Retain clusters that have supporting data
Applicable in Data Rich Environment
![Page 13: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/13.jpg)
13
UKFM Clustering with Multi-UKFM Clustering with Multi-Dataset ValidationDataset Validation
• Choose the initial number of clusters • Develop a clustering using the Fuzzy K-
Means • Validate cluster with other datasets Di=1,n
• Merge if clusters is uncorrelated ElseConsider next candidate pair to merge
Repeat till until termination condition reached
![Page 14: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/14.jpg)
14
UKFM Multi-Dataset ResultsHeight Pressure
TemperatureWindspeed
![Page 15: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/15.jpg)
15
Multi-threading Parallel Algorithm
For each clustering stage For each iteration
Slaves: Calculate Mfor each cluster
Master: Normalize M
Slaves: Calculate Cfor each cluster
Master: Normalize C
![Page 16: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/16.jpg)
16
Multi-threading Result
Implemented on Sun Fire workstation with four 900-MHz UltraSPARC® III processors
Near Linear Speed Up Obtained
![Page 17: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/17.jpg)
17
Relevance to the Army
Directly supports the FBKOF STO (B. Broome) Development of the Weather Information and
Tactical Support (WITS) System
![Page 18: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/18.jpg)
18
Weather Information and Tactical Support (WITS)
Objective: Extraction of patterns from weather to be extracted and fused with external databases (logistics, terrain, forces, etc.) for higher level planning
![Page 19: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/19.jpg)
19
Approach Development of an OLAP
Weather Repository GA Weather (1981-2002)
Sources: Nat. Weather Svc, GA Env. Network
Development of WITS Modules Ad-hoc Querying Real time Analysis and
Planning Effects on Army Systems
Integration with IWEDA
Abstract Data Representation
text
text
text
text
YEAR
MONTH
DAY
TEMPERATURE,PRECIPITATION,WIND SPEED, etc
![Page 20: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/20.jpg)
20
WITS System DesignUSER
INTERFACE
text
text
text
text
DATA WAREHOUSE
DATAMINING
MODULES
QUERYMODULES
KNOWLEDGEBASES
(IWEDA)
DATA CLEANING& TRANSFORMATION
DATAACQUISITION AGENTS
REAL TIME MODULE
TAPS MODULE
IQ MODULE
![Page 21: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/21.jpg)
21
WITS/IQ
![Page 22: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/22.jpg)
22
WITS/IQ
![Page 23: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/23.jpg)
23
WITS/IWEDA
![Page 24: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/24.jpg)
24
WITS/Analysis
![Page 25: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/25.jpg)
25
WITS/Analysis
![Page 26: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/26.jpg)
26
Work in Progress
Characterization of Analysis Queries Incorporation into Data Mining Algorithms into
WITS Enhancement of WITS/TAPS Implementation of WITS/Real
![Page 27: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/27.jpg)
27
Hybrid Genetic Fuzzy Systemsfor Feature Extraction and Knowledge
Discovery
![Page 28: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/28.jpg)
28
Project Goals
Design and implement hybrid genetic fuzzy system for knowledge discovery. Develop API/Tools. Apply tools to Army related problems.
![Page 29: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/29.jpg)
29
Contribution Hybrid system based on the Simple Genetic
Algorithm (SGA). Enhanced the SGA by adding three levels of knowledge discovery.
Level 1: Discovers up to k possible rules for a given set of inputs and outputs. It then attempts to minimize the number of rules and tune the knowledge base.
Level 2: Takes the set of rules from Level 1 and further minimizes the rules. In addition, it also tunes the knowledge base.
Level 3: Makes one last attempt to further tune the architecture of the knowledge base.
![Page 30: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/30.jpg)
30
Rule Discovery Search for k possible rules from the set of p possible rules. k
is a input parameter of the GA application.
Discover the smallest value of k, therefore reducing the number of rules needed.
Example Rules:
If INPUT_1 is low AND INPUT_2 is medium THEN OUTPUT_1 is high
If INPUT_1 is high THEN OUTPUT_1 is low
![Page 31: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/31.jpg)
31
Relevance to the Army
Collaborators: Jeff Passner, John Raby (ARL) IMETS weather modeling Post processing used to predict additional
parameters Visibility, Turbulence, Fog, etc. Use of Knowledge Discovery to Predict Parameters
![Page 32: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/32.jpg)
32
Visibility Application
Generate and tune a system that can predict visibility based on input parameters Tasks for the fuzzy genetic system
Search for a set of k rules from p possible rules that describe the relationship of the input parameters with the output (visibility)
Concurrently discover the architecture, and optimize the performance of the knowledge-bases in relation to the k rules
![Page 33: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/33.jpg)
33
Results for Low Visibility Classifier
![Page 34: Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314](https://reader035.vdocuments.net/reader035/viewer/2022062803/56649f475503460f94c698d8/html5/thumbnails/34.jpg)
34
Results forMedium Visibility Classifier