spatial data mining cs 697
DESCRIPTION
Spatial Data Mining CS 697. Assignment 1 February 16, 2010 Pradnya Khutafale, Peter Lucas, and Chris Maio Advisor: Dr. Wei Ding Computer Science Department UMass Boston. 1. Discovery of Climate Indices using Clustering. Principal Investigators - PowerPoint PPT PresentationTRANSCRIPT
1111
Spatial Data MiningSpatial Data MiningCS 697 CS 697
Assignment 1Assignment 1February 16, 2010February 16, 2010
Pradnya Khutafale, Peter Lucas, Pradnya Khutafale, Peter Lucas, and Chris Maioand Chris Maio
Advisor: Dr. Wei Ding Advisor: Dr. Wei Ding Computer Science DepartmentComputer Science Department
UMass BostonUMass Boston
2222
Discovery of Discovery of Climate Climate
Indices using Indices using ClusteringClustering
Principal InvestigatorsPrincipal Investigators Vipin Kumar (University of Minnesota)Vipin Kumar (University of Minnesota) Michael Steinbach (University of Minnesota)Michael Steinbach (University of Minnesota)
CollaboratorsCollaborators Steven Klooster (Cal. State Univ, Monterey Bay)Steven Klooster (Cal. State Univ, Monterey Bay) Christopher Potter (NASA Ames Research Center)Christopher Potter (NASA Ames Research Center) Pang-Ning Tan (Michigan State University)Pang-Ning Tan (Michigan State University)
33
Department of Computer Science Department of Computer Science and Engineeringand Engineering
Michael Steinbach Michael Steinbach Pang-Ning TanPang-Ning TanVipin KumarVipin Kumar
ResearchersResearchers
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Leading educators in the field of Leading educators in the field of spatial data miningspatial data mining
Investigating the use of data Investigating the use of data mining techniques to find mining techniques to find interesting spatio-temporal interesting spatio-temporal patterns from Earth Sciencepatterns from Earth Science
Regarded as leaders in the field of Regarded as leaders in the field of climate indices identification and climate indices identification and data mining researchdata mining research
44
NASA & Ames Research NASA & Ames Research Center team members: Center team members:
Chris Potter Chris Potter Steven Klooster Steven Klooster
ResearchersResearchers
Working on cutting edge Working on cutting edge computer science methods computer science methods and technologies to be and technologies to be utilized for finding utilized for finding solutions to complex solutions to complex environmental problems.environmental problems.
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
55Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
55
Presentation OutlinePresentation Outline Background: Background: (Chris)(Chris)
Climate ChangeClimate Change Earth System LinkagesEarth System Linkages
Earth Science Data and Climate Indices Earth Science Data and Climate Indices (Chris)(Chris)
Existing Eigenvalue Techniques and Limits Existing Eigenvalue Techniques and Limits (Pete)(Pete)
New Clustering Based Methodology New Clustering Based Methodology (Pete)(Pete)
Results and Comparisons Results and Comparisons (Pradnya)(Pradnya)
Conclusions and Future Research Conclusions and Future Research (Pradnya and Pete)(Pradnya and Pete)
66Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
66
Presentation OutlinePresentation Outline Background:Background:
Climate ChangeClimate Change Earth System LinkagesEarth System Linkages
Earth Science Data and Climate IndicesEarth Science Data and Climate Indices
Existing Eigenvalue Techniques and LimitationsExisting Eigenvalue Techniques and Limitations
New Clustering Based MethodologyNew Clustering Based Methodology
Results and ComparisonsResults and Comparisons
Conclusions and Future ResearchConclusions and Future Research
7777
Climate ChangeClimate ChangeBackgroundBackground
IPCC PredictionsIPCC Predictions
Rise in global temperaturesRise in global temperaturesExtinctions of plants and animalsExtinctions of plants and animals
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using ClusteringSea-level RiseSea-level Rise
8888
Climate Change leads to Climate Change leads to significant changes of significant changes of rainfall and soil moisture rainfall and soil moisture (drought and flood)(drought and flood)
Climate Change ImpactsClimate Change ImpactsBackgroundBackground
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Agricultural activities (crop Agricultural activities (crop growth cycle) and world growth cycle) and world food supplies are affected food supplies are affected greatly by climatic factors greatly by climatic factors (desertification)(desertification)
Climate change increases Climate change increases the frequency, intensity, the frequency, intensity, and distribution of natural and distribution of natural hazards, such as hurricanes hazards, such as hurricanes and other stormsand other storms
99Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
BackgroundBackground
Ocean, atmosphere, Ocean, atmosphere, and land processes are and land processes are highly coupledhighly coupled
Climate phenomena in Climate phenomena in one location can affect one location can affect the climate at a far the climate at a far away location this is away location this is known as climate known as climate teleconnectionsteleconnections
Understanding climate Understanding climate “teleconnections” key “teleconnections” key to knowing and to knowing and predicting ecosystem predicting ecosystem response to climate response to climate change change
Earth System LinkagesEarth System Linkages
1010Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
1010
Presentation OutlinePresentation Outline Background:Background:
Climate ChangeClimate Change Earth System LinkagesEarth System Linkages
Earth Science Data and Climate IndicesEarth Science Data and Climate Indices
Existing Eigenvalue Techniques and LimitationsExisting Eigenvalue Techniques and Limitations
New Clustering Based MethodologyNew Clustering Based Methodology
Results and ComparisonsResults and Comparisons
Conclusions and Future ResearchConclusions and Future Research
1111
Time Series Data Time Series Data Earth Science DataEarth Science Data
Sea Surface Sea Surface Temperature (SST)Temperature (SST)
Sea Level Pressure Sea Level Pressure (SLP)(SLP)
12121212
Earth Science DataEarth Science Data
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
There are thousands of floats, buoys, and other remote sensing devises There are thousands of floats, buoys, and other remote sensing devises throughout the oceans collecting enormous amount of oceanographic data throughout the oceans collecting enormous amount of oceanographic data periodically transmitted to shore via satellite (Naval Research Laboratory). periodically transmitted to shore via satellite (Naval Research Laboratory).
Data Acquisition Data Acquisition
13131313
Spatial and temporal Spatial and temporal nature of data poses a nature of data poses a number of challengesnumber of challenges
NoisyNoisy
Cycles of varying lengths Cycles of varying lengths and regularityand regularity
Strong seasonal Strong seasonal componentcomponent
Displays long term trendsDisplays long term trends
Displays temporal and Displays temporal and spatial Autocorrelationspatial Autocorrelation
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Earth Science DataEarth Science Data Preprocessing RequiredPreprocessing Required
14141414
Climate Indices = Data time Climate Indices = Data time series that summarize series that summarize physical behavior of different physical behavior of different regions of ocean and regions of ocean and atmosphere atmosphere
Distill climate variability at Distill climate variability at regional or global scale into a regional or global scale into a single and manageable time single and manageable time series series
Usually based on sea level Usually based on sea level pressure and sea surface pressure and sea surface temperaturetemperature
Past methods of indication Past methods of indication painstakingly slow and painstakingly slow and tedioustedious
Climate IndicesClimate Indices
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
15151515
Climate Index: Climate Index: Nino 1+2Nino 1+2 Climate IndicesClimate Indices
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
16161616Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
1717
El Nino El Nino CorrelationsCorrelations
Climate IndicesClimate Indices
SST of El Nino correlated indicesSST of El Nino correlated indices
1818
Detection of Climate Indices
Earth Scientists have devoted a Earth Scientists have devoted a significant amount of time significant amount of time discovering climate indicesdiscovering climate indices
Traditional approaches include direct Traditional approaches include direct observation of climate phenomena (El observation of climate phenomena (El Nino)Nino)
Use of linear algebra techniques Use of linear algebra techniques including eigenvalue analysisincluding eigenvalue analysis
Climate IndicesClimate Indices
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
1919
Eigenvalue AnalysisEigenvalue Analysis
Driven by massive amount Driven by massive amount of data obtained from of data obtained from satellites and remote satellites and remote sensing devisessensing devises
Provides a way to quickly Provides a way to quickly and automatically detect and automatically detect patterns in large amounts patterns in large amounts of dataof data
Climate IndicesClimate Indices
Jason-2 IR satellite imageJason-2 IR satellite image
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
2020
Eigenvalue AnalysisEigenvalue AnalysisClimate IndicesClimate Indices
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Eigenvalue techniques include:Eigenvalue techniques include: Principle Components Analysis (PCA)Principle Components Analysis (PCA) Single Value Decomposition (SVD)Single Value Decomposition (SVD)
Limitations of Eigenvalue AnalysisLimitations of Eigenvalue Analysis Weaker signals may be masked by stronger Weaker signals may be masked by stronger
signalssignals All Discovered signals must be orthogonal to All Discovered signals must be orthogonal to
each other making it difficult to attach a each other making it difficult to attach a physical interpretation to themphysical interpretation to them
2121
Alternative Clustering Alternative Clustering MethodologyMethodology
Utilization of data mining Utilization of data mining techniques and enormous techniques and enormous amount of remote sensing amount of remote sensing data to find climate indicesdata to find climate indices
Analysis yields clusters that Analysis yields clusters that represent ocean regions represent ocean regions with relatively with relatively homogeneous behaviorhomogeneous behavior
Centroids of these areas Centroids of these areas summarize behavior summarize behavior particular regionparticular region
Finding “meaningful” Finding “meaningful” clusters will enable Earth clusters will enable Earth Scientists to better predict Scientists to better predict changes in climate systemchanges in climate system
Climate IndicesClimate Indices
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
2222
Benefits of ClusteringBenefits of Clustering
Discovered signals do not need to be Discovered signals do not need to be orthogonal or statistically independent of orthogonal or statistically independent of one anotherone another
Signals are more easily interpretedSignals are more easily interpreted
Weaker signals are more readily detectedWeaker signals are more readily detected
It provides an efficient way to determine the It provides an efficient way to determine the influence of large set of points (all ocean influence of large set of points (all ocean point) on another large set of points (all point) on another large set of points (all land points)land points)
Climate IndicesClimate Indices
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
2323
Results of Clustering Results of Clustering MethodologyMethodology
Candidate Indices Candidate Indices highly correlated to highly correlated to known indices known indices representing representing rediscovery of well rediscovery of well known indices and known indices and validation of methodsvalidation of methods
Variants to well-known Variants to well-known indices which may be indices which may be better predictors of better predictors of land behavior for land behavior for some regions of landsome regions of land
Cluster centroids that Cluster centroids that have medium or low have medium or low correlation with known correlation with known indices may represent indices may represent new Earth science new Earth science phenomenaphenomena
Climate IndicesClimate Indices
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
2424Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
2424
Presentation OutlinePresentation Outline Background:Background:
Climate ChangeClimate Change Earth System LinkagesEarth System Linkages
Earth Science Data and Climate IndicesEarth Science Data and Climate Indices
Existing Eigenvalue Techniques and Existing Eigenvalue Techniques and LimitationsLimitations
New Clustering Based MethodologyNew Clustering Based Methodology
Results and ComparisonsResults and Comparisons
Conclusions and Future ResearchConclusions and Future Research
2525
FindingFinding Spatial or Temporal Spatial or Temporal Patterns using SVD Patterns using SVD
AnalysisAnalysisSVD: Singular Value SVD: Singular Value
DecompositionDecomposition
Earth Scientists typically used SVD Earth Scientists typically used SVD analysis to identify climate indicesanalysis to identify climate indices
Goal : To find a new set of attributes Goal : To find a new set of attributes that better describe variability in that better describe variability in the data, through dimensionality the data, through dimensionality reductionreduction
Its operation can be thought of as Its operation can be thought of as revealing the internal structure of revealing the internal structure of the data in a way which best the data in a way which best explains the variance in the data explains the variance in the data
Karl Pearson, Karl Pearson, StatisticianStatistician 1857 – 1936 1857 – 1936
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Eigenvalue TechniquesEigenvalue Techniques
2626
Overview of SVD AnalysisOverview of SVD Analysis
These techniques applied to a These techniques applied to a data set in the form of a data data set in the form of a data matrix (m by n)matrix (m by n)
m rows (objects)m rows (objects)
n columns (attributes)n columns (attributes)
Data Matrix: a variation of Data Matrix: a variation of
record data in that it consistsrecord data in that it consists
of all numeric attributesof all numeric attributesExample of a data matrixExample of a data matrix
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Eigenvalue TechniquesEigenvalue Techniques
2727
Overview of SVD AnalysisOverview of SVD Analysis Assume the data objects in a Assume the data objects in a
matrix all have the same fixed matrix all have the same fixed set of attributes set of attributes
Each data object can be Each data object can be thought of as a point, or thought of as a point, or Vector in multidimensional Vector in multidimensional spacespace
Each spatial dimension Each spatial dimension
represents a distinct attribute represents a distinct attribute describing the objectdescribing the object
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Eigenvalue TechniquesEigenvalue Techniques
Simple Example of SVD Simple Example of SVD AnalysisAnalysis Just using web, it’s hard to find intuitive explanation of SVD Just using web, it’s hard to find intuitive explanation of SVD
Again, SVD is a way to expose underlying details of matrixAgain, SVD is a way to expose underlying details of matrix
Simple Example using Golf : 3 golfers play 9 holes, par every holeSimple Example using Golf : 3 golfers play 9 holes, par every hole
How to predict score for a player on a given hole?How to predict score for a player on a given hole?
Assume two vectors, Player Ability and Hole Assume two vectors, Player Ability and Hole DifficultyDifficulty
Predicted score = Player Ability * Hole DifficultyPredicted score = Player Ability * Hole Difficulty Hole difficulty is Left Singular VectorHole difficulty is Left Singular Vector Player Ability is Right Singular VectorPlayer Ability is Right Singular Vector
Discovery of Climate Indices Discovery of Climate Indices using Clusteringusing Clustering 2828
2929
Finding Spatial or Temporal Finding Spatial or Temporal Patterns using SVD Patterns using SVD
AnalysisAnalysis Given a data matrix, whose rows consist of time Given a data matrix, whose rows consist of time
series from various points on the globe, the series from various points on the globe, the objective is to discover the strong temporal or objective is to discover the strong temporal or spatial patterns in the dataspatial patterns in the data
SVD decomposes a matrix into two sets of patterns, SVD decomposes a matrix into two sets of patterns, which, that correspond to a set of spatial patterns which, that correspond to a set of spatial patterns (left singular vectors) and a set of temporal patterns (left singular vectors) and a set of temporal patterns (right singular vectors). (right singular vectors).
We can plot the temporal patterns regular line plot We can plot the temporal patterns regular line plot and the spatial patterns on a spatial grid and and the spatial patterns on a spatial grid and visualize these patterns.visualize these patterns.
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Eigenvalue TechniquesEigenvalue Techniques
3030
Example : Plotting SST Example : Plotting SST (Sea Surface Temp)(Sea Surface Temp)
Temporal pattern of SST (blue)Temporal pattern of SST (blue)plotted against the NINO4 index plotted against the NINO4 index (green)(green)
Strongest spatial pattern of Strongest spatial pattern of SSTSST
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Eigenvalue TechniquesEigenvalue Techniques
3131
Limitations of SVD Limitations of SVD AnalysisAnalysis
Only useful for finding a few of the Only useful for finding a few of the strongest signalsstrongest signals
Smaller patterns in data may be obscuredSmaller patterns in data may be obscured
Signals must be orthogonal to each other Signals must be orthogonal to each other (statistically independent)(statistically independent)
May not identify all patterns in dataMay not identify all patterns in data
Efficiency can be a concernEfficiency can be a concern
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Eigenvalue TechniquesEigenvalue Techniques
3232Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
3232
Presentation OutlinePresentation Outline Background:Background:
Climate ChangeClimate Change Earth System LinkagesEarth System Linkages
Earth Science Data and Climate IndicesEarth Science Data and Climate Indices
Existing Eigenvalue Techniques and LimitationsExisting Eigenvalue Techniques and Limitations
New Clustering Based MethodologyNew Clustering Based Methodology
Results and ComparisonsResults and Comparisons
Conclusions and Future ResearchConclusions and Future Research
3333
Clustering Based Methodology Clustering Based Methodology for the Discovery of Climate for the Discovery of Climate
IndicesIndices Two key steps for finding climate Two key steps for finding climate
indicesindices1.1. Find Find candidate candidate indices using clusteringindices using clustering
2.2. Evaluate these candidate indices for Evaluate these candidate indices for Earth Science significanceEarth Science significance
Clustering Method used for this study:Clustering Method used for this study:
SNN Clustering Algorithm Method SNN Clustering Algorithm Method
“ “Searching Nearest Neighbors”Searching Nearest Neighbors”
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Clustering MethodsClustering Methods
3434
Finding Candidate Indices Finding Candidate Indices Using ClusteringUsing Clustering
SNN Clustering AlgorithmSNN Clustering Algorithm
First finds the nearest neighbors of First finds the nearest neighbors of each data point each data point
Next, redefines the similarity Next, redefines the similarity between pairs in terms of how between pairs in terms of how many nearest neighbors the two many nearest neighbors the two points sharepoints share
Using this definition of similarity Using this definition of similarity the algorithm identifies core pointsthe algorithm identifies core points
These Core Points are used to build These Core Points are used to build clustersclusters
SNN algorithms have time SNN algorithms have time complexity O(n*log(n)) complexity O(n*log(n))
Graph of functions n(log n) Graph of functions n(log n) and nand n
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Clustering MethodsClustering Methods
3535
Evaluation of Candidate Evaluation of Candidate IndicesIndices
Indices must be evaluated in terms of Earth Science Indices must be evaluated in terms of Earth Science significancesignificance
(meaning the strength of the association between (meaning the strength of the association between the behavior of a candidate index and land climate)the behavior of a candidate index and land climate)
Goal is to find a numerical measure of the strength Goal is to find a numerical measure of the strength and association between the behavior of an index and association between the behavior of an index and land climateand land climate
To evaluate influence of climate indices on land, the To evaluate influence of climate indices on land, the researchers use Area-Weighted Correlationresearchers use Area-Weighted Correlation
Definition : The weighted average of the correlation Definition : The weighted average of the correlation of the candidate index with all land points, where of the candidate index with all land points, where weight is based on the area of the land grid pointweight is based on the area of the land grid point
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Clustering MethodsClustering Methods
3636
Calculating Area-weighted Calculating Area-weighted CorrelationCorrelation
Step 1 :Step 1 : Compute the correlation of the time series of the candidate index with Compute the correlation of the time series of the candidate index with the same time series associated with each land pointthe same time series associated with each land point
Step 2 :Step 2 : Compute the weighted average of the correlations, where the weight Compute the weighted average of the correlations, where the weight associated with each land point is its areaassociated with each land point is its area
The resulting area-weighted correlation The resulting area-weighted correlation
can be at most 1, min is 0can be at most 1, min is 0
General Formula for W.A.General Formula for W.A.
General Correlation Index. 1 being strongestGeneral Correlation Index. 1 being strongest
Wc = weight of each value MWc = weight of each value M
Mc = some value to averageMc = some value to average
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Clustering MethodsClustering Methods
3737
Comparison of Area-Comparison of Area-Weighted CorrelationsWeighted Correlations
Development of Development of Baseline to compare Baseline to compare the values of area the values of area weighted correlations weighted correlations of candidate indicesof candidate indices
Histogram of area Histogram of area weighted correlation weighted correlation of 1000 random time of 1000 random time seriesseries
No time series has a No time series has a WAC >.1 This will be WAC >.1 This will be the baseline, and the baseline, and indicates whether a indicates whether a good candidate indexgood candidate index
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Clustering MethodsClustering Methods
3838
Validation of Comparison Validation of Comparison BaselineBaseline
Below shown are weighted area correlations of 11 Below shown are weighted area correlations of 11 knownknown indices indices
Note that 10/11 indices have a weighted area Note that 10/11 indices have a weighted area correlation of >.1correlation of >.1
If candidate index shows weighted area correlation If candidate index shows weighted area correlation >.1, investigate>.1, investigate
Graph of Weighted Area Graph of Weighted Area Correlation of Correlation of Well know Climate IndicesWell know Climate Indices
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
Clustering MethodsClustering Methods
3939Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
3939
Presentation OutlinePresentation Outline Background:Background:
Climate ChangeClimate Change Earth System LinkagesEarth System Linkages
Earth Science Data and Climate IndicesEarth Science Data and Climate Indices
Existing Eigenvalue Techniques and LimitationsExisting Eigenvalue Techniques and Limitations
New Clustering Based MethodologyNew Clustering Based Methodology
Results and ComparisonsResults and Comparisons
Conclusions and Future ResearchConclusions and Future Research
4040
SST Based Candidate Indices
Used SST data over time period from 1958 and 1998 and applied SNN clustering
Obtained 107 clusters
Cluster centroids were used to categorize clusters into G0,G1,G2 and G3 groups depending on their correlation to known indices
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
ResultsResults
4141
107 Sea Surface 107 Sea Surface Temperature (SST) ClustersTemperature (SST) Clusters
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
ResultsResults
Find Correlation with known index like SOI, NINO1+2 etc
Find Area Weighted correlation with land
4242
SST Cluster CorrelationSST Cluster Correlation
Correlation between known indices with SST cluster centroids and SVD Components
ResultsResults
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
4343
G0: G0: Clusters with correlation to Clusters with correlation to known indices >= 0.8known indices >= 0.8
ResultsResults
VeryVery highly correlated highly correlated
Rediscovered well-known indicesRediscovered well-known indices
Serve to validate the approachServe to validate the approach
NINO 1+2
NINO 3
NINO 3.4
NINO 4
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
4444
G0: SST Cluster CorrelationG0: SST Cluster Correlation
Correlation between known indices with SST cluster Correlation between known indices with SST cluster centroids and SVD Components centroids and SVD Components
ResultsResults
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
4545
G1: Clusters with correlation to known indices from 0.4 to 0.8
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
ResultsResults
4646
G1: G1: Cluster 29 vs. El Nino IndicesCluster 29 vs. El Nino Indices
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
ResultsResults
Cluster 29
4747
G2: G2: Clusters with correlation to Clusters with correlation to known indices from 0.25 to 0.4known indices from 0.25 to 0.4
Less correlated Less correlated
May represent new earth May represent new earth science science
phenomena phenomena
May be new indexMay be new index
ResultsResults
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
4848
Cluster 62 vs. El Nino Cluster 62 vs. El Nino IndicesIndices
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
ResultsResults
Cluster 62
4949
G3: G3: Clusters with correlation to Clusters with correlation to known indices <= 0.25known indices <= 0.25
Less correlated Less correlated
May represent new earth science May represent new earth science
phenomena or weaker version of phenomena or weaker version of
known phenomenaknown phenomena
New indexNew index
ResultsResults
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
5050
SLP based Candidate SLP based Candidate IndicesIndices
SLP data over time period from SLP data over time period from
1958 to 19981958 to 1998 Correlation measured as difference Correlation measured as difference
of all pairs of cluster centriodsof all pairs of cluster centriods Negative correlation are interesting Negative correlation are interesting
candidatescandidates 25 Clusters found25 Clusters found
ResultsResults
25 Sea Level Pressure Based Clusters
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
5151
SLP Clusters Pairwise SLP Clusters Pairwise Correlation Correlation
Note :Only negative correlation values Note :Only negative correlation values shown shown
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
ResultsResults
5252
Comparison with SVD Comparison with SVD based Indicesbased Indices
Correlation of Cluster Centroids Correlation of Cluster Centroids with land temperature with land temperature
Correlation of first 30 SVD Correlation of first 30 SVD components with land temperature components with land temperature
ComparisonsComparisons
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
5353
SST Clusters : Performance SST Clusters : Performance Comparison Comparison
Correlation for known indices with SST cluster centroids and Correlation for known indices with SST cluster centroids and SVD componentsSVD components
ComparisonsComparisons
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
5454
SLP Clusters : Performance SLP Clusters : Performance Comparison Comparison
ComparisonsComparisons
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
5555
Area-weighted correlation for known indices with SLP cluster Area-weighted correlation for known indices with SLP cluster centroids and SVD componentscentroids and SVD components
SLP clusters Performance SLP clusters Performance ComparisonComparison
ComparisonsComparisons
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
5656
Conclusions Conclusions Demonstrated that clustering is a viable Demonstrated that clustering is a viable
alternative to eigenvalue based approach alternative to eigenvalue based approach for the discovery of climate indicesfor the discovery of climate indices
Can replicate many well-known climate Can replicate many well-known climate indicesindices
Have also discovered variants of known Have also discovered variants of known indices that may be “better” for some indices that may be “better” for some regionsregions
Some indices may represent new Earth Some indices may represent new Earth Science phenomenaScience phenomena
No need for discovered indices to be No need for discovered indices to be orthogonalorthogonal
No need to pre-select the area to analyzeNo need to pre-select the area to analyzeDiscovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
5757
Future WorkFuture Work Investigation of candidate indices by Investigation of candidate indices by
Earth ScientistsEarth Scientists
Investigate whether there are climate Investigate whether there are climate indices that cannot be represented by indices that cannot be represented by clustersclusters
Noise elimination and other Noise elimination and other preprocessing improvementspreprocessing improvements
AggregationAggregation
Discovery of Climate Indices using ClusteringDiscovery of Climate Indices using Clustering
5858
QUESTIONS ???QUESTIONS ???