exploiting data topology in visualization and clustering of self-organizing maps
DESCRIPTION
Exploiting Data Topology in Visualization and Clustering of Self-Organizing Maps. Kadim Tas ¸ demir and Erzsébet Merényi , Senior Member TNN, 2011 Presented by Hung-Yi Cai 2011/3/9. Outlines. Motivation Objectives Previous Study Methodology Experiments Conclusions Comments. - PowerPoint PPT PresentationTRANSCRIPT
Intelligent Database Systems Lab
國立雲林科技大學National Yunlin University of Science and Technology
1
Exploiting Data Topology in Visualization and Clustering of Self-Organizing Maps
Kadim Tas ¸demir and Erzsébet Merényi, Senior MemberTNN, 2011
Presented by Hung-Yi Cai2011/3/9
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
2
Outlines· Motivation· Objectives· Previous Study· Methodology· Experiments· Conclusions· Comments
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
3
Motivation
· Different aspects of the information learned by the SOM are presented by existing methods, but data topology, which is present in the SOM’s knowledge, is greatly underutilized.
· Data topology can be integrated into the visualization of the SOM and thereby provide a more elaborate view of the cluster structure than existing schemes.
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
4
Objectives
· To integrate the data topology, present in the SOM’s knowledge, into the visualization of the SOM for improved capture of clusters.
· This objective will be accomplished through a new concept of the “connectivity matrix” and its specific rendering over the SOM.
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Previous Study· SOM is a topology preserving mapping
─ Ideally, prototypes(neurons) those are neighbors in SOM map are also neighbors (centroids of neighboring Voronoi polyhedra) in data space and vice versa.
· Growing SOM ─ It appears less robust than the Kohonen SOM because of the large
number of parameters needing adjustment.
· ViSOM─ it requires a relatively large number of prototypes even for small data
sets.
5
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Methodology Topology visualization through connectivity matrix of SOM
prototypes CONNvis: visualization of the connectivity matrix Assessment of topology preservation with CONNvis
6
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Topology visualization through connectivity matrix of SOM prototypes
· Induced Delaunay Triangulation and Voronoi─ It can be determined from the relationships of the best
matching units (BMUs) and the second BMUs.
· Connectivity Matrix─ It is a weighted analog of A, where the weights indicate the
density distribution of the input data among the prototypes adjacent in M.
─ where, RFij means wi is the BMU and wj is the second BMU.
7
N
j iji RFRF1jiij RFRFjiCONN ),(
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.CONNvis: visualization of the connectivity matrix
· Line width: Global Importance─ The strength of the connection and reflects the density
distribution among the connected units.
· Line colors: Local Importance─ A ranking of the connectivity strengths of wi .
─ Reveals most-to-least dense regions local to wi in data space.
8
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.The threshold of width
9
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Assessment of topology preservation with CONNvis
· Topology violations─ connected neural units that are not immediate
neighbors in map (forward topology violations); ─ unconnected neural units that are immediate neighbors
in map (backward topology violations).
10
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Remove weak connections
· Remove weak connections that link any two coarse clusters X and Y at their boundary
11
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Experiments
· A real remote sensing spectral image of Ocean City
12
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Experiments
· Compare to U-matrix and ISOMAP
13
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
14
Conclusions· CONNvis integrates data distribution into the
customary Delaunay triangulation, which, when displayed on the SOM grid, enables 2-D visualization of the manifold structure regardless of the data dimensionality.
· CONNvis is also unique among SOM representations in that it shows both forward and backward topology violations on the SOM grid.
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
15
Comments
· Advantages─ CONNvis greatly assists in detailed identification of
cluster boundaries.
· Applications─ Data Clustering