intelligent database systems lab presenter : bei-yi jiang authors : guenael cabanes, younes bennani,...
TRANSCRIPT
Intelligent Database Systems Lab
Presenter : BEI-YI JIANG
Authors : GUENAEL CABANES , YOUNES BENNANI , DOMINIQUE
FRESNEAU
2012. ELSEVIER
Improving the Quality of Self-Organizing Maps by Self-Intersection Avoidance
Intelligent Database Systems Lab
Outlines
MotivationObjectivesMethodologyExperimentsConclusionsComments
Intelligent Database Systems Lab
Motivation
• The exponential growth of data generates terabytes of
very large databases.• The growing number of data dimensions and
data objects presents tremendous challenges for
effective data analysis and data exploration methods and
tools.
Intelligent Database Systems Lab
Objectives
• Develop a method of describing data from enriched and segmented prototypes using a topological clustering algorithm.• Provide data visualizations via maps and graphs, to
provide a comprehensive exploration of the data structure.
Intelligent Database Systems Lab
Methodology
Prototype enrichme
nt
Clustering
of prototypes
Modeling data
distributions
Visualization
Intelligent Database Systems Lab
• Prototype enrichment
Methodology-learning data structure
Input: The distance matrix Dist(w, x) between the M prototypes w andthe N data x.
Output: The density Di and the local variability si associated to eachprototype wi.The neighborhood values vi,j associated with each pair ofprototype wi and wj.
Intelligent Database Systems Lab
• Principle
Methodology-learning data structure
− Density modes.It is a measure of the data density surroundingthe prototype (local density).
− Local variabilityIt can be defined as the average distance between the prototypes and the represented data.
− The neighborhoodThis is a prototype’s neighborhood measure.
Intelligent Database Systems Lab
• Clustering of prototypes
Methodology-learning data structure
Input: Density values Di. Neighborhood values vi,j.
Output: The clusters of prototypes.
Intelligent Database Systems Lab
• Presents some interesting qualities
Methodology-learning data structure
− The number of cluster is automatically detected by the algorithm.− No linearly separable clusters and non hyper-spherical clusters can be detected.− The algorithm can deal with noise (i.e. touching clusters) by using density estimation.
Intelligent Database Systems Lab
• Modeling data distributions• Density function
Methodology-learning data structure
Intelligent Database Systems Lab
Conclusions
• Propose a new data structure modeling method, based on the learning of prototypes.
• Propose a new coclustering algorithm to solve different kind of problems. The results are easy to read and understand, and are perfectly compatible with biologists knowledge.
• A method of visualization able to enhance the data structure within and between groups.