a link-based cluster ensemble approach for categorical data clustering
DESCRIPTION
A Link-Based Cluster Ensemble Approach for Categorical Data Clustering. Presenter : Jian-Ren Chen Authors : Natthakan Iam -On, Tossapon Boongoen , Simon Garrett, and Chris Price 2012 , IEEE. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. - PowerPoint PPT PresentationTRANSCRIPT
Intelligent Database Systems Lab
Presenter : JIAN-REN CHEN
Authors : Natthakan Iam-On, Tossapon Boongoen,
Simon Garrett, and Chris Price
2012 , IEEE
A Link-Based Cluster Ensemble Approachfor Categorical Data Clustering
Intelligent Database Systems Lab
OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments
Intelligent Database Systems Lab
Motivation• Cluster Ensembles:
combine different clustering decisions in such a
way as to achieve accuracy superior to that of
any individual clustering.
Intelligent Database Systems Lab
Objectives• A new link-based approach improves the conventional
matrix by discovering unknown entries through
similarity between clusters in an ensemble.
Intelligent Database Systems Lab
Methodology
Creating a Cluster Ensemble
Generating a Refined Matrix
Applying a Consensus Function to RM
Intelligent Database Systems Lab
Creating a Cluster Ensemble
Generating a Refined Matrix
Applying a Consensus Function to RM
MethodologyType I (Direct ensemble):
Type II (Full-space ensemble)
Type III (Subspace ensemble)
Intelligent Database Systems Lab
MethodologyCreating a Cluster
Ensemble
Generating a Refined Matrix
Applying a Consensus Function to RM
Intelligent Database Systems Lab
MethodologyCreating a Cluster
Ensemble
Generating a Refined Matrix
Applying a Consensus Function to RM
Intelligent Database Systems Lab
Methodology
• given a graph G = (V,W)• SPEC finds the K largest eigenvectors
of W• formed another matrix U
Creating a Cluster Ensemble
Generating a Refined Matrix
Applying a Consensus Function to RM
Intelligent Database Systems Lab
Experiments
• Investigated Data Sets
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Conclusions• Constructing the RM is efficiently resolved by the
similarity among categorical labels, using the
Weighted Triple-Quality similarity algorithm.
• The link-based method usually achieves superior
clustering results.
Intelligent Database Systems Lab
Comments• Advantages– The link-based method is efficient.
• Applications– Categorical Data Clustering