mgr: an information theory based hierarchical divisive clustering algorithm for categorical data

30
Intelligent Database Systems Presenter : BEI-YI JIANG Authors : Hongwu Qin, Xiuqin Ma, Tutut Herawan, Jasni Mohamad Zain 2014. KBS MGR: An information theory based hierarchical divisive clustering algorithm for categorical data

Upload: vine

Post on 04-Feb-2016

54 views

Category:

Documents


0 download

DESCRIPTION

MGR: An information theory based hierarchical divisive clustering algorithm for categorical data. Presenter : Bei -YI Jiang Authors : Hongwu Qin, Xiuqin Ma, Tutut Herawan , Jasni Mohamad Zain 2014. KBS. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Presenter : BEI-YI JIANG

Authors : Hongwu Qin, Xiuqin Ma, Tutut Herawan,

Jasni Mohamad Zain

2014. KBS

MGR: An information theory based hierarchical divisive clustering algorithm for categorical data

Page 2: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments

Page 3: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Motivation

• Many algorithms for clustering categorical have low

clustering accuracy while others have high

computational complexity.

Page 4: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Objectives

• Proposes a new hierarchical divisive clustering

algorithm for categorical data, termed MGR, based on

information theory.

• Achieve better performance and efficiency of

clustering.

Page 5: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology

Page 6: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology

Information system

Mean gain ratio and entropy of cluster

Algorithm

1.

2.

3.

Computational complexity4.

Page 7: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology• Information system

Page 8: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology• Mean gain ratio and entropy of cluster

Page 9: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology• Mean gain ratio and entropy of cluster

Page 10: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology• Mean gain ratio and entropy of cluster

Page 11: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology• Mean gain ratio and entropy of cluster

Page 12: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology• Algorithm

Page 13: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology• Algorithm

Page 14: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology• Algorithm

Page 15: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology• Example

Page 16: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology• Comparisons with MMR

Page 17: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology• Comparisons with MMR

Page 18: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology• Comparisons with MMR

Page 19: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology• Comparisons with MMR

Page 20: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Methodology• Comparisons with MMR

Page 21: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Experments• manually label• randomly select 100 English articles from Wikipedia• labeled 3072 concepts that belong to 29044 categories (7780

relevant categories)

Page 22: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Experments

Page 23: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Experments

Page 24: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Experments

Page 25: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Experments

Page 26: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Experments

Page 27: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Experments

Page 28: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Experments

Page 29: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Conclusions

• MGR has better clustering accuracy and stability.

• MGR has better clustering efficiency and scalability.

Page 30: MGR: An information theory based hierarchical divisive clustering  algorithm for categorical data

Intelligent Database Systems Lab

Comments• Advantages– better clustering accuracy and stability– without specifying the number of clusters

• Applications-Categorical data-Clustering