![Page 1: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/1.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.
A cluster validity measure with a hybrid parameter search method
for the support vector clustering algorithm
Presenter : Lin, Shu-HanAuthors : Jeen-Shing Wang, Jen-Chieh Chiang
PR (2008)
![Page 2: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/2.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.
2
Outline
Introduction of SVC Motivation Objective Methodology Experiments Conclusion Comments
![Page 3: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/3.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.SVC
SVC is from SVMs SVMs is supervised clustering technique
Fast convergence Good generalization performance Robustness for noise
SVC is unsupervised approach1. Data points map to HD feature space using a Gaussian kernel.
2. Look for smallest sphere enclose data.
3. Map sphere back to data space to form set of contours.
4. Contours are treated as the cluster boundaries.
3
![Page 4: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/4.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.SVC - Sphere Analysis
To find the minimal enclose sphere with soft margin:
To solve this problem, the Lagrangian function:
4
a
![Page 5: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/5.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.SVC - Sphere Analysis
5
![Page 6: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/6.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.SVC - Sphere Analysis
Karush-Kuhn-Tucker complementarity:
6
![Page 7: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/7.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.SVC -Sphere Analysis
To find the minimal enclose sphere with soft margin:
C : existence of outliers allowed
7
Wolfe dual optimization
problem a
Bound SV; Outlier
![Page 8: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/8.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.SVC -Sphere Analysis
The distance (similarity) between x and a:
q : |clusters| & the smoothness/tightness of the cluster boundaries.
8
Mercer kernelKernel: Gaussian
a
Gaussian function:
![Page 9: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/9.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Motivation
9
Drawbacks of Cluster validation Compactness
Different densities or size As the # of clusters increases, it will monotonic decrease
Separation Irregular cluster structures
![Page 10: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/10.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Motivation
10
Their previous study Can handle
Different sizes Different densities Arbitrary shape
But…
![Page 11: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/11.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.
Objectives – A cluster validity method and a parameter search algorithm for SVC
Auto determine the two parameter: Increasing q lead to increasing # of clusters C regulates the existence of outliers and overlapping clusters
To Identify the optimal structure
11
![Page 12: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/12.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.
Methodology- Idea
12
q is related to the densities of the clusters Each cluster structure corresponds to an interval of q Identify the optimal structure is equivalent to finding the
largest interval
N=64, max # of cluster = , 8 N
![Page 13: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/13.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.
Methodology- Problem
13
How to locate overall search range of q How to detect outliers/noises How to identify the largest interval
![Page 14: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/14.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Methodology – Locate range of q
14
Lower bound
Upper bound: Employ K-Means to get clusters, and get variance of each clusters vi
N
Ascending order: cluster size
n =3, the biggest 3 clusters’ variance
![Page 15: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/15.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Methodology – Outlier Detection
Set q = qmax ,the tightest of q
15
outliersingleton
And we get Copt, remove these outlier
![Page 16: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/16.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Methodology – the largest interval
16
qopt
![Page 17: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/17.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Methodology – the largest interval
17
Fibonacci search: locate the interval wherethe cluster structure is the same
Bisection search
n: iteration
![Page 18: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/18.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.
Methodology – Overview
18
Locate range of q
Outlier Detection
the largest interval
![Page 19: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/19.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.
Experiments - Benchmark and Artificial Examples
19
![Page 20: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/20.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.
Experiments - Outlier
20
Copt
![Page 21: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/21.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Experiments
21
?
![Page 22: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/22.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.
22
Conclusions
A new measure: Inspired from the observations of q
Determine the optimal cluster structure with its corresponding range of q and C
qC
![Page 23: Presenter : Lin, Shu -Han Authors : Jeen-Shing Wang, Jen- Chieh Chiang](https://reader035.vdocuments.net/reader035/viewer/2022062301/56816222550346895dd24c71/html5/thumbnails/23.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.I. M.
23
Comments
Advantage Inspired from observation of parameter
Drawback …
Application SVC DBSCAN: MinPts / Eps