minimum-distance-to-means clustering for vector quantization: new algorithms and applications

Torna alla prima pagina

A short presentation of two interesting unsupervised learning algorithms for vector quantization recently published in the literature

MINIMUM-DISTANCE-TO-MEANS CLUSTERING FOR VECTOR QUANTIZATION:

NEW ALGORITHMS AND APPLICATIONS

Biography Andrea Baraldi Laurea in Elect. Engineering, Univ. Bologna, 1989 Consultant at ESA-ESRIN, 1991-1993 Research associate at ISAO-CNR, Bologna, 1994-1996 Post-doctoral fellowship at ICSI, Berkeley, 1997-1999

Scientific interests Remote sensing applications Image processing Computer vision Artificial intelligence (neural networks)

About this presentation Basic concepts related to minimum-distance-to-

means clustering Applications in data analysis and image

processing Interesting clustering models taken from the

literature: Fully self-Organizing Simplified Adaptive

Resonance Theory (FOSART, IEEE TSMC, 1999) Enhanced Linde-Buzo-Gray (ELBG, IJKIES, 2000)

Minimum-distance-to-means clustering Clustering as an ill-posed problem (heuristic

techniques for grouping the data at hand) Cost function minimization (inductive learning to

characterize future samples) Mean-square-error minimization = minimum-

distance-to-means (vector quantization) Entropy maximization (equiprobable cluster

detection) Joint probability maximization (pdf

estimation)

Applications of unsupervised vector quantizers Detection of hidden data structures (data

clustering, perceptual grouping) First stage unsupervised learning in RBF

networks (data classification, function regression) (Bruzzone, IEEE TGARS, 1999)

Pixel-based initialization of context-based image segmentation techniques (image partitioning and classification)

FOSART by A. Baraldi, ISAO-CNR, IEEE TSMC, 1999

Input parameters:• (0,1] (ART-based vigilance threshold)• (convergence threshold, e.g., 0.001)

Constructive: generates (resp. removes) units and lateral connections on an example-driven (resp. mini-batch) basis

Topology-preserving Minimum-distance-to means clustering On-line learning Soft-to-hard competitive Incapable of shifting codewords through non-contiguous Voronoi regions

FOSART APPLICATIONS: Perceptual grouping of non-convex data sets

Non-convex data set. Circular ring plus three Gaussian clusters. 140 data points.

FOSART processing: 11 templates, 3 maps.

Input: 3-D digitized human face, 9371 data points.

FOSART APPLICATIONS: 3-D surface reconstruction

Output: 3370 nodes, 60 maps.

ELBG by M. Russo and G. Patane`, Univ. Messina, IJKIES, 2000 c-means minimum-distance-to-means clustering (McQueen, 1967; LBG, 1980) Initialized by means of random selection or splitting by two (Moody and Darken,

1988) Non-constructive Batch learning Hard competitive Capable of shifting codewords through non-contiguous Voronoi regions (in line

with LBG-U, Fritzke, 1977)

Input parameters:• c number of clusters• (convergence threshold, e.g., 0.001)

Combination of ELBG with FOSART FOSART initializes ELBG

Input parameters of the two-stage

clustering system are:

(0,1] (ART-based vigilance

threshold)

(convergence threshold, e.g., 0.001)

ELBG algorithm

•Ym: codebook at iteration m

•P(Ym): Voronoi (ideal) partition

•S(Ym): non-Voronoi (sub-optimal) partition

•D{Ym, S(Ym)} D{Ym, P(Ym)}

•Voronoi cell Si, i = 1, …, Nc, such that Si = {x X : d(x, yi) d(x, yj), j=1,…,Nc, j i}

ELBG block

•Utility Ui = Di / Dmean, Ui [0, ), i = 1,…, Nc, adimensional distorsion

• “low” utility (< 1): distorsion below average codeword to be shifted

• “high” utility (> 1): distorsion above average codeword to be split

ELBG block: iterative scheme

• C.1) Sequential search of cell Si to be shifted (distorsion below average)

• C.2) Stochastic search of cell Sp to be split (distorsion above average)

• C.3)

a) Detection of codeword yn closest to yi;

b) “Local” LBG arrangement of codewords yi and yp;

c) Arrangement of yn such that S’n = Sn Si;

• C.4) Compute D’n, D’p and D’i

• C.5) if (D’n + D’p + D’i) < (Dn + D’p + D’i) then accept the shift

ELBG block: initial situation before the shift of codeword attempt

• C.1) Sequential search of cell Si to be shifted

• C.2) Stochastic search of cell Sp to be split

• C.3.a) Detection of codeword yn closest to yi;

ELBG block: initialization of the “local” LBG arrangement of yi and yp

• C.3.b) “Local” LBG arrangement of codewords yi and yp;

ELBG block: situation after the initialization of the shift of codeword attempt

• C.3.a) Detection of codeword yn closest to yi;

ELBG block: situation after the shift of codeword attempt

• C.3.c) Arrangement of Yn such that S’n = Sn Si;

• C.4) Compute D’n, D’p and D’i• C.5) if (D’n + D’p + D’i) < (Dn +

D’p + D’i) then accept the shift

Examples Polynomial case (Russo and Patane`, IJKIES 2000) Cantor distribution (same as above) Fritzke’s 2-D data set (same as above) RBF network classification (Baraldi and Blonda,

IGARSS 2000)

Lena image compression

Modified-LBG (Lee et al.,IEEE Signal Proc. Lett., 1997)

ELBG with splitting-by-two ELBG with FOSARTc

PSNR*(db)

MSE Iter.* PSNR(db)

MSE Iter. (split.+ ELBG)

PSNR(db)

MSE Iter.(FOSART+ ELBG)

256 31.92 668.6 20 31.97 660.9 46 + 8 31.98 659.4 3 + 10512 33.09 510.7 17 33.17 499.2 54 + 8 33.22 494.0 3 + 91024 34.42 376.0 19 34.72 349.3 64 + 9 34.78 344.3 3 + 9

Comparison of M-LBG and ELBG in the clustering of the 16-dimensional Lena data set, consisting of 16384vectors (Russo and Patane`, IJKIES, 2000).*: results taken from the literature (Russo and Patane`, IJKIES, 2000).

Conclusions

is stable with respect to changes in initial conditions (i.e., it is effective in approaching the absolute minimum of the cost function)

is fast to converge features low overhead with respect to traditional LBG

(< 5%) performs better than or equal to other minimum-

distance-to-means clustering algorithms found in the literature

ELBG (+ FOSART):

minimum-distance-to-means clustering for vector quantization: new algorithms and applications

Documents

menggunakan learning vector quantization ... -...

aplikasi learning vector quantization edisi revisi

vector quantization of images using modified adaptive...

extensions of vector quantization for incremental...

klasifikasi data dengan learning vector quantization

the dynamics of learning vector quantization, rug,...

generalized learning vector quantization

the dynamics of learning vector quantization

vector quantization lecture6new.pdf

vector quantization kernels for the classification...

support vector clustering

distance learning in discriminative vector quantization

application of the cross-entropy method to clustering and...

learning vector quantization - university of...

aplikasi learning vector quantization

bag-of-features models -...

fundamentals of vector quantization

penerapan metode learning vector quantization 3 …penerapan...

affinity preserving quantization for hashing: a vector...

a 2.3. clustering or vector quantization scores n a...