self-organizing maps for imprecise data

JID:FSS AID:6415 /FLA [m3SC+; v 1.175; Prn:3/10/2013; 11:45] P.1 (1-27)

Available online at www.sciencedirect.com

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50

51

52

ScienceDirect

Fuzzy Sets and Systems ••• (••••) •••–•••www.elsevier.com/locate/fss

Self-Organizing Maps for imprecise data

Pierpaolo D’Urso a,∗, Livia De Giovanni b, Riccardo Massari a

a Dipartimento di Scienze Sociali, Sapienza University of Rome, P.za Aldo Moro, 5-00185 Rome, Italyb Dipartimento di Scienze Politiche, LUISS Guido Carli, Viale Romania, 32-00197 Rome, Italy

Received 13 August 2012; received in revised form 28 June 2013; accepted 21 September 2013

Abstract

Self-Organizing Maps (SOMs) consist of a set of neurons arranged in such a way that there are neighbourhood relationshipsamong neurons. Following an unsupervised learning procedure, the input space is divided into regions with common nearestneuron (vector quantization), allowing clustering of the input vectors. In this paper, we propose an extension of the SOMs for dataimprecisely observed (Self-Organizing Maps for imprecise data, SOMs-ID). The learning algorithm is based on two distances forimprecise data. In order to illustrate the main features and to compare the performances of the proposed method, we provide asimulation study and different substantive applications.© 2013 Published by Elsevier B.V.

Keywords: Imprecise data; Fuzziness; Distance measures for imprecise data; SOMs for imprecise data; Vector quantization for imprecise data

1. Introduction

Self-Organizing Maps (SOMs) (Kohonen [1,2]) consist of a set of neurons arranged in a linear (one-dimensional)or rectangular (two-dimensional) configuration, such that there are neighbourhood relations among the neurons. Eachneuron is attached to a weight vector of the same dimension as the input space (the multi-dimensional space ofthe units or input vectors). After completion of training, by assigning each input vector to the neuron with the nearestweight vector (reference vector), the SOMs are able to divide the input space into regions with common nearest weightvector (vector quantization), allowing clustering of the input vectors. Moreover, under appropriate training, becauseof the neighbourhood relation contributed by the interconnection among neurons, the SOMs exhibit the importantproperty of topology preservation. In other words, if two input vectors are close in the input space, the correspondingreference vectors (closest neurons) will also be close in the neural network. Therefore, at least in two-dimensionalneural networks, visualization is also possible. The density of the weight vectors of an organized map reflects thedensity of the input: in clustered areas the weight vectors are close to each other, and in the empty space between theclusters they are more sparse.

In the literature, a great deal of attention has been paid to the SOMs for traditional (numeric) data (“precise”observations). However, there are various real situations in which the data are not precisely observed (imprecise

* Corresponding author.E-mail addresses: [email protected] (P. D’Urso), [email protected] (L. De Giovanni), [email protected]

(R. Massari).
50
51

520165-0114/$ – see front matter © 2013 Published by Elsevier B.V.http://dx.doi.org/10.1016/j.fss.2013.09.011

http://www.sciencedirect.com

http://dx.doi.org/10.1016/j.fss.2013.09.011

http://www.elsevier.com/locate/fss

mailto:[email protected]



http://dx.doi.org/10.1016/j.fss.2013.09.011


2 P. D’Urso et al. / Fuzzy Sets and Systems ••• (••••) •••–•••

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

observations). In the literature on the SOMs, there are only few works. Bock [3,4] proposed to visualize symbolic data(i.e. interval-valued data) by using SOMs. Chen et al. [5] suggested a batch version of the SOMs for symbolic data.D’Urso and De Giovanni [6] proposed Midpoint Radius-Based Self-Organizing Maps (MR-SOMs) for interval-valueddata showing a suggestive telecommunications application. Hajjar and Hamdan [7] suggested an algorithm to train theSOMs for interval data based on city-block distance. Yang et al. [8] suggested SOMs for symbolic data. According toour knowledge, in the literature, there are no works on SOMs for fuzzy data.

In this paper, by considering two distance measures for fuzzy data [9,10], we proposed SOMs allowing clusteringand vector quantization of imprecise data (i.e. fuzzy data).

In particular, in Section 2 we show the fuzzy management of data imprecisely observed. In Section 3, following afuzzy formalization of the data, two distance measures for imprecise observations are illustrated [9,10]; in particular,we prove that the measure introduced by Coppi et al. [9] is a metric. These metrics are utilized in the Self-OrganizingMaps for imprecise data (SOMs-ID) proposed in Section 4. In order to point out the performances of the suggestedSOMs-ID a simulation study is illustrated in Section 5. Several suggestive applications are shown in Section 6. Finalremarks are made in Section 7.

2. Fuzzy management of imprecise information

A statistical reasoning process may be looked at as a specific cognitive process characterized by the simultaneousmanagement of information and uncertainty [11]. In this framework, various sources of uncertainty can be taken intoaccount [12]: (1) sampling uncertainty connected to the data generation process; (2) uncertainty regarding the varioustheoretical ingredients considered in the data analysis process; (3) uncertainty concerning the observation or nature ofempirical data (i.e., imprecision, vagueness, etc.).

In this paper, we focus on the specific case in which the empirical information (the data) is imprecisely or vaguelyobserved. As we can see below, the imprecision will be assumed to be represented by means of fuzzy sets, giving riseto fuzzy statistical variables, parametrized in the form of LR fuzzy variables.

2.1. A fuzzy formalization of the imprecise data (fuzzy data)

We can formalize mathematically the imprecise data in a fuzzy framework, by considering a general class of fuzzydata, i.e. the so-called LR fuzzy data [13,14]. These data can be stored in a (fuzzy) data matrix, i.e. LR fuzzy datamatrix (I observation units ×J fuzzy variables) defined as:

X ≡ {xij = (c1ij , c2ij , lij , rij )LR: i = 1, . . . , I ; j = 1, . . . , J

}, (2.1)

where xij = (c1ij , c2ij , lij , rij )LR represents the j -th LR fuzzy variable observed on the i-th observation unit, c1ij andc2ij (c2ij > c1ij ) denote, respectively, the left and right “center” (the interval [c1ij , c2ij ] is usually referred to as the“core” of the fuzzy number xij ), and lij and rij the left and right spread, respectively, with the following membershipfunction:

μxij(uij ) =

⎧⎪⎪⎨⎪⎪⎩L(

c1ij −uij

lij), uij � c1ij (lij > 0),

1, c1ij � uij � c2ij ,

R(uij −c2ij

rij), uij � c2ij (rij > 0),

(2.2)

where L (and R) is a decreasing “shape” function from R+ to [0,1] with L(0) = 1; L(zij ) < 1 for all zij > 0,∀i, j ; L(zij ) > 0 for all zij < 1, ∀i, j ; L(1) = 0 (or L(zij ) > 0 for all zij and L(+∞) = 0). The fuzzy numberxij = (c1ij , c2ij , lij , rij )LR (i = 1, . . . , I ; j = 1, . . . , J ) consists of an interval which runs from c1ij − lij to c2ij + rijand the membership functions give differential weights to the values in the interval, respectively, to the left and to theright of the left and right “centers”.

The most common LR fuzzy datum is the trapezoidal one (with trapezoidal membership function). In particular,for an LR fuzzy number xij , if L and R are of the form;

L(z) = R(z) ={

1 − zα, 0 � z � 1,(2.3)

0, otherwise


P. D’Urso et al. / Fuzzy Sets and Systems ••• (••••) •••–••• 3

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

with α = 1, then X ≡ {xij : i = 1, . . . , I ; j = 1, . . . , J } is a trapezoidal fuzzy data matrix whose elements have thefollowing membership functions:

μxij(uij ) =

⎧⎪⎪⎨⎪⎪⎩1 − c1ij −uij

lij, uij � c1ij (lij > 0),

1, c1ij � uij � c2ij ,

1 − uij −c2ij

rij, uij � c2ij (rij > 0).

(2.4)

When c1ij = c2ij , we obtain a particular type of LR fuzzy number, denoted as xij = (cij , lij , rij )LR, where cij

denotes the center, i = 1, . . . , I ; j = 1, . . . , J , determining the following particular case of LR fuzzy data matrix:

X ≡ {xij = (cij , lij , rij )LR: i = 1, . . . , I ; j = 1, . . . , J

}. (2.5)

Particular cases of LR fuzzy data are the triangular, parabolic and square root ones (when L and R are of the form(2.3) with α = 1, α = 2 and α = 1/2, respectively). Each case takes into account a different level of fuzziness aroundthe centers of the fuzzy numbers. Specifically, the square root case denotes a low level of fuzziness, the triangular casea medium level, and the parabolic case a high level.

Notice that two very important topics, connected with the representation of some terms of natural language bymeans of fuzzy data, are the elicitation and specification of the membership functions.

Remark 1 (On the elicitation of the membership functions). As remarked by Coppi et al. [12] “as for the subjectivisticapproach to probability, also the choice of the membership functions is subjective. In general, these are determined byexperts in the problem area. In fact, the membership functions are context-sensitive. Furthermore, the functions are notdetermined in an arbitrary way, but are based on a sound psychological/linguistic foundation. It follows that the choiceof the membership function should be made in such a way that a function captures the approximate reasoning of theperson involved. In this respect, the elicitation of a membership function requires a deep psychological understanding.”

Remark 2 (On the specification of the membership functions). In the statistical analysis of fuzzy multivariate data,particular attention must be paid to the specification of the membership functions when we deal simultaneously withJ variables. In particular, we have two possible approaches: the conjunctive approach and the disjunctive approach[15]. In the conjunctive approach, we take into account the fuzzy relationship defined on the Cartesian product of thereference universes of the J variables. From the statistical point of view, the adoption of the conjunctive approachto the multi-dimensional fuzzy variables involves a specific interest in studying the fuzzy relationship looked at asa “variable” in itself, which could be observed on the I objects. Conversely, in the disjunctive approach, we are notinterested in studying a fuzzy variable which constitutes the resultant of the J original variables. Instead, our interestfocuses upon the set of the J “juxtaposed” variables, observed as a whole in the group of I objects. In this case, wehave J membership functions and the investigation of the links among the J fuzzy variables is carried out directly onthe matrix of fuzzy data concerning the IJ -variate observations [15,16].

Without loss of generality, let consider the two-dimensional case; furthermore, let assume that a one-dimensionalfuzzy variable is represented by a (symmetrical) triangular membership function and a two-dimensional fuzzy variableis represented by two (symmetrical) triangular membership function (disjunctive approach) or a conical membershipfunction (conjunctive approach). An example of geometrical representation of the considered membership functionsbased on the disjunctive and conjunctive approach is shown in Fig. 1. An analytical representation of two triangularfuzzy variables with (symmetrical) triangular membership functions (disjunctive approach) [15,16] can be easilyobtained by (2.4) fixing c1ij = c2ij , lij = rij and j = 1,2. For an analytical formalization of a conical fuzzy variablewith conical membership function (conjunctive approach), see Celminš [17,18].

3. Distance measures for imprecise data

By formalizing the imprecise data in a fuzzy manner, we can compare objects with imprecise information by usingdistance measures for fuzzy data.

In the literature, several proximity measures (dissimilarity, similarity and distance measures) have been suggestedin a fuzzy framework [16].



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Fig. 1. Examples of membership functions based on the disjunctive and conjunctive approach. Source: Celminš [17,18].

Some of these measures are defined by comparing suitably the membership functions of the fuzzy data. Thesedistances can be classified according to different approaches [19,20]: the “functional approach”, in which the mem-bership functions are compared by means of Minkowski and Canberra distances extended to the fuzzy case [21,22];the “information theoretic approach”, based on the definition of fuzzy entropy [23] and the “set theoretic approach”,based on the concepts of fuzzy union and intersection [20,22,24,25].

Other kinds of dissimilarities compare the fuzzy data by using directly the empirical information represented bythe centers and the spreads of the fuzzy data, i.e. the fuzzy observations collected in the matrix (2.1), and by adoptingsuitable weighting systems that somehow capture the information connected to the shape of the membership functions(see, e.g., [9,10,26–28]).

In this paper, for comparing fuzzy multivariate data, we consider two weighted distances:

1. the Coppi–D’Urso–Giordani distance (CDG distance) [9];2. the Extended Yang–Ko distance (EYK distance) [8].

3.1. CDG distance measure

The CDG dissimilarity measure for LR fuzzy data has been proposed by Coppi et al. [9]. It represents a general-ization of the dissimilarity for symmetrical fuzzy data suggested by D’Urso and Giordani [26]. By means of the CDGmeasure the dissimilarity between each pair of objects is computed by comparing the fuzzy data observed on eachobject, considering separately the (squared) distances for the centers and the spreads of the fuzzy data and using asuitable weighting system for such distance components. Thus, by considering the i-th and i′-th objects, we have thefollowing distance measure:

CDGd(xi , xi′) = [w2

C

(‖c1i − c1i′ ‖2 + ‖c2i − c2i′ ‖2) + w2S

(‖li − li′ ‖2 + ‖ri − ri′ ‖2)] 12 , (3.1)

where:

c1i ≡ (c1i1, . . . , c1ij , . . . , c1ip)′, c1i′ ≡ (c1i′1, . . . , c1i′j , . . . , c1i′p)′,c2i ≡ (c2i1, . . . , c2ij , . . . , c2ip)′, c2i′ ≡ (c2i′1, . . . , c2i′j , . . . , c2i′p)′,li ≡ (li1, . . . , lij , . . . , lip)′, li′ ≡ (li′1, . . . , li′j , . . . , li′p)′,r ≡ (r , . . . , r , . . . , r )′, r ′ ≡ (r ′ , . . . , r ′ , . . . , r ′ )′;
i i1 ij ip i i 1 i j i p



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

‖.‖ is the Euclidean norm; wC, wS are suitable weights for the center and the spread components of CDGd(xi , xi′),where xi and xi′ denote the fuzzy data vectors, respectively, for the i-th and i′-th objects, i.e. xi ≡ {xij =(c1ij , c2ij , lij , rij )LR: j = 1, . . . , J } and xi′ ≡ {xi′j = (c1i′j , c2i′j , li′j , ri′j )LR: j = 1, . . . , J }.

Notice that, the CDG distance measure does not depend on the shape of the function L and R. In the followingProposition 1 we prove that (3.1) is a metric.

Proposition 1. (X, CDGd(x, y)) is a metric space.

Proof. By denoting with x, y and z three LR fuzzy data vectors, the following properties are satisfied:

(1) Identity: ∀x, CDGd(x, x) = 0.(2) Non-negativity: ∀ distinct x,y, CDGd(x, y) > 0.(3) Symmetry: ∀ distinct x,y, CDGd(x, y) = CDGd(y, x).(4) Triangular inequality:

∀ distinct x,y, z, CDGd(x, y) � CDGd(x, z) + CDGd(z, y).

CDGd2(x, y) = w2C

(‖c1x − c1y‖2 + ‖c2x − c2y‖2) + w2S

(‖lx − ly‖2 + ‖rx − cy‖2)= w2

C

(‖c1x − c1z + c1z − c1y‖2 + ‖c2x − c2z + c2zc2y‖2)+ w2

S

(‖lx − lz + lz − ly‖2 + ‖rx − rz + rz − cy‖2)= CDGd2(x, z) + CDGd2(z, y) + 2w2

C

(⟨(c1x − c1z), (c1z − c1y)

⟩ + ⟨(c2x − c2z), (c2z − c2y)

⟩)+ 2w2

S

(⟨(lx − lz), (lz − ly)

⟩ + ⟨(rx − rz), (rz − ry)

⟩)(by Cauchy–Schwarz inequality)

� CDGd2(x, z) + CDGd2(z, y) + 2CDGd(x, z) · CDGd(z, y)

= (CDGd(x, z) + CDGd(z, y)

)2

where 〈. . .〉 is the inner product between two vectors.

Thus by (1)–(4) CDGd(x, y) is a metric.

Remark 3 (On the weighting selection). The weights can be chosen subjectively a priori by taking into accountexternal or subjective information (external weighting system) or can be computed objectively within a suitable dataanalysis procedure (internal weighting system).

We can obtain the distance by weighting differently the (left and right) center and the (left and right) spreaddistances. As the membership function value of the centers is maximum, we assume that the (left and right) centerdistances’ weight is higher than (or at least equal to) the (left and right) spread distances’ one. Then, we assumethe following conditions: wC + wS = 1 (normalization condition) and wC � wS � 0 (coherence condition). We canset wC = (1 − ν), wS = ν. When the normalization condition is satisfied, the coherence condition turns into ν � 0,ν � 0.5.

Notice that, the weights wC and wS are intrinsically associated to the components of the distance (center andspread distances); then by means of them, we can properly tune the influence of the two components of the fuzzyentity (center and spread) when calculating the distance. In fact, by means of the coherence condition, wC � wS � 0(i.e., the center component of the fuzzy data is given more or equal importance with respect to the spread component),we exclude the anomalous case where the spread component, which represents the uncertainty around the centers ofthe fuzzy number, has more importance than the center component that represents the core information of each fuzzydatum. In this way, we take into account the intuitive assumption of the fuzzy set theory: the membership functionvalue of the centers is maximum. By the normalization condition, wC +wS = 1, we can easily access, in a comparativemanner, the contributions of the center and spread components in the computation of CDGd(x, y).

As we can see by (3.1), we assume that the weights for the left and right center (squared) distances and the left andright spreads (squared) distances are, respectively, the same.

Moreover, as we will see in Section 4, for selecting the weights, we will prefer to adopt an objective criterion; infact, the weight values are not fixed a priori, but are suitable computed in the data analysis procedure.



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Table 1Features of the weighted distances.

Weighted distancemeasures for fuzzymultivariate data

Features

Data Specificationapproach

Information on themembership function shape

Weightingsystem criterion

CDG distance LR fuzzy multivariate data disjunctive external internal/externalEYK distance LR fuzzy multivariate data disjunctive external external

3.2. EYK distance measure

In this section, we consider a multivariate version of the distance measure for LR fuzzy data suggested by Yangand Ko [10], denoted with EYK distance. I.e., in a multivariate framework the Yang–Ko distance can be formalizedas follows:

EYKd(xi , xi′) = [‖c1i − c1i′ ‖2 + ‖c2i − c2i′ ‖2

+ ∥∥(c1i − λli ) − (c1i′ − λli′)∥∥2 + ∥∥(c2i + ρri ) − (c2i′ + ρri′)

∥∥2] 12 , (3.2)

where λ = ∫ 10 L−1(ω)dω and ρ = ∫ 1

0 R−1(ω)dω are parameters which summarize the shape of the left and right tailsof the membership function. Then, for each membership function, we have a particular value of λ and ρ. For instance,λ = ρ = 1/3 for triangular membership function, λ = ρ = 2/3 for parabolic membership function, and so on (formore details, see [10,16]).

Proposition 2. (X, EYKd(x, y)) is a metric space.

Proof. See Yang and Ko [10]. �3.3. A comparative assessment

The rationale underlying the definition of CDG and EYK distances is analogous to that which characterizes thesubjectivist approach to probability. As for the subjectivistic approach to probability, also the choice of the membershipfunctions is subjective and their shape is properly defined based on useful a priori information before the analysisprocess. In fact, as illustrated in Remark 2, the membership functions are context-sensitive, and then the membershipfunctions are determined by experts in the problem area.

The features of the two weighted distance measures are summarized in Table 1. In particular, as we can see, forboth measures:

– the empirical information can be represented by LR fuzzy multivariate data;– a disjunctive approach for the specification of the membership function has been adopted;– the choice of the shape of the membership functions is carried out prior of the utilization of the two measures in

the data analysis process;– in the EYK distance the criterion for selecting the weights is external, i.e. the weights are suitably fixed a priori

(the weights are determined before applying the distance in the process of analysis); for the CDG measure theweights can be fixed subjectively a priori by considering external or subjective conditions (external weightingsystem), or can be computed objectively within a suitable data analysis procedure, e.g. a clustering procedure(internal weighting system) (see Section 4).

4. SOMs for imprecise data (SOMs-ID)

In this section, in order to classify imprecise data, the distance measures (3.1) and (3.2) are exploited in SOMsframework.

The SOM is a network (topology, lattice) of P functional units, or neurons, arranged in a one-dimensional ormulti-dimensional configuration. Each neuron p (1 � p � P ) has a (scalar or vectorial) location (coordinate) rp



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Fig. 2. A two-dimensional input space mapped to a one-dimensional configuration of neurons (left; neural network top, input space bottom);a J -dimensional input space mapped to a two-dimensional configuration of neurons (right).

dependent on the configuration (one-dimensional or multi-dimensional), and an initial J -dimensional weight μp =(μp1, . . . ,μpj , . . . ,μpJ ).

Then there is a set of IJ -dimensional input vectors ξ i = (ξi1, . . . , ξij , . . . , ξiJ ). In Fig. 2 a two-dimensional inputspace mapped to a one-dimensional (left) configuration of neurons and a J -dimensional input space mapped to atwo-dimensional (right) configuration of neurons are presented.

At ordering step s an input vector ξ i (s) is compared in any metric [2] with the weight vectors and the winnerneuron c (response or best matching unit, bmu) whose weight vector μc is closest to ξ i (s) is selected. The learningrule of the weights is the following [2]:{

μp(s + 1) = α(s)ξ i (s) + (1 − α(s))mμp(s) if p ∈ Nc,

μp(s + 1) = μp(s) otherwise,(4.1)

where α(s) is the learning rate and Nc = Nc(s) is the topological neighbourhood of μc .The neighbourhood is often given in terms of a neighbourhood function. In this case the learning rule of the weights

is the following:

μp(s + 1) = α(s)hp,i(s)ξ i (s) + (1 − α(s)hh,i (s)

)μp(s) (4.2)

where hp,i(s) is the neighbourhood function measuring the distance between the locations of neuron p and clos-est (winner) neuron c to the input vector ξ i (s). A frequently used form for neighbourhood function is hp,i(s) =exp(−‖rp−rc‖2

2σ 2(s)), where rp and rc identify the locations (coordinates) of neurons p and c in the configuration (topol-

ogy) and σ(s) is the (decreasing) width of the neighbourhood.According to Kohonen [1,2], the randomly chosen initial values for μp gradually change to new values in a learn-

ing process specified by (4.1) or (4.2) such that, as s → ∞, the weight vectors of the neurons μ1, . . . ,μp, . . . ,μP

become ordered (neurons with nearest location exhibit smaller distance with respect to weights), and that the proba-bility density function of the weight vectors finally approximates some monotonic function of the probability densityfunction p(ξ) of the J -dimensional continuous random variable ξ .



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

The quality of learning in the SOMs are measured through the average expected quantization error and the averageexpected distortion measure. They are defined as:∫

RJ

dg(μc, ξ)p(ξ) (4.3)

∫RJ

P∑p=1

hp,ξ (s) d2g(μp, ξ)p(ξ ) (4.4)

respectively, where dg is a generalized distance function [2], ξ ∈ RJ is the input vector, μc is the weight vector closest

to the input vector ξ according to dg , and hp,ξ (s) is the degree of neighbourhood between the locations of neuron p

and winner neuron c.It has been thought that the SOMs algorithm might be derivable from the optimization (minimization) of the average

expected distortion measure by computing the gradient of (4.4). Exact optimization of (4.4) with respect to μp is stillan unsolved problem and extremely heavy computationally (the best approximate solution is based on stochasticapproximation). Kohonen [2] has shown that the point density of the weights obtained by optimizing (4.4) turns outto be different from the density derived directly on the basis of (4.2), denoting that the weight vectors produced by thebasic SOMs algorithm in general do not coincide with those obtained by optimizing the average expected distortionmeasure. Nonetheless, Ritter and Schulten [29] have shown that, in case of discrete random variable ξ the learningprocess driven by (4.2) finds weight vectors minimizing the average expected distortion measure.

The average quantization error and the average distortion measure are the sample counterpart of (4.3) and (4.4)and are defined as:

I−1I∑

i=1

dg(μc, ξ i ), (4.5)

I∑i=1

P∑p=1

hp,i(s) d2g(μp, ξ i ). (4.6)

In this work we use the SOMs algorithm (4.2). In order to use the SOMs for clustering and vector quantization ofimprecise data the winner is selected on the basis of the distances for imprecise data introduced in Section 3. As ametric is needed for selecting the winner [2], the distances considered are (3.1) and (3.2).

According to (3.1) the distance considered between the weight vector μp(s) and the generic input vector i sortedfor updating the SOMs at ordering step s, ξ i (s) = xi , is the following:

CDGd(μp(s), ξ i (s)

)≈ [

w2C

(∥∥μ1p(s) − c1i (s)∥∥2 + ∥∥μ2p(s) − c2i (s)

∥∥2) + w2S

(∥∥μlp(s) − li (s)∥∥2 + ∥∥μrp(s) − ri (s)

∥∥2)] 12

= [(1 − ν(s)

)2(∥∥μ1p(s) − c1i (s)∥∥2 + ∥∥μ2p(s) − c2i (s)

∥∥2)+ ν(s)2(∥∥μlp(s) − li (s)

∥∥2 + ∥∥μrp(s) − ri (s)∥∥2)] 1

2 (4.7)

where μp(s) is the J -dimensional vector of left and right centers and left and right spreads for the weight vector ofneuron p in which μ1p(s), μ2p(s), μlp(s) and μrp(s) are the J -dimensional vectors of left centers, right centers, leftspreads, right spreads, respectively.

The average quantization error and the average distortion error with the considered distance (3.1) turn into:

I−1I∑

i=1

CDGd(μc, ξ i ), (4.8)

I∑ P∑hp,i(s)CDGd2(μp, ξ i ). (4.9)

i=1 p=1



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

In order to improve the quality of learning, the value of ν is determined to minimize the average distortion measure(4.9). By computing the derivative of (4.9) with respect to ν(s), the optimum value of ν(s) results:

ν(s) = min

{(I∑

i=1

P∑p=1

hp,i(s)(∥∥μ1p(s) − c1i (s)

∥∥2 + ∥∥μ2p(s) − c2i (s)∥∥2))

×(

I∑i=1

P∑p=1

hp,i(s)(∥∥μ1p(s) − c1i (s)

∥∥2 + ∥∥μ2p(s) − c2i (s)∥∥2

+ ∥∥μlp(s) − li (s)∥∥2 + ∥∥μrp(s) − ri (s)

∥∥2))−1

,0.5

}. (4.10)

According to (3.2) the distance considered between the weight vector μp(s) and the generic input vector i sortedfor updating the SOMs at ordering step s, ξ i (s) = xi , is the following:

EYKd(μp(s), ξ i (s)

)≈ [∥∥μ1p(s) − c1i (s)

∥∥2 + ∥∥μ2p(s) − c2i

∥∥2 + ∥∥(μ1p(s) − λμlp(s)

) − (c1i (s) − λli (s)

)∥∥2

+ ∥∥(μ2p(s) + ρμrp(s)

) − (c2i (s) + ρri (s)

)∥∥2] 12 . (4.11)

The average quantization error and the average distortion error with the considered distance (3.2) turn into:

I−1I∑

i=1

EYKd(μc, ξ i ), (4.12)

I∑i=1

P∑p=1

hp,i(s)EYKd2(μp, ξ i ). (4.13)

Either considering distance (3.1) or (3.2) the updating rules for the weight vector μp(s) when the generic inputvector i is sorted for updating the SOMs at ordering step s, ξ i (s) = xi , are:

μ1p(s + 1) = α(s)hp,i(s)c1i (s) + (1 − αhp,i(s)

)μ1p(s),

μ2p(s + 1) = α(s)hp,i(s)c2i (s) + (1 − αhp,i(s)

)μ2p(s),

μlp(s + 1) = α(s)hp,i(s)li (s) + (1 − αhp,i(s)

)μlp(s),

μrp(s + 1) = α(s)hp,i(s)ri (s) + (1 − αhp,i(s)

)μrp(s), (4.14)

where c is the neuron closest to ξ i (s).

Algorithm (CDGSOM-ID: SOMs-ID model, CDGd(.) distance).

Step 0 Fix the topology of the map (one-dimensional, or linear, two-dimensional, or rectangular), the size of the map(the number of neurons P ), the learning rate α(s), the neighbourhood function hp,i(s), ν(s) and the maximumnumber of iterations (maxiter). Generate randomly the weights μp(0), p = 1, . . . ,P .

Step 1 Select an input vector i for updating the SOMs at ordering step s, ξ i (s) = xi , and determine the neuron c

closest to xi (winner) according to (4.7). Update the weights of the map μp according to (4.14).Step 2 Update α(s), hp,i(s) and ν(s).Step 3 If iteration number s = maxiter the algorithm has converged, otherwise go to Step 1.

Algorithm (EYKSOM-ID: SOMs-ID model, EYKd(.) distance).

Step 0 Fix the topology of the map (one-dimensional – linear, two-dimensional – rectangular), the size of the map(the number of neurons P ), the learning rate α(s), the neighbourhood function hp,i(s) and the maximumnumber of iterations (maxiter). Generate randomly the weights μp(0), p = 1, . . . ,P .



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Step 1 Select an input vector i for updating the SOMs at ordering step s, ξ i (s) = xi , and determine the neuron c

closest to xi (winner) according to (4.11). Update the weights of the map μp according to (4.14).Step 2 Update α(s) and hp,i(s).Step 3 If iteration number s = maxiter the algorithm has converged, otherwise go to Step 1.

Although the basic principles of the self-organizing systems are simple, the process behaviour is difficult to bedescribed in mathematical terms.

Ordering. In Kohonen [1,2] the (self-)ordering of the weights is proved, restricting the considerations to a one-dimensional topology of neurons to each of which a scalar-valued input vector ξ is connected, showing that if ξ(s) isa random variable, considering the intermediate “states” (various types of partial sequences of the μp) of the process,

then an index of disorder, D = ∑Pp=2 |μp − μp−1| − |μ1 − μP |, more often decreases than increases in updating.

In Conti and De Giovanni [30] (self-)ordering of the weights with respect to a one-dimensional topology andscalar-valued input vector is rigorously justified. The results hold for general metrics. The conditions regarding thelearning rate under which convergence to an ordered state is obtained are

∑∞s=0 α(s) = ∞, lims→∞ α(s) = 0. See

also Ritter and Schulten [31].Generalizations of the (self-)ordering ability of the SOM to multi-dimensional input space and multi-dimensional

topologies of the neurons have been considered.With respect to the dimension of the input space, Budinich and Taylor [32] give an intuitive necessary and suffi-

cient condition of the decrease of D that applies to the case of multi-dimensional input space and one-dimensionaltopologies.

With respect to the topology of the network, Kohonen [1,2] assumes that in considering multi-dimensional topolo-gies results similar to the one-dimensional case can be obtained. In Budinich and Taylor [32] the problems at theorigin of ordering in higher-dimensional topologies are intuitively explained.

Vector quantization. The probability density function of the weight vectors at the stable state is studied in Ritter[33] in simple cases.

According to the above mentioned results, holding true Propositions 1 and 2, we can state, at least in one-dimensional topology, the following Proposition 3.

Proposition 3. The algorithms CDGSOM-ID and EYKSOM-ID converge to ordered values for the weights if the con-ditions

∑∞s=0 α(s) = ∞, lims→∞ α(s) = 0 are satisfied. Moreover, the probability density function of the weights μp

finally approximates some monotonic function of the probability density function p(ξ) of the J -dimensional continu-ous random variable ξ .

Proof. See Kohonen [1,2], Budinich and Taylor [32], Conti and De Giovanni [30]. �It can be concluded that the (weights of the) proposed SOMs exhibit the ability of convergence to low-dimensional

(at least one-dimensional) ordered values. Moreover the SOMs perform a “non-linear projection” of the probabilitydensity function p(ξ) of the J -dimensional continuous random variable ξ onto a low-dimensional (at least one-dimensional) display.

The density of the weight vectors of the neurons reflects the density of the input vectors: in clustered areas theweight vectors are close to each other; thus the cluster structure can be learnt by computing the distances betweenweight vectors.

5. Simulation study

Three simulation studies have been developed to exploit the capabilities of the CDGSOM-ID and EYKSOM-IDalgorithms.

It is worth noting that the number of iterations is mainly a heuristic issue. As remarked by Kohonen [34], even ifthe number of iterations should be reasonably large and depending on the size of the dataset, 10 000 steps and evenless may be enough.



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Table 2Data generation process – (2 × 1) SOMs-ID.

Scenario Fuzzyvariable

(Left) center c1 Right center c2 (Left) spread l Right spread r

centers triangular I/2 from U [0,1] I from U [0,1]symmetric (1) I/2 from U [1.5,2.5]triangular I/2 from U [0,1] I from U [0,1] I from U [0,0.5]asymmetric (2) I/2 from U [1.5,2.5]trapezoidal I/2 from U [0,1] I/2 from U [1,2] I from U [0,1]symmetric (3) I/2 from U [1.5,2.5] I/2 from U [2.5,3.5]trapezoidal I/2 from U [0,1] I/2 from U [1,2] I from U [0,1] I from U [0,0.5]asymmetric (4) I/2 from U [1.5,2.5] I/2 from U [2.5,3.5]

centers/spreads triangular I/2 from U [0,1] I/2 from U [0,1]symmetric (5) I/2 from U [1.5,2.5] I/2 from U [1.5,2.5]triangular I/2 from U [0,1] I/2 from U [0,1] I/2 from U [0,0.5]asymmetric (6) I/2 from U [1.5,2.5] I/2 from U [1.5,2.5] I/2 from U [1,2]trapezoidal I/2 from U [0,1] I/2 from U [1,2] I/2 from U [0,1]symmetric (7) I/2 from U [1.5,2.5] I/2 from U [2.5,3.5] I/2 from U [1.5,2.5]trapezoidal I/2 from U [0,1] I/2 from U [1,2] I/2 from U [0,1] I/2 from U [0,0.5]asymmetric (8) I/2 from U [1.5,2.5] I/2 from U [2.5,3.5] I/2 from U [1.5,2.5] I/2 from U [1,2]

The learning rate function is α(s) = α(0) · exp(−(s + 1)/maxiter) where α(0) is set to 0.1.The neighbourhood function between neuron p and winner neuron c for unit ξ i (s) at step s is hp,i(s) =

exp(−‖rp−rc‖2

2σ 2(s)), where σ(s) = σ(0) · exp(−(s + 1)/timeconst), timeconst = maxiter/ log(σ (0)), and σ(0) being the

maximum semi-radius of the SOM, i.e., given an SOM with (R × C) topology, then σ(0) = max(R,C)/2.The learning rate function satisfies the conditions required for the convergence of the SOMs-ID to a stable state.The ordering ability of the SOMs-ID is measured through the analysis of the distances between the weight vectors

and the related distances between their locations (closest neurons should have closest weight vectors). The topologypreservation ability of the SOMs-ID (closest input vectors should have closest neurons in the SOMs) is measuredthrough the Spearman correlation coefficient between the ranks of the I (I − 1)/2 distances between input vectors andthe ranks of the distances of the weight vectors of the related closest neurons [35].

5.1. Classification ability of the SOMs-ID

Two scenarios, “separated centers” (centers) and “separated centers and spreads” (centers/spreads) have been con-sidered. For each scenario I = 80 bivariate (J = 2) fuzzy numbers have been generated, considering four types offuzzy numbers, namely triangular, symmetric and asymmetric, and trapezoidal, symmetric and asymmetric.

The random variables generating the data, either centers or spreads, have Uniform distribution. Other randomvariables validated by real life situations for generating the centers and/or the spreads may be considered.

In the triangular symmetric centers scheme the spreads are randomly generated from U [0,1], whereas the centersof the input vectors belonging to the first cluster (I/2 input vectors) from U [0,1] and those of the input vectorsbelonging to the second cluster (I/2 input vectors) from U [1.5,2.5]. In the asymmetric case the right spreads aregenerated from U [0,0.5]. In the trapezoidal symmetric centers scheme the spreads are randomly generated fromU [0,1], whereas the two centers of the input vectors belonging to the first cluster (I/2 input vectors) from U [0,1]and U [1,2] and those of the input vectors belonging to the second cluster (I/2 input vectors) from U [1.5,2.5] andU [2.5,3.5]. In the asymmetric case the right spreads are generated from U [0,5]. Thus, in the centers scheme, theinput vectors are distinguished with respect to the values of the centers.

In the centers/spreads scheme the centers are generated as in the centers scenario. In the symmetric scheme (eithertriangular or trapezoidal) the spreads of the input vectors belonging to the first cluster (I/2 input vectors) are randomlygenerated from U [0,1], whereas the spreads of the input vectors belonging to the second cluster (I/2 input vectors)from U [1.5,2.5]. In the asymmetric scheme (either triangular or trapezoidal) the right spreads are generated fromU [0,0.5] and U [1,2], respectively.

The generation process of the fuzzy variables is summarized in Table 2.



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Fig. 3. Centers/spreads scenario; triangular (left) and trapezoidal (right); expected values of the generative random variables; dashed lines representthe asymmetric spreads.

Table 3Performances of the SOMs-ID – values are averaged over 100 simulations.

SOM-ID Scenario Fuzzyvariable

Adjusted Randindex

ν Quantizationerror

Distortionmeasure

Spearmancorrelation

CDGSOM-ID centers 1 1 0.50 0.337 70.315 0.8582 1 0.50 0.296 69.051 0.8653 1 0.50 0.388 121.561 0.8664 1 0.50 0.356 126.913 0.866

centers/spreads 5 1 0.34 0.317 166.210 0.8666 1 0.36 0.314 148.679 0.8667 1 0.49 0.410 250.153 0.8668 1 0.50 0.386 213.923 0.866

EYKSOM-ID centers 1 1 – 0.673 753.443 0.8642 1 – 0.667 750.063 0.8663 1 – 0.848 993.435 0.8664 1 – 0.842 977.571 0.866

centers/spreads 5 1 – 0.723 892.602 0.8666 1 – 0.705 807.443 0.8667 1 – 0.840 1087.776 0.8668 1 – 0.863 1034.427 0.866

The centers/spreads scenario is graphically represented in Fig. 3.An SOM-ID with one-dimensional topology and P = 2 neurons ((1 × 2) topology) has been trained with the

simulated data. The classification of the input vectors has been obtained by assigning each unit to the closest neuron(and related weight vector) in the SOM-ID. Each SOM-ID has been trained with the simulated data 100 times, toensure that the final results do not depend on the initial random choice of the weight vectors. Then, the classificationobtained for each replication has been compared with the “reference” classification, the first 40 units in one node, theremaining in the other node, by means of the adjusted Rand index [36]. Results are summarized in Table 3, wherethe average values of the adjusted Rand index, of the quantization error, of the Distortion measure, of the Spearmancorrelation index and of the value of the parameter ν (for the CDGSOM-ID) are reported.

In Figs. 4–7 the results for the centers/spreads scenario are presented for the asymmetric triangular and trapezoidalschemes by considering the two distances. In sub-panels (a) the evolution of the quantization error (i) and of thedistortion measure (ii) (and of ν (iii), for CDGSOM-ID) are presented. In sub-panels (b) are reproduced the weightvectors of the centers and of the upper/lower limits of each fuzzy variables, by means of parallel coordinates plots.

Some comments follow.

– The percentage of correct classification is 100% for both SOMs-ID, in each replication of the training of theSOMs-ID with the simulated data, as it can be seen from the mean value of the adjusted Rand index;

– the weight vectors at the stable state tend to describe the probability density function of the simulated data as theyconverge to the expected values of the generative random variables;



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Fig. 4. Evolution of quantization error, distortion measure, ν (a) and weight vectors (b): centers/spreads scenarios schemes asymmetric triangular,

CDGSOM-ID.

Fig. 5. Evolution of quantization error, distortion measure, ν (a) and weight vectors (b): centers/spreads scenarios schemes asymmetric trapezoidal,

CDGSOM-ID.

– the value of ν is 0.5 for the centers scheme; less than 0.5 in the centers/spreads scheme. The relevance of thespreads decreases due to their increased variability. This holds true especially for the triangular case, as in thetrapezoidal case the presence of variability in two centers offsets the variability of the spreads;

– the changes of the values of ν, of the quantization error and of the distortion measure are bigger at beginning dueto the larger neighbourhood size, then stabilize;



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Fig. 6. Evolution of quantization error, distortion measure (a) and weight vectors (b): centers/spreads scenarios schemes asymmetric triangular,

EYK SOM-ID.

Fig. 7. Evolution of quantization error, distortion measure (a) and weight vectors (b): centers/spreads scenarios schemes asymmetric trapezoidal,

EYK SOM-ID.

– in both scenarios, it is expected that the quality of learning in the asymmetric case outperforms the quality oflearning in the symmetric case due to the lower variability of the spreads; the CDGSOM-ID catch this morethan EYKSOM-ID (the relative difference between the quality of learning measures in the symmetric and relatedasymmetric cases are greater in the CDGSOM-ID either with respect to quantization error or distortion measure);

– the Spearman correlation coefficients show the ordering ability of the SOMs-ID (closest input vectors are assignedto closest – the same for P = 2 – neurons).



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Table 4Percentage of replications with correct classification, average adjusted Rand index and average Spearman correlation index-SOMs-ID with (1 × 3)

topology, trained on simulated datasets with two groups-100 replications.


Percentage ofcorrectclassification

Adjusted Randindex

Spearmancorrelation

CDGSOM-ID centers/spreads 5 78 0.98 0.8656 99 1.00 0.8667 97 1.00 0.8668 98 1.00 0.866

EYKSOM-ID centers/spreads 5 93 0.98 0.8636 67 0.91 0.8597 98 1.00 0.8668 72 0.97 0.862

Note that, as stated before, the initial radius σ(0) is set equal to max(R,C)/2. This choice of the radius in theproposed neighbourhood function (or any other convex neighbourhood function) ensures that the convergence time ofthe SOM to optimal ordered configurations is the shortest [37]. For consistency, we stick to this rule throughout thepaper, both for simulations and applicative examples. However, when dealing with 1 × 2 maps (but also with 2 × 2maps) this rule could involve a potential problem, since the radius σ(s) remains constant during the training process.For this reason, we have carried out further experiments by setting σ(0) = max(R,C), obtaining results very similarto those reported in Table 3. For instance, for the scenario 1 (centers scenario, triangular symmetric fuzzy data), theaverage of the adjusted Rand index over the 100 simulations is equal to 1.

A further issue regards the detection of the number of groups when using an SOM-ID. One would expect that if anSOM-ID with more than two neurons is trained on a dataset with two groups, the input vectors have only two neuronsas closest neurons and the third neuron should be not filled.

Therefore, the following experiments have been run. For the datasets generated under the second scenario we havetrained SOMs-ID with one-dimensional topology and P = 3 neurons ((1 × 3) topology) and we have replicated eachexperiment 100 times. Results are reported in Table 4.

First, we have computed the percentage of replications in which all units have been correctly classified into twogroups, i.e. the first 40 units in one group and the second 40 units in an adjacent group. Note that this requirement isparticularly strict, since even if only one unit is not correctly allocated we do not label the classification as “correct”.For this reason we have also considered the adjusted Rand index, averaged over the 100 replications, since this indexcan also account of “near correct” classification. The adjusted Rand index averaged over the 100 replications is alwaysvery high, showing that training a (1 × 3) SOMs-ID map on a dataset that contains two groups is likely to conduct toa classification in which one neurons is empty, or at least only few units are classified in the third neuron.

Indeed, as observed at the end of Section 4, the density of the weight vectors of the SOMs reflects the density ofthe input vectors.

It should be noted that since we deal with fuzzy data, in each scenario the two simulated groups partially overlap,even if the centers are well separated. Consider the scenario 5 in Table 2. Taking into account the spreads, the uppervalues of the generated fuzzy numbers in the first group range from 0 to 2, while the lower values in the second grouprange from −1 to 1. Then, even in presence of overlapping due to the fuzziness of the input vectors, we are able todetect the proper number of cluster. To see what happens when the centers of the fuzzy data in the two clusters are lessseparated, we have conducted a further experiment, considering a generation scheme similar to that of the scenario5, with the centers of the second group generated from a U [1,2] (scenario 5a). We have generated 100 datasets andwe have trained a CDGSOM-ID with (1 × 3) topology over the simulated datasets. From Table 4 we know that for thescenario 5 the average adjusted Rand index is equal to 0.98. For the scenario 5a we observe an average adjusted Randindex equal to 0.9, indicating, as expected, a slight deterioration in the classification performance.

Finally, we generalize the “centers” scenario to the case in which there are four groups, in order to evaluate theclassification ability of SOMs-ID with two-dimensional topology ((2 × 2) topology). In Table 5 the data generationprocess is reported.



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Table 5Data generation process – (2 × 2) SOMs-ID.

Scenario Fuzzyvariable

(Left) center c1 Right center c2 (Left) spread l Right spread r

centers triangular I/4 from U [0,1] I from U [0,1]symmetric (1) I/4 from U [1.5,2.5]

I/4 from U [3,4]I/4 from U [4.5,5.5]

triangular I/4 from U [0,1] I from U [0,1] I from U [0,0.5]asymmetric (2) I/4 from U [1.5,2.5]

I/4 from U [3,4]I/4 from U [4.5,5.5]

trapezoidal I/4 from U [0,1] I/4 from U [1,2] I from U [0,1]symmetric (3) I/4 from U [1.5,2.5] I/4 from U [2.5,3.5]

I/4 from U [3,4] I/4 from U [4,5]I/4 from U [4.5,5.5] I/4 from U [5.5,6]

trapezoidal I/4 from U [0,1] I/4 from U [1,2] I from U [0,1] I from U [0,0.5]asymmetric (4) I/4 from U [1.5,2.5] I/4 from U [2.5,3.5]

I/4 from U [3,4] I/4 from U [4,5]I/4 from U [4.5,5.5] I/4 from U [5.5,6]

Table 6Percentage of replications with correct classification, average Adjusted Rand index and average Spearman correlation index-SOMs-ID with (2 × 2)

topology, trained on simulated datasets with four groups-100 replications.


Percentage ofcorrectclassification

Adjusted Randindex

Spearmancorrelation

CDGSOM-ID centers 1 100 1.00 0.9342 100 1.00 0.9353 100 1.00 0.9344 99 1.00 0.954

EYKSOM-ID centers 1 100 1.00 0.9322 100 1.00 0.9373 99 1.00 0.9404 100 1.00 0.935

In Table 6 we report the results of the experiments. As it can be seen, the algorithm is able to correctly classify theunits, whatever distance measure is used. Evidence concerning the Spearman correlation index will be discussed inSection 5.2.

5.2. Vector quantization, input density mapping and ordering ability of the SOMs-ID

To show the ability of the SOMs-ID to change the random initial values of the weight vectors of the map so that thedensity of the weight vectors reflects the density of the input samples a one-dimensional map has been trained with500 triangular univariate and bivariate fuzzy numbers (J = 1, 2) generated from a U [0,1] either for the center or forthe spread. Three map sizes P = 2, 3, 4 ((1 × 2), (1 × 3), (1 × 4) topologies) have been considered.

The data generation process is summarized in Table 7.The weight vectors of the SOMs-ID are presented in Table 8 (in brackets the distance between weight vectors of

adjacent neurons).In Figs. 8–9 the centers of the weight vectors of the neurons are presented for J = 1 and J = 2, respectively.The CDGSOM-ID has been trained keeping the value of ν both fixed to 0.5 and variable. The results are summarized

in Table 9.Some comments follow.



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Table 7Data generation process.

Map size Center (Left) spread

J = 1 P = 2 I from U [0,1] I from U [0,1]P = 3P = 4

J = 2 P = 2 I from U [0,1] I from U [0,1]P = 3P = 4

Table 8Weight vectors of the neurons (distances between weight vectors of adjacent neurons in brackets).

J SOM-ID P = 2 P = 3 P = 4 ν

μ1 μ2 μ1 μ2 μ3 μ1 μ2 μ3 μ4

J = 1 CDGSOM-ID center 0.23 0.74 0.15 0.47 0.81 0.10 0.35 0.60 0.85 0.35–0.40(0.51) (0.23 0.34) (0.25 0.25 0.25)

spread 0.37 0.50 0.50 0.66 0.52 0.42 0.50 0.56 0.46

EYKSOM-ID center 0.23 0.74 0.11 0.50 0.78 0.15 0.44 0.63 0.81 –(0.51) (0.39 0.28) (0.29 0.19 0.46)

spread 0.49 0.56 0.47 0.52 0.51 0.42 0.50 0.56 0.46

J = 2 CDGSOM-ID center 0.73 0.28 0.17 0.74 0.54 0.71 0.73 0.28 0.18 0.30–0.350.27 0.73 0.66 0.63 0.18 0.67 0.21 0.32 0.82

spread 0.44 0.52 0.60 0.52 0.54 0.48 0.46 0.51 0.620.57 0.50 0.51 0.45 0.50 0.45 0.43 0.60 0.54

EYKSOM-ID center 0.63 0.30 0.67 0.17 0.64 0.32 0.71 0.75 0.19 –0.29 0.69 0.76 0.54 0.27 0.86 0.77 0.26 0.15

spread 0.46 0.50 0.55 0.52 0.50 0.56 0.47 0.50 0.540.56 0.55 0.50 0.58 0.46 0.49 0.47 0.53 0.52

Fig. 8. Weight vectors of CDGSOM-ID (left), of EYK SOM-ID (right), J = 1, P = 2, 3, 4 from bottom to top.

– Table 8 and Figs. 8–9 show that the SOMs-ID reproduce the probability density function of the input vectors;either for J = 1 or J = 2 the weight vectors are almost evenly spaced in the interval (0,1) or in the unit square,respectively. Notice that CDGSOM-ID reproduces the probability density function of the input vectors slightly bet-ter than EYKSOM-ID. In fact the ratios, obtained by the values in parenthesis in Table 8, (μi+1 −μi )/(μi −μi−1)

are almost 1 as required by the Uniform generative random variables [1];



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Fig. 9. Weight vectors of CDGSOM-ID (left), of EYK SOM-ID (right), J = 2, P = 2, 3, 4 from left to right (lines connect adjacent neurons in thetopology). Only 100 input vectors have been drawn.

Table 9Performances of the CDGSOM-ID with different values of ν.

SOM-ID J P = 2 P = 3 P = 4 ν

CDGSOM-ID J = 1 quantization error 0.29 0.26 0.24 fixed = 0.5distortion measure 121.73 135 137

J = 2 quantization error 0.16 0.14 0.12distortion measure 66.84 60.25 60.45

J = 1 quantization error 0.27 0.24 0.22 variabledistortion measure 97 118 121

J = 2 quantization error 0.15 0.12 0.1distortion measure 56.88 51.37 40.9

– the ordering ability (closest neurons in the topology have closest weights) is visible in Table 8 and Figs. 8–9 (linesconnecting adjacent neurons show the topology of the map for J = 2);

– the reproduction of the probability density function of the input vectors is less evident for the spreads due to valueof ν;

– keeping fixed the value of ν, the quality of learning of the map decreases, as shown in Table 9.

Finally, in Table 6 the values of the Spearman correlation index averaged over the 100 replications of the experimentdescribed in the previous section for a two-dimensional map are reported. The correlation between the ranks of thedistances between input vectors and the ranks of the distances between the weight vectors of the related closestneurons detect the ordering ability of the map trained to the data. As it can be seen, the high values of the index provethe ordering ability of SOMs-ID, irrespective of the distance measure adopted, even in the case of two-dimensionalmap.



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Table 10Weight vectors, centers, performances of the CDGSOM-ID and of FCN [10].

Clusteringmethod

c1 (c2) l r Quantization error(Euclidean distance)

triangular CDGSOM-ID μ1 12.9072 – 1.2352 1.3051 1.7μ2 23.9294 – 0.9707 1.0674μ3 38.8404 – 1.2808 0.8963

FCN [10] μ1 12.7730 – 1.150 1.1850 2.0μ2 24.0870 – 0.9140 1.0500μ3 39.5210 – 1.2830 0.8700

trapezoidal CDGSOM-ID μ1 38.3689 40.4200 1.1937 0.9827 2.1μ2 23.3998 25.4670 0.9522 1.0242μ3 12.1656 13.9050 1.2188 1.1876

FCN [10] μ1 12.8900 14.6710 1.1370 1.2100 2.1μ2 23.9720 25.8960 0.9050 1.0400μ3 39.4010 41.3320 1.2870 0.8750

5.3. Vector quantization and classification ability of the SOMs-ID

Two datasets from Yang and Ko [10] are considered: (1) 30 triangular fuzzy numbers and (2) 30 trapezoidal fuzzynumbers. Then SOMs-ID with the proposed distances have been trained with P = 3 neurons ((1 × 3) topology).

The weight vectors of the neurons are presented in Table 10 together with the centers obtained by using the FCN(Fuzzy Clustering Number) procedure by Yang and Ko [10]. In order to compare the quality of learning of the twomethodologies the Euclidean distance has been used in the computation of the quantization error. It is worth notingthat the obtained value of ν for the SOMs-ID is 0.5, so that the distance used in CDGSOM-ID is proportional to theEuclidean distance. Both the models identify three clusters. The CDGSOM-ID slightly outperforms FCN in the qualityof learning, in particular in the case of triangular fuzzy numbers.

6. Applications

In this section different applications are presented to illustrate the effectiveness of the proposed SOMs-ID. As inSection 5, the number of iteration is set to 10 000, the initial value of the learning rate function is α(0) = 0.1 and σ(0)

is the maximum semi-radius of the SOM.

6.1. Tea data

The data drawn by Hung and Yang [38] and Hung et al. [39], regarding the evaluation of 70 kinds of Taiwanese teais considered. In particular, 10 experts evaluate each kind of tea by assigning for 4 criteria (attribute) – appearance,tincture, liquid colour and aroma – 5 different quality levels: perfect, good, medium, poor, bad. These quality termsrepresent the imprecision and ambiguity inherent in human perception. Since the fuzzy sets can be suitably utilized fordescribing the ambiguity and imprecision in natural language, the authors define these quality terms using symmetricaltriangular fuzzy numbers, i.e.: Xperfect = (1,0.25,0), Xgood = (0.75,0.25,0.25), Xmedium = (0.5,0.25,0.25), Xpoor =(0.25,0.25,0.25), Xbad = (0,0,0.25) [38,39].

Following Hung and Yang [38] from the (original) multivariate data (p = 4) a univariate data (p = 1) is obtainedby averaging, in a fuzzy manner, the fuzzy scores on the 4 attributes. Notice that tea no. 1 (called White-tip Oolong)is the best and the most famous Taiwanese tea. In fact the expert (fuzzy) average evaluation is highest. For this reason,this type of tea can be considered a special tea and then an outlier in the tea data.

SOMs-ID with one-dimensional topology and P = 5 neurons have been trained ((1 × 5) topology). The classifica-tion is obtained by considering the neurons closest to the input vectors (Table 11).

In Fig. 10 the centers of the weight vectors of the neurons are presented for both the SOMs-ID. The spreads arenot presented due to their low variability (particularly in case of the right spread).



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Table 11Results of the SOMs-ID – Tea data.

SOM-ID Outlier Neuron Neuroncomposition

Quantizationerror

Distortionmeasure

Spearmancorrelation

ν

CDGSOM-ID Yes 1 1–5 0.014 0.506 0.865 0.52 6–203 21–394 40–605 61–70

No 1 61–70 0.011 0.248 0.890 0.52 56–603 40–554 6, 21–395 2–5, 7–20

EYKSOM-ID Yes 1 61–70 0.035 4.826 0.853 –2 40–603 21–394 3–205 1–2

No 1 63–70 0.020 1.194 0.882 –2 56–623 40–554 21–395 2–20

Fig. 10. Weight vectors of the CDGSOM-ID (bottom), EYK SOM-ID (top), with (left) and without (right) tea no. 1, J = 1 – Tea data.

Some comments follow:

– the obtained value of ν for CDGSOM-ID is 0.5;– the neurons present ordering with respect to weights (farther neurons exhibit greater distance); the ordering of the

neurons with respect to weights is shown in Fig. 10 for both the SOMs-ID;– the value of the correlation coefficient is always high. Moreover, it is slightly higher with the CDG distance than

with the EYK distance, but differences are negligible;– the distance between neurons 1–2 and 4–5 is greater than between neurons 2–3 and 3–4 because the density of

the input vectors is low for very high or very low values of the centers of the fuzzy numbers;– the distance between neurons 1–2 is greater when tea no. 1 (outlier) is considered;– when tea no. 1 (outlier) is not present in the data, teas nos. 61 and 62 properly share the neuron with teas 60–70

in CDGSOM-ID, while share the neuron with teas 56–60 in EYKSOM-ID; teas nos. 61 and 62 differ from teasnos. 56–60 for the center and for the left spread; from teas 63–70 only for the center;

– when tea no. 1 (outlier) is present in the data, tea nos. 1 and 2 share the neuron with teas nos. 3–5 in CDGSOM-ID,while are mapped onto a separate neuron in EYKSOM-ID; in this case the relevance of the centers in the compu-tation of the distance is higher for EYKSOM-ID, which specializes one neuron almost only on tea no. 1 due to thelow variability of the spreads (the value of ν is in fact 0.5 in CDGSOM-ID);



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Table 12Results of the SOMs-ID – Students data.

SOM-ID Outlier Node Units

CDGSOM-ID Yes 1 3 9 11 12 19 20 21 262 4 5 8 13 14 15 16 17 18 233 1 2 6 7 10 22 24 25 27

No 1 3 9 11 12 19 20 21 262 4 5 8 13 14 15 16 17 183 1 2 6 7 10 22 24 25 27

EYKSOM-ID Yes 1 1 2 6 7 10 22 24 25 272 4 5 8 13 14 15 16 17 18 233 3 9 11 12 19 20 21 26

No 1 3 9 11 12 19 20 212 4 5 8 13 14 15 16 17 18 263 1 2 6 7 10 22 24 25 27

– in Hung and Yang [38] the identification of tea no. 1 as outlier requires the use of 6 clusters; notice that with P = 6neurons ((1 × 6) topology), we obtain that teas nos. 1 and 2 represent a separate neuron, both with CDGSOM-IDand with EYKSOM-ID.

6.2. Students data

A set of undergraduate students attending the Course of Statistics at the Faculty of Political Sciences of SapienzaUniversity of Rome in the academic year 2009–2010 has been asked to fill in a questionnaire [9]. The items of thequestionnaire refer to the opinions and perceptions about the recent global economic and financial crisis. The questionsare the following:

Q1: opinion about the origin of the crisis in the financial speculation;Q2: opinion about the utility of a new regularization of financial markets;Q3: feeling about the need for a drastic change of the economic system;Q4: opinion on the adequacy of the EU economic measures to face up to the crisis;Q5: opinions about the Italian economic measures;Q6: perception about the trend of the Italian economy during the following three years.

The responses to each of these questions are collected by means of a Visual Analogue Scale (see, e.g. [40]), invitingevery responder to fill in a vertical mark along a segment of length one joining the two opposite extreme opinions,according to the location of the responder’s opinion [9]. The data are fuzzified considering triangular symmetric LRfuzzy numbers with center (c1 = c2 = c) equal to the distance between the left bound and the filled mark and spreadobtained as in Coppi et al. [9]:

l = r =⎧⎨⎩

0.125, 0.125 � c � 0.875,

c, c < 0.875,

1 − c, c > 0.875.

The underlying assumption is the intuitive iterative splitting of the segment into two halves in order to betterapproximate the assessment. This underlying mechanism suggests to associate to any given location a fuzzy intervalof length 0.25. Hence, the choice of spreads is equal to 0.125, except for extreme locations where the imprecisionnaturally decreases to the extent to which the sign gets closer to the bounds.

The SOMs-ID with one-dimensional topology and P = 3 neurons have been trained ((1 × 3) topology). The clas-sification is obtained by considering the neurons closest to input vectors (Table 12).

The weight vectors are shown in Figs. 11(a)–11(b). Figures are referred to the case with the outlier, since resultsobtained without the outlier are very similar.

The performances of the SOMs-ID are presented in Table 13 both in the case when student 23 (outlier) is consideredor not.



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Fig. 11. Weight vectors of the CDGSOM-ID (top), EYK SOM-ID (bottom), with the presence of the outlier in the data, J = 6 – Students data.

Table 13Performances of the SOMs-ID – Students data.

SOM-ID Outlier Quantizationerror

Distortionmeasure

Spearmancorrelation

ν

CDGSOM-ID Yes 0.178 6.437 0.749 0.5No 0.159 5.767 0.820 0.5

EYKSOM-ID Yes 0.561 72.05 0.749 –No 0.511 65.393 0.804 –

It has to be noted:

– the obtained value of ν for CDGSOM-ID is 0.5;– the neurons preserve ordering with respect to weights (farther neurons exhibit greater distance);– either the CDGSOM-ID or the EYKSOM-ID detect the presence of three well interpretable groups of students.

The students sharing neuron 3 in CDGSOM-ID (corresponding to 1 in EYKSOM-ID) think that the crisis is dueto financial speculation, don’t trust on reconsidering the economic and financial rules or on the initiatives ofthe Italian government but only on a full reorganization of the economic system; the students sharing neuron 1in CDGSOM-ID (corresponding to 3 in EYKSOM-ID) have a general agreement that the crisis is not a relevantproblem and that the measures taken by the EU and by the Italian government are helpful; finally the students



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Table 14Results of the SOMs-ID – Temperature data.

SOM-ID Neuron Neuron composition

CDGSOM-ID 1 Bahrain Bombay Calcutta Colombo DubaiKuala Lumpur Madras Manila New Deli Singapore

2 Cairo Hong Kong Mauritius Mexico City Nairobi Sidney3 Athens Lisbon Madrid New York Rome

San Francisco Seoul Tehran Tokyo4 Amsterdam Copenhagen Frankfurt Geneva London Moscow

Munich Paris Stockholm Toronto Vienna Zurich

EYKSOM-ID 1 Bahrain Bombay Cairo Calcutta Colombo DubaiHong Kong Kuala Lumpur Madras Manila New Deli Singapore

2 Mauritius Mexico City Nairobi Sidney3 Athens Lisbon Madrid New York Rome

San Francisco Seoul Tehran Tokyo4 Amsterdam Copenhagen Frankfurt Geneva London Moscow

Munich Paris Stockholm Toronto Vienna Zurich

sharing neuron 2 think that the crisis is due to financial speculation and that EU is working better than the Italiangovernment;

– student 23 can be considered as an outlier (he doesn’t think that the crisis is due to financial speculation, exhibitingthe lowest value of the answer to Q1, but trusts on regularization of financial markets); the performances of theSOMs-ID improve by not considering student 23 (Table 13);

– with respect to the results obtained in Coppi et al. [9] by using FkMC-F (Fuzzy k-Means clustering model forfuzzy data) and PkMC-F (Possibilistic k-Means clustering model for fuzzy data) we observe the same cluster com-position; the proposed CDGSOM-ID and EYKSOM-ID detect the outlier student 23, whose membership degrees tothe three clusters reduces to 0 in PkMC-F, by assigning it the highest distance from its closest neuron;

– with respect to the results obtained in Coppi et al. [9] the topology of the proposed CDGSOM-ID and EYKSOM-IDshows that the students with most different opinions about the recent global economic and financial crisis arethe ones that strongly believe or strongly do not believe that a drastic change of the economic system is needed(assigned to the border neurons in the topology of the SOMs, neurons 1, 1 and 3, 1).

6.3. Temperature data

The minimum and maximum temperatures of 37 cities in degree centigrade for each month of the year are consid-ered [8,41]. Both CDGSOM-ID and EYKSOM-ID with one-dimensional topology and P = 4 neurons have been trained((1 × 4) topology). From the interval-valued data in Yang et al. [8], triangular fuzzy data have been obtained by con-sidering the center of each interval and the related symmetric spreads. The classification is obtained by consideringthe neurons closest to input vectors (Table 14).

The weight vectors are shown in Figs. 12(a)–12(b).It has to be noted:

– the obtained value of ν for CDGSOM-ID is 0.5;– the neurons present ordering with respect to weights (farther neurons exhibit greater distance); this evidence is

graphically shown in Fig. 12 where the level of the temperature gradually decreases from neuron 1, 1 to neu-ron 4, 1; in fact, with both distance measures we detect four well separated climatic zone;

– some differences in the classification between CDGSOM-ID and EYKSOM-ID are due to the fact that inCDGSOM-ID the obtained value of ν gives the same relevance to centers and spreads; in EYKSOM-ID the rel-evance of the centers is higher and as a consequence the weights are sharper (Fig. 12).

The proposed CDGSOM-ID and EYKSOM-ID have been compared on the same data set with two methodologiesfor clustering interval-valued data, namely MSV agglomerative based clustering [41] and symbolic SOM S-SOM [8].



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Fig. 12. Weight vectors of the CDGSOM-ID (top), EYK SOM-ID (bottom), with the presence of the outlier in the data, J = 12 – Temperature data.

In the MSV agglomerative based clustering the degree of similarity between units i and l, described by J interval-valued variables, is based on the degree of overlapping of the intervals describing each of the J variables. The MSVis then used in an agglomerative clustering methodology.

In the symbolic SOM S-SOM the degree of similarity between units i and l described by J variables that maybe quantitative or qualitative is based with respect to the quantitative interval-valued variables on the comparison ofthe position (center), the span (spread) and the content (overlapping) of the intervals describing each variable. Thesimilarity is then used in an SOM based clustering methodology.

Some observations follow:

– the MSV reserves a single cluster to each of the cities Mauritius and Tehran. The other two clusters discriminatethe cities according to the latitude, latitude 40◦ 60◦ and latitude 40◦ 60◦;

– the symbolic SOM S-SOM reserves a cluster to the cities of Athens and Tehran. Two of the other three clustersdiscriminate between tropics zone and temperate zone. The last cluster is rather difficult to interpret, since thecities assigned to it share only a similar average temperature around the year.

The CDGSOM-ID and the EYKSOM-ID seem to get better clustering results as they identify 4 climatic zones orderedin the topology (Figs. 12(a)–12(b)).

At the basis of the different classification with respect to MSV agglomerative based clustering and S-SOM there areeither the dissimilarity used or the clustering methodology. It is to be noticed that, in case of no overlapping betweentwo intervals, the multivalued asymmetric similarity measure for interval-valued data does not consider the extent ofthe dissimilarity.



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

Table 15Performances of the SOMs-ID – Students’ satisfaction data.

SOM-ID Outlier Quality of learning P = 2 P = 3 P = 4 P = 5 P = 6

CDGSOM-ID Yes Quantization error 0.433 0.324 0.283 0.243 0.219Distortion measure 37.866 35.932 25.945 22.909 21.177Spearman correlation 0.606 0.779 0.755 0.849 0.812

No Quantization error 0.485 0.343 0.291 0.277 0.273Distortion measure 31.984 38.185 24.286 18.473 15.464Spearman correlation 0.455 0.824 0.842 0.818 0.831

EYKSOM-ID Yes Quantization error 1.069 0.715 0.595 0.477 0.416Distortion measure 264.224 238.152 165.232 171.594 156.254Spearman correlation 0.606 0.779 0.771 0.844 0.791

No Quantization error 1.137 0.726 0.546 0.528 0.476Distortion measure 201.997 267.129 160.342 106.782 79.714Spearman correlation 0.454 0.824 0.842 0.845 0.836

Table 16Results of the SOMs-ID, P = 3 – Students’ satisfaction data.

SOM-ID Outlier Neuron Neuron composition

CDGSOM-ID Yes 1 4 10 17 23 24 272 1 3 5 6 8 9 11 13 14 15 18 19 22 25 263 2 7 12 16 20 21

No 1 4 10 17 23 24 272 1 3 5 6 8 9 11 13 14 15 18 19 22 25 263 2 7 12 20 21

EYKSOM-ID Yes 1 4 10 17 23 24 272 1 3 5 6 8 9 11 13 14 15 18 19 22 25 263 2 7 12 16 20 21

No 1 2 7 12 20 212 1 3 5 6 8 9 11 13 14 15 18 19 22 25 263 4 10 17 23 24 27

6.4. Students’ satisfaction data

The following dataset has been drawn from Sinova et al. [42], and it regards a sample of 27 students. Each studenthas been asked to rate her/its satisfaction about five different aspects of each course attended during the II SummerSchool of the European Centre for Soft Computing, held in Spain in July 2008.

Students report their opinions using the scale of fuzzy numbers. Table 2 in Sinova et al. [42] collects the “overall”rating, that has been converted to trapezoidal fuzzy number. Representing opinions in such way has the advantage toreflect the vagueness and the subjectivity of the potential responses.

As observed by Sinova et al. [42], a student (namely the 16-th student) represents an outlier in the data, hence weexamine the topological structure of the data both with the inclusion and with the exclusion of the outlier.

Since we do not have any a priori information about the “clustering structure” of the data, we try to find the bestrepresentation of the data by running the algorithm with different number of nodes, setting a one-dimensional map((1 × 2), (1 × 3), (1 × 4), (1 × 5), (1 × 6) topologies). The classification is obtained by considering the neurons closestto input vectors.

The performances of the SOMs-ID are presented in Table 15 both in the case when student 16 (outlier) is consideredand when is removed from the data.

It has to be noted:

– by applying the CDGSOM-ID both with and without outlier, we observe that the quality of the learning process andof the topological representation of the input vectors on the map improve substantially from two to three nodes.With four or more nodes there is still a decrease of the average quantization error and, more prominently, of the



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 47

48 48

49 49

50 50

51 51

52 52

average distortion measure. At the same time we observe a slight increase of the correlation between distancesof closest input vectors and distances of the locations of their related closest weight vectors. Then, the qualityof the learning process changes only slightly, at least for the average quantization error, as the number of nodesincreases, the topological preservation of the input vectors is ensured with three nodes, and the successive changesare almost irrelevant. Based on this evidence, we choose the linear SOM with three nodes to represent the data(Table 16);

– the obtained value of ν for CDGSOM-ID is 0.5;– the neurons present ordering with respect to weights (farther neurons exhibit greater distance);– in the (1 × 3) topology, the neuron composition is the same, beside the outlier indicating robustness of the

CDGSOM-ID algorithm in presence of outliers (same robust evidence is obtained with the use of the medianin Sinova et al. [42]); this does not hold true with more than three nodes;

– a similar evidence is found when exploiting the EYKSOM-ID.

As for the cluster composition, the first node (the last in case of EYKSOM-ID on the data without outlier) ischaracterized by student very satisfied about the course attended, the second node concerns student with averagesatisfaction, the third is that of the unsatisfied students.

7. Final remarks

In this paper we proposed an extension of the Self-Organizing Maps (SOMs) for data imprecisely observed (Self-Organizing Maps for imprecise data, SOMs-ID). To compute the distance between the multivariate imprecise data andthe weights of the neurons of the map we suggested to exploit two different distance measures: the Coppi–D’Urso–Giordani distance (CDG distance) [9] and the Extended Yang–Ko distance (EYK distance) [10].

We illustrated the main features of the proposed method with a simulation study and different substantive applica-tions. The SOMs-ID with respect to other classification models for fuzzy data take advantage of all the properties ofthe SOMs, both performing clustering and organizing clusters in a reduced dimension topology, allowing impreciseinput. Moreover the weight vectors of the neurons reproduce the probability density function of the input vectors andthe number of clusters has not to be chosen a priori.

The experimental evidence allowed us also to compare the performance of SOMs-ID with the two distance mea-sures. Overall, only slight differences, if any, have been observed between the SOMs-ID supplied with the CDGdistance and that with the EYK distance, namely CDGSOM-ID and EYKSOM-ID.

References

[1] T. Kohonen, Self-Organization and Associative Memory, 3rd edition, Springer-Verlag, New York, 1989.[2] T. Kohonen, Self-Organizing Maps, Springer-Verlag, 1995.[3] H.H. Bock, Clustering methods and Kohonen maps for symbolic data, J. Jpn. Soc. Comput. Stat. 15 (2) (1999) 217–229.[4] H. Bock, Visualizing symbolic data by Kohonen maps, in: E. Diday, M. Noihome-Fraiture (Eds.), Symbolic Data Analysis and the SODAS

Software, Wiley, 2008, pp. 205–234.[5] D. Chen, W. Hung, M. Yang, et al., A batch version of the SOM for symbolic data, in: 2010 Sixth International Conference on Natural

Computation (ICNC), vol. 1, IEEE, 2010, pp. 1–5.[6] P. D’Urso, L. De Giovanni, Midpoint radius self-organizing maps for interval-valued data with telecommunications application, Appl. Soft

Comput. 11 (5) (2011) 3877–3886.[7] C. Hajjar, H. Hamdan, Self-organizing map based on city-block distance for interval-valued data, in: M. Aiguier, F. Bretaudeau, D. Krob

(Eds.), Complex Systems Design & Management, Springer, 2012, pp. 281–292.[8] M. Yang, W. Hung, D. Chen, Self-organizing map for symbolic data, Fuzzy Sets Syst. 203 (2012) 49–73.[9] R. Coppi, P. D’Urso, P. Giordani, Fuzzy and possibilistic clustering for fuzzy data, Comput. Stat. Data Anal. 56 (4) (2012) 915–927.

[10] M. Yang, C. Ko, On a class of fuzzy c-numbers clustering procedures for fuzzy data, Fuzzy Sets Syst. 84 (1) (1996) 49–60.[11] R. Coppi, Management of uncertainty in statistical reasoning: The case of regression analysis, Int. J. Approx. Reason. 47 (3) (2008) 284–305.[12] R. Coppi, P. Giordani, P. D’Urso, Component models for fuzzy data, Psychometrika 71 (4) (2006) 733–761.[13] D. Dubois, Prade, Possibility Theory, Plenum Press, New York, 1988.[14] H. Zimmermann, Fuzzy Set Theory and Its Applications, Kluwer, Boston, 2001.[15] R. Coppi, The fuzzy approach to multivariate statistical analysis, Tech. Rep. 11, Dipartimento di Statistica, Probabilità e Statistiche Applicate,

Sapienza Università di Roma, 2003.[16] P. D’Urso, Clustering of fuzzy data, in: J. de Oliveira, W. Pedrycz (Eds.), Advances in Fuzzy Clustering and Its Applications, J. Wiley and

Sons, 2007, pp. 155–192.



1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19

20 20

21 21

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

[17] A. Celminš, Multidimensional least-squares fitting of fuzzy models, Math. Model. 9 (9) (1987) 669–690.[18] A. Celminš, A practical approach to nonlinear fuzzy regression, SIAM J. Sci. Stat. Comput. 12 (3) (1991) 521–546.[19] I. Bloch, On fuzzy distances and their use in image processing under imprecision, Pattern Recognit. 32 (11) (1999) 1873–1895.[20] R. Zwick, E. Carlstein, D. Budescu, Measures of similarity among fuzzy concepts: A comparative analysis, Int. J. Approx. Reason. 1 (2)

(1987) 221–242.[21] R. Lowen, W. Peeters, Distances between fuzzy sets representing grey level images, Fuzzy Sets Syst. 99 (2) (1998) 135–149.[22] C. Pappis, N. Karacapilidis, A comparative assessment of measures of similarity of fuzzy values, Fuzzy Sets Syst. 56 (2) (1993) 171–174.[23] A. De Luca, S. Termini, A definition of non-probabilistic entropy in the setting of fuzzy set theory, Inf. Control 20 (1972) 301–312.[24] S. Chen, M. Yeh, P. Hsiao, A comparison of similarity measures of fuzzy values, Fuzzy Sets Syst. 72 (1) (1995) 79–89.[25] W. Wang, New similarity measures on fuzzy sets and on elements, Fuzzy Sets Syst. 85 (3) (1997) 305–309.[26] P. D’Urso, P. Giordani, A weighted fuzzy c-means clustering model for fuzzy data, Comput. Stat. Data Anal. 50 (6) (2006) 1496–1523.[27] M. Yang, P. Hwang, D. Chen, Fuzzy clustering algorithms for mixed feature variables, Fuzzy Sets Syst. 141 (2) (2004) 301–317.[28] M. Yang, H. Liu, Fuzzy clustering procedures for conical fuzzy vector data, Fuzzy Sets Syst. 106 (2) (1999) 189–200.[29] H. Ritter, K. Schulten, Kohonen’s self-organizing maps: Exploring their computational capabilities, in: Proceedings of IEEE International

Conference on Neural Networks, IEEE, 1988, pp. 109–116.[30] P. Conti, L. De Giovanni, On the mathematical treatment of self organization: extension of some classical results, in: International Conference

on Artificial Neural Networks, 1991, pp. 1809–1812.[31] H. Ritter, K. Schulten, Convergence properties of Kohonen’s topology conserving maps: fluctuations, stability, and dimension selection, Biol.

Cybern. 60 (1) (1988) 59–71.[32] M. Budinich, J. Taylor, On the ordering conditions for self-organizing maps, Neural Comput. 7 (2) (1995) 284–289.[33] H. Ritter, Asymptotic level density for a class of vector quantization processes, IEEE Trans. Neural Netw. 2 (1) (1991) 173–175.[34] T. Kohonen, The self-organizing map, Proc. IEEE 78 (9) (1990) 1464–1480.[35] H. Bauer, M. Herrmann, T. Villmann, Neural maps and topographic vector quantization, Neural Netw. 12 (4) (1999) 659–676.[36] L. Hubert, P. Arabie, Comparing partitions, J. Classif. 2 (1) (1985) 193–218.[37] E. Erwin, K. Obermayer, K. Shulten, Convergence properties of self-organizing maps, in: T. Kohonen, K. Mäkisara, O. Simula, J. Kangas

(Eds.), Artificial Neural Networks, 1991, pp. 409–414.[38] W. Hung, M. Yang, Fuzzy clustering on LR-type fuzzy numbers with an application in Taiwanese tea evaluation, Fuzzy Sets Syst. 150 (3)

(2005) 561–577.[39] W. Hung, M. Yang, E. Lee, A robust clustering procedure for fuzzy data, Comput. Math. Appl. 60 (2010) 151–165.[40] E. Huskisson, Visual analogue scale, in: R. Melzack (Ed.), Pain Measurement and Assessment, Raven Press, New York, 1983, pp. 33–37.[41] D.S. Guru, B.B. Kiranagi, P. Nagabhushan, Multivalued type proximity measure and concept of mutual similarity value useful for clustering

symbolic patterns, Pattern Recognit. Lett. 25 (10) (2004) 1203–1213.[42] B. Sinova, M. Ángeles Gil, A. Colubi, S. Van Aelst, The median of a random fuzzy number. The 1-norm distance approach, Fuzzy Sets

Syst. 200 (2011) 99–115.

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

self-organizing maps for imprecise data

Documents