set-values prototypes through consensus analysis
TRANSCRIPT
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Set-valued prototypesthrough Consensus Analysis
M. Fordellone1 F. Palumbo2
1Department of Statistical SciencesUniversity of Padua (Italy)
email: [email protected]
2Department of Political SciencesUniversity of Naples (Italy)email: [email protected]
IFCS ConferenceJuly 6th 2015, Bologna (Italy)
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Outline
1 Prototypes definitionWhat is a prototype?
2 Consensus AnalysisConsensus clusteringConsensus measurement
3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
4 Simulated data examplesEight experimental contexts
5 Application on real dataI.P.I.P. test
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
What is a prototype?
Outline
1 Prototypes definitionWhat is a prototype?
2 Consensus AnalysisConsensus clusteringConsensus measurement
3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
4 Simulated data examplesEight experimental contexts
5 Application on real dataI.P.I.P. test
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
What is a prototype?
What is a prototype?
According to Rosch (1975, 1999), prototypes are the elements thatbetter than others represent a category.Smith and Medin (1981) refer to the concept of category as thehighest order of genera that cannot be defined by a mere listing ofproperties shared by all elements.A prototype is not necessarily a real element of the category, itcan be observed or unobserved (abstract) entity (Medin, D. L. andSchaffer, M. M., 1978).
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Consensus clusteringConsensus measurement
Outline
1 Prototypes definitionWhat is a prototype?
2 Consensus AnalysisConsensus clusteringConsensus measurement
3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
4 Simulated data examplesEight experimental contexts
5 Application on real dataI.P.I.P. test
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Consensus clusteringConsensus measurement
Consensus concept
Finding and measuring the agreement between two or more parti-tions of the same data set is of substantial interest in cluster analysis.This particular case of consensus analysis is also known as consensusclustering.
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Consensus clusteringConsensus measurement
Comparing partitions
Let X be a N×J data matrix, and T and V two partitions of X , thennrc (r = 1, . . . ,R; c = 1, . . . ,C ) represents the number of objectsassigned to the classes tr and vc , with respect to the two partitioningcriteria. Consensus between the partitions T and V is evaluatedstarting from the entries of the cross-classifying contingency table.
Table : Contingency table
Partition Vv1 v2 · · · vC
Partition T
t1 n11 n12 · · · n1C n1·t2 n21 n22 · · · n2C n2·...
......
. . ....
...tR nR1 nR2 · · · nRC nR·
n·1 n·2 · · · n·C n
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Consensus clusteringConsensus measurement
Comparing partitions
Let X be a N×J data matrix, and T and V two partitions of X , thennrc (r = 1, . . . ,R; c = 1, . . . ,C ) represents the number of objectsassigned to the classes tr and vc , with respect to the two partitioningcriteria. Consensus between the partitions T and V is evaluatedstarting from the entries of the cross-classifying contingency table.
Table : Contingency table
Partition Vv1 v2 · · · vC
Partition T
t1 n11 n12 · · · n1C n1·t2 n21 n22 · · · n2C n2·...
......
. . ....
...tR nR1 nR2 · · · nRC nR·
n·1 n·2 · · · n·C n
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Consensus clusteringConsensus measurement
Outline
1 Prototypes definitionWhat is a prototype?
2 Consensus AnalysisConsensus clusteringConsensus measurement
3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
4 Simulated data examplesEight experimental contexts
5 Application on real dataI.P.I.P. test
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Consensus clusteringConsensus measurement
Measure of Consensus
Number of ways that n units can pair:S =
(n2
)= n(n−1)
2
Total number of Agreements:
A =(n
2
)+∑R
r=1
∑Cc=1 n2
rc − 12
[∑Rr=1 n2
r · +∑C
c=1 n2·c
]Total number of Disagreements:
D = 12
[∑Rr=1 n2
r · +∑C
c=1 n2·c
]−∑R
r=1
∑Cc=1 n2
rc
Table : Measures of Consensus
Authors Measure RangeRand (1971) A/S ∈ [0, 1]Arabie et al. (1973) D/S ∈ [0, 1]Hubert (1977) (A− D)/S ∈ [0, 1]
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Consensus clusteringConsensus measurement
Measure of Consensus
Number of ways that n units can pair:S =
(n2
)= n(n−1)
2Total number of Agreements:
A =(n
2
)+∑R
r=1
∑Cc=1 n2
rc − 12
[∑Rr=1 n2
r · +∑C
c=1 n2·c
]
Total number of Disagreements:
D = 12
[∑Rr=1 n2
r · +∑C
c=1 n2·c
]−∑R
r=1
∑Cc=1 n2
rc
Table : Measures of Consensus
Authors Measure RangeRand (1971) A/S ∈ [0, 1]Arabie et al. (1973) D/S ∈ [0, 1]Hubert (1977) (A− D)/S ∈ [0, 1]
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Consensus clusteringConsensus measurement
Measure of Consensus
Number of ways that n units can pair:S =
(n2
)= n(n−1)
2Total number of Agreements:
A =(n
2
)+∑R
r=1
∑Cc=1 n2
rc − 12
[∑Rr=1 n2
r · +∑C
c=1 n2·c
]Total number of Disagreements:
D = 12
[∑Rr=1 n2
r · +∑C
c=1 n2·c
]−∑R
r=1
∑Cc=1 n2
rc
Table : Measures of Consensus
Authors Measure RangeRand (1971) A/S ∈ [0, 1]Arabie et al. (1973) D/S ∈ [0, 1]Hubert (1977) (A− D)/S ∈ [0, 1]
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Consensus clusteringConsensus measurement
Measure of Consensus
Number of ways that n units can pair:S =
(n2
)= n(n−1)
2Total number of Agreements:
A =(n
2
)+∑R
r=1
∑Cc=1 n2
rc − 12
[∑Rr=1 n2
r · +∑C
c=1 n2·c
]Total number of Disagreements:
D = 12
[∑Rr=1 n2
r · +∑C
c=1 n2·c
]−∑R
r=1
∑Cc=1 n2
rc
Table : Measures of Consensus
Authors Measure RangeRand (1971) A/S ∈ [0, 1]Arabie et al. (1973) D/S ∈ [0, 1]Hubert (1977) (A− D)/S ∈ [0, 1]
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
Outline
1 Prototypes definitionWhat is a prototype?
2 Consensus AnalysisConsensus clusteringConsensus measurement
3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
4 Simulated data examplesEight experimental contexts
5 Application on real dataI.P.I.P. test
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
k-Means method
K-Means method is developed by Queen (1967). He suggests thename k-Means for describing an algorithm that assigns each unitto the group having the nearest centroid (mean). The iterativeprocedure consists in four principal steps:
1 Randomly select K group centers;
2 Calculate the distance between each data point and groupcenters;
3 Assign the data point to the group whose distance from thegroup center is minimum among all the group centers;
4 Recalculate the new group centers.
The procedure repeats from step 2 until no more assignments takeplace.
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
k-Means method
K-Means method is developed by Queen (1967). He suggests thename k-Means for describing an algorithm that assigns each unitto the group having the nearest centroid (mean). The iterativeprocedure consists in four principal steps:
1 Randomly select K group centers;
2 Calculate the distance between each data point and groupcenters;
3 Assign the data point to the group whose distance from thegroup center is minimum among all the group centers;
4 Recalculate the new group centers.
The procedure repeats from step 2 until no more assignments takeplace.
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
k-Means method
K-Means method is developed by Queen (1967). He suggests thename k-Means for describing an algorithm that assigns each unitto the group having the nearest centroid (mean). The iterativeprocedure consists in four principal steps:
1 Randomly select K group centers;
2 Calculate the distance between each data point and groupcenters;
3 Assign the data point to the group whose distance from thegroup center is minimum among all the group centers;
4 Recalculate the new group centers.
The procedure repeats from step 2 until no more assignments takeplace.
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
k-Means method
K-Means method is developed by Queen (1967). He suggests thename k-Means for describing an algorithm that assigns each unitto the group having the nearest centroid (mean). The iterativeprocedure consists in four principal steps:
1 Randomly select K group centers;
2 Calculate the distance between each data point and groupcenters;
3 Assign the data point to the group whose distance from thegroup center is minimum among all the group centers;
4 Recalculate the new group centers.
The procedure repeats from step 2 until no more assignments takeplace.
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
k-Means method
K-Means method is developed by Queen (1967). He suggests thename k-Means for describing an algorithm that assigns each unitto the group having the nearest centroid (mean). The iterativeprocedure consists in four principal steps:
1 Randomly select K group centers;
2 Calculate the distance between each data point and groupcenters;
3 Assign the data point to the group whose distance from thegroup center is minimum among all the group centers;
4 Recalculate the new group centers.
The procedure repeats from step 2 until no more assignments takeplace.
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
k-Means method
K-Means method is developed by Queen (1967). He suggests thename k-Means for describing an algorithm that assigns each unitto the group having the nearest centroid (mean). The iterativeprocedure consists in four principal steps:
1 Randomly select K group centers;
2 Calculate the distance between each data point and groupcenters;
3 Assign the data point to the group whose distance from thegroup center is minimum among all the group centers;
4 Recalculate the new group centers.
The procedure repeats from step 2 until no more assignments takeplace.
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
k-Means method
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
k-Means method
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
k-Means method
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
Outline
1 Prototypes definitionWhat is a prototype?
2 Consensus AnalysisConsensus clusteringConsensus measurement
3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
4 Simulated data examplesEight experimental contexts
5 Application on real dataI.P.I.P. test
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
Fuzzy clustering
In fuzzy clustering data elements can belong to more than one group,in according to a measure of association given by a set of member-ship levels.The memberships, ∈ [0, 1], indicate the strength of the associationbetween each data element and each group.
In our case the units with the max membership degree can be uni-vocally assigned to the corresponding group.
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
Fuzzy clustering
In fuzzy clustering data elements can belong to more than one group,in according to a measure of association given by a set of member-ship levels.The memberships, ∈ [0, 1], indicate the strength of the associationbetween each data element and each group.In our case the units with the max membership degree can be uni-vocally assigned to the corresponding group.
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
Outline
1 Prototypes definitionWhat is a prototype?
2 Consensus AnalysisConsensus clusteringConsensus measurement
3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
4 Simulated data examplesEight experimental contexts
5 Application on real dataI.P.I.P. test
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
FCM and AA
Fuzzy c-means (Bezdek et al., 1984) and Archetypal Analysis (Cutlerand Breiman, 1994) can be seen as a fuzzy approach of the k-Means,under different constraints.
Fuzzy c-Means minimizes the sum of distances between each pointand a set of K centers; Archetypal Analysis minimizes the sum ofdistances between each point and a set of K archetypes as definedby a convex combination of extreme points.
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
FCM and AA
Fuzzy c-means (Bezdek et al., 1984) and Archetypal Analysis (Cutlerand Breiman, 1994) can be seen as a fuzzy approach of the k-Means,under different constraints.Fuzzy c-Means minimizes the sum of distances between each pointand a set of K centers; Archetypal Analysis minimizes the sum ofdistances between each point and a set of K archetypes as definedby a convex combination of extreme points.
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
FCM and AA
Fuzzy c-Means
W =n∑
i=1
K∑k=1
γ2ik‖xi − ck‖2
γik is the membership level ofthe i-th unit and of the k-thgroupck is the center of the k-thgroupConstraints:∑K
k=1 γik = 1;γik ≥ 0.∀k ∈ 1, 2, . . . ,K
Archetypal Analysis
J =n∑
i=1
K∑k=1
‖xi − δikak‖2
δik is the membership level ofthe i-th unit and of the k-thgroupak =
∑ni=1 xiβik is the
archetype of the k-th groupConstraints:∑K
k=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.
∀k ∈ 1, 2, . . . ,K
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
FCM and AA
Fuzzy c-Means
W =n∑
i=1
K∑k=1
γ2ik‖xi − ck‖2
γik is the membership level ofthe i-th unit and of the k-thgroup
ck is the center of the k-thgroupConstraints:∑K
k=1 γik = 1;γik ≥ 0.∀k ∈ 1, 2, . . . ,K
Archetypal Analysis
J =n∑
i=1
K∑k=1
‖xi − δikak‖2
δik is the membership level ofthe i-th unit and of the k-thgroupak =
∑ni=1 xiβik is the
archetype of the k-th groupConstraints:∑K
k=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.
∀k ∈ 1, 2, . . . ,K
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
FCM and AA
Fuzzy c-Means
W =n∑
i=1
K∑k=1
γ2ik‖xi − ck‖2
γik is the membership level ofthe i-th unit and of the k-thgroupck is the center of the k-thgroup
Constraints:∑Kk=1 γik = 1;
γik ≥ 0.∀k ∈ 1, 2, . . . ,K
Archetypal Analysis
J =n∑
i=1
K∑k=1
‖xi − δikak‖2
δik is the membership level ofthe i-th unit and of the k-thgroupak =
∑ni=1 xiβik is the
archetype of the k-th groupConstraints:∑K
k=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.
∀k ∈ 1, 2, . . . ,K
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
FCM and AA
Fuzzy c-Means
W =n∑
i=1
K∑k=1
γ2ik‖xi − ck‖2
γik is the membership level ofthe i-th unit and of the k-thgroupck is the center of the k-thgroupConstraints:∑K
k=1 γik = 1;γik ≥ 0.∀k ∈ 1, 2, . . . ,K
Archetypal Analysis
J =n∑
i=1
K∑k=1
‖xi − δikak‖2
δik is the membership level ofthe i-th unit and of the k-thgroupak =
∑ni=1 xiβik is the
archetype of the k-th groupConstraints:∑K
k=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.
∀k ∈ 1, 2, . . . ,K
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
FCM and AA
Fuzzy c-Means
W =n∑
i=1
K∑k=1
γ2ik‖xi − ck‖2
γik is the membership level ofthe i-th unit and of the k-thgroupck is the center of the k-thgroupConstraints:∑K
k=1 γik = 1;γik ≥ 0.∀k ∈ 1, 2, . . . ,K
Archetypal Analysis
J =n∑
i=1
K∑k=1
‖xi − δikak‖2
δik is the membership level ofthe i-th unit and of the k-thgroupak =
∑ni=1 xiβik is the
archetype of the k-th groupConstraints:∑K
k=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.
∀k ∈ 1, 2, . . . ,K
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
FCM and AA
Fuzzy c-Means
W =n∑
i=1
K∑k=1
γ2ik‖xi − ck‖2
γik is the membership level ofthe i-th unit and of the k-thgroupck is the center of the k-thgroupConstraints:∑K
k=1 γik = 1;γik ≥ 0.∀k ∈ 1, 2, . . . ,K
Archetypal Analysis
J =n∑
i=1
K∑k=1
‖xi − δikak‖2
δik is the membership level ofthe i-th unit and of the k-thgroup
ak =∑n
i=1 xiβik is thearchetype of the k-th groupConstraints:∑K
k=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.
∀k ∈ 1, 2, . . . ,K
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
FCM and AA
Fuzzy c-Means
W =n∑
i=1
K∑k=1
γ2ik‖xi − ck‖2
γik is the membership level ofthe i-th unit and of the k-thgroupck is the center of the k-thgroupConstraints:∑K
k=1 γik = 1;γik ≥ 0.∀k ∈ 1, 2, . . . ,K
Archetypal Analysis
J =n∑
i=1
K∑k=1
‖xi − δikak‖2
δik is the membership level ofthe i-th unit and of the k-thgroupak =
∑ni=1 xiβik is the
archetype of the k-th group
Constraints:∑Kk=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.
∀k ∈ 1, 2, . . . ,K
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
FCM and AA
Fuzzy c-Means
W =n∑
i=1
K∑k=1
γ2ik‖xi − ck‖2
γik is the membership level ofthe i-th unit and of the k-thgroupck is the center of the k-thgroupConstraints:∑K
k=1 γik = 1;γik ≥ 0.∀k ∈ 1, 2, . . . ,K
Archetypal Analysis
J =n∑
i=1
K∑k=1
‖xi − δikak‖2
δik is the membership level ofthe i-th unit and of the k-thgroupak =
∑ni=1 xiβik is the
archetype of the k-th groupConstraints:∑K
k=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.
∀k ∈ 1, 2, . . . ,KM. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Outline
1 Prototypes definitionWhat is a prototype?
2 Consensus AnalysisConsensus clusteringConsensus measurement
3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
4 Simulated data examplesEight experimental contexts
5 Application on real dataI.P.I.P. test
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data
Three groups of units in different experimental contexts have beengenerated by a multivariate Gaussian distribution with eight dimen-sions (four variables are white noise).
Table : Experimental contexts
Size Correlation Kurtosis
Case 1 900 0.2− 0.4 β = 3Case 2 300 0.2− 0.4 β = 3Case 3 900 0.2− 0.4 β < 3Case 4 300 0.2− 0.4 β < 3Case 5 900 0.6− 0.8 β = 3Case 6 300 0.6− 0.8 β = 3Case 7 900 0.6− 0.8 β < 3Case 8 300 0.6− 0.8 β < 3
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data
Three groups of units in different experimental contexts have beengenerated by a multivariate Gaussian distribution with eight dimen-sions (four variables are white noise).
Table : Experimental contexts
Size Correlation Kurtosis
Case 1 900 0.2− 0.4 β = 3Case 2 300 0.2− 0.4 β = 3Case 3 900 0.2− 0.4 β < 3Case 4 300 0.2− 0.4 β < 3Case 5 900 0.6− 0.8 β = 3Case 6 300 0.6− 0.8 β = 3Case 7 900 0.6− 0.8 β < 3Case 8 300 0.6− 0.8 β < 3
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 1K-means groups
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 2K-means groups
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 3K-means groups
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 4K-means groups
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 5K-means groups
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 6K-means groups
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 7K-means groups
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 8K-means groups
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 1Memberships FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 2Memberships FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 3Memberships FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 4Memberships FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 5Memberships FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 6Memberships FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 7Memberships FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 8Memberships FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 1Consensus Analysis between FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 2Consensus Analysis between FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 3Consensus Analysis between FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 4Consensus Analysis between FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 5Consensus Analysis between FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 6Consensus Analysis between FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 7Consensus Analysis between FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 8Consensus Analysis between FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 1Consensus groups FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 2Consensus groups FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 3Consensus groups FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 4Consensus groups FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 5Consensus groups FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 6Consensus groups FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 7Consensus groups FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Case 8Consensus groups FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Eight experimental contexts
Simulated data: Summary
Table : Results of Consensus Analysis and definition of the prototypes
Experimental Conditions Prototyping Results Consensus Measuring
N Corr. Kurt. K Size Rand Arabie Hubert900 0.2− 0.4 β = 3 3 900 (100.0%) 1.000 0.000 1.000300 0.2− 0.4 β = 3 3 300 (100.0%) 1.000 0.000 1.000900 0.2− 0.4 β < 3 3 625 (69.4%) 0.725 0.275 0.449300 0.2− 0.4 β < 3 3 185 (61.7%) 0.683 0.317 0.365900 0.6− 0.8 β = 3 3 599 (66.6%) 0.753 0.247 0.506300 0.6− 0.8 β = 3 3 202 (67.3%) 0.758 0.242 0.517900 0.6− 0.8 β < 3 3 533 (59.2%) 0.698 0.302 0.397300 0.6− 0.8 β < 3 3 189 (63.0%) 0.720 0.280 0.439
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
Outline
1 Prototypes definitionWhat is a prototype?
2 Consensus AnalysisConsensus clusteringConsensus measurement
3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)
4 Simulated data examplesEight experimental contexts
5 Application on real dataI.P.I.P. test
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testAbout data
Web Site: http://personality-testing.info/ rawdata/
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testAbout data
Four different scales were used as part of an experiment DISC per-sonality test. The scales are from the International Personality ItemPool (http://ipip.ori.org/newCPIKey.htm).The scales used are:
Assertiveness, is the quality of being self-assured andconfident without being aggressive
Social confidence, is generally described as a state of beingcertain
Adventurousness, is represented by the activities with somepotential for physical danger
Dominance, is conceptualized as a measure of individualdifferences in levels of group-based discrimination
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testAbout data
Four different scales were used as part of an experiment DISC per-sonality test. The scales are from the International Personality ItemPool (http://ipip.ori.org/newCPIKey.htm).The scales used are:
Assertiveness, is the quality of being self-assured andconfident without being aggressive
Social confidence, is generally described as a state of beingcertain
Adventurousness, is represented by the activities with somepotential for physical danger
Dominance, is conceptualized as a measure of individualdifferences in levels of group-based discrimination
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testAbout data
Four different scales were used as part of an experiment DISC per-sonality test. The scales are from the International Personality ItemPool (http://ipip.ori.org/newCPIKey.htm).The scales used are:
Assertiveness, is the quality of being self-assured andconfident without being aggressive
Social confidence, is generally described as a state of beingcertain
Adventurousness, is represented by the activities with somepotential for physical danger
Dominance, is conceptualized as a measure of individualdifferences in levels of group-based discrimination
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testAbout data
Four different scales were used as part of an experiment DISC per-sonality test. The scales are from the International Personality ItemPool (http://ipip.ori.org/newCPIKey.htm).The scales used are:
Assertiveness, is the quality of being self-assured andconfident without being aggressive
Social confidence, is generally described as a state of beingcertain
Adventurousness, is represented by the activities with somepotential for physical danger
Dominance, is conceptualized as a measure of individualdifferences in levels of group-based discrimination
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testAbout data
Four different scales were used as part of an experiment DISC per-sonality test. The scales are from the International Personality ItemPool (http://ipip.ori.org/newCPIKey.htm).The scales used are:
Assertiveness, is the quality of being self-assured andconfident without being aggressive
Social confidence, is generally described as a state of beingcertain
Adventurousness, is represented by the activities with somepotential for physical danger
Dominance, is conceptualized as a measure of individualdifferences in levels of group-based discrimination
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testAbout data
Dataset consists in 40 items (10 for each scale) and 898 individuals.The items were rated on a 5 point scale where:
1=Strongly disagree,
2=Disagree,
3=Neither agree not disagree,
4=Agree,
5=Strongly agree.
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testAbout data
Dataset consists in 40 items (10 for each scale) and 898 individuals.The items were rated on a 5 point scale where:
1=Strongly disagree,
2=Disagree,
3=Neither agree not disagree,
4=Agree,
5=Strongly agree.
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testAbout data
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testPrincipal Component Analysis
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testScree-plots FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testK-means groups
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testMemberships FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testConsensus Analysis between FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testConsensus groups FCM and AA
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
I.P.I.P. test
I.P.I.P. testDescription of prototypes
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
Conclusions
The results of the applications confirm the following hypothesis:
When the groups are well defined, avoiding any overlapping,the consensus analysis between the two different partitioningmethods underlined the presence of the groups;
The simulation has been useful to study which are the causesthat can deeply affect the consensus among the twoapproaches: firstly correlation between variables, secondlypresence of multivariate outliers (different kurtosis levels).
We believe that the prototypes definitions through the consensusapproach is more reliable in comparison to the classical approaches:the finding of the groups in respect to the consensus-criterion, guar-antees more homogeneous prototypes.
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Prototypes definitionConsensus Analysis
Partitioning methodsSimulated data examplesApplication on real data
Conclusions
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis
Appendix Bibliography
M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis