agglomerative hierarchical clustering - computer science · 2010. 11. 2. · agglomerative...

58
Agglomerative Hierarchical Clustering 10/14/2010 1 Loomis & Romanczyk Outline Introduction Distance Example S.L. Dist. Matrices Cluster Plots Dendrodgrams Summary References Questions Extra Stuff T.L. Dist. Matrices A.L. Dist. Matrices Agglomerative Hierarchical Clustering James Loomis & Paul Romanczyk October 27, 2010

Upload: others

Post on 20-Feb-2021

22 views

Category:

Documents


2 download

TRANSCRIPT

  • AgglomerativeHierarchicalClustering

    10/14/20101

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Agglomerative Hierarchical Clustering

    James Loomis & Paul Romanczyk

    October 27, 2010

  • AgglomerativeHierarchicalClustering

    10/14/20102

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Outline

    Outline

    Introduction

    Distance

    ExampleSingle Linkage Distance MatricesCluster PlotsDendrogramsSummary

    References

    Questions

    Extra StuffTotal Linkage Distance MatricesAverage Linkage Distance Matrices

  • AgglomerativeHierarchicalClustering

    10/14/20103

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Hierarchical ClusteringHierarchical clustering:

    I Clustering using a hierarchy of clusters

    I May be represented in a tree structure (dendrogram)

    I Root - a single cluster containing all observations

    I Leaves - individual observations.

    x1

    x8

    x7

    x6

    x5

    x4

    x3

    x2

  • AgglomerativeHierarchicalClustering

    10/14/20103

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Hierarchical ClusteringHierarchical clustering:

    I Clustering using a hierarchy of clusters

    I May be represented in a tree structure (dendrogram)

    I Root - a single cluster containing all observations

    I Leaves - individual observations.

    x1

    x8

    x7

    x6

    x5

    x4

    x3

    x2

  • AgglomerativeHierarchicalClustering

    10/14/20104

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Dendrogram

    [Duda et al., 2001] Figure 10.11

  • AgglomerativeHierarchicalClustering

    10/14/20105

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Two Distinct Approaches:

    I Agglomerative (bottom up, clumping)I Start with n singleton clustersI Successively merge (”clump”) clustersI Computation from one level to another generally simplerI For small number of clusters, takes many iterations

    I Divisive (top down, splitting)I Start with one clusterI Successively split clustersI Single iteration is more expensiveI With fewer clusters, fewer iterations

  • AgglomerativeHierarchicalClustering

    10/14/20105

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Two Distinct Approaches:I Agglomerative (bottom up, clumping)

    I Start with n singleton clustersI Successively merge (”clump”) clustersI Computation from one level to another generally simplerI For small number of clusters, takes many iterations

    I Divisive (top down, splitting)I Start with one clusterI Successively split clustersI Single iteration is more expensiveI With fewer clusters, fewer iterations

  • AgglomerativeHierarchicalClustering

    10/14/20105

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Two Distinct Approaches:I Agglomerative (bottom up, clumping)

    I Start with n singleton clustersI Successively merge (”clump”) clustersI Computation from one level to another generally simplerI For small number of clusters, takes many iterations

    I Divisive (top down, splitting)I Start with one clusterI Successively split clustersI Single iteration is more expensiveI With fewer clusters, fewer iterations

  • AgglomerativeHierarchicalClustering

    10/14/20106

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Agglomerative Clustering Algorithm

    1 c , ĉ ← n2 Di ← {xi} where i = 1, . . . , n

    3 do ĉ ← ĉ − 14 find nearest clusters Di , Dj5 merge Di and Dj6 until c = ĉ

    7 return c clusters

    How do we determine which two clusters are nearest?

  • AgglomerativeHierarchicalClustering

    10/14/20106

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Agglomerative Clustering Algorithm

    1 c , ĉ ← n2 Di ← {xi} where i = 1, . . . , n

    3 do ĉ ← ĉ − 14 find nearest clusters Di , Dj5 merge Di and Dj6 until c = ĉ

    7 return c clusters

    How do we determine which two clusters are nearest?

  • AgglomerativeHierarchicalClustering

    10/14/20107

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Properties of Distance

    I Distance is non-negative.I D(x , y) ≥ 0

    I D(x , y) = 0 if and only if x = y .I Distance is symmetric.

    I D(x , y) = D(y , x)

    I Distance satisfies the triangle inequalityI D(x , z) ≤ D(x , y) + D(y , z)

  • AgglomerativeHierarchicalClustering

    10/14/20108

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Distance Measures—Between Points

    Let ~x1 = [ x1,1 x1,2 · · · x1,n ]T and~x2 = [ x2,1 x2,2 · · · x2,n ]T

    Name Formula

    Manhattan d1(~x1, ~x2) =∑n

    i=1 |x1,i − x2,i |Euclidian d2(~x1, ~x2) =

    √∑ni=1 |x1,i − x2,i |

    2

    P-norm dp(~x1, ~x2) =p√∑n

    i=1 |x1,i − x2,i |p

    Statistical ds(~x1, ~x2) =

    √∑ni=1

    (x1,i−x2,i

    σi

    )2Mahalanobis dm(~x1, ~x2) =

    √(~x1 − ~µ)Σ−1(~x2 − ~µ)T

    Cosine dc(~x1, ~x2) =~x1

    T ~x1||~x2||·||~x2||

    Chebyshev dC (~x1, ~x2) = max(|x1,i − x2,i |)

  • AgglomerativeHierarchicalClustering

    10/14/20109

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Distance Measures—Between Clusters

    Single Linkage d(U,V ),W = min{dU,W , dV ,W }

    d2,4

    1

    2

    3

    4 5

    Complete Linkage d(U,V ),W = max{dU,W , dV ,W }

    d1,5

    1

    2

    3

    4 5

    Average Linkage d(U,V ),W =

    ∑i

    ∑j

    di ,j

    NU,V NW2∑

    i=1

    5∑j=3

    di ,j

    2·3

    1

    2

    3

    4 5

  • AgglomerativeHierarchicalClustering

    10/14/201010

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Data

    x y

    -1.3508 0.9010-0.3674 1.1548-1.5895 -0.0732-1.3615 0.1443-0.7088 0.33240.3155 -0.32201.6638 0.25670.4751 0.25822.0778 0.28481.3015 -1.0126

  • AgglomerativeHierarchicalClustering

    10/14/201011

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using single linkage. Iteration: 11 2 3 4 5 6 7 8 9 10

    1 0.00 1.02 1.00 0.76 0.86 2.07 3.08 1.94 3.48 3.272 1.02 0.00 1.73 1.42 0.89 1.63 2.22 1.23 2.60 2.743 1.00 1.73 0.00 0.32 0.97 1.92 3.27 2.09 3.68 3.044 0.76 1.42 0.32 0.00 0.68 1.74 3.03 1.84 3.44 2.905 0.86 0.89 0.97 0.68 0.00 1.22 2.37 1.19 2.79 2.426 2.07 1.63 1.92 1.74 1.22 0.00 1.47 0.60 1.86 1.207 3.08 2.22 3.27 3.03 2.37 1.47 0.00 1.19 0.41 1.328 1.94 1.23 2.09 1.84 1.19 0.60 1.19 0.00 1.60 1.529 3.48 2.60 3.68 3.44 2.79 1.86 0.41 1.60 0.00 1.51

    10 3.27 2.74 3.04 2.90 2.42 1.20 1.32 1.52 1.51 0.00

  • AgglomerativeHierarchicalClustering

    10/14/201012

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using single linkage. Iteration: 21 2 5 6 7 8 9 10 11

    1 0.00 1.02 0.86 2.07 3.08 1.94 3.48 3.27 0.762 1.02 0.00 0.89 1.63 2.22 1.23 2.60 2.74 1.425 0.86 0.89 0.00 1.22 2.37 1.19 2.79 2.42 0.686 2.07 1.63 1.22 0.00 1.47 0.60 1.86 1.20 1.747 3.08 2.22 2.37 1.47 0.00 1.19 0.41 1.32 3.038 1.94 1.23 1.19 0.60 1.19 0.00 1.60 1.52 1.849 3.48 2.60 2.79 1.86 0.41 1.60 0.00 1.51 3.44

    10 3.27 2.74 2.42 1.20 1.32 1.52 1.51 0.00 2.9011 0.76 1.42 0.68 1.74 3.03 1.84 3.44 2.90 0.00

  • AgglomerativeHierarchicalClustering

    10/14/201013

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using single linkage. Iteration: 31 2 5 6 8 10 11 12

    1 0.00 1.02 0.86 2.07 1.94 3.27 0.76 3.082 1.02 0.00 0.89 1.63 1.23 2.74 1.42 2.225 0.86 0.89 0.00 1.22 1.19 2.42 0.68 2.376 2.07 1.63 1.22 0.00 0.60 1.20 1.74 1.478 1.94 1.23 1.19 0.60 0.00 1.52 1.84 1.19

    10 3.27 2.74 2.42 1.20 1.52 0.00 2.90 1.3211 0.76 1.42 0.68 1.74 1.84 2.90 0.00 3.0312 3.08 2.22 2.37 1.47 1.19 1.32 3.03 0.00

  • AgglomerativeHierarchicalClustering

    10/14/201014

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using single linkage. Iteration: 41 2 5 10 11 12 13

    1 0.00 1.02 0.86 3.27 0.76 3.08 1.942 1.02 0.00 0.89 2.74 1.42 2.22 1.235 0.86 0.89 0.00 2.42 0.68 2.37 1.19

    10 3.27 2.74 2.42 0.00 2.90 1.32 1.2011 0.76 1.42 0.68 2.90 0.00 3.03 1.7412 3.08 2.22 2.37 1.32 3.03 0.00 1.1913 1.94 1.23 1.19 1.20 1.74 1.19 0.00

  • AgglomerativeHierarchicalClustering

    10/14/201015

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using single linkage. Iteration: 51 2 10 12 13 14

    1 0.00 1.02 3.27 3.08 1.94 0.762 1.02 0.00 2.74 2.22 1.23 0.89

    10 3.27 2.74 0.00 1.32 1.20 2.4212 3.08 2.22 1.32 0.00 1.19 2.3713 1.94 1.23 1.20 1.19 0.00 1.1914 0.76 0.89 2.42 2.37 1.19 0.00

  • AgglomerativeHierarchicalClustering

    10/14/201016

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using single linkage. Iteration: 62 10 12 13 15

    2 0.00 2.74 2.22 1.23 0.8910 2.74 0.00 1.32 1.20 2.4212 2.22 1.32 0.00 1.19 2.3713 1.23 1.20 1.19 0.00 1.1915 0.89 2.42 2.37 1.19 0.00

  • AgglomerativeHierarchicalClustering

    10/14/201017

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using single linkage. Iteration: 710 12 13 16

    10 0.00 1.32 1.20 2.4212 1.32 0.00 1.19 2.2213 1.20 1.19 0.00 1.1916 2.42 2.22 1.19 0.00

  • AgglomerativeHierarchicalClustering

    10/14/201018

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using single linkage. Iteration: 810 12 17

    10 0.00 1.32 1.2012 1.32 0.00 1.1917 1.20 1.19 0.00

  • AgglomerativeHierarchicalClustering

    10/14/201019

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using single linkage. Iteration: 910 18

    10 0.00 1.2018 1.20 0.00

  • AgglomerativeHierarchicalClustering

    10/14/201020

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example–Linkage Step 1

    −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−1.5

    −1

    −0.5

    0

    0.5

    1

    1.5

    1

    2

    3

    4

    5

    6

    7 8 9

    10

    Single Linkage Complete Linkage Average Linkage

  • AgglomerativeHierarchicalClustering

    10/14/201021

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example–Linkage Step 2

    −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−1.5

    −1

    −0.5

    0

    0.5

    1

    1.5

    1

    2

    3

    4

    5

    6

    7 8 9

    10

    Single Linkage Complete Linkage Average Linkage

  • AgglomerativeHierarchicalClustering

    10/14/201022

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example–Linkage Step 3

    −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−1.5

    −1

    −0.5

    0

    0.5

    1

    1.5

    1

    2

    3

    4

    5

    6

    7 8 9

    10

    Single Linkage Complete Linkage Average Linkage

  • AgglomerativeHierarchicalClustering

    10/14/201023

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example–Linkage Step 4

    −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−1.5

    −1

    −0.5

    0

    0.5

    1

    1.5

    1

    2

    3

    4

    5

    6

    7 8 9

    10

    Single Linkage Complete Linkage Average Linkage

  • AgglomerativeHierarchicalClustering

    10/14/201024

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example–Linkage Step 5

    −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−1.5

    −1

    −0.5

    0

    0.5

    1

    1.5

    1

    2

    3

    4

    5

    6

    7 8 9

    10

    Single Linkage Complete Linkage Average Linkage

  • AgglomerativeHierarchicalClustering

    10/14/201025

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example–Linkage Step 6

    −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−1.5

    −1

    −0.5

    0

    0.5

    1

    1.5

    1

    2

    3

    4

    5

    6

    7 8 9

    10

    Single Linkage Complete Linkage Average Linkage

  • AgglomerativeHierarchicalClustering

    10/14/201026

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example–Linkage Step 7

    −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−1.5

    −1

    −0.5

    0

    0.5

    1

    1.5

    1

    2

    3

    4

    5

    6

    7 8 9

    10

    Single Linkage Complete Linkage Average Linkage

  • AgglomerativeHierarchicalClustering

    10/14/201027

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example–Linkage Step 8

    −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−1.5

    −1

    −0.5

    0

    0.5

    1

    1.5

    1

    2

    3

    4

    5

    6

    7 8 9

    10

    Single Linkage Complete Linkage Average Linkage

  • AgglomerativeHierarchicalClustering

    10/14/201028

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example–Linkage Step 9

    −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−1.5

    −1

    −0.5

    0

    0.5

    1

    1.5

    1

    2

    3

    4

    5

    6

    7 8 9

    10

    Single Linkage Complete Linkage Average Linkage

  • AgglomerativeHierarchicalClustering

    10/14/201029

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example–Linkage Step 10

    −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−1.5

    −1

    −0.5

    0

    0.5

    1

    1.5

    1

    2

    3

    4

    5

    6

    7 8 9

    10

    Single Linkage Complete Linkage Average Linkage

  • AgglomerativeHierarchicalClustering

    10/14/201030

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example–Dendrogram

    1 2 3 5 7 10 8 6 9 4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    1.1

    Class

    Dis

    tance

  • AgglomerativeHierarchicalClustering

    10/14/201031

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example–Dendrogram

    1 2 3 5 4 7 10 6 9 8

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    Class

    Dis

    tance

  • AgglomerativeHierarchicalClustering

    10/14/201032

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example–Dendrogram

    1 2 3 7 10 4 5 6 9 8

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    1.6

    1.8

    2

    Class

    Dis

    tance

  • AgglomerativeHierarchicalClustering

    10/14/201033

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example–Dendrogram

    1 2 3 5 7 10 8 6 9 40

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    Class

    Distan

    ce

    1 2 3 5 4 7 10 6 9 80

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    Class

    Distan

    ce

    1 2 3 7 10 4 5 6 9 80

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    Class

    Distan

    ce

    Single Linkage Complete Linkage Average Linkage

  • AgglomerativeHierarchicalClustering

    10/14/201034

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Single Linkage Complete Linkage Average Linkagedist. action dist. action dist. action

    0.3151 {4, 3} → 11 0.3151 {4, 3} → 11 0.3151 {4, 3} → 110.4149 {9, 7} → 12 0.4149 {9, 7} → 12 0.4149 {9, 7} → 120.6018 {8, 6} → 13 0.6018 {8, 6} → 13 0.6018 {8, 6} → 130.6792 {11, 5} → 14 0.8576 {5, 1} → 14 0.8244 {11, 5} → 140.7568 {14, 1} → 15 1.0030 {14, 11} → 15 0.8724 {14, 1} → 150.8904 {15, 2} → 16 1.5119 {12, 10} → 16 1.2640 {15, 2} → 161.1862 {16, 13} → 17 1.6271 {13, 2} → 17 1.3598 {13, 10} → 171.1887 {17, 12} → 18 2.0910 {17, 15} → 18 1.4924 {17, 12} → 181.2038 {10, 18} → 19 3.2706 {16, 18} → 19 2.4476 {16, 18} → 19

  • AgglomerativeHierarchicalClustering

    10/14/201035

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    References

    Duda, R., Hart, P., and Stork, D. (2001).Pattern Classification.John Wiley and Sons, 2nd edition.

    Johnson, R. and Wichern, D. (2007).Applied multivariate statistical data analysis.Prentice Hall: Upper Saddle River, NJ, 6th edition.

  • AgglomerativeHierarchicalClustering

    10/14/201036

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Questions

  • AgglomerativeHierarchicalClustering

    10/14/201037

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using total linkage. Iteration: 11 2 3 4 5 6 7 8 9 10

    1 0.00 1.02 1.00 0.76 0.86 2.07 3.08 1.94 3.48 3.272 1.02 0.00 1.73 1.42 0.89 1.63 2.22 1.23 2.60 2.743 1.00 1.73 0.00 0.32 0.97 1.92 3.27 2.09 3.68 3.044 0.76 1.42 0.32 0.00 0.68 1.74 3.03 1.84 3.44 2.905 0.86 0.89 0.97 0.68 0.00 1.22 2.37 1.19 2.79 2.426 2.07 1.63 1.92 1.74 1.22 0.00 1.47 0.60 1.86 1.207 3.08 2.22 3.27 3.03 2.37 1.47 0.00 1.19 0.41 1.328 1.94 1.23 2.09 1.84 1.19 0.60 1.19 0.00 1.60 1.529 3.48 2.60 3.68 3.44 2.79 1.86 0.41 1.60 0.00 1.51

    10 3.27 2.74 3.04 2.90 2.42 1.20 1.32 1.52 1.51 0.00

  • AgglomerativeHierarchicalClustering

    10/14/201038

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using total linkage. Iteration: 21 2 5 6 7 8 9 10 11

    1 0.00 1.02 0.86 2.07 3.08 1.94 3.48 3.27 1.002 1.02 0.00 0.89 1.63 2.22 1.23 2.60 2.74 1.735 0.86 0.89 0.00 1.22 2.37 1.19 2.79 2.42 0.976 2.07 1.63 1.22 0.00 1.47 0.60 1.86 1.20 1.927 3.08 2.22 2.37 1.47 0.00 1.19 0.41 1.32 3.278 1.94 1.23 1.19 0.60 1.19 0.00 1.60 1.52 2.099 3.48 2.60 2.79 1.86 0.41 1.60 0.00 1.51 3.68

    10 3.27 2.74 2.42 1.20 1.32 1.52 1.51 0.00 3.0411 1.00 1.73 0.97 1.92 3.27 2.09 3.68 3.04 0.32

  • AgglomerativeHierarchicalClustering

    10/14/201039

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using total linkage. Iteration: 31 2 5 6 8 10 11 12

    1 0.00 1.02 0.86 2.07 1.94 3.27 1.00 3.482 1.02 0.00 0.89 1.63 1.23 2.74 1.73 2.605 0.86 0.89 0.00 1.22 1.19 2.42 0.97 2.796 2.07 1.63 1.22 0.00 0.60 1.20 1.92 1.868 1.94 1.23 1.19 0.60 0.00 1.52 2.09 1.60

    10 3.27 2.74 2.42 1.20 1.52 0.00 3.04 1.5111 1.00 1.73 0.97 1.92 2.09 3.04 0.32 3.6812 3.48 2.60 2.79 1.86 1.60 1.51 3.68 0.41

  • AgglomerativeHierarchicalClustering

    10/14/201040

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using total linkage. Iteration: 41 2 5 10 11 12 13

    1 0.00 1.02 0.86 3.27 1.00 3.48 2.072 1.02 0.00 0.89 2.74 1.73 2.60 1.635 0.86 0.89 0.00 2.42 0.97 2.79 1.22

    10 3.27 2.74 2.42 0.00 3.04 1.51 1.5211 1.00 1.73 0.97 3.04 0.32 3.68 2.0912 3.48 2.60 2.79 1.51 3.68 0.41 1.8613 2.07 1.63 1.22 1.52 2.09 1.86 0.60

  • AgglomerativeHierarchicalClustering

    10/14/201041

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using total linkage. Iteration: 52 10 11 12 13 14

    2 0.00 2.74 1.73 2.60 1.63 1.0210 2.74 0.00 3.04 1.51 1.52 3.2711 1.73 3.04 0.32 3.68 2.09 1.0012 2.60 1.51 3.68 0.41 1.86 3.4813 1.63 1.52 2.09 1.86 0.60 2.0714 1.02 3.27 1.00 3.48 2.07 0.86

  • AgglomerativeHierarchicalClustering

    10/14/201042

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using total linkage. Iteration: 62 10 12 13 15

    2 0.00 2.74 2.60 1.63 1.7310 2.74 0.00 1.51 1.52 3.2712 2.60 1.51 0.41 1.86 3.6813 1.63 1.52 1.86 0.60 2.0915 1.73 3.27 3.68 2.09 1.00

  • AgglomerativeHierarchicalClustering

    10/14/201043

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using total linkage. Iteration: 72 13 15 16

    2 0.00 1.63 1.73 2.7413 1.63 0.60 2.09 1.8615 1.73 2.09 1.00 3.6816 2.74 1.86 3.68 1.51

  • AgglomerativeHierarchicalClustering

    10/14/201044

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using total linkage. Iteration: 815 16 17

    15 1.00 3.68 2.0916 3.68 1.51 2.7417 2.09 2.74 1.63

  • AgglomerativeHierarchicalClustering

    10/14/201045

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using total linkage. Iteration: 916 18

    16 0.00 3.2718 3.27 0.00

  • AgglomerativeHierarchicalClustering

    10/14/201046

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using average linkage. Iteration: 11 2 3 4 5 6 7 8 9 10

    1 0.00 1.02 1.00 0.76 0.86 2.07 3.08 1.94 3.48 3.272 1.02 0.00 1.73 1.42 0.89 1.63 2.22 1.23 2.60 2.743 1.00 1.73 0.00 0.32 0.97 1.92 3.27 2.09 3.68 3.044 0.76 1.42 0.32 0.00 0.68 1.74 3.03 1.84 3.44 2.905 0.86 0.89 0.97 0.68 0.00 1.22 2.37 1.19 2.79 2.426 2.07 1.63 1.92 1.74 1.22 0.00 1.47 0.60 1.86 1.207 3.08 2.22 3.27 3.03 2.37 1.47 0.00 1.19 0.41 1.328 1.94 1.23 2.09 1.84 1.19 0.60 1.19 0.00 1.60 1.529 3.48 2.60 3.68 3.44 2.79 1.86 0.41 1.60 0.00 1.51

    10 3.27 2.74 3.04 2.90 2.42 1.20 1.32 1.52 1.51 0.00

  • AgglomerativeHierarchicalClustering

    10/14/201047

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using average linkage. Iteration: 21 2 5 6 7 8 9 10 11

    1 0.00 1.02 0.86 2.07 3.08 1.94 3.48 3.27 0.882 1.02 0.00 0.89 1.63 2.22 1.23 2.60 2.74 1.585 0.86 0.89 0.00 1.22 2.37 1.19 2.79 2.42 0.826 2.07 1.63 1.22 0.00 1.47 0.60 1.86 1.20 1.837 3.08 2.22 2.37 1.47 0.00 1.19 0.41 1.32 3.158 1.94 1.23 1.19 0.60 1.19 0.00 1.60 1.52 1.979 3.48 2.60 2.79 1.86 0.41 1.60 0.00 1.51 3.56

    10 3.27 2.74 2.42 1.20 1.32 1.52 1.51 0.00 2.9711 0.88 1.58 0.82 1.83 3.15 1.97 3.56 2.97 0.16

  • AgglomerativeHierarchicalClustering

    10/14/201048

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using average linkage. Iteration: 31 2 5 6 8 10 11 12

    1 0.00 1.02 0.86 2.07 1.94 3.27 0.88 3.282 1.02 0.00 0.89 1.63 1.23 2.74 1.58 2.415 0.86 0.89 0.00 1.22 1.19 2.42 0.82 2.586 2.07 1.63 1.22 0.00 0.60 1.20 1.83 1.678 1.94 1.23 1.19 0.60 0.00 1.52 1.97 1.40

    10 3.27 2.74 2.42 1.20 1.52 0.00 2.97 1.4211 0.88 1.58 0.82 1.83 1.97 2.97 0.16 3.3612 3.28 2.41 2.58 1.67 1.40 1.42 3.36 0.21

  • AgglomerativeHierarchicalClustering

    10/14/201049

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using average linkage. Iteration: 41 2 5 10 11 12 13

    1 0.00 1.02 0.86 3.27 0.88 3.28 2.002 1.02 0.00 0.89 2.74 1.58 2.41 1.435 0.86 0.89 0.00 2.42 0.82 2.58 1.20

    10 3.27 2.74 2.42 0.00 2.97 1.42 1.3611 0.88 1.58 0.82 2.97 0.16 3.36 1.9012 3.28 2.41 2.58 1.42 3.36 0.21 1.5313 2.00 1.43 1.20 1.36 1.90 1.53 0.30

  • AgglomerativeHierarchicalClustering

    10/14/201050

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using average linkage. Iteration: 51 2 10 12 13 14

    1 0.00 1.02 3.27 3.28 2.00 0.872 1.02 0.00 2.74 2.41 1.43 1.35

    10 3.27 2.74 0.00 1.42 1.36 2.7912 3.28 2.41 1.42 0.21 1.53 3.1013 2.00 1.43 1.36 1.53 0.30 1.6714 0.87 1.35 2.79 3.10 1.67 0.44

  • AgglomerativeHierarchicalClustering

    10/14/201051

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using average linkage. Iteration: 62 10 12 13 15

    2 0.00 2.74 2.41 1.43 1.2610 2.74 0.00 1.42 1.36 2.9112 2.41 1.42 0.21 1.53 3.1413 1.43 1.36 1.53 0.30 1.7515 1.26 2.91 3.14 1.75 0.57

  • AgglomerativeHierarchicalClustering

    10/14/201052

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using average linkage. Iteration: 710 12 13 16

    10 0.00 1.42 1.36 2.8712 1.42 0.21 1.53 3.0013 1.36 1.53 0.30 1.6916 2.87 3.00 1.69 0.77

  • AgglomerativeHierarchicalClustering

    10/14/201053

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using average linkage. Iteration: 812 16 17

    12 0.21 3.00 1.4916 3.00 0.77 2.0817 1.49 2.08 0.74

  • AgglomerativeHierarchicalClustering

    10/14/201054

    Loomis &Romanczyk

    Outline

    Introduction

    Distance

    Example

    S.L. Dist. Matrices

    Cluster Plots

    Dendrodgrams

    Summary

    References

    Questions

    Extra Stuff

    T.L. Dist. Matrices

    A.L. Dist. Matrices

    Example

    Cluster distances using average linkage. Iteration: 916 18

    16 0.00 2.4518 2.45 0.00

    OutlineIntroductionDistanceExampleSingle Linkage Distance MatricesCluster PlotsDendrogramsSummary

    ReferencesQuestionsExtra StuffTotal Linkage Distance MatricesAverage Linkage Distance Matrices