introduction to machine learningudel.edu/~amotong/teaching/machine learning/lectures/(lec 13... ·...
TRANSCRIPT
![Page 1: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/1.jpg)
Introduction to Machine Learning
Introduction to Machine Learning Amo G. Tong 1
![Page 2: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/2.jpg)
Lecture 13Unsupervised Learning
• K-means Framework
• Cut-based Framework
• Agglomerative Framework
• Divisive Framework
• Some materials are courtesy of Vibhave Gogate , Carlos Guestrin, Dan Klein & Luke Zettlemoyer, Eric Xing, Hastie.
• All pictures belong to their creators.
Introduction to Machine Learning Amo G. Tong 2
![Page 3: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/3.jpg)
Introduction to Machine Learning Amo G. Tong 3
Machine Learning
Machine Learning
Supervised Learning 𝒇(𝒙) Reinforcement Learning
ParametricRegressions vs ClassificationContinuous vs DiscreteLinear vs Non-linear
Methods:Linear regressionDecision TreeNeural network….
Non parametric
Instance-based learning KNN
Unsupervised Learning
Clustering
![Page 4: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/4.jpg)
Introduction to Machine Learning Amo G. Tong 4
Clustering
• Input: some data
• Goal: infer group information
![Page 5: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/5.jpg)
Introduction to Machine Learning Amo G. Tong 5
Clustering
• Input: some data
• Goal: infer group information
• E.g. Group emails, search results, detection styles.
source : http://ogrisel.github.io/scikit-learn.org/sklearn-tutorial/_images/plot_cluster_comparison_11.png
![Page 6: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/6.jpg)
Introduction to Machine Learning Amo G. Tong 6
Clustering
• Input: some data
• Goal: infer group information
• E.g. Group emails, search results, detection styles.
Edge Foci Interest PointsDOI: 10.1109/ICCV.2011.6126263
![Page 7: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/7.jpg)
Introduction to Machine Learning Amo G. Tong 7
Clustering (Eric Xing)
• Input: some data
• Goal: infer group information
• Clustering is subjective.
![Page 8: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/8.jpg)
Introduction to Machine Learning Amo G. Tong 8
Clustering
• Input: some data
• Goal: infer group information
• Clustering is subjective.
• Similarity??
• Output:
• a partition
• Some pattern can reflect the group information.
![Page 9: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/9.jpg)
Introduction to Machine Learning Amo G. Tong 9
Clustering
• Input: some data
• Goal: infer group information
• E.g. Group emails, research results, detection styles.
• We have data but there is no label.
• We do not know how many clusters.
• We do not know which data belongs to which cluster.
• We do not even know if the hidden pattern exists.
• BUT we never give up..
![Page 10: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/10.jpg)
Introduction to Machine Learning Amo G. Tong 10
Clustering
• BUT we never give up..
• Partition based framework
• Hierarchical clustering framework
![Page 11: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/11.jpg)
Introduction to Machine Learning Amo G. Tong 11
K-means Framework
• We have some data.
• We can define (a) the similarity between two instances and (b) the center of a set of instances.
• E.g. Euclidian space (real vector)
• Distance 𝒙𝟏 − 𝒙𝟐2
• similarity=1/distance
• Center of 𝑥1, … , 𝑥𝑛 : ഥ𝒙 =σ 𝒙𝑖
𝑛
![Page 12: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/12.jpg)
Introduction to Machine Learning Amo G. Tong 12
K-means Framework
• We have some data.
• We can define (a) the similarity between two instances and (b) the center of a set of instances.
• Suppose there are 𝑘 clusters.
• Randomly select 𝑘 centers
• Repeat
• Assign each instance to the closest center. (now we have 𝑘 clusters)
• Recompute the center of each cluster.
• Until converge or other criteria meet
![Page 13: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/13.jpg)
Introduction to Machine Learning Amo G. Tong 13
K-means Framework (Bishop)
• Example (Euclidian space)
Suppose k=2.Step 1: random pick two centers
![Page 14: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/14.jpg)
Introduction to Machine Learning Amo G. Tong 14
K-means Framework (Bishop)
• Example (Euclidian space)
Suppose k=2.Step 1: random pick two centersStep 2: assign points to the closest center
![Page 15: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/15.jpg)
Introduction to Machine Learning Amo G. Tong 15
K-means Framework (Bishop)
• Example (Euclidian space)
Suppose k=2.Step 1: random pick two centersStep 2: assign points to the closest centerStep 3: calculate the center of each cluster
![Page 16: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/16.jpg)
Introduction to Machine Learning Amo G. Tong 16
K-means Framework (Bishop)
• Example (Euclidian space)
Suppose 𝑘 = 2.Step 1: random pick two centersStep 2: assign points to the closest centerStep 3: calculate the center of each clusterStep 4: assign points to the closest center
Repeat until converge.
![Page 17: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/17.jpg)
• Example (Graph Segmentation)
• Formally, partition an image into regions each of which has reasonably homogeneous visual appearance.
• Informally, identify main elements in an image.
Introduction to Machine Learning Amo G. Tong 17
K-means Framework (Bishop)
Pixel and color.
![Page 18: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/18.jpg)
• Example (Graph Segmentation)
• Formally, partition an image into regions each of which has reasonably homogeneous visual appearance.
• Informally, identify main elements in an image.
Introduction to Machine Learning Amo G. Tong 18
K-means Framework (Bishop)
![Page 19: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/19.jpg)
• Example (Graph Segmentation)
• Formally, partition an image into regions each of which has reasonably homogeneous visual appearance.
• Informally, identify main elements in an image.
Introduction to Machine Learning Amo G. Tong 19
K-means Framework (Bishop)
![Page 20: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/20.jpg)
• Example (Graph Segmentation)
• Formally, partition an image into regions each of which has reasonably homogeneous visual appearance.
• Informally, identify main elements in an image.
Introduction to Machine Learning Amo G. Tong 20
K-means Framework (Bishop)
![Page 21: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/21.jpg)
• Example (Graph Segmentation)
• Formally, partition an image into regions each of which has reasonably homogeneous visual appearance.
• Informally, identify main elements in an image.
Introduction to Machine Learning Amo G. Tong 21
K-means Framework (Bishop)
![Page 22: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/22.jpg)
Introduction to Machine Learning Amo G. Tong 22
K-means Framework
• Repeat
• Update the assignment.
• Update the means (centers).
• Until converge or other criteria meet
![Page 23: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/23.jpg)
Introduction to Machine Learning Amo G. Tong 23
K-means Framework
• Repeat
• Update the assignment.
• Update the means (centers).
• Until converge or other criteria meet
• Given the assignment 𝐶, let 𝐶(𝑥) be the mean (center) of the cluster containing 𝑥. Consider the Euclidian distance.
• Will it converge? Yes!
• Consider a potential function 𝑓 = σ𝑥∈𝐷 dist(𝑥, 𝐶(𝑥))
• 𝑓 will never increase and 𝑓 is bounded => it will converge
![Page 24: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/24.jpg)
Introduction to Machine Learning Amo G. Tong 24
K-means Framework
• Given the assignment 𝐶, let 𝐶(𝑥) be the means (center) of the cluster containing 𝑥. Consider the Euclidian distance.
• Repeat
• Update the assignment.
• Update the means (centers).
• Until converge or other criteria meet
• Updating the assignment will not increase 𝒇.
• Recalculating the means will not increase 𝒇.
• For a fixed cluster, which one can minizine the distance sum?
• Try Lagrange Multiplier Method (do it yourself).
𝑓 =
𝑥∈𝐷
dist(𝑥, 𝐶(𝑥))
𝑑𝑖𝑠𝑡 = 𝒙𝟏 − 𝒙𝟐2
![Page 25: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/25.jpg)
Introduction to Machine Learning Amo G. Tong 25
K-means Framework
• Simple
• Intuitive, minimize σ𝑥∈𝐷 dist(𝑥, 𝐶(𝑥)) (implicitly)
• Not time consuming, O(tkn) (k: clusters, t: iterations, n: instance #)
![Page 26: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/26.jpg)
Introduction to Machine Learning Amo G. Tong 26
K-means Framework
• Simple
• Intuitive, minimize σ𝑥∈𝐷 dist(𝑥, 𝐶(𝑥)) (implicitly)
• Not time consuming, O(tkn) (k: clusters, t: iterations, n: instance #)
• K-means may converge to local optimal
• How many clusters are there?
• Distance between clusters.
• How to define mean? What if the attributes are not real numbers
• Cannot handle noise
• Not suitable for non-convex patterns. (recall the pattern of knn)
![Page 27: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/27.jpg)
Introduction to Machine Learning Amo G. Tong 27
K-means Framework
• Simple
• Intuitive, minimize σ𝑥∈𝐷 dist(𝑥, 𝐶(𝑥)) (implicitly)
• Not time consuming, O(tkn) (k: clusters, t: iterations, n: instance #)
• K-means may converge to local optima
• How many clusters are there?
• Distance between clusters.
• How to define mean? What if the attributes are not real numbers
• Cannot handle noise
• Not suitable for non-convex patterns. (recall the pattern of knn)
![Page 28: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/28.jpg)
Introduction to Machine Learning Amo G. Tong 28
Cut-based Clustering
• Two intuitions behind a good clustering.
• (a) weaken the connection between objects in different clusters
• (b) strengthen the connection between objects within a cluster
![Page 29: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/29.jpg)
Introduction to Machine Learning Amo G. Tong 29
Cut-based Clustering
• Two intuitions behind a good clustering.
• (a) weaken the connection between objects in different clusters
• (b) strengthen the connection between objects within a cluster
• Ground set 𝑈 = {𝑣1, … 𝑣𝑛}
• Similarity between two elements 𝑠𝑖𝑚(𝑣𝑖 , 𝑣𝑗)
• A partition 𝐶1, … , 𝐶𝑘 of 𝑈
• Inner-sim(𝐶𝑖) = σ𝑢,𝑣∈ 𝐶𝑖𝑠𝑖𝑚(𝑢, 𝑣)
• Inter-sim(𝐶𝑖)= σ𝑢∈ 𝐶𝑖, 𝑣∉ 𝐶𝑖𝑠𝑖𝑚(𝑢, 𝑣) (cut)
How to measure the goodness of a cluster?
Cost of a clustering 𝐶1, … , 𝐶𝑘
σInter−sim(𝐶𝑖)
Inner−sim(𝐶𝑖)
![Page 30: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/30.jpg)
Introduction to Machine Learning Amo G. Tong 30
Cut-based Clustering
• Two intuitions behind a good clustering.
• Ground set 𝑈 = {𝑣1, … 𝑣𝑛}
• Similarity between two elements 𝑠𝑖𝑚(𝑣𝑖 , 𝑣𝑗)
• A partition 𝐶1, … , 𝐶𝑘 of 𝑈
• Inner-sim(𝐶𝑖) = σ𝑢,𝑣∈ 𝐶𝑖𝑠𝑖𝑚(𝑢, 𝑣)
• Inter-sim(𝐶𝑖)= σ𝑢∈ 𝐶𝑖, 𝑣∉ 𝐶𝑖𝑠𝑖𝑚(𝑢, 𝑣) (cut)
• Find a clustering that minimizes 𝐜𝐨𝐬𝐭 = σInter−sim(𝐶𝑖)Inner−sim(𝐶𝑖)
Optimal solution exists but it is hard to find.Enumerating?Polynomial time?
![Page 31: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/31.jpg)
Introduction to Machine Learning Amo G. Tong 31
Cut-based Clustering
• Find a clustering that minimizes 𝐜𝐨𝐬𝐭 = σInter−sim(𝐶𝑖)Inner−sim(𝐶𝑖)
• An algorithm
• Initialize 𝐶1, … , 𝐶𝑘 randomly.
• Repeat until converge
• Unlock all elements
• Repeat until all elements are locked.
• Randomly select one 𝐶𝑖• Randomly select one unlocked element 𝑣 ∈ 𝐶𝑖 if any
• Move 𝑣 to the cluster such that 𝒄𝒐𝒔𝒕 is maximally decreased.
• Lock 𝑣.
![Page 32: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/32.jpg)
Introduction to Machine Learning Amo G. Tong 32
Cut-based Clustering
• Find a clustering that minimizes 𝐜𝐨𝐬𝐭 = σInter−sim(𝐶𝑖)Inner−sim(𝐶𝑖)
• An algorithm
• Example. k=2
1
32
Cost= 0+ (3+1)/2 =2
ab
c
• Inner−cost(𝐶𝑖)=∞ if |𝐶𝑖|=1• Or you can do some smoothing
by assign a base similarity.
![Page 33: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/33.jpg)
Introduction to Machine Learning Amo G. Tong 33
Cut-based Clustering
• Find a clustering that minimizes 𝐜𝐨𝐬𝐭 = σInter−sim(𝐶𝑖)Inner−sim(𝐶𝑖)
• An algorithm
• Example. k=2
1
32
Cost= 0+ (3+1)/2 =2
ab
c
If move c, cost=(1+2)/3+ 0 =1
1
32
ab
c
Inner−cost(𝐶𝑖)=∞ if |𝐶𝑖|=1
![Page 34: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/34.jpg)
Introduction to Machine Learning Amo G. Tong 34
Cut-based Clustering
• Find a clustering that minimizes 𝐜𝐨𝐬𝐭 = σInter−sim(𝐶𝑖)Inner−sim(𝐶𝑖)
• An algorithm
• Example. k=2
1
32
Cost= 0+ (3+1)/2 =2
ab
c
If move c, cost=(1+2)/3+ 0 =1
If move b, cost=(3+2)/1+ 0 =5
1
32
ab
c
1
32
ab
c
Inner−cost(𝐶𝑖)=∞ if |𝐶𝑖|=1
![Page 35: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/35.jpg)
Introduction to Machine Learning Amo G. Tong 35
Cut-based Clustering
• Find a clustering that minimizes 𝐜𝐨𝐬𝐭 = σInter−sim(𝐶𝑖)Inner−sim(𝐶𝑖)
• An algorithm
• Heuristic algorithm.
• May not be optimal
• Is the solution good?
• Reasonable. Cost is iteratively decreased.
• Does it converge?
• Yes.
![Page 36: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/36.jpg)
Introduction to Machine Learning Amo G. Tong 36
Cut-based Clustering
• Find a clustering that minimizes 𝐜𝐨𝐬𝐭 = σInter−sim(𝐶𝑖)Inner−sim(𝐶𝑖)
• An algorithm
• Initialize 𝐶1, … , 𝐶𝑘 randomly.
• Repeat until converge (converge?)
• Unlock all elements
• Repeat until all elements are locked. (converge?)
• Randomly select one 𝐶𝑖• Randomly select one unlocked element 𝑣 ∈ 𝐶𝑖 if any
• Move 𝑣 to the cluster such that 𝒄𝒐𝒔𝒕 is maximally decreased.
• Lock 𝑣.
![Page 37: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/37.jpg)
Introduction to Machine Learning Amo G. Tong 37
Cut-based Clustering
• Find a clustering that minimizes 𝐜𝐨𝐬𝐭 = σInter−sim(𝐶𝑖)Inner−sim(𝐶𝑖)
• An algorithm
• Heuristic algorithm.
• May not optimal
• Is the solution good?
• Reasonable. Cost is iteratively decreased.
• Does it converge?
• Yes.
• Any other choices?
• Yes
![Page 38: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/38.jpg)
Introduction to Machine Learning Amo G. Tong 38
Cut-based Clustering
• Find a clustering that minimizes 𝐜𝐨𝐬𝐭 = σInter−sim(𝐶𝑖)Inner−sim(𝐶𝑖)
• An algorithm
• Initialize 𝐶1, … , 𝐶𝑘 randomly.
• Repeat until converge
• Unlock all elements
• Repeat until all elements are locked.
• Randomly select one 𝐶𝑖• Randomly select one unlocked element 𝑣 ∈ 𝐶𝑖 if any
• Move 𝑣 to the cluster such that 𝒄𝒐𝒔𝒕 is maximally decreased.
• Lock 𝑣.
You may select the one that, after considered, can maximally decrease the cost
![Page 39: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/39.jpg)
Introduction to Machine Learning Amo G. Tong 39
Cut-based Clustering
• Compare to k-means
• The number of clusters is known in advance.
• Need some initializations
• Iteratively improve the solution.
• Cut-based: consider both the inter and inner similarity
• K-means: only consider the inner similarity.
![Page 40: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/40.jpg)
Introduction to Machine Learning Amo G. Tong 40
Agglomerative Clustering
• Idea: combine small clusters.
![Page 41: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/41.jpg)
Introduction to Machine Learning Amo G. Tong 41
Agglomerative Clustering
• Idea: combine small clusters.
• Framework:
• Maintain a set of clusters
• Initially, each instance is one cluster
• Repeat
• Merge two closest clusters
• Until there is one cluster
• Key: how to define closeness of clusters?
![Page 42: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/42.jpg)
Introduction to Machine Learning Amo G. Tong 42
Agglomerative Clustering
• Key: how to define closeness of clusters?
• First, define the closeness of each pair.
• The closeness of the clusters can be• The closest pair (single-link clustering)
• The farthest pair (complete-link clustering, diameter)
• Sum of all pairs? Average of all pairs.
• Ward’s method
• If you can define the distance within a cluster, find the pair of cluster that results minimum increase in in-cluster distance.
![Page 43: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/43.jpg)
Introduction to Machine Learning Amo G. Tong 43
Agglomerative Clustering (Hastie)
• The result of agglomerative clustering hierarchy of clusters.
dendrogram
So what if we want k clusters?
![Page 44: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/44.jpg)
Introduction to Machine Learning Amo G. Tong 44
Agglomerative Clustering
Detect
outliers.
![Page 45: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/45.jpg)
Introduction to Machine Learning Amo G. Tong 45
Divisive Clustering
• Idea: split a large cluster into two
![Page 46: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/46.jpg)
Introduction to Machine Learning Amo G. Tong 46
Divisive Clustering
• Idea: split a large cluster into two
• Framework:
• Maintain a set of clusters
• Initially, all instances form one cluster
• Repeat
• Split one cluster into two
• Until each cluster is a singleton.
• Key: Which cluster should we split? How to split it?
![Page 47: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/47.jpg)
Introduction to Machine Learning Amo G. Tong 47
Divisive Clustering (Andrea)
Key: Which cluster should we split? How to split it?
![Page 48: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/48.jpg)
Introduction to Machine Learning Amo G. Tong 48
Divisive Clustering
• Idea: split a large cluster into two
• Framework:
• Maintain a set of clusters
• Initially, all instances form one cluster
• Repeat
• Split one cluster into two
• Until each cluster is a singleton.
Which cluster should we split?
If we grow the entire dendrogram and your splitting rule is local, it does not matter.
Otherwise, you may select the one with the highest cost.
How to split it? (many choices)Equally partition it such that the cost is minimized.
DIANA.
![Page 49: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/49.jpg)
Introduction to Machine Learning Amo G. Tong 49
Divisive Clustering
• Idea: split a large cluster into two
• Framework:
• Maintain a set of clusters
• Initially, all instances form one cluster
• Repeat
• Split one cluster into two
• Until each cluster is a singleton.
Which cluster should we split?
If we grow the entire dendrogram and your splitting rule is local, it does not matter.
Otherwise, you may select the one with the highest cost.
How to split it? (many choices)Equally partition it such that the cost is minimized.
DIANA
DIANA:To divide the selected cluster, the algorithm first looks for its most disparate observation (i.e., which has the largest average dissimilarity to the other observations of the selected cluster). This observation initiates the "splinter group". In subsequent steps, the algorithm reassigns observations that are closer to the "splinter group" than to the "old party". The result is a division of the selected cluster into two new clusters.
![Page 50: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/50.jpg)
Introduction to Machine Learning Amo G. Tong 50
Hierarchal Clustering - Summary
• No need to specify the number of clusters in advance.
• Can be time consuming, time complexity of at least O(𝑛2), where n is the number of total objects
• Hierarchical structure stands for intuitions for some domains.
• But the interpretation is subjective.
![Page 51: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/51.jpg)
Introduction to Machine Learning Amo G. Tong 51
Summary
• K-means
• Cut-based measurements
• Agglomerative clustering
• Divisive clustering
![Page 52: Introduction to Machine Learningudel.edu/~amotong/teaching/machine learning/lectures/(Lec 13... · Unsupervised Learning • K-means Framework • Cut-based Framework • Agglomerative](https://reader033.vdocuments.net/reader033/viewer/2022042807/5f773c5c4a7bdb757e7e55e6/html5/thumbnails/52.jpg)
Spring 2019 Amo G. Tong 52
Equal-sized k-clustering
Cut-based k-clustering.
cost = σInter−cost(𝐶𝑖)
Inner−cost(𝐶𝑖)
Initialize 𝐶1, … , 𝐶𝑘 randomly.Repeat until converge (converge?)
Unlock all elementsRepeat until all elements are locked. (converge?)
Randomly select one 𝐶𝑖Randomly select one unlocked element 𝑣 ∈ 𝐶𝑖 if anyMove 𝑣 to the cluster such that 𝒄𝒐𝒔𝒕is maximally decreased.Lock 𝑣.
Given a set of 𝑘 ∗ 𝑚 elements, we want a equal-sized k-clustering. That is, each cluster has exactly 𝑚 elements.
Please describe a cut-based algorithm for such a purpose.
Hint: How to take account of the size of the clusters?