cluster1 291013
TRANSCRIPT
1
Cluster Analysis
What is cluster analysis?
Why use cluster analysis?
Nomenclature
Strengths and weaknesses
5
What cluster analysis does
• Identifies subgroups in a population based on some set of characteristics
6
Some ~ Synonyms
• Cluster analysis• Unsupervised learning• Unsupervised classification• Automatic classification• Numerical taxonomy• Typological analysis
7
Types of cluster analysis
• Connectivity models: Hierarchical clustering • Centroid models: K-means algorithm
• Distribution models: Latent Class Models
• Density models
• Subspace models
• Graph-based models
8
Heirarchical
9
10
K Means
11
Finite mixture model
12
Density
13
Model based cluster analysis
• Distributional• Latent class• Latent profile• Finite mixture model• Gaussian mixture model
14
Hard vs. Fuzzy Clustering
15
Strengths
• Empirical (Data driven)• Handles complexity better than linear model• Better suited for some theoretical tasks
16
Limitations
• Subjective• Largely data-driven (empirical)• Computationally intensive• Variable selection • Model selection (how many clusters?)• Limited options for validation
17
RESULTS
18
19
SIMPLE STUNTING MODELS/ss
These models only include binary measures of stunting at each of 4 time points.
20The prevalence of stunting at each age.
21
22These make the most sense. They also suggest that stunting at 24 and 249 will tell most of the
story.
23These make sense, with group 3 as a “leftovers” group.
24Nonsensical group
25Nonsensical groups and/or trivial differences.
26Nonsensical groups and/or trivial differences.
27Nonsensical groups and/or trivial differences.
28Nonsensical groups and/or trivial differences.
29Dramatic improvement in model fit from the 2 to 3-class model. Minima for all three fit
measures found in the 4-class model.
30Dramatic improvement in model fit from the 2 to 3-class model. Best fit for the 4-class model.
31Best in the 2-class solution. Marginal differences between the 3 to 6 class solutions.
32Model >5 have classes with no modal assignments. 2 and 3 look good; 4 slightly less so.
33
34
35
SIMPLE BODYSIZE MODELS/bs
These models only include IOTF measures of bodysize at each of 4 time points.
36
37
38
39
40
41
42
43
44
45Dramatic improvement in model fit from the 2 to 3-class model. Minima for all three fit
measures found in the 6-class model.
46
47Not great overall. Best for 5 an 6.
48
49
50
51