bayesian complementary clustering, mcmc and anglo-saxon ... · journal of the american statistical...

87
Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames Giacomo Zanella [email protected] Department of Statistics University of Warwick, Coventry, UK 14 March 2014

Upload: others

Post on 20-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Bayesian Complementary Clustering, MCMC andAnglo-Saxon placenames

Giacomo [email protected]

Department of StatisticsUniversity of Warwick, Coventry, UK

14 March 2014

Page 2: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Overview

1. Motivation: historical problem.

2. Modeling part: literature, our approach, model definition.

3. Computational part: MCMC heuristics.

4. Real data analysis.

5. Related problems and future steps.

Page 3: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

A classic problem: Cluster Analysis

Aim: organizing objects into groups whose members are “similar”.

Geometrical interpretation: separate points into clusters made of close points.

Figure : A point pattern being divided into two clusters.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 1 / 60

Page 4: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Popular approaches to Cluster Analysis

Deterministic approach

• K-mean clustering,

• Hierarchical clustering,

• ...

Probabilistic approach(model-based clustering)

• Mixture of Gaussians,

• Bayesian cluster models (BCM),• Inferences on centers;• Inferences on intensity measure;• Inferences on cluster partition;

• ...

Figure : k-mean clustering.

Figure : Model-based clustering.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 2 / 60

Page 5: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Bayesian Cluster Models

Definition (Cluster point process)A cluster process is the superposition of a collection of independent daughterpoint processes ∪z∈zxz indexed by the points of a center point process z.

Figure : Centre process z. Figure : Cluster process x.

Observed points x = {x1, ..., xn(x)}

Unobserved centers z = {z1, ..., zN(z)} and partition ρ = {C1, ...,CN(p)}

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 3 / 60

Page 6: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Bayesian Cluster models: Example 1

Kang, Jian, et al. ”Meta analysis of functional neuroimaging data via Bayesianspatial point processes.” Journal of the American Statistical Association 106.493(2011): 124-134.

Figure : Application of a Bayesian Cluster Model in functional neuroimaging: inferenceson activation centers given peak activation locations.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 4 / 60

Page 7: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Model-based Clustering: Example 2

Hill, B.J., Kendall, W.S. & Thonnes, E., 2012. Fibre-generated Point Processesand Fields of Orientations. Annals of Applied Statistics, 6(3), pp.994-1020.

Figure : Fibre clustering. Application to fingerprints.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 5 / 60

Page 8: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Our clustering problem: original motivation

Problem posed by JohnBlair (History Professorfrom Oxford).

Figure : Reconstruction of an Anglo-Saxon settlement inWest-Stow, Suffolk. (Image borrowed from John Blair)

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 6 / 60

Page 9: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Empirical Observations - 1

Stretton, Newton, Burton, Carlton in the region of Gt .Glen

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 7 / 60

Page 10: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Empirical Observations - 1

Stretton, Newton, Burton, Carlton in the region of Gt .Glen

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 7 / 60

Page 11: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Empirical Observations - 2

Stratton, Charlton, Kingston, Burton in the region of Dorchester

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 8 / 60

Page 12: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Empirical Observations - 2

Stratton, Charlton, Kingston, Burton in the region of Dorchester

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 8 / 60

Page 13: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Structure of administrative clusters

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 9 / 60

Page 14: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Need for cluster analysis

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 10 / 60

Page 15: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Need for cluster analysis

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 10 / 60

Page 16: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Need for cluster analysis

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 10 / 60

Page 17: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Need for cluster analysis

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 10 / 60

Page 18: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Need for cluster analysis

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 10 / 60

Page 19: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

The data as a marked point process

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 11 / 60

Page 20: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

The data as a marked point process

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 11 / 60

Page 21: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

The data as a marked point process

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 11 / 60

Page 22: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Historical Questions - 1

Is there statistical support to the “administrative clusters hypothesis” comingfrom the geographical locations of settlements?

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 12 / 60

Page 23: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Historical Questions - 2

What is the typical intra-cluster dispersion σ?

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 13 / 60

Page 24: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Historical Questions - 3

Which portion of the settlements are clustered together?Which placenames tend to cluster together?Can we provide a list of clusters more strongly supported by the analysis?

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 14 / 60

Page 25: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Model requirements

Marked Cluster Point Process: eachpoint is assigned a mark (color), in thiscase placenames.

Complementary clustering: two pointsof the same color are not admitted in thesame cluster.

Marked Cluster Point Process: eachpoint is assigned a mark (color), in thiscase placenames.

Complementary clustering: two pointsof the same color are not admitted in thesame cluster.

Inferences on the cluster partition:

π(ρ|x) ∝ π(ρ)π(x|ρ)

x = {x1, ..., xn} ↔ Observed pointsρ = {C1, ...,CN} ↔ Unobserved partition

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 15 / 60

Page 26: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Model requirements

Marked Cluster Point Process: eachpoint is assigned a mark (color), in thiscase placenames.

Complementary clustering: two pointsof the same color are not admitted in thesame cluster.

Marked Cluster Point Process: eachpoint is assigned a mark (color), in thiscase placenames.

Complementary clustering: two pointsof the same color are not admitted in thesame cluster.

Inferences on the cluster partition:

π(ρ|x) ∝ π(ρ)π(x|ρ)

x = {x1, ..., xn} ↔ Observed pointsρ = {C1, ...,CN} ↔ Unobserved partition

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 15 / 60

Page 27: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

BCM with inferences on the partitions

We are given n observed points x = {x1, ..., xn}.

We denote a partition of x by ρ = {C1, ...,CN(ρ)}.

Random Partition Model (RPM)

1. Define a prior distribution π(ρ): for the partition of n elements into clusters(π must be exchangeable with respect to both clusters and points labels).

2. Define the distribution of x|ρ (usually define the distribution hs(·) of a clusterwith s elements and suppose each cluster is independent).

3. Perform Bayesian inferences on the partition:

π(ρ|x) ∝ π(ρ) π(x|ρ) = π(ρ)

N(ρ)∏i=1

h|Cj |(xCj )

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 16 / 60

Page 28: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Data Generation Model for x|ρCluster centers: z ∼ Inhomogeneous Poisson Point Process

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 17 / 60

Page 29: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Data Generation Model for x|ρLocations: i.i.d. Gaussians conditioned on the cluster centers being their means

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 18 / 60

Page 30: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Prior on the partition π(ρ)

Main idea: pass from a partition of n points into many little clusters to a partitionof n points into k big clusters.

Exchangeable prior on partitionsρ = {C1, ...,CN(ρ)} partition of {1, ..., n}.Nl(ρ) := #{Cj : |Cj | = l}, l=1,...n.

π(ρ) exchangeable over cluster and points indices⇒ π(ρ) depends only on N1(ρ), ...,Nn(ρ) (e.g. Dirichlet process)

Complementary clustering with k typesNk+1(ρ) = ... = Nn(ρ) = 0

⇒ π(ρ) depends only on N1(ρ), ...,Nk(ρ)

Yl(ρ) := l Nl(ρ) number of points in clusters of size l

⇒∑k

l=1 Yl(ρ) = n and Yl ≡l 0

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 19 / 60

Page 31: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Prior on the partition π(ρ)

ρ partition ↔(Y1(ρ), ...,Yk(ρ)

)Distribution on the multivariate vector

Pr(Y1 = y1, ...,Yk = yk) ∝

{n(x)!

y1!···yk !py11 · · · p

ykk if

∑kl=1 yl = n(x) and yl ≡l 0,

0 otherwise,

with p = (p1, ..., pk) ∼ Dir(1, ..., 1).

Distribution on the partition

π(ρ) ∝ 1

η(ρ)

n(x)!

Y1(ρ)! · · ·Yk(ρ)!pY1(ρ)1 · · · pYk (ρ)

k ,

where η(ρ) = #{ρ | Y1(ρ) = Y1(ρ), ...,Yk(ρ) = Yk(ρ)}.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 20 / 60

Page 32: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Posterior distribution

Random objects

pρσx

probability vector of cluster sizespartition into clustersintra-cluster dispersionobserved point process

π(ρ, σ,p|x) ∝ π(p) π(σ) π(ρ|p) π(x|ρ, σ) ∝ π(σ) π(ρ|p)

N(ρ)∏i=1

h|Cj |(xCj ) ∝

∝ π(σ)k∏

l=1

(Nl !

(lNl)!

) N(ρ)∏j=1

(c|Cj | exp

(−∑

i∈Cj(xi − xCj )

2

2σ2

)),

where cs =(ks

)−1 pss

s W (2πσ2)s−1 .

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 21 / 60

Page 33: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Posterior distribution

Random objects

pρσx

probability vector of cluster sizespartition into clustersintra-cluster dispersionobserved point process

π(ρ, σ,p|x) ∝ π(p) π(σ) π(ρ|p) π(x|ρ, σ) ∝ π(σ) π(ρ|p)

N(ρ)∏i=1

h|Cj |(xCj ) ∝

∝ π(σ)k∏

l=1

(Nl !

(lNl)!

) N(ρ)∏j=1

(c|Cj | exp

(−∑

i∈Cj(xi − xCj )

2

2σ2

)),

where cs =(ks

)−1 pss

s W (2πσ2)s−1 .

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 21 / 60

Page 34: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Posterior distribution

Observed point process x is the superposition of all the clusters.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 22 / 60

Page 35: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Posterior distribution

We are interested in the posterior distribution of ρ: π(ρ|x)

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 23 / 60

Page 36: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Intractability of π (ρ|x)

π (ρ|x) probability measure on Pn, the set of partition of {1, ..., n}.We are interested in E[f (ρ)] for ρ ∼ π(·|x).

Problems

• π(·|x) is known up to a normalizing constant.

• Normalizing π(·|x) or sampling exactly from π(·|x) is extremely inefficient(the order of Pn is between n! and nn).

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 24 / 60

Page 37: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Finding ρmax = argmaxρπ(ρ|x) ?

Sampling exactly from π(·|x) is unfeasible. Can we at least find the maximum?

Find ρmax ↔ Optimal Assignation Problem

2D Optimal Assignation ProblemSolvable polynomially O(n3) with Hungarianalgorithm (Optimal Transportation Theory).

kD Optimal Assignation Problem (k≥ 3)NP-hard optimization problem. Notapproximable in polynomial time with anydeterministic algorithm (not in APX).

Figure : Optimal assignation with50 red and 50 blue points

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 25 / 60

Page 38: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Feasible approach: Monte Carlo Markov Chains

MCMC approachSimulate an ergodic Markov chain(Xn)n≥0 with stationary distribution π.Estimate I = Eπ[f (X )] with

In :=1

n

t+n∑k=t

f (Xk).

SLLN for Markov Chains gives In → Ia.s. under mild conditions.

Question: How to design a Markov chain whose stationary distribution is π? Is iteasier than sampling or normalizing π?

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 26 / 60

Page 39: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Reversibility and Stationarity

Reversibility conditionA transition kernel P is reversible withrespect to π if ∀(x , y) ∈ X × X

π(x)P(x , y) = π(y)P(y , x). (1)

(1) implies πP = π, i.e. π is the stationarydistribution of (Xn)n≥0 driven by P.

Figure : “Probability flow”between x and y starting from π

Metropolis-Hastings algorithm idea: Given a transition kernel Q(x , y) make itreversible with respect to π by suppressing some moves with the correctprobabilities.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 27 / 60

Page 40: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Metropolis Hastings algorithm

Metropolis-Hasting AlgorithmObtain Xn+1 from Xn by

1. Sample the proposed move X ∼ Q(Xn, ·)2. Compute the acceptance probability

α(Xn,X ) = min{

1,f (X )Q(X ,Xn)

f (Xn)Q(Xn,X )

}3. With probability α(Xn,X ) set Xn+1 = X , otherwise set Xn+1 = Xn

Very general and flexible but performances depend heavily on the choice of Q.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 28 / 60

Page 41: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

2D case (i.e. 2 colors)

Why start from 2D?

• Simpler: use it to design a good proposal Q for MH algorithm (understandwhat can go wrong and how to solve it).

• ρmax = argmaxρ∈Pnπ(ρ|x) is available: reliable check for the MCMC.

• It allows us to explore pairwise interaction among placenames in the dataset.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 29 / 60

Page 42: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

2D case (i.e. 2 colors)

Posterior Sample Space for 2DPartial matchings contained in a complete bipartite graph.

partition ρ ={{1}, {2, 6}, {3}, {4, 7}, {5}

}↔ matching X (ρ)

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 30 / 60

Page 43: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Proposal distribution (I)

Proposal distributionQ(Xold ,Xnew )

1. Pick a red point i and a bluepoint j uniformly at random.

2. Propose the correspondingmove (add/remove/switch).

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 31 / 60

Page 44: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Proposal distribution (II)

vertices

edges

↔ states of the MC

↔ moves allowed

Proposal Distribution

Xnew ∼ Unif (N(Xold))

where N(X ) being the set ofstates connected to X .

QuestionDo we have a good mixing?

Figure : Markov Chain represented by a graph

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 32 / 60

Page 45: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Uniform proposal leads to bad mixing

Low acceptance rate: Q(Xold ,Xnew ) often proposes to links two far-away points.

Possible solution: Change Q to take into account the geometry of the problem.

Instead of picking a red point i and a blue point j uniformly at random choosethem according to some q(i , j).

Q(Xold ,Xnew ) ∝

{q(i , j), if Xold goes in Xnew by choosing {i , j},

0 if Xnew /∈ N(Xold).

QuestionWhat is the optimal choice of q(i , j)?

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 33 / 60

Page 46: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Different proposal distributions

Three proposal1. q(i , j) ∝ 1;

2. q(i , j) ∝ π(Xnew );

3. q(i , j) ∝√π(Xnew ).

where Xnew is theproposed state.

Intuitive ideaGiven the set of allowedmoves a good mixingwill be obtained by

Q(Xold ,Xnew )

Q(Xnew ,Xold)≈ π(Xnew )

π(Xold).

Q(Xold ,Xnew )

Q(Xnew ,Xold)=

qXold(i , j)

qXnew (i , j)=

For proposal 2:

=π(Xnew )/π(N(Xold))

π(Xold)/π(N(Xnew ))=π(Xnew )

π(Xold)

π(N(Xnew ))

π(N(Xold))≈

≈ π(Xnew )

π(Xold)

π(Xnew )

π(Xold)=

(π(Xnew )

π(Xold)

)2

.

For proposal 3:

=

√π(Xnew )/π(N(Xold))√π(Xold)/π(N(Xnew ))

=

√π(Xnew )√π(Xold)

π(N(Xnew ))

π(N(Xold))≈

≈√π(Xnew )√π(Xold)

√π(Xnew )√π(Xold)

=π(Xnew )

π(Xold).

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 34 / 60

Page 47: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Different proposal distributions

Three proposal1. q(i , j) ∝ 1;

2. q(i , j) ∝ π(Xnew );

3. q(i , j) ∝√π(Xnew ).

where Xnew is theproposed state.

Intuitive ideaGiven the set of allowedmoves a good mixingwill be obtained by

Q(Xold ,Xnew )

Q(Xnew ,Xold)≈ π(Xnew )

π(Xold).

Q(Xold ,Xnew )

Q(Xnew ,Xold)=

qXold(i , j)

qXnew (i , j)=

For proposal 2:

=π(Xnew )/π(N(Xold))

π(Xold)/π(N(Xnew ))=π(Xnew )

π(Xold)

π(N(Xnew ))

π(N(Xold))≈

≈ π(Xnew )

π(Xold)

π(Xnew )

π(Xold)=

(π(Xnew )

π(Xold)

)2

.

For proposal 3:

=

√π(Xnew )/π(N(Xold))√π(Xold)/π(N(Xnew ))

=

√π(Xnew )√π(Xold)

π(N(Xnew ))

π(N(Xold))≈

≈√π(Xnew )√π(Xold)

√π(Xnew )√π(Xold)

=π(Xnew )

π(Xold).

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 34 / 60

Page 48: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Different proposal distributions

Three proposal1. q(i , j) ∝ 1;

2. q(i , j) ∝ π(Xnew );

3. q(i , j) ∝√π(Xnew ).

where Xnew is theproposed state.

Intuitive ideaGiven the set of allowedmoves a good mixingwill be obtained by

Q(Xold ,Xnew )

Q(Xnew ,Xold)≈ π(Xnew )

π(Xold).

Q(Xold ,Xnew )

Q(Xnew ,Xold)=

qXold(i , j)

qXnew (i , j)=

For proposal 2:

=π(Xnew )/π(N(Xold))

π(Xold)/π(N(Xnew ))=π(Xnew )

π(Xold)

π(N(Xnew ))

π(N(Xold))≈

≈ π(Xnew )

π(Xold)

π(Xnew )

π(Xold)=

(π(Xnew )

π(Xold)

)2

.

For proposal 3:

=

√π(Xnew )/π(N(Xold))√π(Xold)/π(N(Xnew ))

=

√π(Xnew )√π(Xold)

π(N(Xnew ))

π(N(Xold))≈

≈√π(Xnew )√π(Xold)

√π(Xnew )√π(Xold)

=π(Xnew )

π(Xold).

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 34 / 60

Page 49: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Different proposal distributions

Three proposal1. q(i , j) ∝ 1;

2. q(i , j) ∝ π(Xnew );

3. q(i , j) ∝√π(Xnew ).

where Xnew is theproposed state.

Intuitive ideaGiven the set of allowedmoves a good mixingwill be obtained by

Q(Xold ,Xnew )

Q(Xnew ,Xold)≈ π(Xnew )

π(Xold).

Q(Xold ,Xnew )

Q(Xnew ,Xold)=

qXold(i , j)

qXnew (i , j)=

For proposal 2:

=π(Xnew )/π(N(Xold))

π(Xold)/π(N(Xnew ))=π(Xnew )

π(Xold)

π(N(Xnew ))

π(N(Xold))≈

≈ π(Xnew )

π(Xold)

π(Xnew )

π(Xold)=

(π(Xnew )

π(Xold)

)2

.

For proposal 3:

=

√π(Xnew )/π(N(Xold))√π(Xold)/π(N(Xnew ))

=

√π(Xnew )√π(Xold)

π(N(Xnew ))

π(N(Xold))≈

≈√π(Xnew )√π(Xold)

√π(Xnew )√π(Xold)

=π(Xnew )

π(Xold).

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 34 / 60

Page 50: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Compare performances of the three proposals (I)

1) q(i , j) ∝ 1 2) q(i , j) ∝ π(Xnew ) 3) q(i , j) ∝√π(Xnew )

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 35 / 60

Page 51: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Compare performances of the three proposals (II)

1) q(i , j) ∝ 1 2) q(i , j) ∝ π(Xnew ) 3) q(i , j) ∝√π(Xnew )

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 36 / 60

Page 52: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Multiple proposal scheme

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 37 / 60

Page 53: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Multiple proposal scheme

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 37 / 60

Page 54: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Multiple proposal scheme

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 37 / 60

Page 55: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Convergence Diagnostic (I)

Figure : The summary statistic in (a) and (b) is the number of different edges from afixed matching. In (c) the intensity of gray represents the percentage of time the link hasbeen present in the MCMC.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 38 / 60

Page 56: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Convergence Diagnostic (II)

Figure : Left: G&R diagnostic for a 10-dimensional summary of 5 independent runs ofthe MCMC. Right: average of the value of D obtained by comparing 5 independent runs.

Let pij = Pπ[{i , j} ∈ X ] and p(1)ij , p

(2)ij be the value estimated by two independent

MCMC runs. D12 := sup{i,j}∈E |p(1)ij − p

(2)ij | can be used as a measure of proximity.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 39 / 60

Page 57: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Convergence Diagnostic (III)

Single/multiple q(i , j) ∝ ESS for ESS for Time for Time forproposal scheme 104 steps 10 sec GR < 1.002 D < 0.05

Single√π (Xnew ) 83.46 64.65 20 · 104 28 · 104

Single π (Xnew ) 30.19 23.38 49 · 104 48 · 104

Multiple (l=4)√π (Xnew ) 204.54 128.56 6 · 104 7 · 104

Table : The values refer to a synthetic sample with 100 + 100 points. The EffectiveSample Size (ESS) and Gelman&Rubin diagnostic are estimated using the coda package.The running time is evaluated using R software running on a desktop computer withIntel i7 processor.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 40 / 60

Page 58: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Multimodality for the complete matching case

For pnoise → 0, π(X ) showsmultimodality.⇒ the MCMC gets stuck inlocal maxima.

We used Simulated Tem-pering and tested on ex-treme cases (artificial cy-cles).

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 41 / 60

Page 59: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Theoretical bounds for mixing times?

2-color case: Monomer-dimer systemsGiven a graph G = (V ,E ) with edge weights w : E → [0,∞) the state space is{0, 1}E with probability distribution

π(X ) ∝

∏e:X (e)=1

w(e)

(∏i∈V

1(degX (i) ≤ 1)

). (2)

Jerrum and Sinclair [1996] use canonical paths arguments to prove that

τX (ε) ≤ 4(#E )(#V )w ′2(log(#E )#E + log

(ε−1)), w ′ = max

{1,max

e∈Ew(e)

}.

Experiment with 100 points (50 blue + 50 red)

JS bound:5 · 1014 steps

G&R diagnostic:106 steps

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 42 / 60

Page 60: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

General k dimensional case

Posterior Sample Space for KDPartial matchings (i.e. hypergraphs of degree at most 1) contained in a k-partitecomplete hypergraphs.

Figure : A complete 3-partite hypergraphFigure : A matching in a 3-partitehypergraph. The corresponding partitionis {{1}, {2, 3, 4}}.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 43 / 60

Page 61: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Reducing kD to 2D (I)

π (ρ|x) ∝k∏

l=1

(Nl !

(lNl)!

) N(ρ)∏j=1

c|Cj | exp

(−∑

i∈Cj(xi − xCj )

2

2σ2

).

LemmaFor any x1, ..., xs , z ∈ Rn and x = s−1

∑si=1 xi it holds

s∑i=1

(xi − x

)2=

s∑i=1

(xi − z

)2 − s(x − z)2.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 44 / 60

Page 62: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Reducing kD to 2D (II)

CorollaryFor any x1, ..., xs ∈ Rn, x = s−1

∑si=1 xi and x (s−1) = (s − 1)−1

∑s−1i=1 xi it holds

s∑i=1

(xi − x

)2=

(s−1)∑i=1

(xi − x (s−1)

)2 +

s − 1

s(xs − x (s−1))2.

ProofFollow from previous lemma by putting z = x (s−1).

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 45 / 60

Page 63: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Reducing kD to 2D (III)

π(ρ|σ,p, x

)= π2D

(ρ2D |σ,p, x2D

)where

• x2D = x2D(x, ρ, i) is the 2-color point process obtained by keeping the pointsof the i-th color and replace the others with their clusters centroids.

• ρ2D = ρ2D(ρ, i) is the induced matching.

• π2D is the posterior distribution of the 2-color case with slight changes.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 46 / 60

Page 64: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

kD Algorithm

kD-AlgorithmObtain ρnew from ρold as follows

1) Sample a color i ∼ U({1, ..., k}

);

2) Evaluate x2D = x2Dold(x, ρold , i) andρ2Dold = ρ2D(ρold , i);

3) Obtain ρ2Dnew performing a moveof the 2D-Algorithm usingπ2D(·|σ, x2Dold) as target distributionand ρ2Dold as current state;

4) Obtain ρnew from ρ2Dnew .

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 47 / 60

Page 65: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

kD algorithm results

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 48 / 60

Page 66: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Format of the Dataset

PARISH/ GRID DATE OFCOUNTY PLACE TOWNSHIP REF FIRST

EVIDENCEBRK Bourton Bourton SU 230870 c. 1200BUC Bierton Bierton with Broughton SP 836152 DBBUC Bourton Buckingham SP 710333 DBCHE Burton Burton (T) SJ 509639 DBCHE Burton Burton (T) SJ 317743 1152CHE Buerton Buerton (T) SJ 682433 DB

Table : Data available regarding the first 6 settlement with the name Burton.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 49 / 60

Page 67: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Data Cleaning

Operations performed

1. Placenames: Burton, Bourton, Bierton, Buerton, etc. → Burton.

2. Locations: SP 836 152 identify 100× 100m square.SP 83 15 identify 1km × 1km square.In both cases the placename is placed at the centre of the square.

3. “Multiple” settlements

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 50 / 60

Page 68: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Number of Settlements

Placenames total # with # of couples # of couplesnumber 1km acc. (by historians) (by proximity)

Aston/Easton 90 0 1 8Bolton 17 1 1 0Burh-Stall 29 2 1 0Burton 109 2 1 7Centres 46 0 0 0Charlton/Charlcot 98 3 7 1Chesterton 9 0 0 0Claeg 84 13 0 5Draycot/Drayton 55 1 0 2Eaton 33 1 1 5Kingston 71 1 1 1Knighton 26 1 0 0Newbold 34 3 1 0Newton 191 5 4 5Norton 74 1 8 1Stratton 37 0 5 0Sutton 101 2 4 5Tot 77 17 1 1Walton/Walcot 51 4 1 0Weston 85 3 3 2Total 1317 60 40 43

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 51 / 60

Page 69: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Density estimation

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 52 / 60

Page 70: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Density estimation

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 52 / 60

Page 71: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Homogeneous and inhomogeneous K-cross functions

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 53 / 60

Page 72: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Interaction plot based on K-cross functions

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 54 / 60

Page 73: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Using our model: three names considered

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 55 / 60

Page 74: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Using our model: three names considered

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 55 / 60

Page 75: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Using our model: higher-order interaction

Figure : Three placenames are considered: Charlton − Newton − Norton. Estimatedposterior distributions for σ, p1, p2 and p3 are shown in red. Prior distributions areshown in blue.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 56 / 60

Page 76: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 57 / 60

Page 77: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Considering 11 placenames

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 58 / 60

Page 78: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 59 / 60

Page 79: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Future steps - Possible developments

Short term:

• Analyze more carefully the real dataset: sensitivity analysis, consider all 20placenames.

• Consider heterogeneity among placenames?

• Discuss results with historians for the interpretation.

Longer term:

• (Computation) Theoretical results on optimal proposal forMetropolis-Hastings on discrete spaces?

• (Computation) A more careful comparison with other sampling schemes forclustering and data association problems.

• (Applications) Other context where complementary clustering occur?

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 60 / 60

Page 80: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Acknowledgments

Prof. Wilfrid Kendall for the supportive and wise supervision, Prof. John Blair forthe collaboration, EPSRC for funding.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 61 / 60

Page 81: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Thank you

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 62 / 60

Page 82: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Second problem of the MCMC

Problem: MCMC getsstuck in local maxima.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 57 / 60

Page 83: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Simulated Tempering

Aim: overcome multimodality.

Procedure:

• Consider an artificial sample spaceX0 ∪ ...∪Xs made of s + 1 copies ofthe original sample space X .

• Assign to each copy Xj a probability

distribution πj with density fβjπ (a

tempered version of π), withβ0 = 1.

• Implement an MCMC running onthe new space and keep just theiterations lying in X0.

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 58 / 60

Page 84: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Testing Simulated Tempering

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 59 / 60

Page 85: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Density estimations

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 60 / 60

Page 86: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Density estimations

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 60 / 60

Page 87: Bayesian Complementary Clustering, MCMC and Anglo-Saxon ... · Journal of the American Statistical Association 106.493 (2011): 124-134. Figure :Application of a Bayesian Cluster Model

Introduction Problem considered The model MCMC MCMC for 2D kD Real Data

Density estimations

Giacomo Zanella (University of Warwick) Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames 14/03/2014 60 / 60