markov chain monte carlo

51
Markov Chain Monte Markov Chain Monte Carlo Carlo Hadas Barkay Hadas Barkay Anat Hashavit Anat Hashavit

Upload: nardo

Post on 19-Mar-2016

36 views

Category:

Documents


1 download

DESCRIPTION

MCMC. Markov Chain Monte Carlo. Hadas Barkay Anat Hashavit. Motivation. Models built to represent diverse types of genetic data tend to have a complex structure. Computation on these structures are often intractable. We see this, for instance, in pedigree analysis. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Markov Chain Monte Carlo

Markov Chain Monte CarloMarkov Chain Monte CarloHadas BarkayHadas BarkayAnat HashavitAnat Hashavit

Page 2: Markov Chain Monte Carlo

MotivationMotivation Models built to represent diverse types of genetic

data tend to have a complex structure. Computation on these structures are often intractable.

We see this, for instance, in pedigree analysis. Stochastic simulation techniques are good ways

to obtain approximation. Markov Chain Monte Carlo methods have proven

to be quite useful for such simulations.

Page 3: Markov Chain Monte Carlo

Markov chainMarkov chain Markov chains are random processes X1,X2,

…,having a discrete 1-dimensional index set (1,2,3…)

domain D of m states {s1,…,sm} from which it takes it’s values Satisfying the markov property:

The distribution of xt+1 given all the previous states xt xt-1 … depends only on the immediately preceding state xt

] x X | x P[X ] x X , x X , . . . , x X , x X | x P[X

tt11t

00111-t1-ttt11t

t

t

X1 X2 X3 Xi-1 Xi Xi+1S1 S2 S3 Si-1 Si Si+1

Page 4: Markov Chain Monte Carlo

A markov chain also consists of1. An m dimensional initial distribution vector

2. An m×m transition probabilities transition matrix P= (Aij) s.t P(i,j)=Pr[xt+1=si|xt=sj]

1 m( (s ), , (s ))

X1 X2 X3 Xi-1 Xi Xi+1S1 S2 S3 Si-1 Si Si+1

Page 5: Markov Chain Monte Carlo

The multiplications shown in the previous slide are equivalent to multiplying M by itself .

We conclude that P2ik= Pr(xt+2=sk|xt=si)

more generally Pnik = Pr(xt+n=sk|xt=si)

Notice that the transition probability between Xt and Xt+2 are also derived in the following way:

2 2 1

2 1

Pr( | ) Pr( , | )

Pr( | ) Pr( | )

t k t i t k t j t ij

t k t j t j t ij

ij jkj

x s x s x s x s x s

x s x s x s x s

p p

Page 6: Markov Chain Monte Carlo

Stationary distributionStationary distribution

A distribution = (1,…, m) is stationary if P =

• When P is unique for a transition matrix P , the rows of all coinside with .

• in other words , the long-run probability of being in state j is j, regardless of the initial state.

lim nn P

Page 7: Markov Chain Monte Carlo

The gambler exampleThe gambler example a gambler wins 1 dollar with probability losses 1 dollar with probability 1-p the gambler plays 5 rounds. The transition matrix of that markov chain is:

1 0 0 0 0 0

1 0 0 0 00 1 0 0 00 0 1 0 00 0 0 1 00 0 0 0 0 1

p pp p

p pp p

0 1 2 3 4 5012345

Page 8: Markov Chain Monte Carlo

there is no unique stationary distribution for this example

any distribution for which 1+6=1 will satisfy P=.

For example: 1 0 0 0 0 0

1 0 0 0 00 1 0 0 0

0.5 0 0 0 0 0.5 0.5 0 0 0 0 0.50 0 1 0 00 0 0 1 00 0 0 0 0 1

p pp p

p pp p

Page 9: Markov Chain Monte Carlo

For p=0.5 we can see that

The long run frequency of being in a given state depends on the initial state

1 0 0 0 0 0 1 0 0 0 0 00.5 0 0.5 0 0 0 0.8 0 0 0 0 0.20 0.5 0 0.5 0 0 0.6 0 0 0 0 0.40 0 0.5 0 0 0.4 0 0 0 0 0.60 0 0 0.5 0 0.5 0.2 0 0 0 0 0.80 0 0 0 0 1 0 0 0 0 0 1

p

Page 10: Markov Chain Monte Carlo

Some properties of Markov chainsSome properties of Markov chains A Markov chain is aperiodic if the chances of

getting from state to state are not periodic in the number of states needed.

States i,j communicate if state j is reachable from state i, i.e there exists a finite number n s.t pn(i,j)>0.

A Markov chain is said to be irreducible if all its states communicate.

A Markov chain is said to be ergodic if it is irreducible and aperiodic.

An ergodic Markov chain always has a unique stationary distribution

Page 11: Markov Chain Monte Carlo

MCMC - Monte Carlo Markov MCMC - Monte Carlo Markov ChainChain

Examples :1.1. Boltzman distributionBoltzman distribution:

where is the canonical Partitioning function,which is sometimes incalculable.

1 expib

iEZ k T

expi b

iEZk T

Goal:Calculating an Expectatoin Value When the probabilities i are given up to an unknown normalization factor Z.

i

i i)(

Page 12: Markov Chain Monte Carlo

Prob(data| 2) = P(x11 ,x12 ,x13 ,x21 ,x22 ,x23) =

Probability of data (sum over all states of all hidden variables)

Prob(data| 2) = P(x11 ,x12 ,x13 ,x21 ,x22 ,x23) = l11m, l11f … s23f [P(l11m) P(l11f) P(x11 | l11m, l11f,) … P(s13m) P(s13f) P(s23m | s13m, 2) P(s23m | s13m, 2) ]

2. Constructing the likelihood function …2. Constructing the likelihood function …

Page 13: Markov Chain Monte Carlo

Method: Simulate don’t go over all space.Calculate an estimator to the expectation value. ( )s

s

s We want to calculate the expectation value

1

0

1lim ( ) limn

in ni

Sn

MCMC tells us how to carry out the sampling to create a good (unbiased, low variance) estimator.

Instead, we calculate an estimator to the expectation value, by sampling from the space .

1

0

1( ) ( )n

ii

E S Sn

Page 14: Markov Chain Monte Carlo

MCMC: create an ergodic (=irreducible, aperiodic) Markov Chain and simulate from it.

We need to define an ergodic transition matrixfor which is a unique stationary distibution.If P satisfies detailed balance, i.e. for all , then is stationary for P:

ijP p

i ij j jip p

i j

i ij j jij j

i j jij

p p

p

Page 15: Markov Chain Monte Carlo

Metropolis (1953) suggested a simulating algorithm, in which one defines P indirectly via a symmetric transition matrix Q in the following way:If you are in state i, propose a state j with probabilityMove from i to j with probability .

ijq

min 1, jij

i

a

Hastings (1970) generalized for a non-symmetric Q :

min 1, j jiij

i ij

qa

q

min 1, min 1,

min , min ,

i ij j ji

j ii ij j ji

i j

ij i j ji j i

p p

q q

q q

This assures detailed balance of P :

Page 16: Markov Chain Monte Carlo

Metropolis – Hastings Algorithm:Metropolis – Hastings Algorithm:

• Start with any state .

• Step t: current state is .

Propose state j with probablity .

compute

Accept state j if .

else, draw uniformly from [0,1] a random number c. If , accept state j else , stay in state i.

0x

1tx i

ijq

1ija

ijc a

min 1, j jiij

i ij

qa

q

Page 17: Markov Chain Monte Carlo

Simulated AnnealingSimulated Annealing

Goal: Find the most probable state of a Markov Chain.

Use the Metropolis Algorithm with a different acceptance probability:

Where is the temperature.Cool gradually, i.e., allow the chain to run m times with an initial temperature , then set a new temperature and allow the chain to run again. Repeat n times, until .

1

min 1, jij

i

a

01 0

0

Page 18: Markov Chain Monte Carlo

Simulated Annealing (Kirkpatrick,1983)Goal: Find the global minimum of a multi -dimensional surface , when all states have an equal probability.

1 2( , , , )nS x x x

Use another variation of the Metropolis Algorithm.Suppose you are in state i. Draw a new state j, and accept using the acceptance probabilty:

Where T is the temperature.Cool gradually.

( )min 1, exp j i

ij

S Sa C

kT

Page 19: Markov Chain Monte Carlo

Parameters to play with: n (number of loops) m (number of iterations in each loop) T0 (initial temperature, shouldn’t be too high

or too low) T (cooling speed, shouldn’t be too fast) k (“Boltzman constant” – scaling parameter) C (first step scaling parameter) (initial state) qij (how to draw the next step)

0 0,1 0,2 0,( , , , )nx x x x

Page 20: Markov Chain Monte Carlo

MCMC Example: Descent GraphMCMC Example: Descent Graph

Corresponds to a specific inheritance vector.

Vertices: the individuals’ genes (2 genes for each individual in the pedigree).

Edges: represent the gene flow specified by the inheritance vector. A child’s gene is connected by an edge to the parent’s gene from which it flowed.

(Taken from tutorial 6)

Page 21: Markov Chain Monte Carlo

1 2

1211a/ba/b

21

13

22

14

a/ba/b23 24

b/da/c

3 4 5 6

1 2 7 8)a,b(

)a,c( )b,d()a,b(

)a,b(

)a,b(

Descent Graph

Assume that the descent graph vertices below represent the pedigree on the left.

(Taken from tutorial 6)

Page 22: Markov Chain Monte Carlo

3 4 5 6

1 2 7 8)a,b(

)a,c( )b,d()a,b(

)a,b(

)a,b(

Descent Graph

1. Assume that paternally inherited genes are on the left.

2. Assume that non-founders are placed in increasing order.

3. A ‘1’ (‘0’) is used to denote a paternally (maternally) originated gene.

The gene flow above corresponds to the inheritance vector: v = ( 1,1; 0,0; 1,1; 1,1; 1,1; 0,0 ) (Taken from tutorial 6)

Page 23: Markov Chain Monte Carlo

Creating a Markov Chain of descent graphs:

•A Markov Chain state is a (legal) descent graph.

•We need to define transition rules from one state to another.

•We will show 4 possible transition rules : T0 , T1 , T2a , T2b .

Page 24: Markov Chain Monte Carlo

Transition rule T0 :

Choose parent, child.

Page 25: Markov Chain Monte Carlo

Transition rule T0 :

Choose parent, child.Swap paternal arc to maternal arc (or vice versa)Small change in the graph slow mixing

Page 26: Markov Chain Monte Carlo

Transition rule T1 :

Choose person i.

Page 27: Markov Chain Monte Carlo

Transition rule T1 :

Choose person i. Perform a T0 transition from i to each one of his/her children.

Page 28: Markov Chain Monte Carlo

Transition rule T2a :

Choose a couple i.j with common children.

Page 29: Markov Chain Monte Carlo

Transition rule T2a :

Choose a couple i.j with common children. Exchange the subtree rooted at the maternal node of i with the subtree rooted at the maternal node of j. exchange also the paternally rooted subtrees.

Page 30: Markov Chain Monte Carlo

Transition rule T2b :

Choose a couple i.j with common children.

Page 31: Markov Chain Monte Carlo

Transition rule T2b :

Choose a couple i.j with common children. Exchange the subtree rooted at the maternal node of i with the subtree rooted at the paternal node of j, and vice versa.

Page 32: Markov Chain Monte Carlo

Problem :

1/11/1 2/22/2

Two legal graphs of inheritance.No way to move from a to b using allowed transitions(the states don’t communicate).

a1/11/1 2/22/2

1/2

3/3 3/3

1/2

b

3/3 3/3

Page 33: Markov Chain Monte Carlo

Solution: allow passage through illegal graphs (tunneling)

1/11/1 2/22/2

1/2

3/33/3

1/11/1 2/22/21/11/1 2/22/2

1/2

3/3 3/3 3/3

1/2

b

3/3

Page 34: Markov Chain Monte Carlo

Applying tunneling in MCMC:

In order to move from state i to state j (one step of the chain), (1) Select a transition (+person+locus) and perform it.(2) Toss a coin. If it says “head”, stop.

If it says “tail”, goto (1). (3) If the proposed state j is legal, accept it with the Metropolis probability

This assures that the Markov Chain is irreducible.

min 1, jij

i

a

Page 35: Markov Chain Monte Carlo

One use of descent graphs MCMC is the calculation of the location score for location d, given marker evidence M and trait phenotypes T :

The difficult term to calculate is the conditional probability . Assuming linkage equilibrium we can write

,where is a descent graph.

MCMC: the stationary distribution of the chain is the conditional distribution .If a sequence of legal descent graphs is generated by running the chain, then for n large enough the estimate is: . The term can be calculated quickly, using MENDEL.

ˆ

ˆ ˆPr Pr Prd dG

T M T G G M

10

Prlog

Pr( )d T MT

Prd T M

G

ˆPr G M

1ˆ ˆ, , nG G

1

1 ˆPr Prn

d di

T M T Gn

ˆPrd T G

Page 36: Markov Chain Monte Carlo

The Gibbs samplerThe Gibbs sampler Special case of the Metropolis - Hastings algorithm for

Cartesian product state spaces Suppose that each sample point i=(i1,i2,i3..im) has m

components, for instance, a pedigree of m individuals The Gibbs sampler updates one component of i at a

time. If the component is chosen randomly and resampled

conditional on the components, then the acceptance probability is 1 .

Page 37: Markov Chain Monte Carlo

The Gibbs samplerThe Gibbs samplerProof: Let ic be the uniformly chosen component Denote the remaining components by i-c If j is a neighbor of i reachable by changing only

component ic then j-c=i-c. The proposal probability is

and satisfies ppiiqqijij=p=pjjqqjiji The ratio appearing in the acceptance probability is

1{ : }

jij

c c k

qm k k i

1{ : }

11

{ : }

ij

j ji c c k

ji iji

c c k

q m k k jq

m k k i

Page 38: Markov Chain Monte Carlo

Example:Example:MCMC with ordered genotypesMCMC with ordered genotypes

Suppose we want to generate a pedigree for which we know only the unordered genotype of 4 individuals in the last generation .

We make the following simplifying assumptions: The population is in HW equilibrium The three loci are in linkage equilibrium There is no crossover interference between the

three loci

Page 39: Markov Chain Monte Carlo

One option is to first simulate the founders genotype according to population frequencies, then “drop” genes along the tree according to Mendel’s law and the recombination fractions

1 2

1211

21

13

22

14

2,41,21,2

2,41,22,2

23 243,41,21,3

1,21,31,3

Page 40: Markov Chain Monte Carlo

One realization of an ordered genotype consisting One realization of an ordered genotype consisting with the data we have is the following pedigree :with the data we have is the following pedigree :

1 2

1211

21

13

22

14

2|41|21|2

2|41|22|2

23 24

4|31|21|3

2|11|31|3

2|11|21|3

1|43|12|1

2|11|31|2

4|22|32|3

2|42|13|1

1|33|23|3

Page 41: Markov Chain Monte Carlo

Computing the probability of a certain realization

of this pedigree is easy but even for this small

tree the proposal we saw is just one of

((43)2)4x(23)12=284 options.

It is obvious we need a better method .

Page 42: Markov Chain Monte Carlo

The Gibbs sampler updates the genotype of one individual at one locus at a The Gibbs sampler updates the genotype of one individual at one locus at a time.time.

This is easy given all other data due to the local nature of the dependency. This is easy given all other data due to the local nature of the dependency. We need to compute We need to compute

* * Pr[xPr[ximim(l), x(l), xipip(l)|x(l)|ximim(-l), x(-l), xipip(-l), x(-l), x-i-i,y],y]Where xWhere ximim(l), x(l), xipip(l) are the maternal and paternal alleles of locus l of (l) are the maternal and paternal alleles of locus l of individual i, individual i, xximim(-l), x(-l), xipip(-l) are the paternal and maternal alleles of all other loci of (-l) are the paternal and maternal alleles of all other loci of individual i, individual i, y is the phenotype data,y is the phenotype data,and xand x-i -i denotes the data of all other individuals in the pedigree.denotes the data of all other individuals in the pedigree.

The Gibbs samplerThe Gibbs sampler

Page 43: Markov Chain Monte Carlo

This calculation involves only the phenotype of i and the genotypes of his parents and children

under the no crossover assumption the genotype of on locus across a chromosome depends only on it’s adjacent loci

We get the following expression to calculate * Pr[ ( ), ( ) | ( ), ( ), , ]

Pr[ | ]Pr[ | ]Pr[ ( ) | ( )] Pr[ | ]( )

Pr[ | ]Pr[ | ]Pr[ ( ) | ( )] Pr[ | ]( )( )

x l x l x l x l x yim ip im ip ix x x x y l x l x xm pim ip i ikmk kids i

x x x x y l x l x xx l m pim ip i ikmi k kids i

Page 44: Markov Chain Monte Carlo

Lets return to the pedigree we showed earlier. Assume that the ordered genotype is the current state, and we want to update individual 13 at locus number 2.

There are 3 genotypes consistent with the genotype configuration of other individuals in the pedigree 1|1,2|1, 1|3

We need to calculate the probability of each one of these states according to the equation shown in the last slide

Page 45: Markov Chain Monte Carlo

We first compute the paternal and maternal transmission probabilities for each allele in these three genotypes.

Then we need the transmission probabilities from individual 13 to his kids.

Once the three have been computed ,they are normalized by dividing each one by the total sum.

The genotype is sampled according to that distribution .

Page 46: Markov Chain Monte Carlo

Problems :Problems : Updating is done a single genotype at a time.

For a large pedigree this is a very slow process The Gibbs sampler may also be reducible, i.e.

some legal states will never be reached.For example:

A|B A|C

A|A B|C

A|C A|B

A|A C|Bby Gibbs

Page 47: Markov Chain Monte Carlo

One solution to this problem is allowing illegal states in the chain which will later be discarded.

Another is to identify the irreducible components called islands and to define metropolis states allowing to jump from one island to another. However the whole process of finding the islands

and setting up proposal for the metropolis jumps is not fully automated

the work involved for each different pedigree prevents the widespread use of this method.

Page 48: Markov Chain Monte Carlo

Other MCMC that use the Gibbs SamplerOther MCMC that use the Gibbs Sampler

The L-sampler (Heath, 1997) A Whole Locus Gibbs sampler

combining Monte Carlo simulation with single-locus peeling

Page 49: Markov Chain Monte Carlo

The M-sampler (Thompson and Heath, 1999;Thompson, 2000a) A whole-meiosis Gibbs sampler which

jointly updates an entire meiosis from the full conditional distribution

Other MCMC that use the Gibbs Other MCMC that use the Gibbs SamplerSampler

Page 50: Markov Chain Monte Carlo

ReferencesReferences K. Lange, Mathematical and Statistical Methods for

Genetic Analysis, chapter 9, Springer (1997). A. Bureau, T.P. Speed, Course stat 260: Statistics in

Genetics, week 7. A.W. George, E. Thompson, Discovering Disease

Genes: Multipoint Linkage Analysis via a New Markov Chain Monte Carlo Approach, Statistical Science, Vol. 18, No.4 (2003)

Lecture notes. Tutorial notes.

Page 51: Markov Chain Monte Carlo

The End!!