genomic rearrangements cs 374 – algorithms in biology fall 2006 nandhini n s

30
Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Post on 21-Dec-2015

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Genomic Rearrangements

CS 374 – Algorithms in BiologyFall 2006

Nandhini N S

Page 2: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Motivation

One of the keys to evolution. Detecting dynamics between members of

the same family. An interesting combinatorial problem!! Everybody loves Central Limit theorem (or

a variant).

Page 3: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Terminology Possible rearrangements

• Reversals• Translocations• Fission• Fusion.

Most Parsimonious scenario. Genomic Distance. Synteny Blocks

Page 4: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Describing the problem

Basically a reversal distance problem.

Given permutations π & σ (permutations

implying genes) , find a series of reversals such

that π.ρ1.ρ2.ρ3 …. .ρn = σ and n (genomic

distance) is minimum.

“The most parsimonious scenario”.

Page 5: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S
Page 6: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S
Page 7: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Putting it all together

Local Alignments. Synteny Blocks. Breakpoint Graph. Rearrangement Scenario.

Page 8: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

From Local Alignments to Synteny Blocks Non-Trivial Issue!!

False orthologs. Micro-rearrangements. Sequence similarities in non-coding regions.

Page 9: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Human and Mouse Synteny Blocks

Page 10: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Grimm Synteny algorithm

Form an anchor Graph whose vertex set is the set of anchors.

Obtaining the Anchor Graph. (Use BLAST/ BLAST like techniques).

Page 11: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Grimm Synteny algorithm, contd.

Connect vertices in the anchor graph by an edge if the distance between them is smaller than the gap size G.

Page 12: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Determine the connected components of the anchor graph. Each small component is called a cluster.

Grimm Synteny algorithm, contd.

Page 13: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Delete ‘small’ clusters (shorter than the minimum cluster size C in length).

Grimm Synteny algorithm, contd.

Page 14: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Determine cluster

order and signs for

each genome.

Output the strips

in the resulting

cluster order as

synteny blocks.

Grimm Synteny algorithm, contd.

Page 15: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

From Synteny Blocks to the breakpoint graph

Page 16: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

From Breakpoint Graph to Rearrangement Scenarios

b(π)–c(π)+h(π) <= d(π) <= b(π)–c(π)+h(π)+1

“Efficient sorting of genomic permutations by translocation, inversion and block interchange ”

Page 17: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Reconstructing contiguous regions of an ancestral genome.

Page 18: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S
Page 19: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Reconstructing regions of an ancestral genome

Segmenting genomes based on pair wise alignments.

Nets -> Orthology Blocks -> Conserved Segments.

Page 20: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Nets to Orthology Blocks to Conserved Segments

First determine alignments

Then the orthology blocks

And then come the conserved segments.

Page 21: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Methodology

Predicting contiguous ancestral regions (CARs)

from modern alignments.

Identification of small inversions

Properties of breakpoints.

Inferring CARs.

Page 22: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Consider..

Page 23: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S
Page 24: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Sundry Details - Small Inversions.

For ambiguous cases, go with human data (the best documented till now).

Page 25: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

A Sanity Check

Define a genome; and follow it through its

evolution!!

Imagine a genome π with n elements, that

evolves through a series of rearrangements.

Works! 90.8% of adjacencies predicted in the

Boreoeutherian ancestor are correct!

Page 26: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

More realism!!!! Employed a realistic evolutionary tree with

branch lengths based on substitution frequencies.

Rearrangements – 90% Inversions.5% Translocations.3.75% Fusions.1.25% Fissions.

Modeled length of block with γ distribution, with shape and scale parameters α = .7 and θ = 500.

Page 27: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Comparison with other reconstructions

Page 28: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Details More data needed. Looking for better sequenced outgroups. Require improvements in handling large

duplications and deletions. Modeling gene conversion, expansion,

contraction of short tandem repeats caused by strand slippage.

Eventually; nucleotide resolution.

Page 29: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Inferring CARs

Page 30: Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S

Thank you