comparative genomics: analysis of the mouse...

Post on 29-May-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Comparative Genomics:Analysis of the Mouse Genome

Initial sequencing and comparativeanalysis of the mouse genome.

Mouse Genome Sequencing Consortium2002, Nature 420:520-562.

Mouse/human genome comparison

• Conservation of synteny: number ofchromosome rearrangements

• Repeats

• Evolution of orthologues. Ratio Ka/Ks

• Evolution of gene families

• Selection

2

Purpose

Highlights

• Genome 14% smaller than human (2.5Gb vs 2.9 Gb).

• 90% corresponds to regions of conserved synteny.

• 40% can be aligned at the nucleotide level.

• ~0.5 nucleotide substitutions per site since the divergence of the two species.

• 5% under purifying selection.

3

More highlights

• Various measures of divergence show substantial variation across the genome.

• 30,000 protein-coding genes.• Dozens of local gene expansions.• Estimation of the rate of protein evolution

in mammals. Certain classes of secreted proteins under positive selection.

• Marked differences in activity but similar types of repeat sequences.

• 80,000 SNP identified.

Divergence time

p.521

4

Sequencing strategy

p.522

5

88 mapped ultracontigs with N50 length = 50.6 Mb

6

Syntenic segments and syntenic blocks

7

8

Size distribution of segments and blockswith conserved synteny

betwwen mouse and human

24.046.433.538.6Total

1.03.00.40.9DNA

4.18.68.79.9LTR

10.713.67.68.2SINEs

7.921.016.519.2LINEs

Lineage specific

HumanLineage specific

MouseTEs

Composition of repeats in the mouse and human genome Fraction of lineage-specific repeats

Ancestral repeats 5% 22%

9

Twofold higher of nucleotide substitution rate in the mouse lineage

(estimated from comparison of ancestral repeats)

Human Mouse

0.17 substitutions per site

0.34 substitutions per site

Age distribution of interspersed repeats in the mouse genome

10

Pseudogenes in mouse genome: ~14.000. More than half processedpseudogenes.

Gapdh: 1 single functional gene and ~400 pseudogenes distriburedacross 19 of the mouse chromosomes.

11

Comparison of 12.845 1:1 orthologues

12

Evolution of Cytochrome P450 gene familiesin mouse

Changes in genome size

Human 2.9 Gb Mouse 2.5 Gb

Ancestor 2.9 Gb

Lineage-specific repeats + 900 Mb

Deletion -1.300 Mb

-----------------------------------------------

Net change - 400 Mb

Lineage-specific repeats + 700 Mb

Deletion - 700 Mb

-----------------------------------------------

Net change - 0 Mb

Expected proportion of the ancestral genome retained in both species

76% x 55% = 42%

13

Neutral substitution rate

• Ancestral repeat sequence.

66.7% nucletide identity 0.46-0.47 substitutions per site

• Fourfold degenerate sites in codons ofgenes

67% nucletide identity 0.46-0.47 substitutions per site

Example: n = 100; = 0.667 (genome-wide average); p = 0.8; S = 2.8

14

Proportion of mammalian genome underevolutionary selection for biological function

Sneutral

Sgenome

Sselected

20.8% of the windows are under selection

25.2% of human genomecontained in windows

20.8 x 25.2 = 5.25% ofgenome under selection

15

Proportion of genome under selection

p. 552

• 1.5% protein-coding regions of genes• 1% UTR of protein-coding genes• Regulatory regions that control gene-

expression• Non-protein coding RNAs (ncRNAs)• Chromosomal structural elements• Recent pseudogenes• Other??????

Proportion of genome under selection

top related