Download - Very Short Dispersed Repeats
Very Short Dispersed Repeats
Also, palindromes
What are they? Short sequences (between 4 and 13-ish nucleotides) – one might
even say they’re very short Occur multiple times in the genome, at dispersed intervals (not
repeated right next to each other) – “distributed or spread over a wide interval”
It turns out that many of these very short dispersed sequences are palindromic – what
does that mean?
The sequence is read the same in the 5’ 3’ direction as it is in the 3’ 5’ direction.
Occurrence of Highly Iterated Palindromes
(HIP1) in Cyanobacteria
Palindromic Sequence in Cyanobacteria
5’-GCGATCGC-3’3’-CGCTAGCG-5’
But not all cyanobacteria have this sequence
Blast SearchWhat is the purpose of HIP1?
ATGCATGATACGTAGCGATCGCCACCCGGGATTGCGATCGC
Match: GCGATCGC What genes are nearby? Prior research shows DAM
methylase recognition and DNA profiling
Thermosynechococcus elongatus BP1 &Synechococcus Elongatus PCC 6301
Prochlorococcus marinus mit9301 &Trichodesmium Erythraeum IMS101
Results?
Hypothetical Proteins-highest peak in Themosynechococcus and Trichodesmium
Many metabolic proteins such as Ferredoxin, peptidase
Program to overlap nearby genes in different organisms
References
Robinson, P. J., Cranenburgh, R. M., Head, I. M. and Robinson, N. J. (1997), HIP1 propagates in cyanobacterial DNA via nucleotide substitutions but promotes excision at similar frequencies inEscherichia coli and Synechococcus PCC 7942. Molecular Microbiology, 24: 181–189.
Robinson, P. J., Gupta, A., Bleasby, A., Whitton, B., Morby, AP. Singular over-representation of an octameric palindrome, HIP1, in DNA from many cyanobacteria
Moya, A. Delaye, L. Abundance and distribution of the highly iterated palindrome 1(HIP1) among prokaryotes Mob Genet Elements. 2011 Sep-Oct; 1(3) 159-168
Occurrences of DNA uptake sequence from Haemophilus influenzae in other Pasteurellaceae bacteria
By: Noha Mudhaffar
AAGTGCGGT
1516 (479 – 1001) 1913428 0.3815231
Highly Repeated Sequences
Family: Pasteurellaceae
Organism Length GC-FRACTION Occurrences of DNA USS
Actinobacillus-succinogenes-130Z
2319663 0.44918594 1690
Haemophilus-influenzae-86-028NP 1913428 0.3815231 1516
Actinobacillus-actinomycetemcomitans-HK1651
1995520 0.44412282 1507
Mannheimia-succiniciproducens-MBEL55E
2314078 0.42537978 1485
Haemophilus-influenzae-R2846
1824242 0.37971717 1461
Haemophilus-somnus-2336 2263857 0.37378067 1355
Haemophilus-somnus-129PT 2012878 0.37191722 1245
Haemophilus-influenzae-R2866
1933340 0.38079283 952
Pasteurella-multocida-subsp-multocida-str-Pm70 2257487 0.40404883 927
Haemophilus-influenzae-86028NP 1738864 0.38510486 888
Haemophilus-influenzae-Rd-KW20 1830138
0.38147888 737
Actinobacillus-pleuropneumoniae-L20 2274482 0.41299513 73
Actinobacillus-pleuropneumoniae-serovar-1-str-4074 2292348 0.41376877 63
Mannheimia-haemolytica 2498406 0.40754706 59
Haemophilus-ducreyi-35000HP 1698955 0.38220495 41
Organism Length GC-FRACTION Occurrences of DNA USS
Escherichia-coli-DH10B 5004529 0.5103499 19
Escherichia-coli-53638 5289471 0.5095685 22
Escherichia-coli-HS 4643538 0.5081961 5
Escherichia-coli-E24377A 4980187 0.50621593 19
Escherichia-coli-E2348-69 5059346 0.5050742 18
Escherichia-coli-F11 5206906 0.5049682 19
Escherichia-coli-042 5379979 0.50532633 24
Escherichia-coli-E110019 5384084 0.5077157 17
Escherichia-coli-K12 4639221 0.50788873 23
Escherichia-coli-O157-H7 5594477 0.5048416 27
Escherichia-coli-B7A 5202558 0.5084804 15
Escherichia-coli-APEC-O1 5497653 0.5033812 14
Escherichia-coli-B171 5299753 0.50713533 19
Escherichia-coli-CFT073 5231428 0.50474805 7
Escherichia-coli-W3110 4646332 0.5079958 22
Escherichia-coli-E22 5516160 0.506397 22
Escherichia-coli-O157-H7-EDL933
5528445 0.5038297 10
Escherichia-coli-ATCC-8739 4746218 0.5086652 23
Reference
Frequency and Distribution of DNA Uptake Signal Sequences in the Haemophilus influenzae Rd Genome.
Genomic Sequence of an Otitis Media Isolate of Nontypeable Haemophilus influenzae: Comparative Study with H. influenzae Serotype d, Strain KW20.
Xu Z, Yue M, Zhou R, Jin Q, Fan Y, et al. (2011) Genomic Characterization of Haemophilus parasuis SH0165, a Highly Virulent Strain of Serovar 5 Prevalent in China. PLoS ONE 6(5): e19631. doi:10.1371/journal.pone.0019631.
DNA uptake signal sequences in naturally transformable bacteria.
The Evolutionary Change of DNA Uptake Sequences in Neisseria meningitides.
What are DNA Uptake Sequences (DUS)?
Neisseria sp. Constitute ~1% of genome. Homology 5’GCCGTCTGAA’3
Kingdom: Bacteria Phylum: Proteobacteria
Class: Betaproteobacteria Order: Neisseriales
Family: Neisseriaceae Genus: Neisseria
Species:Neisseria meningitidis
DNA Uptake Sequence
AT-DUS AG-DUS
A Closer Look
Strain # of DUS
DUS Sequence G+C (%) Length of Genome (bp)
N. meningitidis MC58
1477 5’ATGCCGTCTGAA’3 51.5 2272351
N. Meningitidis Z2491
1449 5’AGGCCGTCTGAA’3 51.8 2184406
N. Gonorrhoeae FA1090
1522 5’ATGCCGTCTGAA’3 52.7 2153922
DUS Inversion
Phylogeny16s rRNA DUS
References http://phil.cdc.gov/phil/details.asp?pid=2678 Frye SA, Nilsen M, Tønjum T, Ambur OH. Dialects of the DNA uptake
sequence in Neisseriaceae. PLoS Genet. 2013
Six nucleotide palindromic
sequences in Mycobacteriophage
genomesWhat about them? That’s a great question. I’m glad you asked.
How did we get here from Very Short Dispersed Repeats?
A very short story. Point A:“Singular over-representation of an octameric palindrome, HIP1, in DNA from many cyanobacteria.” From there: what about palindromes in mycobacteriophages?
?
Avoidance of 6 nt palindromes in Mycobacteriophages
Mycobacterial genomes generally do not avoid 6 palindrome sequence. Generally, this means that the viruses that infect them will not either. When two of the mycobacteriophage genomes were examined, they were found to avoid palindromes of size 6.
“The sole exception is provided by the two M. tuberculosisphages D29 and L5, which strongly avoid palindromes of size 6.” (Rocha, et. al) 2001L5 and D29 – Mycobacteriophage cluster A2
Generated 186 random sequences of the same length of the average Mycobacteriophage genome (70627 nucleotides long) and same GC content
(64%) and counted the number of occurrences of all 6 nucleotide palindromes in these randomly generated sequences.
1240 1260 1280 1300 1320 1340 1360 1380 1400 1420 1440 1460 1480 1500 More05
1015202530354045
Histogram
Occurrences of 6nt palindromes random Mycobac-teriophage-like DNA sequence
Freq
uenc
y of
occ
urre
nces
Occurrences of all 6 nucleotide palindromes over the actual genomes of Mycobacteriophages (or at least the 186 that BioBIKE knows)
320 420 520 620 720 820 920 1020 1120 1220 1320 1420 1520 1620 1720 1820 1920 2020 2120 2220 2320 2420 25200
2
4
6
8
10
12
14
16
Histogram
Occurrences of 6nt palindromes in genomes of Mycobacteriophage
Freq
uenc
y of
occ
urre
nces
Which phages are outliers to the right? (>2000 occurrences) - these
15((Mycobacterium-phage-Cali 2485) – C, C1(Mycobacterium-phage-Catera 2466) – C, C1(Mycobacterium-phage-Alice 2437) – C, C1(Mycobacterium-phage-LRRHood 2445) – C, C1(Mycobacterium-phage-Rizal 2465) – C, C1(Mycobacterium-phage-Nappy 2528) – C, C1(Mycobacterium-phage-Ghost 2524) – C, C1(Mycobacterium-phage-Drazdys 2506) – C, C1(Mycobacterium-phage-ScottMcG 2480) – C, C1(Mycobacterium-phage-Spud 2485) – C, C1(Mycobacterium-phage-Sebata 2519) – C, C1(Mycobacterium-phage-Pio 2505) - C, C1(Mycobacterium-phage-Bxz1 2501) – C, C1(Mycobacterium-phage-LinStu 2478) – C, C1(Mycobacterium-phage-ET08 2466)) – C, C1
All of the cluster C1 phages in BioBIKE!
2400 2420 2440 2460 2480 2500 2520 2540 More0
1
2
3
4
5
6
Right Side
Freq
uenc
y
What’re C1 cluster phages?
“Only two of these (Subclusters C1 and C2) correspond to phages with myoviral morphologies (with contractile tails)”
Okay, so they’re of the family Myoviridae. This means they are: generally lytic, and lack the necessary genes to become lysogenic. They have a contractile tail, and contracting the tail requires ATP.
C cluster phage isolated by Michael Kiflezghi!
Which sequences are occurring so frequently?
GGCGCC GACGTC CGCGCGACCGGT
GTCGAC GCGCGC CCCGGGTGGCCA CAGCTG AGGCCTCGATCG CTGCAG TGCGCA
Many of these are recognition sites for restriction enzymes. Significant? There’s a chance. Warrants more investigation? It seems likely.
References and credit for pictures
phagesdb.org Rocha, E., Danchin, A., & Viari, A. (2001). Evolutionary role of Restriction/Modification systems as
revealed by comparative genome analysis. Genome Research, (11), 946-958. doi:10.1101/gr.153101Discussion of avoidance of palindromic sequences of length 4 and 6 and possible reasons for this avoidance in bacteria and bacteriophages. Mentions 2 mycobacteriophages that exhibit an avoidance for 6nt palindromes, L5 and D29.
Article Source: Expanding the Diversity of Mycobacteriophages: Insights into Genome Architecture and Evolution. Pope WH, Jacobs-Sera D, Russell DA, Peebles CL, Al-Atrache Z, et al. (2011) Expanding the Diversity of Mycobacteriophages: Insights into Genome Architecture and Evolution. PLoS ONE 6(1): e16329. doi: 10.1371/journal.pone.0016329
cyanobacteria picture: http://www.um.edu.mt/__data/assets/image/0005/166604/oculatella2.jpg mycobacteriophage picture: http://
openi.nlm.nih.gov/imgs/512/165/2884959/2884959_2711fig1.png coral snake: http://upload.wikimedia.org/wikipedia/commons/8/8a/Coral_snake.jpg