the impact of prophage on bacterial chromosomes

10
Molecular Microbiology (2004) 53(1), 9–18 doi:10.1111/j.1365-2958.2004.04113.x © 2004 Blackwell Publishing Ltd Blackwell Science, LtdOxford, UKMMIMolecular Microbiology0950-382XBlackwell Publishing Ltd, 2004 ? 2004531918Review ArticleProphage–chromosome interactionC. Canchaya, G. Fournous and H. Brüssow Accepted 27 February, 2004. *For correspondence. E-mail [email protected]; Tel. (+41) 21 785 8676; Fax (+41) 21 785 8544. These authors contributed equally to the database mining. MicroReview The impact of prophages on bacterial chromosomes Carlos Canchaya, Ghislain Fournous and Harald Brüssow* Nestlé Research Centre, Nutrition and Health Department/Functional Microbiology Group, CH-1000 Lausanne 26 Vers-chez-les-Blanc, Switzerland. Summary Prophages were automatically localized in se- quenced bacterial genomes by a simple semantic script leading to the identification of 190 prophages in 115 investigated genomes. The distribution of prophages with respect to presence or absence in a given bacterial species, the location and orientation of the prophages on the replichore was not homogeneous. In bacterial pathogens, prophages are particularly prominent. They frequently encoded virulence genes and were major contributors to the genetic individuality of the strains. However, some commensal and free-living bacteria also showed prominent prophage contributions to the bacterial genomes. Lysogens containing multiple sequence- related prophages can experience rearrangements of the bacterial genome across prophages, leading to prophages with new gene constellations. Transfer RNA genes are the preferred chromosomal integra- tion sites, and a number of prophages also carry tRNA genes. Prophage integration into protein cod- ing sequences can lead to either gene disruption or new proteins. The phage repressor, immunity and lysogenic conversion genes are frequently tran- scribed from the prophage. The expression of the latter is sometimes integrated into control circuits linking prophages, the lysogenic bacterium and its animal host. Prophages are apparently as easily acquired as they are lost from the bacterial chromo- some. Fixation of prophage genes seems to be restricted to those with functions that have been co- opted by the bacterial host. Introduction Microbial genomics revealed that, in some bacteria, sub- stantial amounts of the bacterial DNA is in fact prophage DNA (Canchaya et al ., 2003; Casjens, 2003). It also became increasingly clear that prophage DNA has played an important role in the evolution of bacterial pathogenic- ity (Boyd and Brüssow, 2002). These data have changed our understanding of phage–bacterium interaction from a simple parasite–host relationship into a two-way co- evolution of viral and bacterial genomes. Here, we provide a short overview on the impact of prophage integration on bacterial genome structure and diversification. Methodological problems of prophage identification Prophage identification is not an exact science, but a labour-intensive, empirical approach that needs a lot of insight (Casjens, 2003). To keep pace with the increasing number of sequenced bacterial genomes, a simple script was written in our laboratory (Fournous, 2003) that trans- forms the GenBank file of a bacterial genome into a FASTA file of all its open reading frames (ORFs). A BLASTX search is then conducted with each individual ORF, and the out- put of the significant database matches is searched by a semantic program for its annotations using positive (e.g. phage, integrase, tail, capsid, terminase, portal) and neg- ative (e.g. macrophage, transposase, transposon, inser- tion) keywords. The hits are plotted for each ORF position of the genome, and a further script transforms this hit list into prophage genomes by performing a neighbour anal- ysis for further phage-like genes around each individual hit. An example of the automatic data output is shown for Salmonella enterica serovar Typhi strain CT-18. It shows all phage hits (ticks on the outer circle) and the prophage identification by neighbour analysis (green bars) (Fig. 1A). Differences between the automatic and manual annota- tions by Casjens (2003) shown in the second circle were minimal and concern only remotely phage-like elements (Sti5, Sti6; Fig. 1A, centre). The maps of the candidate prophage elements are shown in Fig. 1A. Sti1 and Sti4 are clearly lambda-like prophages in their genetic organization, but not in their sequence; the structural genes resemble a Photorhabdus

Upload: wojciech-sienkiewicz

Post on 22-Nov-2014

82 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Impact of Prophage on Bacterial Chromosomes

Molecular Microbiology (2004)

53

(1), 9–18 doi:10.1111/j.1365-2958.2004.04113.x

© 2004 Blackwell Publishing Ltd

Blackwell Science, LtdOxford, UKMMIMolecular Microbiology0950-382XBlackwell Publishing Ltd, 2004

? 2004

53

1918

Review Article

Prophage–chromosome interactionC. Canchaya, G. Fournous and H. Brüssow

Accepted 27 February, 2004. *For correspondence. [email protected]; Tel. (+41) 21 785 8676; Fax (+41)21 785 8544.

These authors contributed equally to the databasemining.

MicroReview

The impact of prophages on bacterial chromosomes

Carlos Canchaya,

Ghislain Fournous

and Harald Brüssow*

Nestlé Research Centre, Nutrition and Health Department/Functional Microbiology Group, CH-1000 Lausanne 26 Vers-chez-les-Blanc, Switzerland.

Summary

Prophages were automatically localized in se-quenced bacterial genomes by a simple semanticscript leading to the identification of 190 prophagesin 115 investigated genomes. The distribution ofprophages with respect to presence or absence in agiven bacterial species, the location and orientationof the prophages on the replichore was nothomogeneous. In bacterial pathogens, prophagesare particularly prominent. They frequently encodedvirulence genes and were major contributors to thegenetic individuality of the strains. However, somecommensal and free-living bacteria also showedprominent prophage contributions to the bacterialgenomes. Lysogens containing multiple sequence-related prophages can experience rearrangementsof the bacterial genome across prophages, leadingto prophages with new gene constellations. TransferRNA genes are the preferred chromosomal integra-tion sites, and a number of prophages also carrytRNA genes. Prophage integration into protein cod-ing sequences can lead to either gene disruption ornew proteins. The phage repressor, immunity andlysogenic conversion genes are frequently tran-scribed from the prophage. The expression of thelatter is sometimes integrated into control circuitslinking prophages, the lysogenic bacterium and itsanimal host. Prophages are apparently as easilyacquired as they are lost from the bacterial chromo-some. Fixation of prophage genes seems to berestricted to those with functions that have been co-opted by the bacterial host.

Introduction

Microbial genomics revealed that, in some bacteria, sub-stantial amounts of the bacterial DNA is in fact prophageDNA (Canchaya

et al

., 2003; Casjens, 2003). It alsobecame increasingly clear that prophage DNA has playedan important role in the evolution of bacterial pathogenic-ity (Boyd and Brüssow, 2002). These data have changedour understanding of phage–bacterium interaction from asimple parasite–host relationship into a two-way co-evolution of viral and bacterial genomes. Here, we providea short overview on the impact of prophage integration onbacterial genome structure and diversification.

Methodological problems of prophage identification

Prophage identification is not an exact science, but alabour-intensive, empirical approach that needs a lot ofinsight (Casjens, 2003). To keep pace with the increasingnumber of sequenced bacterial genomes, a simple scriptwas written in our laboratory (Fournous, 2003) that trans-forms the GenBank file of a bacterial genome into a

FASTA

file of all its open reading frames (ORFs). A

BLASTX

searchis then conducted with each individual ORF, and the out-put of the significant database matches is searched by asemantic program for its annotations using positive (e.g.phage, integrase, tail, capsid, terminase, portal) and neg-ative (e.g. macrophage, transposase, transposon, inser-tion) keywords. The hits are plotted for each ORF positionof the genome, and a further script transforms this hit listinto prophage genomes by performing a neighbour anal-ysis for further phage-like genes around each individualhit. An example of the automatic data output is shown for

Salmonella enterica

serovar Typhi strain CT-18. It showsall phage hits (ticks on the outer circle) and the prophageidentification by neighbour analysis (green bars) (Fig. 1A).Differences between the automatic and manual annota-tions by Casjens (2003) shown in the second circle wereminimal and concern only remotely phage-like elements(Sti5, Sti6; Fig. 1A, centre).

The maps of the candidate prophage elements areshown in Fig. 1A. Sti1 and Sti4 are clearly lambda-likeprophages in their genetic organization, but not in theirsequence; the structural genes resemble a

Photorhabdus

Page 2: The Impact of Prophage on Bacterial Chromosomes

10

C. Canchaya, G. Fournous and H. Brüssow

© 2004 Blackwell Publishing Ltd,

Molecular Microbiology

,

53

, 9–18

prophage. Sti3 is a Mu-like whereas Sti8 and Sti9 are P2-like prophages. The well-known gene map of these threephage types allowed the tentative identification of morons(extra non-phage genes inserted into the genomes oftemperate phages) (Juhala

et al

., 2000). Notably, themorons encode important candidate virulence factors for

this typhoid fever-causing

Salmonella

serovar (seeFig. 1A legend for annotations). Sti10 resembles a P4-likeprophage and also contains a potential moron (Fig. 1A).Sti7 represents a tail gene cluster from a P2-like phageflanked by a

pin

invertase. The presence of recombinases(integrase, transposase, invertase) points to the mobile

B C

n

P

ori

ter

Sti1

Sti2

Sti3

Sti4

Sti5Sti6Sti7

Sti9Sti10

ori

ter

Sti8

T

R

R

T

T

I

I 2

1

R

3 4

I 5

65

78

9 1

I

1:

4:

3:

8:

9:

10:2:4:5:6:7:

R

A

Fig. 1.

Prophage distribution.A.

Salmonella enterica

serovar Typhi strain CT18 and its prophages. The two outer circles represent the circular genome map. Prophages identified by Casjens (2003) are indicated by red boxes. The outermost ring displays the computer output of our prophage detection pro-gram (Fournous, 2003). Each tick is a potential phage hit. Prophages identified by automatic neighbour analysis in our program are marked by a small green box. The gene maps of the candidate prophage elements are displayed within the circle with their Sti number as identi-fier; they are not to scale. The prophage pro-teins are coloured according to putative function in the top five maps representing lambda- (1, 4), Mu- (3) and P2-like (8, 9) proph-ages. The colour code is as follows: red, lysog-eny; violet, lysis; green, head; brown, head-to-tail; blue, tail; mauve, tail fibre; black, lysogenic conversion; orange, DNA replication; yellow, transcriptional regulation. In the P4-like (10) and prophage remnants (bottom five maps), phage-related genes are coloured red, and candidate lysogenic conversion genes are in black. The following candidate lysogenic con-version genes were identified: 1,

msgA

; 2, enterohaemolysin-1; 3, enterohaemolysin-2; 4, type II secretion; 5, restriction–modification; 6,

sopE

; 7, protein kinase/phosphatase; 8, pertus-sis-like toxin subunits A and B; 9,

sspH

. Recom-binases are identified with I (integrase), T (transposase) and R (invertase).B. Location of prophages from 115 investigated genomes projected on an idealized replichore of a bacterial genome (

ori

: origin,

ter

: terminus).C. The ordinate gives the number n of prokary-otic genomes, the abscissa the number p of prophages per genome. The height of the bars gives the number of genomes showing the indi-cated number of prophages.

Page 3: The Impact of Prophage on Bacterial Chromosomes

Prophage–chromosome interaction

11

© 2004 Blackwell Publishing Ltd,

Molecular Microbiology

,

53

, 9–18

character of these DNA elements (Fig. 1A). In contrast,Sti2 and 5 show isolated phage genes that flank pertussis-like toxin genes (Sti2) and

Salmonella

serovar Typhimu-rium virulence genes (

sspH

and

msgA

) (Sti5). Sti5 and 6were identified by a manual search (Casjens, 2003), butnot by the automatic program (Fig. 1A). On the basis oftheoretical models, prophages are predicted to decay bydeletion of phage genes resulting in prophage remnants.In contrast, genes conferring a selective advantage suchas a moron should be retained (Lawrence

et al

., 2003).Sti5 fits this prediction, but its prophage derivation mustawait the discovery of related, but less decayed prophageelements.

Distribution and orientation of prophages

One hundred and ninety prophages identified by the pro-gram in 115 sequenced prokaryotic genomes were pro-jected on a hypothetical replichore (one half of thechromosome) (Fig. 1B). The position of the prophages isgiven as a percentage of the distance from

ori

(origin) to

ter

(terminus of replication) regardless of which half (leftor right) of the chromosome the prophages were identifiedon. The representation does not take account of the vari-able length of the different bacterial replichores. Overall,the prophages were relatively evenly distributed, but thedensity of the prophages was greater near

ter

than near

ori

. Approximately half the sequenced prokaryoticgenomes do not contain either prophages or well-definedremnants (Fig. 1C). This group contains all sequencedArchaea and most intracellular eubacterial pathogens. Ithas been argued that the latter have undergone recentgenome contractions in which all non-essential genes forthe intracellular niche have been lost. Prophages have,however, been observed in insect endosymbionts (phageAPSE-1 and

Wolbachia

prophages), suggesting that intra-cellular bacteria can harbour phages and prophages. Aquarter of all genomes contain one or two prophages.Approximately one-tenth of the genomes contributed themajority of the prophage sequences (Fig. 1C). Bacteriacontaining six or more prophages in their genome aremainly pathogens, with the notable exception of

Lactococ-cus lactis

, an organism used in cheese fermentation andunder extreme pressure from bacteriophages. The com-parisons of genomes from the same species suggest astochastic process: prophages might be present or absent(e.g.

Streptococcus agalactiae

) (Fig. 2A) or differentstrains within a species may contain one, two or threeprophages (e.g.

Staphylococcus aureus

) (Fig. 2B).Prophages showed a preferred orientation for integra-

tion: the structural genes pointed mostly in the directionof the majority of the surrounding bacterial genes. Whenprophages changed position from the right to the left rep-lichore, they also changed their orientation (Smoot

et al

.,

2002a; Nakagawa

et al

., 2003; see also the circled proph-age in Fig. 2B). This strong bias in phage orientation isstill unexplained, although avoidance of RNA and DNApolymerase collision and interference with the normalfunctioning of terminus functions (

dif

site) have been pro-posed (Campbell, 2002).

Prophages contribute to the genetic individuality of a bacterial strain

The role of mobile DNA in the diversification of bacterialgenomes becomes apparent when the sequencedgenomes from two different strains of the same speciesare aligned in a dot plot. For example, the alignment oftwo

S. agalactiae

serotypes showed about a dozen smallgaps (Fig. 2A). The gaps corresponded nearly exclusivelyto mobile DNA; bioinformatic analysis revealed integrativeplasmids, transposons and two prophages. Ten gapsshowed an atypical nucleotide composition suggestinglateral gene transfer, and eight of the genomic islandswere associated with integrases/transposases (Glaser

et al

., 2002). The variable genome parts defined by com-parative genomics (Glaser

et al

., 2002), dot plot alignmentand microarray hybridization (Tettelin

et al

., 2002) showedexcellent concordance. The relative contribution of proph-ages to the strain-specific DNA varied sometimes forstrains from the same species. In some

Staphylococcusaureus

strain comparisons, prophages were the majorcontributors to variability (Fig. 2B) (Kuroda

et al

., 2001),whereas in other comparisons, they competed withgenomic islands and transposons for this role (Baba

et al

.,2002). An extreme case is presented by

Streptococcuspyogenes

in which all major gaps in the alignment ofdifferent M serotypes could be traced to prophage inte-gration events (Smoot

et al

., 2002a). As all

S. pyogenes

and many

S. aureus

prophages encode proven or sus-pected virulence factors, the prophage-imposed diversitybetween the strains might be of clinical relevance (Beres

et al

., 2002).The role of prophages in the individuality of the

strains is not restricted to Gram-positive bacteria, butcan also be seen in Gram-negative bacteria. Two

Sal-monella

Typhi strains could be aligned over the entiregenome when allowing for a large chromosomal inver-sion across two rRNA gene clusters. From 113 ORFsspecific for one or other strain, 76 were prophage genes(Deng

et al

., 2002). In a

Salmonella

serovar Typhi andTyphimurium comparison, two chromosomal inversionsand 12 larger alignment gaps were identified. Nine ofthe gaps can be traced to prophages or prophage rem-nants. The gaps represent either prophage insertion/deletion events or prophage replacements at a givenchromosomal position.

Escherichia coli

can serve as adramatic example in which half the 1 Mb DNA that dis-

Page 4: The Impact of Prophage on Bacterial Chromosomes

12

C. Canchaya, G. Fournous and H. Brüssow

© 2004 Blackwell Publishing Ltd,

Molecular Microbiology

,

53

, 9–18

tinguish

E. coli

O157:H7 from K-12 are accounted for byprophage DNA (Ohnishi

et al

., 2001). Interestingly,prophage comparisons between strains sharing a veryrecent common ancestor (

E. coli

O157 Sakai andEDL933 or the two M3

S. pyogenes

strains) alreadyshowed modular exchange reactions (Makino

et al

.,1999) (Fig. 3B, two tail fibre genes in P5 versus 315.2comparison), suggesting that prophages are a highlydynamic part of the bacterial genomes.

Even in the comparisons of some closely related sisterspecies such as

Listeria innocua

and

L. monocytogenes

(Glaser

et al

., 2001) or

Lactobacillus johnsonii

and

L. gas-seri

(Pridmore

et al

., 2004; T. R. Klaenhammer and F.Desiere, unpublished), prophages account for a substan-tial part of the major alignment gaps.

DNA–DNA microarray analysis

Microarray analysis demonstrated that, in

Salmonella

serovar Typhimurium strains, genetic differences wereessentially limited to prophage content (Porwollik

et al

.,2002; Chan

et al

., 2003). The same was true for

S.pyogenes

(Smoot

et al

., 2002b).

Vibrio cholerae

strainsrecovered as far back as 1910 demonstrated only 1%difference between the strains (Dziejman

et al

., 2002).The prepandemic clinical isolates and an environmen-Lactobacillus johnsonii

ori

1 23

2

SCC

Staphylococcus aureus NCTC 8325

MS

SA

476

2603

V/R

Streptooccus agalactiae NEM316

25000001

2000000

SA1

SA2

Tn916

A

B

C

Fig. 2.

Prophages contribute to the individuality of bacterial strains.A. Dot plot comparison of the DNA sequences from

Streptococcus agalactiae

strains 2603V/R (vertical) and NEM316 (horizontal). The red boxes identify mobile DNA elements characterized by compara-tive genomics (Glaser

et al

., 2002) and microarray analysis (Tettelin

et al

., 2002). The gaps in the alignment are located by thin horizontal and vertical lines. The numbers are bp positions. The two prophages are annotated with SA1 and SA2. The dot plot was done with the

DOTTUP

program (http://www.emboss.org/), using the direct and reverse sequence; word size was 15, output format was

POSTSCRIPT

, the program was run in the direct and reverse direction and the figures were combined.B. Dot plot of the DNA sequence alignment from the

Staphylococcus aureus

strains MSSA476 and NCTC 8325. Prophages are annotated as red boxes. A prophage that changed the replichore and the orien-tation is circled in red. The MSSA476 sequencing data were produced by the Microbial Sequencing group at the Sanger Institute and can be obtained from ftp://ftp.sanger.uk/pub/sa/ (NCBI accession number NC_002953). The NCTC8325 sequencing data were produced by the

Staphylococcus aureus

Genome Sequencing Project of J. Iandolo at the University of Oklahoma Health Sciences Center (NCBI accession number NC_002954).C. Microarray analysis in the gut commensal

Lactobacillus johnsonii

. Outer circle: eight

L. johnsonii

strains were hybridized against the reference strain NCC533. Inner circle: eight different

Lactobacillus

species were hybridized against the NCC533 strain (ring 1):

L. gas-seri

ATCC19992 and DSM20234 (2, 3),

L. helveticus

CNRZ303 (4),

L. crispatus

DSM20584 (5),

L. gallinarum

ATCC33199 (6),

L. amylo-vorus

DSM20531 (7),

L. acidophilus

ATCC4356 (8),

L. reuteri

DSM20016 (9) and

L. plantarum

ATCC14917 (10). Red, blue and black segments indicate regions that do or not do cross-hybridize with NCC533 and that are not represented on the microarray respectively. The outermost thin circle locates prophages and prophage remnants (orange boxes), integrase (red) and transposase (green) genes. For

details, see Ventura

et al

. (2003a).

Page 5: The Impact of Prophage on Bacterial Chromosomes

Prophage–chromosome interaction

13

© 2004 Blackwell Publishing Ltd,

Molecular Microbiology

,

53

, 9–18

tal non-toxigenic O1 El Tor isolate lacked the CTXprophage encoding the cholera toxin CT, identifyingthis prophage as a major determinant of genetic vari-ability. Microarray analysis showed that lateral genetransfer has played a fundamental role in the diversifi-cation of

S. aureus

: up to 22% of the genome com-prised variable genetic material. Prophages comprised17% of the total amount of 250 kb mobile and variableDNA (Fitzgerald

et al

., 2001). As prophages from bac-terial pathogens frequently encode virulence factors,one could suspect that the analysis of pathogenic bac-teria overestimates the contribution of prophages tothe genetic individuality of a given strain. However,similar data were obtained with the gut commensal

Lactobacillus johnsonii

. Hybridization of eight molecu-larly distinct strains of

L. johnsonii

revealed that theprophage DNA contributed approximately half thestrain-specific DNA of the reference strain (Fig. 2C)(Ventura

et al

., 2003a). When lactobacilli belonging toseven different

Lactobacillus

species were included inthat analysis, genes related to the prophages of thereference strain were not detected. This observationdemonstrates that mobile DNA is not necessarilywidely distributed across species borders.

Prophages as target regions for chromosomal rearrangement

It is not rare that different prophages from the samelysogen share DNA sequence identity. As these regionsare targets for homologous recombination, it was pre-dicted that prophages mediate rearrangements of bacte-rial chromosomes (Brüssow and Hendrix, 2002). Severalgenome alignments support this prediction. For example,a Japanese

S. pyogenes

M3 strain differed from an Amer-ican M3 isolate mainly by two sequential DNA inversions(Nakagawa

et al

., 2003). One inversion occurred acrosstwo prophages (Fig. 3A). Genomics analysis suggestedthat the cross-over point was located in the lysis modules.As the lysogenic conversion genes from

S. pyogenes

prophages are encoded downstream of the lysis genes,the cross-over results in a reshuffling of virulence genesbetween prophages (Fig. 3B). This might allow a widerhorizontal spread of these genes as they become associ-ated with phages with potential new host ranges andbelonging to new immunity groups. This flexibility extendsthe possibilities of conversion gene permutation inpolylysogenic hosts such as

S. pyogenes

. A spectacularcase of apparently prophage-mediated recombinations is

Fig. 3. Genome rearrangements in Streptococcus pyogenes.A. The US S. pyogenes serotype M3 strain MGAS315 (top) was aligned with the Japanese S. pyogenes serotype M3 strain SSI-1 (bottom) using the Artemis comparison tool. MGAS315 shows the likely original constellation, SSI-1 experienced a first inversion across the duplicated com genes and a second inversion across prophages P5 and P6. The position of the prophages is indicated by the red boxes.B. Alignment of the prophages P5 and P6 with the corresponding prophages 315.1 and 315.2 locating the site of recombination within the lysis cassette. The shading connects regions of high DNA sequence identity between the compared phages. The modular structure of the prophages is colour-coded (key in Fig. 1).

5

6

1'

B

2'

5

6

M3 315

1' 2' 3' 4' 5' 6'

1 2 3 4 5 6

M3 SSI-1com com

ssa

ssa

A B

Page 6: The Impact of Prophage on Bacterial Chromosomes

14 C. Canchaya, G. Fournous and H. Brüssow

© 2004 Blackwell Publishing Ltd, Molecular Microbiology, 53, 9–18

presented by the Gram-negative plant pathogen Xylellafastidiosa. The pathovars 9a5c and Temecula shared DNAsequence identity essentially over the entire genome, butshowed a very complex alignment pattern, suggestingthree successive chromosomal inversion events (VanSluys et al., 2003). Genomics analysis suggested the fol-lowing sequence of events (Fig. 4): inversion acrossprophage elements XfP1 and XfP2 sharing extensiveDNA sequence over their entire length (Canchaya et al.,2003) (Fig. 4A and B), followed by an illegitimate recom-bination between prophage remnant Xt10 and a chromo-

somal site to the left of Xt1 (Fig. 4B and C) and, finally,an inversion across the prophages Xt8 and Xt2 sharinghighly related integrase genes (Fig. 4C), leading to nearlyperfectly aligned genomes (Fig. 4D). A genome inversionbetween the two sequenced O157 E. coli strains can alsobe traced to a recombination between prophages 993Oand 933P. However, prophages are not a privilegedrecombination site: two Yersinia pestis genomes differ bya large number of small genome inversions and containnumerous prophage remnants, but only three inversionsare flanked at one side by a prophage sequence (Deng

Fig. 4. Genome rearrangements in Xylella fastidiosa.A. Dot plot comparison of X. fastidiosa pathovar Temecula (horizontal) and pathovar 9a5c (vertical). The position of the prophages is indicated by red boxes. The chromosome inversion between prophages XfP1 and XfP2 leads to the dot plot shown in (B). An inversion across prophage Xt10 and an unknown chromosomal site leads to the dot plot depicted in (C). A final inversion between prophages Xt8 and Xt2 results in the near-perfect alignment of the two genomes shown in (D).

XfP

2X

fP1

Xt1 Xt10

Xt8 Xt2

Xylella fastidiosa Temecula

observed

reconstructed

9.a.

5.c

?

2500000

2500000

A

C D

B

Page 7: The Impact of Prophage on Bacterial Chromosomes

Prophage–chromosome interaction 15

© 2004 Blackwell Publishing Ltd, Molecular Microbiology, 53, 9–18

et al., 2002). In other bacteria, rDNA and duplicated bac-terial genes were used for genome rearrangements (comgenes in the case of Fig. 3A).

Integration sites

Many prophages from both Gram-negative and Gram-positive bacteria integrate into tRNA genes in a preferredorientation (Campbell, 1992; 2002). Mostly, but not always(Ventura et al., 2003a), the attP site of the phage recon-stitutes the tRNA upon integration. Interestingly, anincreasing number of prophages are described that carrytRNA genes (Fig. 5D). The phage-encoded tRNA differedfrom the chromosomal tRNA and were transcribed in thelysogen (Ventura et al., 2004). Such constellations mightalleviate the consequences of prophage integrationevents into tRNA genes that do not reconstitute the orig-inal tRNA gene. However, integration into tRNA genes isfar from being universal. Integration into intergenic regionsand into ORFs has also been described (Canchaya et al.,2002). The prophage attachment site frequently faithfullycomplements the coding sequence of the protein. How-ever, cases of inactivation of the protein-encoding functionhave also been described, a well-known case being thenegative lysogenic conversion of the lipase gene by an S.aureus phage (Lee and Iandolo, 1986). In addition, inte-gration into an ORF can also lead to an altered but func-

tional protein. For example, lambdoid phage 21 insertswithin the isocitrate dehydrogenase gene and introducesan alternative 165 bp 3¢ end for that gene (Campbell et al.,1992). In rare cases, the phage recombination site attP islocated within the phage integrase gene, and prophageintegration leads to an altered int gene, which apparentlystabilizes the lysogen (Magrini et al., 1999), or the bacte-rial attB site complements the int gene (Bruttin et al.,1997).

Transcription of prophage genomes

In the lambdoid phages, it is known that most genes inthe integrated prophages are not transcribed, and the bulkof the prophage genes are thus ‘passive genetic cargo’ tothe lysogen. However, this statement does not apply to alllambda prophage genes: lambda genes bor and lom areexpressed from the lysogen and confer serum resistanceto the E. coli lysogen during in vivo growth (Barondessand Beckwith, 1990). Two lambdoid prophages from E.coli O157 encode Shiga-like toxins (Stx), the major viru-lence factor of this important food pathogen. Their expres-sion is tightly controlled. Expression is achieved duringprophage induction and when iron, sensed by the Furtranscriptional regulator, becomes growth limiting (Wag-ner et al., 2002). Stx induces bleeding into the gut, andiron thus becomes available from decaying blood cells. Stx

Fig. 5. Transcription control of prophage genes.A. Genome map of the Corynebacterium diphtheriae strain NCTC13129 (Cerdeno-Tarraga et al., 2003). The transcriptional regulator gene dtxR is shown together with all genes regulated by DtxR (arrows) including the DT toxin encoded by the corynephage F shown in (B). The transcribed lysogenic conversion genes from a Streptococcus pyogenes prophage are shown in (C). They are under the control of SPIF released from pharyngeal cells (Broudy et al., 2001). Map of Lactobacillus plantarum prophage Lp1 in (D). The modular structure of the prophages is indicated by a colour code (key in Fig. 1). A similar modular structure is proposed for the corynephage despite the phylogenetic distance separating the bacterial hosts. The horizontal arrows next to the prophage maps indicate the prophage transcripts. In (B) and (C), only the right prophage end was investigated for transcription.

irp6A124

539

625

pi1

894

922

1061

1296

1520

hmuO

1734

adhA

pi321592162

2356

dtxR

Aori

pi2

DT

speC

mf2-

spd

1

mf2

mf4

B

C

D

SP

IF

Page 8: The Impact of Prophage on Bacterial Chromosomes

16 C. Canchaya, G. Fournous and H. Brüssow

© 2004 Blackwell Publishing Ltd, Molecular Microbiology, 53, 9–18

has no physiological export system, but E. coli O157 has‘learned’ to avoid this suicidal production by chargingbystander intestinal E. coli strains with its synthesis viainfection with the induced prophage (Gamage et al.,2003). In a number of Gram-positive pathogens, importantvirulence factors are encoded between the phage lysingene and attR, the right attachment site (Beres et al.,2002). Only a low transcription level of these genes wasobserved in broth growth, whereas contact with pharyn-geal cells induced the expression of these virulencegenes in S. pyogenes prophages (Broudy et al., 2001)(Fig. 5C). The expression of the diphtheria toxin inCorynebacterium diphtheriae is regulated via the DtxRtranscriptional regulator that binds in an iron-dependentway to operators of many bacterial genes (Fig. 5A),including the diphtheria toxin gene encoded by a cory-nephage (Fig. 5B). DtxR and Fur are the master regula-tors of large iron regulons, and prophage virulence geneexpression thus comes under the control of the host bac-terium. In contrast, constitutive transcription of moron-likeextra phage genes was seen in commensals and free-living bacteria (Ventura et al., 2003b). The morons werelocated in the vicinity of the left and right phage attach-ment sites (Fig. 5D). Interestingly, morons from pathogensand commensals sharing the same habitat (e.g. S. pyo-genes and Lactobacillus plantarum both isolated from theoral cavity) also shared sequence-related putativelysogenic conversion genes in their prophages (e.g. mf-type DNases; Fig. 5C and D).

In S. pyogenes and S. aureus, the prophage morontranscription is increased by prophage induction via anincrease in copy number (Broudy et al., 2002; Sumby andWaldor, 2003). In these cases, it is not clear whether thesegenes are of ecological benefit to the phage, the lysogenor both (Broudy and Fischetti, 2003). Microarray analysisin a number of systems revealed that prophage genesfigured prominently under the upregulated genes of thelysogen under stress conditions. This was observed in S.pyogenes growing in phagocytes (Voyich et al., 2003),recovered from mice (Kazmi et al., 2001) and humanpatients or experiencing growth temperature changes(Smoot et al., 2001).

A prominent upregulation of prophage gene expressionwas also seen in Gram-negative lysogens upon infectionof animals (Dozois et al., 2003) or when changed fromplanktonic to biofilm growth (Pseudomonas aeruginosa;Whiteley et al., 2001).

Outlook

There is currently a lively discussion about the relativecontribution of vertical and horizontal elements of evolu-tion in bacteria. Some researchers argue that the tree-likerepresentations of Darwinian evolution should be replaced

by web-like phylogenies as a result of the rampant effectof horizontal gene transfer (HGT) in bacteria (Doolittle,1999). Other researchers claim that the role of HGT forthe evolution of bacterial genomes is overstated (Kurlandet al., 2003). The sequencing of multiple strains fromthe same bacterial species demonstrated that HGTaccounted for the majority of the intraspecies genomedifferences. Prophages are the major contributors togenome diversification in some species; in others, proph-ages compete with integrative plasmids, transposons orpathogenicity islands (which themselves show links toprophages) for this role. Still other sequenced bacterialack prophages as a result of sampling bias (e.g. Strepto-coccus pneumoniae) or perhaps a genuine absence ofphages (e.g. Helicobacter pylori ).

Prophages played an important and, in some cases, adecisive role in the emergence of bacterial pathogens. V.cholerae, E. coli O157 and C. diphtheriae are examplesin which the disease-specifying toxins are encoded byprophages. S. aureus, S. pyogenes and Salmonella areexamples of pathogens in which many disease-modifyingfactors are encoded by multiple prophages and each indi-vidual prophage contributes only an incremental virulenceincrease.

With respect to the impact of HGT on bacterial evolu-tion, we believe that two processes must be distinguished.In the field of medical microbiology and epidemiology, weobserve events that occur across time frames that rarelyexceed 100 years. Obviously, vertical modes of bacterialgenome evolution are not very efficient over these shorttime periods. Therefore, the emergence of new pathogensis likely to rely on the acquisition of lateral DNA (or lossof DNA or a combination of both). In this context, phagesare an ideal carrier for horizontal DNA and thus a likelymotor for short-term bacterial diversification. But does thisprocess influence the structure of bacterial genomes overevolutionary time periods? Over this time scale, it is notthe acquisition, but the fixation, of prophage sequencesthat counts (Lawrence and Roth, 1999). In our survey, wehave seen little, if any, evidence that prophages were fixedinto the chromosome of a bacterial species. Prophagesare apparently acquired as easily as they are lost from thechromosome. A site occupied by a prophage in one strainmight be empty in another strain or occupied by anotherprophage. Within the confines of a bacterial species, a fewcases of closely related, although not identical, prophageswere reported that occupied the same chromosomal sitein different bacteria (Casjens, 2003). The Neisseria proph-ages may be the clearest examples of long residencetimes during prophage decay. However, this case mightbe an exception as it relates to Mu-like phages that veryrarely excise precisely once they are inserted.

However, fixing of entire prophages would not beexpected as it makes no evolutionary sense to bacteria.

Page 9: The Impact of Prophage on Bacterial Chromosomes

Prophage–chromosome interaction 17

© 2004 Blackwell Publishing Ltd, Molecular Microbiology, 53, 9–18

The mechanism would no doubt be characterized by fix-ation of particular phage genes, the function of which hasbeen co-opted by the host. Prophage genes without selec-tive value to the host are therefore likely to be deleted.Owing to the lack of surrounding prophage genes, com-pletely fixed prophage genes are thus difficult to detect.In the case of bacterial pathogens, a conspicuous obser-vation is virulence genes flanked by isolated phage-likegenes. Examples are Shiga toxin genes in Shigella dys-enteriae (McDonough and Butterton, 1999), sopE2 in Sal-monella Typhimurium, sspH and pertussis-like toxin genesin Salmonella Typhi (Fig. 1A). As related genes are foundas lysogenic conversion genes in well-established proph-ages (Stx prophages in E. coli O157, SopE prophages inS. Typhimurium), we probably have here examples of fixedprophage genes. As these genes are likely to extend theecological range of their bacterial hosts, prophages mighthave a greater impact on bacterial genomes than is indi-cated by the presence of clear-cut prophage DNAsequences in the sequenced bacterial genomes.

Acknowledgements

We thank Sherwood Casjens, Chris Blake, Anne Constableand Anne Bruttin for critical reading of the manuscript, andthe Swiss National Science foundation for financial supportto Carlos Canchaya (research grant 5002-057832).

References

Baba, T., Takeuchi, F., Kuroda, M., Yuzawa, H., Aoki, K.,Oguchi, A., et al. (2002) Genome and virulence determi-nants of high virulence community-acquired MRSA. Lancet359: 1819–1827.

Barondess, J.J., and Beckwith, J. (1990) A bacterial virulencedeterminant encoded by lysogenic coliphage lambda.Nature 346: 871–874.

Beres, S.B., Sylva, G.L., Barbian, K.D., Lei, B., Hoff, J.S.,Mammarella, N.D., et al. (2002) Genome sequence of aserotype M3 strain of group A Streptococcus: phage-encoded toxins, the high-virulence phenotype, and cloneemergence. Proc Natl Acad Sci USA 99: 10078–10083.

Boyd, E.F., and Brüssow, H. (2002) Common themes amongbacteriophage-encoded virulence factors and diversityamong the bacteriophages involved. Trends Microbiol 10:521–529.

Broudy, T.B., and Fischetti, V.A. (2003) In vivo lysogenicconversion of Tox(–) Streptococcus pyogenes to Tox(+)with lysogenic streptococci or free phage. Infect Immun 71:3782–3786.

Broudy, T.B., Pancholi, V., and Fischetti, V.A. (2001)Induction of lysogenic bacteriophage and phage-associated toxin from group A streptococci during cocul-ture with human pharyngeal cells. Infect Immun 69:1440–1443.

Broudy, T.B., Pancholi, V., and Fischetti, V.A. (2002) The invitro interaction of Streptococcus pyogenes with human

pharyngeal cells induces a phage-encoded extracellularDNase. Infect Immun 70: 2805–2811.

Brüssow, H., and Hendrix, R.W. (2002) Phage genomics:small is beautiful. Cell 108: 13–16.

Bruttin, A., Foley, S., and Brüssow, H. (1997) The site-specific integration system of the temperate Streptococcusthermophilus bacteriophage phiSfi21. Virology 237: 148–158.

Campbell, A.M. (1992) Chromosomal insertion sites forphages and plasmids. J Bacteriol 174: 7495–7499.

Campbell, A.M. (2002) Preferential orientation of naturallambdoid prophages and bacterial chromosome organiza-tion. Theor Popul Biol 61: 503–507.

Campbell, A., Schneider, S.J., and Song, B. (1992) Lamb-doid phages as elements of bacterial genomes (integrase/phage21/Escherichia coli K-12/icd gene). Genetica 86:259–267.

Canchaya, C., Desiere, F., McShan, W.M., Ferretti, J.J.,Parkhill, J., and Brüssow, H. (2002) Genome analysis ofan inducible prophage and prophage remnants integratedin the Streptococcus pyogenes strain SF370. Virology 302:245–258.

Canchaya, C., Proux, C., Fournous, G., Bruttin, A., and Brüs-sow, H. (2003) Prophage genomics. Microbiol Mol Biol Rev67: 238–276.

Casjens, S. (2003) Prophages and bacterial genomics: whathave we learned so far? Mol Microbiol 49: 277–300.

Cerdeno-Tarraga, A.M., Efstratiou, A., Dover, L.G., Holden,M.T., Pallen, M., Bentley, S.D., et al. (2003) The completegenome sequence and analysis of Corynebacterium diph-theriae NCTC13129. Nucleic Acids Res 31: 6516–6523.

Chan, K., Baker, S., Kim, C.C., Detweiler, C.S., Dougan, G.,and Falkow, S. (2003) Genomic comparison of Salmonellaenterica serovars and Salmonella bongori by use of an S.enterica serovar typhimurium DNA microarray. J Bacteriol185: 553–563.

Deng, W., Burland, V., Plunkett, G., III, Boutin, A., Mayhew,G.F., Liss, P., et al. (2002) Genome sequence of Yersiniapestis KIM. J Bacteriol 184: 4601–4611.

Doolittle, W.F. (1999) Phylogenetic classification and the uni-versal tree. Science 284: 2124–2129.

Dozois, C.M., Daigle, F., and Curtiss, R. (2003) Identificationof pathogen-specific and conserved genes expressed invivo by an avian pathogenic Escherichia coli strain. ProcNatl Acad Sci USA 100: 247–252.

Dziejman, M., Balon, E., Boyd, D., Fraser, C.M., Heidelberg,J.F., and Mekalanos, J.J. (2002) Comparative genomicanalysis of Vibrio cholerae: genes that correlate with chol-era endemic and pandemic disease. Proc Natl Acad SciUSA 99: 1556–1561.

Fitzgerald, J.R., Sturdevant, D.E., Mackie, S.M., Gill, S.R.,and Musser, J.M. (2001) Evolutionary genomics of Staphy-lococcus aureus: insights into the origin of methicillin-resistant strains and the toxic shock syndrome epidemic.Proc Natl Acad Sci USA 98: 8821–8826.

Fournous, G. (2003) Automatisation de la recherche et del’analyse de prophages dans les génomes bactériens. Tra-vail de DEA, Université Henri Poincaré, Nancy, France.

Gamage, S.D., Strasser, J.E., Chalk, C.L., and Weiss, A.A.(2003) Nonpathogenic Escherichia coli can contribute tothe production of Shiga toxin. Infect Immun 71: 3107–3115.

Page 10: The Impact of Prophage on Bacterial Chromosomes

18 C. Canchaya, G. Fournous and H. Brüssow

© 2004 Blackwell Publishing Ltd, Molecular Microbiology, 53, 9–18

Glaser, P., Frangeul, L., Buchrieser, C., Rusniok, C., Amend,A., Baquero, F., et al. (2001) Comparative genomics ofListeria species. Science 294: 849–852.

Glaser, P., Rusniok, C., Buchrieser, C., Chevalier, F.,Frangeul, L., Msadek, T., et al. (2002) Genome sequenceof Streptococcus agalactiae, a pathogen causing invasiveneonatal disease. Mol Microbiol 45: 1499–1513.

Juhala, R.J., Ford, M.E., Duda, R.L., Youlton, A., Hatfull, G.F.,and Hendrix, R.W. (2000) Genomic sequences of bacte-riophages HK97 and HK022: pervasive genetic mosaicismin the lambdoid bacteriophages. J Mol Biol 299: 27–51.

Kazmi, S.U., Kansal, R., Aziz, R.K., Hooshdaran, M., Norrby-Teglund, A., Low, D.E., et al. (2001) Reciprocal, temporalexpression of SpeA and SpeB by invasive M1T1 group astreptococcal isolates in vivo. Infect Immun 69: 4988–4995.

Kurland, C.G., Canback, B., and Berg, O.G. (2003) Horizon-tal gene transfer: a critical view. Proc Natl Acad Sci USA100: 9658–9662.

Kuroda, M., Ohta, T., Uchiyama, I., Baba, T., Yuzawa, H.,Kobayashi, I., et al. (2001) Whole genome sequencing ofmethicillin-resistant Staphylococcus aureus. Lancet 357:1225–1240.

Lawrence, J.G., and Roth, J.R. (1999) Genomic flux:genomic evolution by gene loss and acquisition. In Orga-nization of the Prokaryotic Genome. Charlebois, R.L. (ed.).Washington, DC: American Society for Microbiology Press,pp. 263–289.

Lawrence, J.G., Hendrix, R.W., and Casjens, S. (2003)Where are the pseudogenes in bacterial genomes. TrendsMicrobiol 9: 535–540.

Lee, C.Y., and Iandolo, J.J. (1986) Lysogenic conversion ofstaphylococcal lipase is caused by insertion of the bacte-riophage L54a genome into the lipase structural gene. JBacteriol 166: 385–391.

McDonough, M.A., and Butterton, J.R. (1999) Spontaneoustandem amplification and deletion of the shiga toxinoperon in Shigella dysenteriae 1. Mol Microbiol 34:1058–1069.

Magrini, V., Storms, M.L., and Youderian, P. (1999) Site-specific recombination of temperate Myxococcus xanthusphage Mx8: regulation of integrase activity by reversible,covalent modification. J Bacteriol 181: 4062–4070.

Makino, K., Yokoyama, K., Kubota, Y., Yutsudo, C.H., Kimura,S., Kurokawa, K., et al. (1999) Complete nucleotidesequence of the prophage VT2-Sakai carrying the vero-toxin 2 genes of the enterohemorrhagic Escherichia coliO157:H7 derived from the Sakai outbreak. Genes GenetSyst 74: 227–239.

Nakagawa, I., Kurokawa, K., Yamashita, A., Nakata, M.,Tomiyasu, Y., Okahashi, N., et al. (2003) Genomesequence of an M3 strain of Streptococcus pyogenesreveals a large-scale genomic rearrangement in invasivestrains and new insights into phage evolution. GenomeRes 13: 1042–1055.

Ohnishi, M., Kurokawa, K., and Hayashi, T. (2001)Diversification of Escherichia coli genomes: are bacte-riophages the major contributors? Trends Microbiol 9: 481–485.

Porwollik, S., Wong, R.M., and McClelland, M. (2002) Evolu-tionary genomics of Salmonella: gene acquisitions

revealed by microarray analysis. Proc Natl Acad Sci USA99: 8956–8961.

Pridmore, D., Berger, B., Desiere, F., Vilanova, D., Barretto,C., Pittet, A.C., et al. (2004) The genome sequence of theprobiotic intestinal bacterium Lactobacillus johnsonii NCC533. Proc Natl Acad Sci USA 101: 2512–2517.

Smoot, L.M., Smoot, J.C., Graham, M.R., Somerville, G.A.,Sturdevant, D.E., Migliaccio, C.A., et al. (2001) Global dif-ferential gene expression in response to growth tempera-ture alteration in group A Streptococcus. Proc Natl AcadSci USA 98: 10416–10421.

Smoot, J.C., Barbian, K.D., Van Gompel, J.J., Smoot, L.M.,Chaussee, M.S., Sylva, G.L., et al. (2002a) Genomesequence and comparative microarray analysis of serotypeM18 group A Streptococcus strains associated with acuterheumatic fever outbreaks. Proc Natl Acad Sci USA 99:4668–4673.

Smoot, L.M., McCormick, J.K., Smoot, J.C., Hoe, N.P.,Strickland, I., Cole, R.L., et al. (2002b) Characterization oftwo novel pyrogenic toxin superantigens made by an acuterheumatic fever clone of Streptococcus pyogenes associ-ated with multiple disease outbreaks. Infect Immun 70:7095–7104.

Sumby, P., and Waldor, M.K. (2003) Transcription of the toxingenes present within the Staphylococcal phage phiSa3msis intimately linked with the phage’s life cycle. J Bacteriol185: 6841–6851.

Tettelin, H., Masignani, V., Cieslewicz, M.J., Eisen, J.A.,Peterson, S., Wessels, M.R., et al. (2002) Completegenome sequence and comparative genomic analysis ofan emerging human pathogen, serotype V Streptococcusagalactiae. Proc Natl Acad Sci USA 99: 12391–12396.

Van Sluys, M.A., de Oliveira, M.C., Monteiro-Vitorello, C.B.,Miyaki, C.Y., Furlan, L.R., Camargo, L.E., et al. (2003)Comparative analyses of the complete genome sequencesof Pierce’s disease and citrus variegated chlorosis strainsof Xylella fastidiosa. J Bacteriol 185: 1018–1026.

Ventura, M., Canchaya, C., Pridmore, D., Berger, B., andBrüssow, H. (2003a) Integration and distribution of Lacto-bacillus johnsonii prophages. J Bacteriol 185: 4603–4608.

Ventura, M., Canchaya, C., Kleerebezem, M., de Vos, W.M.,Siezen, R.J., and Brüssow, H. (2003b) The prophagesequences of Lactobacillus plantarum strain WCFS1. Virol-ogy 316: 245–255.

Ventura, M., Canchaya, C., Pridmore, R.D., and Brüssow, H.(2004) The Prophages of Lactobacillus johnsonii NCC 533:comparative genomics and transcription analysis. Virology320: 229–242.

Voyich, J.M., Sturdevant, D.E., Braughton, K.R., Kobayashi,S.D., Lei, B., Virtaneva, K., et al. (2003) Genome-wideprotective response used by group A Streptococcus toevade destruction by human polymorphonuclear leuko-cytes. Proc Natl Acad Sci USA 100: 1996–2001.

Wagner, P.L., Livny, J., Neely, M.N., Acheson, D.W., Fried-man, D.I., and Waldor, M.K. (2002) Bacteriophage controlof Shiga toxin 1 production and release by Escherichia coli.Mol Microbiol 44: 957–970.

Whiteley, M., Bangera, M.G., Bumgarner, R.E., Parsek, M.R.,Teitzel, G.M., Lory, S., and Greenberg, E.P. (2001) Geneexpression in Pseudomonas aeruginosa biofilms. Nature413: 860–864.