predicting mutation frequencies in stem–loop structures of...

13
Molecular Microbiology (2003) 48 (2), 429–441 © 2003 Blackwell Publishing Ltd Blackwell Science, LtdOxford, UKMMIMolecular Microbiology0950-382XBlackwell Publishing Ltd, 200348 2429441 Original Article Predicting mutations in stem–loop structuresB. E. Wright et al. Accepted 13 December, 2002. *For correspondence. E-mail [email protected]; Tel. ( + 1) 406 243 6676; Fax ( + 1) 406 243 4184. Predicting mutation frequencies in stem–loop structures of derepressed genes: implications for evolution Barbara E. Wright,* Dennis K. Reschke, Karen H. Schmidt, Jacqueline M. Reimers and William Knight Division of Biological Sciences, University of Montana, Missoula, MT 59812, USA. Summary This work provides evidence that, during transcrip- tion, the mutability (propensity to mutate) of a base in a DNA secondary structure depends both on the sta- bility of the structure and on the extent to which the base is unpaired. Zuker ' s DNA folding computer pro- gram reveals the most probable stem–loop structures (SLSs) and negative energies of folding (– G) for any given nucleotide sequence. We developed an interfac- ing program that calculates (i) the percentage of folds in which each base is unpaired during transcription; and (ii) the mutability index (MI) for each base, expressed as an absolute value and defined as follows: MI = (% total folds in which the base is unpaired) × (highest – G of all folds in which it is unpaired). Thus, MIs predict the relative mutation or reversion frequencies of unpaired bases in SLSs. MIs for 16 mutable bases in auxotrophs, selected during starvation in derepressed genes, are compared with 70 background mutations in lacI and ebgR that were not derepressed during mutant selection. All the results are consistent with the location of known mutable bases in SLSs. Specific conclusions are: (i) Of 16 mutable bases in transcribing genes, 87% have higher MIs than the average base of the sequence analysed, compared with 50% for the 70 background mutations. (ii) In 15 of the mutable bases of transcrib- ing genes, the correlation between MIs and relative mutation frequencies determined experimentally is good. There is no correlation for 35 mutable bases in the lacI gene. (iii) In derepressed auxotrophs, 100% of the codons containing the mutable bases are within one codon ' s length of a stem, compared with 53% for the background mutable bases in lacI . (iv) The data suggest that environmental stressors may cause as well as select mutations in derepressed genes. The implications of these results for evolution are discussed. Introduction Gene-specific transcription-induced mutations have been described in growing cells of prokaryotes (Brock, 1971; Herman and Dworkin, 1971; Beletskii and Bhagwat, 1996) and eukaryotes (Datta and Jinks-Robertson, 1995), as well as in starving cells of Escherichia coli (Lipschutz et al ., 1965; Wright, 1996; 1997; Wright and Minnick, 1997; Rudner et al ., 1999; Wright et al ., 1999). Recent reviews have discussed the mechanisms by which starva- tion and gene derepression can enhance non-random mutations (Foster, 1999; Wright, 2000; Bridges, 2001; Rosenberg, 2001). Previous studies have demonstrated a correlation between mutation rates and transcription as determined by measuring levels of specific mRNA (Wright et al ., 1999). However, accurate rates of mRNA synthesis can only be determined when both mRNA concentration and half-life are known. Current studies on mRNA turn- over (J. M. Reimers et al ., submitted) have confirmed the conclusions of the earlier investigations (Wright et al ., 1999). There are at least two mechanisms by which tran- scription can increase mutation rates in localized areas of the genome: (i) separation of the two DNA strands exposes the non-transcribed strand, which is then vul- nerable to mutations (Lindahl and Nyberg, 1972; 1974; Coulondre et al ., 1978; Singer and Sterns, 1982; Fix and Glickman, 1987; Frederico et al ., 1990; 1993; Skandalis et al ., 1994); and (ii) the advancing RNA polymerase complex drives localized supercoiling (Balke and Gralla, 1987; Liu and Wang, 1987; Dayn et al ., 1991; Dayn et al ., 1992; Rahmouni and Wells, 1992; Krasilnikov et al ., 1999), which creates and stabilizes stem–loop structures (SLSs) containing vulnerable unpaired and mispaired bases (Ripley, 1982; Todd and Glickman, 1982; Pananicolaou and Ripley, 1991). Thus, these two mechanisms, in concert, are ideal for obtaining the vari- ants most likely to survive each kind of stress while minimizing random genomic damage (Wright, 2000).

Upload: others

Post on 10-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Predicting mutation frequencies in stem–loop structures of ...hs.umt.edu/dbs/labs/wright/documents/j.1365-2958... · argH, including codon 4; malT, including codon 545; and trpA,

Molecular Microbiology (2003)

48

(2), 429–441

© 2003 Blackwell Publishing Ltd

Blackwell Science, LtdOxford, UKMMIMolecular Microbiology0950-382XBlackwell Publishing Ltd, 200348

2429441

Original Article

Predicting mutations in stem–loop structuresB. E. Wright et al.

Accepted 13 December, 2002. *For correspondence. [email protected]; Tel. (

+

1) 406 243 6676; Fax (

+

1) 406 2434184.

Predicting mutation frequencies in stem–loop structures of derepressed genes: implications for evolution

Barbara E. Wright,* Dennis K. Reschke,Karen H. Schmidt, Jacqueline M. Reimers andWilliam Knight

Division of Biological Sciences, University of Montana, Missoula, MT 59812, USA.

Summary

This work provides evidence that, during transcrip-tion, the mutability (propensity to mutate) of a base ina DNA secondary structure depends both on the sta-bility of the structure and on the extent to which thebase is unpaired. Zuker

'

s DNA folding computer pro-gram reveals the most probable stem–loop structures(SLSs) and negative energies of folding (–

∆∆∆∆

G) for anygiven nucleotide sequence. We developed an interfac-ing program that calculates (i) the percentage of foldsin which each base is unpaired during transcription;and (ii) the mutability index (MI) for each base,expressed as an absolute value and defined asfollows: MI

====

(% total folds in which the base isunpaired)

××××

(highest –

∆∆∆∆

G of all folds in which it isunpaired). Thus, MIs predict the relative mutation orreversion frequencies of unpaired bases in SLSs. MIsfor 16 mutable bases in auxotrophs, selected duringstarvation in derepressed genes, are compared with70 background mutations in

lacI

and

ebgR

that werenot derepressed during mutant selection. All theresults are consistent with the location of knownmutable bases in SLSs. Specific conclusions are: (i)Of 16 mutable bases in transcribing genes, 87% havehigher MIs than the average base of the sequenceanalysed, compared with 50% for the 70 backgroundmutations. (ii) In 15 of the mutable bases of transcrib-ing genes, the correlation between MIs and relativemutation frequencies determined experimentally isgood. There is no correlation for 35 mutable bases inthe

lacI

gene. (iii) In derepressed auxotrophs, 100% ofthe codons containing the mutable bases are withinone codon

'

s length of a stem, compared with 53% for

the background mutable bases in

lacI

. (iv) The datasuggest that environmental stressors may cause aswell as select mutations in derepressed genes.The implications of these results for evolution arediscussed.

Introduction

Gene-specific transcription-induced mutations have beendescribed in growing cells of prokaryotes (Brock, 1971;Herman and Dworkin, 1971; Beletskii and Bhagwat, 1996)and eukaryotes (Datta and Jinks-Robertson, 1995), aswell as in starving cells of

Escherichia coli

(Lipschutz

et al

., 1965; Wright, 1996; 1997; Wright and Minnick,1997; Rudner

et al

., 1999; Wright

et al

., 1999). Recentreviews have discussed the mechanisms by which starva-tion and gene derepression can enhance non-randommutations (Foster, 1999; Wright, 2000; Bridges, 2001;Rosenberg, 2001). Previous studies have demonstrated acorrelation between mutation rates and transcription asdetermined by measuring levels of specific mRNA (Wright

et al

., 1999). However, accurate rates of mRNA synthesiscan only be determined when both mRNA concentrationand half-life are known. Current studies on mRNA turn-over (J. M. Reimers

et al

., submitted) have confirmed theconclusions of the earlier investigations (Wright

et al

.,1999).

There are at least two mechanisms by which tran-scription can increase mutation rates in localized areasof the genome: (i) separation of the two DNA strandsexposes the non-transcribed strand, which is then vul-nerable to mutations (Lindahl and Nyberg, 1972; 1974;Coulondre

et al

., 1978; Singer and Sterns, 1982; Fix andGlickman, 1987; Frederico

et al

., 1990; 1993; Skandalis

et al

., 1994); and (ii) the advancing RNA polymerasecomplex drives localized supercoiling (Balke and Gralla,1987; Liu and Wang, 1987; Dayn

et al

., 1991; Dayn

et al

., 1992; Rahmouni and Wells, 1992; Krasilnikov

et al

., 1999), which creates and stabilizes stem–loopstructures (SLSs) containing vulnerable unpaired andmispaired bases (Ripley, 1982; Todd and Glickman,1982; Pananicolaou and Ripley, 1991). Thus, these twomechanisms, in concert, are ideal for obtaining the vari-ants most likely to survive each kind of stress whileminimizing random genomic damage (Wright, 2000).

Page 2: Predicting mutation frequencies in stem–loop structures of ...hs.umt.edu/dbs/labs/wright/documents/j.1365-2958... · argH, including codon 4; malT, including codon 545; and trpA,

430

B. E. Wright

et al.

© 2003 Blackwell Publishing Ltd,

Molecular Microbiology

,

48

, 429–441

Evidence from many laboratories has demonstrated thattranscription drives localized supercoiling and that stem–loop conformations contain mutable bases. However, nodirect evidence has yet been presented to document acorrelation in specific genes between increased ratesof transcription and enhanced mutation localized inunpaired bases of SLSs.

The highest concentration of supercoiling and SLSsoccurs in the wake (5

) of the transcription bubble, andSLSs can form in either or both strands (cruciforms). DNAsecondary structures not only expose mutable bases, butalso increase the time of their exposure by periodicallyblocking the progress of transcribing RNA polymerasecomplexes during transcript elongation. Transcription-enhanced mutations should therefore occur preferentiallyat pause sites, and mutation rates should be increased byactivators such as guanosine tetraphosphate (ppGpp),which has been shown to lengthen the duration of thesepauses (Kingston

et al

., 1981). Specific effects of stresson transcription are essential to the mechanisms of non-random mutations in evolution (see

Discussion

; Wright,2000).

To explore the possible correlation of secondary struc-ture formation and sites of mutation, Zuker's DNA foldingcomputer program (Benham, 1992; SantaLucia, 1998;Zuker

et al

., 1999) was used. This program folds a seg-ment of ssDNA, predicts the most thermodynamically sta-ble secondary structure that could form and calculatesthe negative energy of folding (–

G) for each structure(http://www.bioinfo.rpi.edu/applications/mfold/old/dna).For example, a 200 nucleotide (nt) sequence containinga known mutable base is copied as input to the program.The program is then instructed to fold a series of 30 ntsegments that include the mutable base, and then to showthe structure with the highest –

G for that base. Thetheoretical methods used in screening DNA sequencesfor sites susceptible to superhelical strand separation andtransition to alternative structures have been verifiedexperimentally in other systems (Benham, 1992). A com-puter program that interfaces with Zuker's program hasbeen developed in this laboratory in order to obtain addi-tional information regarding the mutability of individualbases. This new program calculates the extent to whicheach base is unpaired during transcription (see

Experi-mental procedures

).In four

Escherichia coli

auxotrophs (

leuB

,

argH

,

malT

and

lacZ

) studied in our laboratory, mutant sequences andnumber of mutations have been determined. Sequencesand mutation frequencies of seven

trpA

auxotrophs werefound in the literature (Isbell and Fowler, 1989; Bhamre

et al

., 2001). As controls, data were also obtained fromthe literature for background (‘spontaneous’) mutationsnot isolated in derepressed genes, namely

lacI

and

ebgR

(Schaaper

et al

., 1986; Hall, 1999). The presence of loss-

of-function mutations in these repressor genes is revealedindirectly by allowing the constitutive expression of

β

-galactosidase.

Results

The location of mutable bases in high and low –

G areas of a gene

In order to correlate the location of mutable bases in SLSswith the –

G of their structures, about 130 nt of the non-transcribed strand, including the mutation(s) in

leuB

,

argH

,

malT

or

lacZ

, were folded in successive, overlapping 30 ntsegments, beginning at each fifth nt (Fig. 1). Thus, eachnucleotide was included in six successive folds. The mag-nitudes of the –

G values indicate the relative stability ofthe structures formed by each segment: the greater the –

G value, the more likely a stable secondary structure willform and create a pause site. Figure 1 shows that muta-tions generally occur in structures with high –

Gs. Thatis, in

lacZ

,

leuB

,

argH

,

malT

and codon 49 of

trpA

, muta-tions occur in the highest –

G peaks present. In the otherthree codons of interest in

trpA

, mutations occur in threeof the four highest peaks. It should be pointed out that ahigh –

G results primarily from the stability of the stem(s),and does not necessarily indicate the presence of anunpaired base vulnerable to a chemical change (see

Dis-cussion

). Presumably, mutations usually occur when anunpaired base exists in its highest –

G structure. TheSLSs that form at peak –

G values in selected genes aredescribed below.

The DNA folding program appears to describe a bio-chemical process that is robust. That is, although thestructures formed in the 30 nt window are not as large andstable as those in 40 nt and 50 nt windows (not shown),all three analyses show a peak of high negative energyvalues and structures with a similar conformation at thesite of the mutable base. This suggests a reliable andbuffered system for the dynamic interconversion of sec-ondary structures. Thus, despite probable continuousvariations in the length of segments folded during tran-scription, mutation and (presumably) pause sites gener-ally coincide with peak –

G values. The most stringentcondition used, folding 30 nt (compared with 40 and 50 nt)shows the most specific correlation of maximum peak –

G values with the presence of each mutant nucleotide,reflecting, perhaps, the average number of nucleotidesfolded

in vivo

(Fig. 1). Although the 50 nt foldings hadpeaks at the mutation sites, they also had higher peakselsewhere, and were therefore not as specifically corre-lated with the mutation sites. Very large windows generatevery complex structures but, in choosing a smaller win-dow, we attempted to simulate transcription conditions

insilico

. Therefore, a 30 nt window was chosen for analysing

Page 3: Predicting mutation frequencies in stem–loop structures of ...hs.umt.edu/dbs/labs/wright/documents/j.1365-2958... · argH, including codon 4; malT, including codon 545; and trpA,

Predicting mutations in stem–loop structures

431

© 2003 Blackwell Publishing Ltd,

Molecular Microbiology

,

48

, 429–441

the mutant sequences. However, in the analysis of

lacZ

,it was also found necessary to use a 40 nt window in orderto understand the mechanism of the triple mutation thatoccurred (see below).

In the

trpA

gene (Yanofsky

et al

., 1981; Isbell andFowler, 1989; Bhamre

et al

., 2001), the locations of sevenmutable bases in four codons are known, and thesequences containing these bases are shown (Fig. 1).

Fig. 1.

The location and –

G values of 16 known mutable bases in

E. coli

auxotrophs. Beginning at each fifth nucleotide, 130 nt (

lacZ

, including codon 63;

leuB

, including codon 286;

argH

, including codon 4;

malT

, including codon 545; and

trpA

, including codon 49) or 260 nt (

trpA

, including codons 183, 211 and 234) of the non-transcribed strands were folded in successive, overlapping 30 nt windows. Each bar indicates the beginning of a window, and the number of mutable bases in each fold appears above each bar. The entire

trpA

gene was included in the analysis, but a 340 nt segment without mutable bases was omitted from the following sequence and is indicated by dotted lines. Codon numbers are in parenthesis, and mutable bases are capitalized.…gct aat tga agc cgg tgc tga gcg ctg GAg (49) tta ggc atc ccc ttc tcc gac cca ctg gca gat ggc ccg acg att caa … gca ggc gtg aCc (183) ggc gca gaa aac cgc gcc gcg tta ccc ctc aat cat ctg gtt gcg aag ctg aaa gag tac aac gct gca cct cca ttg cag GGa (211) ttt ggt att tcc gcc ccg gat cag gta aaa gca gcg att gat gca gga gct gcg ggc gcg att tct GGt (234) tcg ccc…

Page 4: Predicting mutation frequencies in stem–loop structures of ...hs.umt.edu/dbs/labs/wright/documents/j.1365-2958... · argH, including codon 4; malT, including codon 545; and trpA,

432

B. E. Wright

et al.

© 2003 Blackwell Publishing Ltd,

Molecular Microbiology

,

48

, 429–441

These seven mutations are also located in high –

Gpeaks. In contrast, Fig. 2 shows an apparently randomdistribution of mutable bases in

lacI

and

ebgR

in whichtranscription was not induced. Note, especially in the large

lacI

data set (Fig. 2 legend), that most of the 414 muta-tions that were sequenced are in the first and secondpositions of the codons (see

Discussion

).

Calculation of a mutability index for mutable bases

DNA folding analyses have suggested that mutable basesin selected, derepressed genes are preferentially locatedin high –

G SLSs, in contrast to mutable bases in non-induced genes. The mutability of a base should dependboth on the stability (–

G) of its SLS and on whether thebase is unpaired in these structures (Lindahl and Nyberg,1972; 1974; Coulondre

et al

., 1978; Singer and Sterns,1982; Fix and Glickman, 1987; Frederico

et al

., 1990;1993; Skandalis

et al

., 1994). Therefore, a program hasbeen developed that will provide this information for eachbase (see

Experimental procedures

). To describe andevaluate these assumed causes of base mutability, wehave defined a new term, called the mutability index (MI),and express it as an absolute value:

MI

=

(% total folds in which the base is unpaired)

×

(highest –

G of all folds in which it is unpaired).

Thus, MIs indicate the relative mutation frequencies ofbases in predicted SLSs. The highest –

G term gives themost weight to the most stable fold containing the

unpaired base. That is, primary importance is given to thestructure in which the base is exposed for the longestperiod of time and therefore has the highest probability ofmutation. In addition, the average –

G and MI of all thebases included in the sequence analysed are calculated,for comparison with individual base –

Gs and MIs. Thisinformation is necessary, as mutable bases exist in differ-ent –

G regions. For example, the average –

G value inthe

lacZ

analysis is 4.5, whereas this value for

leuB

is 2.4(Fig. 1).

The percentage of each wild-type mutable base MIabove or below the average MI in the –

G region of eachbase was calculated for the 16 mutable bases in auxotro-phs and for the 70 mutable bases in control genes. Thepercentage above average MI values for the auxotrophbases is 87%, and this value for

lacI

and

ebgR

is 50%.

Comparing MIs with mutation frequencies

If mutable bases are located in unpaired bases of SLSspredicted by the Zuker DNA folding program, MI valuesmay predict relative mutation frequencies. Sequences andreversion frequencies have been determined for bases inthe same codons of seven

trpA

mutant alleles (Isbell andFowler, 1989; Bhamre

et al

., 2001). In Fig. 3, the SLSswith the highest –

G for wild type and three mutant allelesin codon 49 are shown. In the wild type, a strong stemwith five C*G basepairs is formed. As the G in the firstposition of the codon is in the stem, its MI is very low (0.3)in this 30 nt SLS, as well as in smaller windows (not

Fig. 2.

The location and –

∆G values of mutable bases in the E. coli repressors, lacI and ebgR. Beginning every fifth nt, 30 nt windows were folded. Each bar indicates the nucleotide at the beginning of the window, and the number of mutable bases in each fold appears above each bar. Wild type lacI, 57 mutations (mutable bases are capitalized) have been found:…gtg aaa cca gta ACg tta tac gat gTc GCa gag tat Gcc ggt gTc tCt Tat CAG ACc GTt TCc CGc GTg gTg Aac CAg gcc agc cac GTt tCt gcg aaa Acg cgg gaa aaa GTg Gaa gcg GCg atg Gcg gag ctg aat TAC att cCc aAC cgc gTg GCa CAA CAa cTg gCg Ggc Aaa Cag tCg ttg ctg att ggc gtt gcc acc tcc agt cTg gcc…Wild-type ebgR sequence, 13 mutation sites (mutable bases are capitalized) have been found:…gac atc gca atc gaa gCt ggc gta Tcc ctg gcg aca Gta tcc AGg gtc tta aat gac gAT ccg aca ttg Aat Gtg aaa gaa gag aCg aaa cat cgc att ctc Gag atc gcc Gaa aag ctg gag taC aag acc…

Page 5: Predicting mutation frequencies in stem–loop structures of ...hs.umt.edu/dbs/labs/wright/documents/j.1365-2958... · argH, including codon 4; malT, including codon 545; and trpA,

Predicting mutations in stem–loop structures 433

© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441

shown). However, in higher –∆G folds of 35 or 40 nt, morestable stems occur, and the MI is zero, i.e. this G is pairedin all folds. Therefore, the mutation of this G to an A or T,giving rise to alleles 11 and 88, respectively, is predictedto occur in relatively small SLSs. These latter SLSs haveonly three C*G pairs in their stems and thus lower –∆Gvalues (see bar graph in Fig. 3). However, as they areunpaired much of the time, their MIs are much higher (2.3and 2.5) than that of the wild-type G (0.3). The relative MIvalues calculated for the three mutable bases in thesemutants compare favourably with their relative reversionfrequencies (Isbell and Fowler, 1989; Bhamre et al., 2001;see Fig. 3, table inset). Mutation of the middle base incodon 49 (from an A to a T) retains the strong stem (allele3), and the MI of that T reverting to wild-type A is thehighest of all (3.5). These relative MIs predict that thereversion frequency of allele 3 should be higher than thatof 11 and 88, and that the latter alleles should havecomparable reversion frequencies. The table inset com-pares MIs with reversion frequencies. Figure 4 summa-rizes similar data for four more mutant alleles in codons211 and 234 of trpA. Again, the relative MIs compare wellwith their relative reversion frequencies, and MIs of themutant bases are higher than those of the wild-type baseto which they revert.

Figures 3 and 4 also show that the highest –∆G SLSsof wild-type sequences are often the same as that of theirmutant alleles. For example, in codon 49, the most stableSLS of allele A 3 is the same as that of wild type. Similarly,A 23 and A 26 in codon 211 and A 78 and A 58 in codon234 have the same SLSs as their respective wild types.Transcription should enhance the mutation frequency ofall mutable bases, including those of wild type, by local-ized increases in the number of SLSs.

The most probable SLSs formed by the non-transcribedstrands (NTSs) at peak –∆G values were compared withthose formed by the transcribed strands (TSs) and usuallyfound to mirror one another. In such genes, therefore,mutations are equally probable in the NTS and TS. How-ever, in the argH gene (and, to a lesser extent, leuB),different structures form in 30 nt folds of the two strands,as shown in Fig. 5A. This occurs because of nearestneighbour interactions, the asymmetry of the base stack-ing rules and because a stem structure containing a G*Tbond forms in the TS (SantaLucia, 1998; Zuker et al.,1999). This stem cannot form in the NTS, which has a Cand an A in a comparable region of the SLS. The longerstem present in the TS bestows a higher –∆G and higherMIs to that SLS compared with the NTS, and predicts thatmutations are more likely in the TS. In both the TS and

Fig. 3. Predicted SLSs at peak –∆G values are shown for wild type and three mutant alleles of codon 49 of trpA. The predicted MI values for the mutations (arrows) are compared in the table inset with mutation frequencies determined experimentally in two data sets. See Table 1 and text for details .

Page 6: Predicting mutation frequencies in stem–loop structures of ...hs.umt.edu/dbs/labs/wright/documents/j.1365-2958... · argH, including codon 4; malT, including codon 545; and trpA,

434 B. E. Wright et al.

© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441

the NTS, the MIs of the mutable bases are somewhathigher than the MIs of the wild-type bases (not shown). Inthe NTS, folding 40 nt gives similar MIs for the first twocodon positions but increases the MI in the third positionfrom 1.5 to 2.8. In all folds above 20 nt, the third codonposition has the highest MI value, consistent with thenumber of mutations determined by sequencing 23 rever-tants (Fig. 5A, table inset).

In Fig. 5B, a qualitative comparison of MIs with thenumber of mutation events is shown for the leuBmutant. In most cases, 30 nt SLSs have higher –∆G val-ues than 20 nt SLSs but, in this case, the stems are thesame, and the long sequences at the ends of the 30 ntSLSs may well be paired with their complements. Asshown in the table inset of Fig. 5B, the MIs of the muta-ble bases are comparable and consistent with the

Fig. 4. Predicted SLSs at peak negative energy values for wild type and two mutant alleles for codons 211 and 234 of trpA. Predicted MIs and mutation frequencies determined experimentally are compared in the table inset. See Table 1 and text for details.

Fig. 5. Predicted SLSs of argH (A) and leuB (B) genes. The predicted MI of each mutable base is indicated by an arrow. Inset table shows the MI of each mutable base compared with the number of mutations (No.) determined by sequencing 23 revertants in the case of argH and 36 in the case of leuB. See text for discussion.

Page 7: Predicting mutation frequencies in stem–loop structures of ...hs.umt.edu/dbs/labs/wright/documents/j.1365-2958... · argH, including codon 4; malT, including codon 545; and trpA,

Predicting mutations in stem–loop structures 435

© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441

results of sequencing 36 revertants (Wright and Min-nick, 1997).

Table 1 summarizes the comparisons in the samecodon between calculated MIs and reversion frequenciesfor the seven trpA auxotrophs, and Table 2 summarizessequencing data for the argH, leuB and lacZ auxotrophs.Data are also available for a comparable analysis of 414background mutable bases in lacI (Schaaper and Dunn,1991; Fig. 6). In this sequence, all mutable bases withina codon in which at least one base had three or moremutational events were analysed, giving a total of 35. Thenumber of mutants shown above each base was counted,and the results are summarized in Table 3. Mutation fre-quencies (trpA) and number of mutations (lacI) are plottedagainst MIs to obtain linear regression analyses of thedata in Tables 1 and 3 (Fig. 7). The R2 value for the trpAalleles is 0.57 and for the lacI data set is 0.002. Althoughthe qualitative comparisons of MIs and mutation eventsare good for the mutable bases in Table 2, there are toofew data points for this type of analysis.

The triple mutant

The lacZ stop codon mutant and its triple revertant areshown in Fig. 8. As mentioned previously, a 30 nt win-dow was chosen for the analysis of SLSs. However,when the very long and stable stem was observed, itwas necessary to increase the window to 32 nt in orderto include the TAG codon (Fig. 8). In the course ofsequencing 47 of the revertants, we observed manytypes of single base revertants, a four codon deletionand, on nine occasions, a triple mutation was isolated(Table 2). As seen in Table 2, a number of amino acidsubstitutions can result in an active enzyme, althoughdifferent growth rates are observed. This triple mutationmust have arisen by a templating mechanism. The rever-sion frequency of this stop codon is ≈ 10−9, and three

simultaneous independent mutations would have theunlikely frequency of (10−9)3. Inspection of the sequencesat the end of the 32 nt SLS revealed that, were the struc-ture extended to 40 nt, a bubble would appear in whichtemplating from the complementary strand could changethe TAG stop codon to the triple mutant shown in Fig. 8.There is precedence for this mechanism in the literature(Ripley, 1982), and the existence of this templated triplemutation provides compelling evidence for the role ofSLSs in vivo.

Discussion

Evidence in the literature indicates that DNA sequencecan determine the location of a background mutation, butthe frequency with which it occurs depends primarily uponvarious kinds of DNA-destabilizing metabolic events.When growth is inhibited, DNA replication is minimizedand transcription of specific derepressed genes inresponse to an inhibitor or condition of stress may thenbe a major cause of mutations (Wright, 2000). In 11strains of leuB and argH auxotrophs, isogenic except forrelA, a correlation was found between mutation rates,ppGpp and mRNA levels (Wright, 1996; Wright and Min-nick, 1997; Longacre et al., 1999; Wright et al., 1999).Similar results have been obtained in B. subtilis, in whichreversion rates of three different amino acid auxotrophswere higher in strains able to produce the transcriptionalactivator guanosine tetraphosphate (Rudner et al., 1999).Current studies with malT and lacZ auxotrophs also indi-cate that those strains able to activate transcription (cya+,crp+) have higher reversion rates than the strains (cya andcrp mutants) unable to activate transcription (unpublisheddata). The presence of these mutations in unpaired bases

Table 1. Comparisons in the same codons between MIs and rever-sion frequencies in seven mutable alleles of the derepressed trpAgene.

Codon Mutant allele MI

Revertants/108 cells

Expt Aa Expt Bb

49 trpA 3 3.5 0.4 1.349 trpA 11 2.3 0.06 0.249 trpA 88 2.5 NDc 0.3

211 trpA 23 3.6 1.1 ND211 trpA 46 3.1 0.8 ND234 trpA 58 2.8 0.3 0.3234 trpA 78 3.4 ND 1.8

a. Taken from Table 3 in Bhamre et al. (2001). For allele 58, thefrequencies of A to G and to C were summed.b. Taken from Table 4 in Isbell and Fowler (1989). Only class I fullrevertant frequencies were used.c. No data available.

Table 2. Comparisons between MIs and relative numbers of muta-tions determined by sequencing in eight mutable alleles of threederepressed genes.

Mutant Codon Base change MI No. of mutations

argH 4 T to G, C, or A 0.8 84 G to T or C 0.8 2a

4 A to C, G, or T 1.5 13leuB 286 T to A or G 2.6 20

286 T to C 2.6 16286 G to A, C or T – Not observed

lacZ 63 T to C, G or A 13.9 7b

63 A to G or C 13.9 1963 G to C or T 13.9 1263 TAG to AGC Triple 9

a. This number of revertants is lower than expected, but mutation ofG to A results in a stop codon that would not be observed.b. Theoretically, mutations in the first position should occur as fre-quently as those in the second and third positions. This value may below due to (observed) lower growth rates of these revertants, just asthe A to G revertants (wild type) in the second position (wild type)may be high because these revertants have the highest growth rate.

Page 8: Predicting mutation frequencies in stem–loop structures of ...hs.umt.edu/dbs/labs/wright/documents/j.1365-2958... · argH, including codon 4; malT, including codon 545; and trpA,

436 B. E. Wright et al.

© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441

of predicted SLSs implicates supercoiling and is consis-tent with the literature.

Background point mutations in unpaired bases occur byknown chemical mechanisms having finite, significantactivation energies under physiological conditions, andsuch lesions can subsequently be immortalized by repli-cation. The thermodynamic properties of hydrolytic reac-tions that occur in nucleic acids under these conditionsare such that the deamination of C is much greater thanthat of A, and the depurination of G or A is much greaterthan the depyrimidation of C or T (reviewed by Singer andSterns, 1982). Also, the oxidation potential of a G located5′ to another G is greater than it is next to a C or T(reviewed by Sugden and Stearns, 2000). Because of itssize, an A is more likely than a C or a T to replace a G.The lacI data set (Fig. 6) is sufficiently large to documentthese statements, i.e. 75% of the G mutations are to A,and 88% of the C mutations are to T. A recent analysis(Wright et al., 2002) of 14 000 mutations in the human p53gene shows a similar pattern, with values of 74% and 91%respectively.

A significant source of unpaired and mispaired basessubject to the kinds of mutations described above are theloops of SLSs that can often elude detection and repair(Moore et al., 1999). These secondary DNA structures arenaturally present because of the frequent occurrence ofinverted complements in DNA sequences (Lilley, 1980;Papyotatos and Wells, 1981). It should be pointed out that,in general, specific positions in these SLSs (e.g. a basenear a stem) are uniquely vulnerable to mutation, regard-less of which base is present at that position. Duringtranscript elongation and the probable continuous varia-tions in the length of the nucleotide segments folded invivo, the stems, which create the SLSs, are the invariantpart. Thus, the most reliable location for an unpaired basethat evolved to be mutable is near a stem. In prokaryotes,the most mutable bases are in codons that are within acodon's length of a stem (Figs 3–5 and 8). In the eukary-otic p53 tumour suppressor gene, the most mutable basesare located immediately next to stems in stable SLSs.The predicted MIs correlate well (R2 = 0.76) with thoseobserved for 14 000 human cancers, whereas no such

Fig. 6. The number of background mutations in lacI (Schaaper and Dunn, 1991).

Page 9: Predicting mutation frequencies in stem–loop structures of ...hs.umt.edu/dbs/labs/wright/documents/j.1365-2958... · argH, including codon 4; malT, including codon 545; and trpA,

Predicting mutations in stem–loop structures 437

© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441

correlation (R2 = 0.0005) is seen for nearby control bases(Wright et al., 2002).

Do measured mutation rates really represent mutationrates in vivo? An appropriate gene to consider in discuss-ing the true frequency of background mutations (whetheror not they are silent) is lacI, in which 25% of the basesare known to be mutable (Fig. 6). However, 90% of these

mutations occur in the first two codon positions, aschanges in the third position are usually silent, i.e. do notalter the encoded amino acid. As mutations are alsoundoubtedly occurring in the third position, these can beadded to the 57 observed mutated bases. Thus, about40% of the bases in lacI are vulnerable to mutation. Thisdoes not include all the other silent mutations that donot inactivate this repressor. Thus, the majority of allbases appear to be mutable, but many of these are notdetectable.

Although SLSs are similar for mutable bases in theauxotrophs and in lacI, a number of differences have beennoted. In lacI, mutable bases occur more frequently at agreater distance from a stem, or in an unlooped, singlestrand 8–11 bases from a stem. The existence of such astructure may be unlikely, as that length of ssDNA wouldprobably be bonded to its complement as dsDNA. Thesame mechanisms that give rise to the lacI mutationsmust of course occur at some frequency in derepressedauxotroph genes.

The analysis of background mutable bases in lacI indi-cates that as many mutations occur in high as in low –∆GSLSs. In this system, all mutations have equal probabili-ties of being detected and sequenced. However, thisis not the case with revertants of auxotrophs selectedin derepressed genes during starvation. Transcriptionenhances localized supercoiling and increases the con-centration of SLSs that harbour unpaired bases suscepti-ble to mutation. Therefore, these vulnerable bases locatedin high –∆G SLSs are selected during starvation. Suchbases will mutate the most frequently, enrich the mutantpopulation with the most progeny and have the highestprobability of being isolated.

Many kinds of environmental challenge can target spe-cific genes for supercoiling and enhanced mutagenesis.Data summarized in this article indicate that nutritionalstress can localize enhanced mutability in SLSs resultingfrom selective transcription. These circumstances can

Table 3. Comparisons in the same codon between MIs and the num-ber of background mutations in 35 mutable bases of lacI.

Allele MI No. of mutations

41 3.6 242 0.4 356 0.6 657 0.6 1080 3.9 381 3.2 582 3.2 183 0.8 1784 0.8 586 1.4 787 4.3 389 4.7 190 4.7 492 4.7 793 3.4 1695 2.7 596 0.3 10

104 3.1 29105 3.4 5116 1.5 4117 0.1 1140 4.2 1141 3.5 12149 0.6 3150 2.2 2167 3.5 15168 3.5 6169 3.3 1177 3.8 6178 1.3 1185 2.0 9186 3.5 15188 2.9 6189 2.8 3190 3.1 3

Fig. 7. Linear regression analyses for the data in Tables 1 and 3. Mutation frequencies or number of mutations are plotted against MIs for seven mutable alleles in the derepressed trpA gene (Isbell and Fowler, 1989; Bhamre et al., 2001), and for 35 background mutable alleles in lacI (Schaaper and Dunn, 1991).

Page 10: Predicting mutation frequencies in stem–loop structures of ...hs.umt.edu/dbs/labs/wright/documents/j.1365-2958... · argH, including codon 4; malT, including codon 545; and trpA,

438 B. E. Wright et al.

© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441

result in ‘forward’ mutations beneficial to evolution. Forexample, during starvation for ribitol in the presence ofxylitol, two existing genes are modified to regulate thetransport and use of xylose as a new carbon source(Lerner et al., 1964; Wright, 2000).

Other stressors, such as extreme osmotic conditions,heat shock, host–pathogen interactions or metal toxicity,could direct mutations to related genes potentially ableto alleviate these problems. (Borowiec and Gralla, 1987;Higgins et al., 1988; Pruss and Drlica, 1989; Ansariet al., 1992; Dorman, 1995; Jordi et al., 1995; Lefstin andYamamoto, 1998; Massey et al., 1999).

Specific areas of the genome are targeted for localizedtranscription and supercoiling, increasing the concentra-tion of SLSs and unpaired bases vulnerable to mutation.This mechanism provides a continuous feedback loopbetween the ever-changing environment, productive tar-gets for localized enhanced mutation frequencies andselection of the fittest. As most mutations are detrimental,directing mutations to those genes regulating the cell'sresponse to each type of stress minimizes genome-widegenetic damage while creating the most appropriate vari-ants for accelerating evolution.

Just as DNA secondary structures may be consideredan additional dimension of gene expression (Wells et al.,1980), so may the mutations that occur because of theirlocation in these structures. These structures are presentmany thousands of times more frequently than expectedby chance (Lilley, 1980) and, in viral DNAs, they are over-represented in regions that appear to have regulatorysignificance (Müller and Fitch, 1982). In higher organ-isms, these hypermutable sequences have apparentlybecome localized to increase variants in genes of criticalimportance to the survival of the organism (Forsdyke,1995). However, the same mechanisms by which stresscan cause mutations and accelerate evolution in microor-ganisms may cause cancer in multicellular organisms(Rady et al., 1992; Ionov et al., 1993; Skandalis et al.,

1994; Panagopoulos et al., 1997; Colman et al., 2000),where survival of the fittest mutant can result in abnormalgrowth.

There is reason to believe that the evolution of the DNAworld was paralleled by the development of systems suchas supercoiling to help maintain the integrity, geometryand functionality of DNA (López-García, 1999). Theessential role of negative supercoiling in the regulation ofall aspects of DNA behaviour, especially in response toenvironmental stress, is becoming increasingly apparent.If specific conditions of stress target particular areas ofthe genome for negative supercoiling, the formation ofsecondary structures will be facilitated; if stress first acti-vates transcription, this will in turn drive supercoiling. Byeither scenario, the resulting increase in concentration ofSLSs will result in localized mutations, and a numberof circumstances can apparently converge to makeresponses to particular adverse conditions quite specific.

In bases within the same codon, differential effects ofother variables affecting mutability should be minimal.These contiguous bases obviously have a very similarmicroenvironment with respect to supercoiling domains(Sinden and Pettijohn, 1981; Krasilnikov et al., 1999),ssDNA binding proteins, accessory proteins involved intranscription, repair systems and so on. Therefore, com-paring their relative mutation frequencies or events is idealfor evaluating the relevance of the factors used in calcu-lating MI, and for testing the validity of our proposed modelfor predicting mutability. In fact, all our results are consis-tent with the location of mutable bases in SLSs. Based onthis assumption, MIs for mutable bases in selected geneshave, with unanticipated success, predicted 15 relativemutation frequencies (trpA) or events (argH, leuB andlacZ). This correlation occurred in spite of the fact that anMI depends upon only two variables: (i) the highest –∆Gof all the SLSs in which it is unpaired; and (ii) the extentto which the base is unpaired in all its foldings duringtranscription.

Fig. 8. The 32 nt and 40 nt SLSs of lacZ. When 40 nt are folded, it can be seen that the triple mutation, AGC, could arise by templating to the complementary TCG sequence in the opposite strand.

Page 11: Predicting mutation frequencies in stem–loop structures of ...hs.umt.edu/dbs/labs/wright/documents/j.1365-2958... · argH, including codon 4; malT, including codon 545; and trpA,

Predicting mutations in stem–loop structures 439

© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441

Experimental procedures

The analysis of DNA secondary structures

A new software tool called ‘mfg’ was developed for use in thispaper for predicting mutation frequencies. Given an inputsequence, the set of all subsequences containing that baseis generated. From this set, mfg calculates the single-stranded percentage of the base and the energy of the moststable structure containing the base. These are multiplied toproduce the MI value for that base. A full description ofthis program and instructions for its use can be found athttp://biology.dbs.umt.edu/wright/upload/mfg.html. The pro-gram currently runs on the Windows platform only, but sourcecode is available for porting to other platforms.

Mutation rate conditions used to obtain revertants for sequencing

Mutation rates for argH and leuB were determined asdescribed previously (Wright and Minnick, 1997). The rever-sion rate of codon 63 of lacZ was determined as follows: E.coli CSH2 cells from a 12- to 24-h-old rich media plate werediluted in 75 ml of minimal medium, consisting of 50 mMsodium phosphate buffer, pH 6.5, 1.0 g l−1 (NH4)2SO4, 1.0 gl−1 MgSO4, 0.3 mM glucose and 0.3 mM tryptophan to a celldensity of 5 × 10−4 ml−1. This inoculum was divided into 45 2-cm-diameter test tubes at 1.5 ml per tube and grown for 18–28 h with shaking, at a 45° angle at 37°C. Cell growth wasfollowed by hourly sampling companion tubes at a 1:10 dilu-tion in buffered saline. When the cell concentration reachedthe end of log growth, serial dilutions were made from twocompanion cultures, and 10−6 and 10−7 dilutions were platedon rich nutrient agar plates and incubated for 24 h at 37°Cto determine viable counts. The contents of 40 culture tubeswere distributed on selective [5 g l−1 lactose, salts (as above),0.3 mM tryptophan, 0.1 mM NaCl] agar plates. Plates wereincubated for 40–48 h at 37°C and evaluated for revertants(Luria and Delbrück, 1943). Although mutation rates per seare not reported in this paper, the above procedures wereused to obtain the revertants sequenced (Tables 2 and 3).

Sequencing of revertants

Revertants were isolated from mutation rate plates containinga single colony at 72 h, restruck on selective plates andgrown for 24 h at 37°C. An inoculum was transferred to 2 mlof Luria–Bertani (LB) broth and grown for 8 h with shaking at37°C. The culture was transferred to a microcentrifuge tubeand pelleted at 10 000 g for 10 min. A Qiagen DNeasy tissuekit was used to extract chromosomal DNA. The lacZ DNAwas amplified by polymerase chain reaction (PCR) in a50 µl volume using 500 ng of template and the primer pair:LacZ5′ (5′-ATGACCATGATTACGGATTC-3′) and LacZ3′ (5′-TTATTTTTGACACCAGACCAAC-3′) used at a concentrationof 50 pmol per reaction. Other components of the PCR, allfrom New England BioLabs, were 10× polymerase buffer,VentR DNA polymerase, MgSO4 and dNTPs. After 30 cycles,2 µl of the product was visualized on a 1% agarose gelcontaining 0.5 µg ml−1 of ethidium bromide, and the remain-der was cleaned with a Qiaquick PCR purification kit

(Qiagen). Sequencing was performed at the Murdock Molec-ular Biology Facility using a BigDye Terminator ready reactionkit with an Applied Biosystems Model 373A stretch DNAsequencer.

Acknowledgements

We are indebted to Drs Robert G. Fowler, Bryn Bridges, RoelM. Schaaper and Scott Samuels for valuable criticisms of andsuggestions for the manuscript. We thank Dr George Cardfor the CSH2 E. coli strain. This work was supported byNational Institutes of Health grants R15CA88893 andR55CA99242 and the Stella Duncan Memorial ResearchInstitute.

References

Ansari, A.Z., Chael, M.L., and O'Halloran, T.V. (1992) Allos-teric underwinding of DNA is a critical step in positivecontrol of transcription by Hg-MerR. Nature 355: 87–89.

Balke, V.L., and Gralla, J.D. (1987) Changes in the linkingnumber of supercoiled DNA accompany growth transitionsin Escherichia coli. J Bacteriol 169: 4499–4506.

Beletskii, A., and Bhagwat, A.S. (1996) Transcription-inducedmutations: increase in C-to-T mutations in the non-transcribed strand during transcription in Escherichia coli.Proc Natl Acad Sci USA 93: 13919–13924.

Benham, C.J. (1992) Energetics of the strand separationtransition in superhelical DNA. J Mol Biol 225: 835–847.

Bhamre, S., Bedrick, B.G., Koyama, C.A., White, S.J., andFowler, R.G. (2001) An aerobic recA-, umuC-dependentpathway of spontaneous base-pair substitution mutagene-sis in Escherichia coli. Mutat Res 473: 229–247.

Borowiec, J.A., and Gralla, J.D. (1987) All three elements ofthe lac ps promoter mediate its transcriptional response toDNA supercoiling. J Mol Biol 195: 89–97.

Bridges, B.A. (2001) Hypermutation in bacteria and othercellular systems. Phil Trans R Soc London B 356: 29–39.

Brock, R.D. (1971) Differential mutation of the β-galactosi-dase gene of Escherichia coli. Mutat Res 11: 181–186.

Colman, M.S., Afshari, C.A., and Barrett, J.C. (2000) Regu-lation of p53 stability and activity in response to genotoxicstress. Mutat Res 462: 179–188.

Coulondre, C., Miller, J.H., Farabaugh, P.J., and Gilbert, W.(1978) Molecular basis of base substitution hotspots inEscherichia coli. Nature 274: 775–780.

Datta, A., and Jinks-Robertson, S. (1995) Association ofincreased spontaneous mutation rates with high levels oftranscription in yeast. Science 268: 1616–1619.

Dayn, A., Malkhosyan, S., Duzhy, D., Lyamichev, V.,Panchenko, Y., and Mirkin, S. (1991) Formation of (dA-dT)n

cruciforms in Escherichia coli cells under different environ-mental conditions. J Bacteriol 173: 2658–2664.

Dayn, A., Malkhosyan, S., and Mirkin, S.M. (1992) Transcrip-tionally driven cruciform formation in vivo. Nucleic AcidsRes 20: 5991–5997.

Dorman, C.J. (1995) DNA topology and the global control ofbacterial gene expression: implications for the regulationof virulence gene expression. Microbiology 141: 1271–1280.

Fix, D.F., and Glickman, B.W. (1987) Asymmetric cytosine

Page 12: Predicting mutation frequencies in stem–loop structures of ...hs.umt.edu/dbs/labs/wright/documents/j.1365-2958... · argH, including codon 4; malT, including codon 545; and trpA,

440 B. E. Wright et al.

© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441

deamination revealed by spontaneous mutational specific-ity in an UngG– strain of Escherichia coli. Mol Gen Genet209: 78–82.

Forsdyke, D.R. (1995) Conservation of stem-loop potential inintrons of snake venom phospholipase A2 genes: an appli-cation of FORS-D analysis. Mol Evol Biol 12: 1157–1165.

Foster, P.L. (1999) Mechanisms of stationary phase muta-tion: a decade of adaptive mutation. Annu Rev Genet 33:57–88.

Frederico, L.A., Kunkel, T.A., and Shaw, B.R. (1990) A sen-sitive genetic assay for the detection of cytosine deamina-tion: determination of rate constants and the activationenergy. Biochemistry 29: 2532–2537.

Frederico, L.A., Kunkel, T.A., and Shaw, B.R. (1993)Cytosine deamination in mismatched base pairs. Biochem-istry 32: 6523–6530.

Hall, B.G. (1999) Spectra of spontaneous growth-dependentand adaptive mutations at ebgR. J Bacteriol 181: 1149–1155.

Herman, R.K., and Dworkin, N.B. (1971) Effect of geneinduction on the rate of mutagenesis by ICR-191 in Escher-ichia coli. J Bacteriol 106: 543–550.

Higgins, C.F., Dorman, C.J., Stirling, D.A., Waddell, L.,Booth, I.R., May, G., and Bremer, E. (1988) A physiologicalrole for DNA supercoiling in the osmotic regulation of geneexpression in S. typhimurium and E. coli. Cell 52: 569–584.

Ionov, Y., Peinado, M.A., Malkhosyan, S., Shibata, D., andPerucho, M. (1993) Ubiquitous somatic mutations in simplerepeated sequences reveal a new mechanism for coloniccarcinogenesis. Nature 363: 558–561.

Isbell, R.J., and Fowler, R.G. (1989) Temperature-dependentmutational specificity of an Escherichia coli mutator,dnaQ49, defective in 3′→5′ exonuclease (proofreading)activity. Mutat Res 213: 149–156.

Jordi, B.J.A.M., Owen-Hughes, T.A., Hulton, C.S.J., andHiggins, C.F. (1995) DNA twist, flexibility and transcriptionof the osmoregulated proU promoter of Salmonella typh-imurium. EMBO J 14: 5690–5700.

Kingston, R.E., Nierman, W.C., and Chamberlin, M.J. (1981)A direct effect of guanosine tetraphosphate on pausing ofEscherichia coli RNA polymerase during RNA chain elon-gation. J Biol Chem 256: 2787–2797.

Krasilnikov, A.S., Podtelezhnikov, A., Vologodskii, A., andMirkin, S.M. (1999) Large-scale effects of transcriptionalDNA supercoiling in vivo. J Mol Biol 292: 1149–1160.

Lefstin, J.A., and Yamamoto, K.R. (1998) Allosteric effects ofDNA on transcriptional regulators. Nature 392: 885–888.

Lerner, S.A., Wu, T.T., and Lin, E.C.C. (1964) Evolution of acatabolic pathway in bacteria. Science 146: 1313–1315.

Lilley, D.M.J. (1980) The inverted repeat as a recognizablestructural feature in supercoiled molecules. Proc Natl AcadSci USA 77: 6468–6472.

Lindahl, T., and Nyberg, B. (1972) Rate of depurination ofnative deoxyribonucleic acid. Biochemistry 11: 3610–3618.

Lindahl, T., and Nyberg, B. (1974) Heat-induced deaminationof cytosine residues in deoxyribonucleic acid. Biochemistry13: 3405–3410.

Lipschutz, R., Falk, R., and Avigad, G. (1965) Mutation ratesof the induced and repressed locus of alkaline phos-phatase in Escherichia coli. Isr J Med Sci 1: 23–24.

Liu, L.F., and Wang, J.C. (1987) Supercoiling and the DNA

template during transcription. Proc Natl Acad Sci USA 84:7024–7027.

Longacre, A., Reimers, J.M., and Wright, B.E. (1999) Spec-ificity of transcription-enhanced mutations. Ann NY AcadSci 870: 383–385.

López-García, P. (1999) DNA supercoiling and temperatureadaptation: a clue to early diversification of life? J Mol Evol49: 439–452.

Luria, S., and Delbrück, M. (1943) Mutations of bacteria fromvirus sensitivity to virus resistance. Genetics 28: 491–504.

Massey, R.C., Rainey, P.B., Sheehan, B.J., Keane, O.M.,and Dorman, C.J. (1999) Environmentally constrainedmutation and adaptive evolution in Salmonella. Curr Biol 9:1477–1480.

Moore, H., Greenwell, P.W., Liu, C.-P., Arnheim, N., andPetes, T.D. (1999) Triplet repeats form secondary struc-tures that escape DNA repair in yeast. Proc Natl Acad SciUSA 96: 1504–1509.

Müller, U.R., and Fitch, W.M. (1982) Evolutionary selectionfor perfect hairpin structures in viral DNAs. Nature 928:582–585.

Panagopoulos, I., Lasen, C., Isaksson, M., Mitelman, F.,Mandahl, N., and Aman, P. (1997) Characteristicsequence motifs at the breakpoints of the hybrid genesFUS/CHOP, EWS/CHOP and FUS/ERG in myxoid liposa-rcoma and acute myeloid leukemia. Oncogene 15: 1357–1362.

Pananicolaou, C., and Ripley, L.S. (1991) An in vitro appro-ach to identifying specificity determinants of mutagenesismediated by DNA misalignments. J Mol Biol 221: 805–821.

Papyotatos, N., and Wells, R.D. (1981) Cruciform structuresin superhelical DNA. Nature 289: 466–470.

Pruss, G.J., and Drlica, K. (1989) DNA supercoiling andprokaryotic transcription. Cell 56: 521–523.

Rady, P., Scinicariello, F., Wagner, F., and King, S. (1992)p53 mutations in basal cell carcinoma. Cancer Res 52:3804–3806.

Rahmouni, A.R., and Wells, R.D. (1992) Direct evidence forthe effect of transcription on local DNA supercoiling in vivo.J Mol Biol 223: 131–144.

Ripley, L.S. (1982) Model for the participation of quasi-palindromic DNA sequences in frameshift mutation. ProcNatl Acad Sci USA 79: 4128–4132.

Rosenberg, S.M. (2001) Evolving responsively: adaptivemutation. Nature Rev Genet 2: 504–515.

Rudner, R., Murray, A., and Huda, N. (1999) Is there a linkbetween mutation rates and the stringent response inBacillus subtilis? Ann NY Acad Sci 870: 418–422.

SantaLucia, J., Jr (1998) A unified view of polymer, dumbbell,and oligonucleotide DNA nearest-neighbor thermodynam-ics. Proc Natl Acad Sci USA 95: 1460–1465.

Schaaper, R.M., and Dunn, R.L. (1991) Spontaneous muta-tion in the Escherichia coli lacI gene. Genetics 129: 317–326.

Schaaper, R.M., Danforth, B.N., and Glickman, B.L. (1986)Mechanisms of spontaneous mutagenesis: an analysis ofthe spectrum of spontaneous mutation in the Escherichiacoli lacI gene. J Mol Biol 189: 273–284.

Sinden, R.R., and Pettijohn, D.E. (1981) Chromosomes inliving Escherichia coli are segregated into domains ofsupercoiling. Proc Natl Acad Sci USA 78: 224–228.

Page 13: Predicting mutation frequencies in stem–loop structures of ...hs.umt.edu/dbs/labs/wright/documents/j.1365-2958... · argH, including codon 4; malT, including codon 545; and trpA,

Predicting mutations in stem–loop structures 441

© 2003 Blackwell Publishing Ltd, Molecular Microbiology, 48, 429–441

Singer, K.D., and Sterns, D.M. (1982) Chemical Mutagene-sis. Annu Rev Biochem 52: 655–693.

Skandalis, A., Ford, B.N., and Glickman, B.W. (1994) Strandbias in mutation involving 5-methylcytosine deamination inthe human hprt gene. Mutat Res 314: 21–26.

Sugden, K.D., and Stearns, D.M. (2000) The role of chro-mium (V) in the mechanism of chromate-induced oxidativeDNA damage and cancer. J Environ Pathol Toxicol Oncol19: 215–230.

Todd, P.A., and Glickman, B.W. (1982) Mutational specificityof UV light in Escherichia coli: indications for a role ofsecondary structure. Proc Natl Acad Sci USA 79: 4123–4127.

Wells, R.D., Goodman, T.C., Hillen, W., Horn, G.T., Klein,R.D., Larson, J.E., et al. (1980) DNA structure and generegulation. Prog Nucleic Acid Res Mol Biol 24: 167–267.

Wright, B.E. (1996) The effect of the stringent response onmutations in Escherichia coli K12. Mol Microbiol 19: 213–219.

Wright, B.E. (1997) Does selective gene activation directevolution? FEBS Lett 402: 4–8.

Wright, B.E. (2000) A biochemical mechanism for nonrandommutations and evolution. J Bacteriol 182: 2993–3001.

Wright, B.E., and Minnick, M.F. (1997) Reversion rates in aleuB auxotroph of Escherichia coli K12 correlate withppGpp levels during exponential growth. Microbiology 143:847–854.

Wright, B.E., Longacre, A., and Reimers, J.M. (1999) Hyper-mutation in derepressed operons of Escherichia coli K12.Proc Natl Acad Sci USA 96: 5089–5094.

Wright, B.E., Reimers, J.M., Schmidt, K.H., and Reschke,D.K. (2002) Hypermutable bases in the p53 cancer geneare at vulnerable positions in DNA secondary structures.Cancer Res 62: 5641–5644.

Yanofsky, C.T., Platt, T., Crawford, I.P., Nichols, B.P.,Christie, G.E., Horowitz, H., et al. (1981) The completenucleotide sequence of the tryptophan operon of Escheri-chia coli. Nucleic Acids Res 9: 6647–6668.

Zuker, M., Mathews, D.H., and Turner, D.H. (1999) A Practi-cal Guide. In RNA Biochemistry and Biotechnology.Barciszewski, J., and Clark, B.F.C. (eds). Dordrecht:Kluwer Academic Publishers, pp. 11–43.