transcription initiation from dihydrofolate reductase
Embed Size (px)
TRANSCRIPT
Transcription initiation from the dihydrofolate reductase promoter
is positioned by HIP1 binding at the initiation siteVol. 10, No.
2MOLECULAR AND CELLULAR BIOLOGY, Feb. 1990, p. 653-661
0270-7306/90/020653-09$02.00/0 Copyright C) 1990, American Society
for Microbiology
Transcription Initiation from the Dihydrofolate Reductase Promoter Is Positioned by HIPI Binding at the Initiation Site
ANNA L. MEANS AND PEGGY J. FARNHAM* McArdle Laboratory for Cancer Research, University of Wisconsin, 1400 University Avenue, Madison, Wisconsin 53706
Received 30 August 1989/Accepted 31 October 1989
We have identified a sequence element that specifies the position of transcription initiation for the dihydrofolate reductase gene. Unlike the functionally analogous TATA box that directs RNA polymerase II to initiate transcription 30 nucleotides downstream, the positioning element of the dihydrofolate reductase promoter is located directly at the site of transcription initiation. By using DNase I footprint analysis, we have shown that a protein binds to this initiator element. Transcription initiated at the dihydrofolate reductase initiator element when 28 nucleotides were inserted between it and all other upstream sequences, or when it was placed on either side of the DNA helix, suggesting that there is no strict spatial requirement between the initiator and an upstream element. Although neither a single Spl-binding site nor a single initiator element was sufficient for transcriptional activity, the combination of one Spl-binding site and the dihydrofolate reductase initiator element cloned into a plasmid vector resulted in transcription starting at the initiator element. We have also shown that the simian virus 40 late major initiation site has striking sequence homology to the dihydrofolate reductase initiation site and that the same, or a similar, protein binds to both sites. Examination of the sequences at other RNA polymerase II initiation sites suggests that we have identified an element that is important in the transcription of other housekeeping genes. We have thus named the protein that binds to the initiator element HIP1 (Housekeeping Initiator Protein 1).
Interactions between transcription factors and specific DNA sequences within an RNA polymerase II promoter can be grouped into two categories, depending upon how they influence transcription. One class of factors, usually binding at least 50 base pairs (bp) upstream of the initiation site, regulates the efficiency of transcription, presumably by altering the rate or conformation of polymerase attachment. Examples of this class of factors are Spl (6, 26), Apl (1, 28), and Ap2 (24, 35). Deletion of binding sites for these factors results in a gradual reduction in transcriptional activity until all such sites are removed. The second class of transcription factors specifies the site of initiation. The only previously characterized transcription factor known to influence the site of initiation is TFIID, a protein that binds to an A+T-rich sequence called a TATA box and directs RNA polymerase II to start transcription approximately 30 bp downstream (5, 39). Deletion of the TATA box can result in spurious initiations and a low level of transcription (5). However, not every gene contains a TFIID consensus sequence at the correct distance upstream of the transcription initiation site. In particular, many cellular genes that are expressed at low levels and encode proteins found in all cell types (so-called housekeeping genes) do not have a TFIID consensus se- quence. One example is the dihydrofolate reductase (DHFR) gene.
The DHFR gene is expressed throughout the cell cycle in proliferating cells, but its transcription rate increases seven- fold at the G1-S phase boundary (17). We wish to understand the mechanism of this regulation and to characterize the factors required for the transcription of DHFR and other housekeeping genes. Toward this goal, we have identified DHFR promoter deletions that define the 5' and 3' bound- aries of the region that is absolutely required for DHFR transcription. We refer to the region defined by these dele- tions, containing nucleotides -65 to +15, as the DHFR
* Corresponding author.
minimal promoter. We have examined the DNA-protein interactions in this minimal promoter; we found that a protein binds to the transcription initiation site of the DHFR gene and specifies the site of initiation, and that binding of this factor and Spl are sufficient for accurate transcription initiation.
MATERIALS AND METHODS Cells and extract. HeLa cells were grown in alpha minimal
essential medium plus 5% supplemented calf serum (Hy- clone) to a density of 2 x 105 to 5 x 105 cells per ml. Nuclear extracts (14) were made from approximately 109 cells either on the day of cell harvest or from frozen cells (3).
Construction of plasmids. Plasmids containing sequences from the murine DHFR gene are pST410, pBSprol8, pDFX120, pBSprol9, pDMM285, and pSR320 containing DHFR nucleotides -356 to +61, -50 to +52, -65 to +52, -87 to +52, -270 to +15, and -356 to -30, respectively. pST410 was created by insertion of a SmaI-TaqI fragment (nucleotides -356 to +61) into the SmaI and AccI sites of pUC9. pBSprol8 and pBSprol9 were created by insertion of EcoRI-XbaI fragments from pdprol8 (31) and pdprol9 (31) into the EcoRI and XbaI sites of pBSM13+ (Stratagene Inc.). pDMM285 was created by insertion of a MaeI frag- ment (nucleotides -270 to + 15) into the SmaI site of pUC9. pSR320 was created by insertion of a PvuII-RsaI fragment (which includes DHFR sequences from -356 to -30, as well as vector DNA) from pSS625 (17) into the SmaI site of pUC9. pDFX120 was created by insertion of a FokI-XbaI fragment from pBSprol9 into pBSM13-. Note that, unlike previous papers in which the translation start codon was numbered + 1, numbering is relative to the transcription start site at +1 (the position of the translation start codon is now +56). pSVS contains the entire simian virus 40 (SV40) genome (19). pGemHindIII-C consists of the HindIIIC frag- ment of SV40 inserted into the HindIII site of pGemZF and was a gift from the laboratory of J. E. Mertz.
653
654 MEANS AND FARNHAM
pSVSN contains the SphI-NaeI fragment from pSVS (nucleotides 200 to 345) cloned into pBSM13- at the SphI and SmaI sites. pST410mp19 contains the EcoRI-HindIII fragment of pST410 cloned into the corresponding sites of M13mp19. pSTUmpl9 was derived from pST410mp19 by site-directed mutagenesis with the Bio-Rad Muta-Gene in vitro mutagenesis kit and contains a single-base-pair substi- tution at nucleotide -17 which changes the G of the coding strand to a C, thereby creating a StuI restriction site. pSTUmpl9 was cut with Stul, and 10- or 14-bp linkers were inserted. pSTU+lOmpl9 contains one copy of the XhoI linker 5'-CCCTCGAGGG-3'. pSTU+14mpl9 contains one XbaI linker 5'-CTAGTCTAGACTAG-3'; pSTU+28mpl9 contains two XbaI linkers. pGC was constructed by inserting the Spl-binding-site oligonucleotide
5'-GATCGGGGCGGGGC-3' 3'-CCCCGCCCCGCTAG-5'
into the BamHI site of pUC19. pGCDI was constructed from pGC by inserting the DHFR initiation site oligonucleotide
5'-AATTCATTTCGCGCCAAACTTGACG-3' 3 LGTAAAGCGCGGTTTGAACTGCTTAA-5'
into the EcoRI site, digesting with XmaI, removing the 5' overhang with mung bean nuclease, and inserting the 14-bp XbaI linker shown above. Thus, pGCDI contains an Spl- binding site and the DHFR initiation site separated by 37 bp of polylinker sequence (see Fig. 3C).
In vitro transcriptions. Templates for in vitro transcrip- tions were prepared as follows: pST410 (-356 to +61), pBSprol9 (-87 to +52), pBSprol8 (-50 to +52), pDFX120 (-65 to +52), pDMM285 (-270 to + 15), pSTUmpl9, pSTU+ lOmpl9, pSTU+ 14mpl9, and pSTU+28mpl9 were cleaved with PvuII; pST410 (-258 to +61) was cleaved with NotI and PvuII; pSR320 (-356 to -30) was cleaved with HaeII; pSVS was cleaved with SphI and NdeI; pGemHindIII-C was cleaved with HindIlI; pGC and pGCDI were cleaved with HindIII and NdeI. The promoter-con- taining fragments were all isolated by polyacrylamide gel electrophoresis followed by electroelution.
In vitro transcription reactions (final volume, 25 ,ul) were performed as described previously (18), with modifications for primer extension analysis or oligonucleotide competi- tion. For analysis by primer extension, 5 nM DNA was incubated for 15 min at 24°C with 2.4 ,ug of nuclear extract per ,u in 6 mM MgCl2-24 mM Tris hydrochloride (pH 7.4)-12% (vol/vol) glycerol-60 mM KCl-.12 mM EDTA-0.3 mM dithiothreitol-0.12 mM phenylmethylsulfonyl fluoride. Nucleoside triphosphates were then added to final concen- trations of 600 ,uM GTP, CTP, and UTP and 200 ,uM ATP. After an additional 15 min at 24°C, the reactions were stopped, and the products were extracted and precipitated (18). The precipitates were suspended in 10 RI containing 100 fmol of 32P-end-labeled primer (29), 0.5 M NaCl, 10 mM Tris (pH 7.5), and 5 mM EDTA. This mixture was heated at 85°C for 5 min and then incubated at 60°C for 60 min. Then 40 RI containing 10 U of avian myeloblastosis virus reverse tran- scriptase (Life Sciences, Inc.), 20 U of RNasin (Promega Biotech), 10 mM MgCl2, 12.5 mM dithiothreitol, 1.25 mM each deoxynucleoside triphosphate, and 12.5 mM Tris (pH 8.5) was added, and incubation was continued for 45 min at 42°C. The reactions were stopped by addition of 50 RI of 1% sodium dodecyl sulfate (wt/vol) and 20 mM NaCl, precip- itated with ethanol, and loaded onto an 8 M urea-8% polyacrylamide gel. The primer used for Fig. 3A anneals to
pUC19 nucleotides 455 to 479, and the primer used for Fig. 3B anneals to pUC19 nucleotides 358 to 375.
If oligonucleotide competition was performed, concate- merized oligonucleotides were added 5 min prior to addition of the promoter-bearing fragment. The DHFR initiation site oligonucleotides used for competition are concatemers of the sequence
5'-AATTCTGCGATTTCGCGCCAAACTTGACG-3' 3LGACGCTAAAGCGCGGTTTGAACTGCTTAA-5'.
DNase I protection assays. DHFR coding strands from pBSprol8 and pBSprol9 were phosphorylated at the EcoRI site with T4 polynucleotide kinase and [-y-32P]ATP. A sub- sequent digestion with SphI yielded 171- and 210-bp frag- ments, respectively. The noncoding strand of pBSprol8 was phosphorylated similarly at the Sail site. Subsequent diges- tion with EcoRI produced a 195-bp band. The pSVSN coding strand was phosphorylated at the HindIII site, and subse- quent digestion with NdeI produced a fragment 886 bp in length. These fragments were isolated by polyacrylamide gel electrophoresis followed by elution in an Elutrap (Schleicher & Schuell, Inc.). The pSVSN noncoding strand was labeled by isolating the 165-bp HindIII-SacI fragment and filling in the 5' overhang with the Klenow fragment ofDNA polymer- ase and [a-32P]dATP. All phosphorylating reactions were performed as described previously (29). DNase I footprinting reaction mixtures contained 60 ,ug of
nuclear extract, 1 ng of 32P-labeled DNA, 3 ,ug of poly(dI- dC-poly(dI-dC) or poly(dA-dT)-poly(dA-dT), 24 mM Tris (pH 7.4), 12% (vol/vol) glycerol, 60 mM KC1, 1.2 mM EDTA, 0.3 mM dithiothreitol, and 6 mM MgCl2, in a total volume of 20 ,ul. The reactions were incubated for 10 min at 24°C. DNase I (0.25 to 2 ,ug) was then added, and the samples were returned to 24°C for another 60 s. The reac- tions were immediately terminated by the addition of 4 jl of 0.25 M EDTA-1% sodium dodecyl sulfate (wt/vol), diluted to 75 RI, phenol extracted, and ethanol precipitated. Elec- trophoresis was carried out on an 8 M urea-6% or 8% polyacrylamide gel.
RESULTS
Delimitation of the DHFR minimal promoter region. The promoter region of the DHFR gene does not contain the CCAAT or TATA boxes that are commonly used as RNA polymerase II transcription initiation signals (31). Instead, the region directly upstream of the transcription initiation site consists of four copies of a 48-bp repeat, each of which contains a GC box that binds the transcription factor Spl (15). We have developed an in vitro transcription system for the DHFR promoter by using HeLa cell nuclear extract (18) and have now used this system to define the minimal region ofgenomic DNA necessary for accurate DHFR transcription initiation (Fig. 1). We had previously shown that a template with a 5' end extending to -87 retained transcriptional activity (18) but that a 5' deletion to nucleotide -50 inacti- vated the DHFR promoter. A template extending to nucle- otide -65 is also active, as compared with the template extending to nucleotide -50, which does not support initia- tion from the DHFR start site (Fig. 1). These results corre- spond well to results of deletion studies of the hamster DHFR gene that found that the 5' limit of the hamster promoter was 48 bp 5' of the major transcription start site (10). The DHFR templates used in previous studies extended several hundred base pairs downstream of the transcription initiation site. We have now compared 3' promoter deletions
MOL. CELL. BIOL.
I.:, L,~
(- L, . _z
ci. cilI6
.690 .-
_ _ __I
-30h
FIG. 1. Delimitation of the DHFR minimal promoter. (A) Transcription reactions were performed with DHFR promoter fragments and HeLa nuclear extract. The extent of DHFR sequences on each template is indicated above the lanes. The arrowheads indicate the expected size of the runoff transcript initiating at the DHFR major start site. No product of the correct size is seen when the -50/+52 or the -356/-30 template was used. The strong signals of 444 and 552 nucleotides seen in the -50/+52 and -270/+15 lanes, respectively, are due to end-to-end transcription of the template DNA by RNA polymerase. Similarly, the 800-nucleotide band in the -356/-30 lane is due to end-to-end transcription, whereas the three bands between the 311 and 444 markers are transcripts arising from minor start sites. The sizes of the molecular size markers (in base pairs) are indicated to the right of the figure. (B) Schematic of the DHFR promoter region. All sequences are
numbered relative to + 1 (the major DHFR transcription initiation site). Other start sites corresponding to RNAs transcribed from the opposite strand are also indicated (16, 41). The deletions that define the DHFR minimal promoter (-65 to + 15) are shown. The small boxes below the line represent Spl consensus binding sites. The four DHFR-proximal Spl consensus sites are in the opposite orientation to the six upstream sites. The open boxes above the line represent the four 48-bp repeats.
and have found that a template with a 3' boundary of + 15 can initiate accurately, but that further 3' deletion to -30 inactivates the promoter. Transcription does not initiate the correct distance downstream from the DHFR-proximal GC box on the template containing nucleotides -356 to -30, even though all four 48-bp repeats are retained. Thus, the boundaries of the region defined as the DHFR minimal promoter extend from nucleotides -65 to +15 and contain one binding site for the transcription factor Spl and the transcription initiation site, but no other consensus se- quences for previously identified factors.
Proteins bind to two sites in the DHFR minimal promoter. We have examined the protein-binding sites in the DHFR minimal promoter region. A DNA probe that was 5' end labeled at nucleotide -87 was used in DNase I protection assays with HeLa nuclear extract. Only two regions of this probe were protected from DNase I cleavage. One protected region spans the GC box, protecting nucleotides -60 through -40 in the absence of polyethylene glycol and -60 through -33 in the presence of 3% polyethylene glycol (Fig.
2A, lanes 2 to 4). The other protected region spans the transcription initiation site, protecting nucleotides -11 through +9. Addition of volume excluders such as polyeth- ylene glycol or polyvinyl alcohol may increase the detection of protein-DNA interactions. However, reactions containing up to 3% polyethylene glycol (lane 4) or 4.5% polyvinyl alcohol (data not shown) did not reveal any other protein- binding sites within the minimal promoter. The two binding sites detected correspond to regions required for transcrip- tion in vitro (Fig. 1). Deletion from nucleotides -65 to -50 removes half of the GC box and inactivates the promoter. Deletion from +15 to -30 abolishes correctly positioned initiation, resulting in spurious initiation sites throughout the template. To determine whether binding at the transcription initia-
tion site was dependent upon formation of a functional transcription complex, we assayed a transcriptionally inac- tive DHFR promoter construct by using DNase I protection. Because this fragment, containing DHFR sequences from nucleotides -50 to +52, lacks most of the GC box, binding
A
.1
NONCODING STRAND
Om
4w
on
CCTTGGTOGGGGCGGGGCCTMGCTGCGCMGTGGTACAGAGCTCAGGGCTGCGATTICGCGcCMACTTGACGGC GGMOCCACCCCCGCCCOGGATTCACGCGUCACCATGTGTCGAGTCCCGACCTAMGCGCCGGTTTGMCTGCC
FIG. 2. DNase I protection of the DHFR promoter region. (A) The coding strand of pBSprol9, containing DHFR sequences from -87 to +52 was 5' end labeled and digested with DNase I in the absence (lane 1) and presence (lanes 2 to 4) of 60 jig of HeLa nuclear extract and in the presence of 0% (lane 2), 1.5% (lane 3), or 3% (lane 4) polyethylene glycol (PEG). Lane 5 shows the position of G nucleotides in the fragment (30). Symbols: Fii, regions protected from DNase I cleavage; A, position of the transcription initiation site (40); M, Sp1 consensus site. (B) Both the noncoding (lanes 1 to 3) and the coding (lanes 4 to 6) strands of pBSprol8, containing DHFR sequences -50 to +52 (but lacking the Spl-binding site required for transcription), were digested with DNase I in the presence (lanes 1 and 6) and absence (lanes 2 and 5) of 60 pug of HeLa nuclear extract. Lanes 3 and 4 show the positions ofG and A nucleotides in the sequence (30). Symbols: L., regions protected from DNase I cleavage: .A position of the transcription initiation site. (C) Sequence of the DHFR minimal promoter. Sequences protected from DNase I cleavage on the coding and noncoding strands are indicated by lines above and below the sequence, respectively. The sequence protected by Spl on the noncoding strand was determined by DNase I digestion of a 5'-end-labeled noncoding strand fragment from pBSprol9 (data not shown) and is identical to the protected region described by Dynan et al. (15).
of Spl was not observed. However, protein did bind to the initiation site (Fig. 2B). Although the initiation site is in the center of the protected region on the coding strand (Fig. 2B, lane 6), the noncoding (template) strand is protected mainly 5' of the initiation site (Fig. 2B, lane 1). Our results indicate that protein binding to the DHFR initiation site is indepen- dent of Spl binding and therefore does not require formation of a functional transcription complex. The location of the DHFR initiation site determines the
position of transcription initiation. We have shown that a
deletion of DHFR sequences from nucleotides -30 to +15, including the protected region spanning the initiation site, abolishes correctly initiated transcription (Fig. 1). For many promoters, the sequence at the initiation site is less impor- tant than the TATA box located 30 nucleotides upstream. The TATA box can direct RNA polymerase II to initiate 30 nucleotides downstream even after replacement of the initi- ation site with random sequence. To assess the relative
A CODING STRAND
127\
101-* _
88-S
c
ACAAAnSSAA:_ ---- --. _g ACz _'A-...s
.......- t...g.g -
FIG. 3. The location of the DHFR initiation site determines the position of transcription initiation. (A) Primer extension analysis of in vitro transcriptions from templates diagrammed in panel C. Lane 1 contains molecular size markers (in base pairs). Arrowheads correspond to the initiation sites marked with the identical arrowheads in panel C. (B) Primer extension analysis of in vitro transcriptions from the GC template (lane 3) and the GCDI template (lane 2), both diagrammed in panel C. Arrowheads correspond to the initiation sites marked with the identical arrowheads in panel C. (C) Partial sequence of the coding strands of the templates used in panels A and B. The wild-type template, from pST410, includes DHFR sequences from -356 to +61. The STU template, from pSTUmp19, is identical to the wild-type promoter, except that the G at -17 has been changed to a C. STU+10, STU+14, and STU+28 (from pSTU+1Omp19, pSTU+14mp19, and pSTU+28mp19, respectively) have linkers of 10, 14, and 28 bp inserted at the StuI site of the STU template. The GC template contains a GC box cloned into pUC19. The GCDI template contains a GC box and the DHFR initiation site cloned into pUC19. Underlined nucleotides indicate the Spl and HIPi consensus sites. Nucleotides that do not correspond to DHFR sequences are in lowercase letters. Arrowheads indicate the sites of transcription initiation that were determined by the reactions in panels A and B.
contribution of the transcription initiation site and the -30 region, we performed the following experiments. To determine whether the binding of protein to the DHFR
initiation site influenced the position of transcription initia- tion, we tested the effect of moving the initiation site farther from the upstream elements. We created a StuI restriction site by a single-base-pair substitution at the 5' boundary of the protected region spanning the transcription initiation site. We then inserted oligonucleotides 10, 14, and 28 bp in length into this StuI site and assayed for in vitro transcrip- tional activity by primer extension of transcription products (Fig. 3A). If a factor binding upstream of the protected region specifies the position of transcription initiation by directing RNA polymerase to initiate a fixed distance down- stream, analogous to TFIID, the lengths of the RNAs from the insertion mutants would be increased by 10, 14, and 28 bases. If the protein binding to the initiation site specifies the position of initiation, then transcription would initiate at this site in any location and the RNAs would be the same length regardless of the insert size. The creation of the StuI site did not significantly change the level or site of initiation (Fig. 3A, lane 3). Analysis of the different insertion templates indi- cated that transcription initiated within the protein binding site, despite its greater distance from upstream elements (Fig. 3A, lanes 4 to 6). However, with the 10-bp insertion, transcription initiated equally often at the normal initiating nucleotide and 5 bp upstream, at the G of the GCCA element. This is a minor site of initiation in the DHFR promoter both in vivo (J. Flatt, unpublished data) and in
vitro (Fig. 3A, lanes 2 and 3). The 14- and 28-bp insertion mutations initiated predominantly at this G and, less fre- quently, at the A of the wild-type promoter. The level of transcription from every insertion template was lower than the level from the wild-type and STU templates, suggesting that there is an optimal distance between the transcription initiation site and upstream elements for efficient transcrip- tion.
Construction of a synthetic promoter. We have shown that removal of either the Spl binding site or the region contain- ing the initiation site abolishes transcriptional activity, dem- onstrating the importance of both elements (Fig. 1). To examine the requirement of the sequences between the Spl and the transcription initiation sites, we cloned oligonucleo- tides containing these sites into the polylinker region of pUC19, separated by the same number of nucleotides as in the DHFR promoter (plasmid GCDI [Fig. 3C]), and assayed this template for transcription in vitro (Fig. 3B). Although insertion of an Spl-binding site was not sufficient for tran- scriptional activity (Fig. 3B, lane 3), insertion of both the Spl and the DHFR initiation site oligonucleotides resulted in initiation of transcription within the DHFR initiation site oligonucleotide (Fig. 3B, lane 2). Thus, one GC box and the DHFR initiation site oligonucleotide are sufficient for accu- rate transcription in vitro. A protein binds to the SV40 late initiation site. Because the
DNA sequence at the initiation site of the SV40 late pro- moter is very similar to the sequence at the DHFR initiation site (see Fig. 5), we examined this region of the SV40 late
B
658 MEANS AND FARNHAM
promoter for protein-DNA interactions. DNase I footprint- ing of SV40 sequences from nucleotides 200 to 345 (the major initiation site is at nucleotide 325) revealed binding at two sites, one spanning the initiation site and the other centered approximately 45 bp upstream of the initiation site (Fig. 4, lanes 3 and 6). Examination of this upstream site revealed that it also contained a sequence, TTTCCGCC, similar to that at the DHFR initiation site. To determine whether the protein that binds to the DHFR
transcription initiation site is also involved in SV40 late transcription, we used concatemerized oligonucleotides con- taining the region from nucleotides -16 to +9 spanning the DHFR initiation site as a competitor of SV40 late promoter in vitro transcription reactions (Fig. 4B). A decrease in transcriptional activity from an SV40 late promoter fragment that is added to the reaction after the competitorDNA would indicate that the protein binding to the DHFR initiation site oligonucleotide was required for SV40 late transcription. We examined the effects of excess DHFR initiation site oligonu- cleotides on transcription from the SV40 early and late promoters. As discussed above, the SV40 major late initia- tion site, at nucleotide 325, has homology to the DHFR initiation site. The initiation site at nucleotide 170 is homol- ogous to the 5' half of the DHFR initiation site, having the sequence TTTC. The SV40 early start sites do not have homology to the DHFR initiation site. Concatemerized DHFR initiation initiation site oligonucleotides (200 ng) reduced transcription from both the 170 and 325 start sites of the SV40 late promoter and led to novel upstream initiations (Fig. 4B, lane 4). Excess DHFR initiation site oligonucleo- tides increased transcription from the SV40 early start sites (Fig. 4B, lanes 5 to 8). Control reactions were performed in which 200 ng of a concatemerized oligonucleotide containing a mutated (nonfunctional) heat shock element was added to transcription reactions before the SV40 late or early pro- moter fragments. No difference in transcriptional activity from either template was observed (data not shown), dem- onstrating that the effects on transcription caused by the DHFR initiation site concatemer were not due simply to excess DNA in the reaction. These results of competition with the DHFR initiation site concatemer suggest that the availability of the protein that binds to the DHFR initiation site may be important for determining the efficiency of transcription in the early (versus the late) direction of SV40 transcription. Initiation from the late start sites requires this protein, whereas initiation from the early start sites occurs more efficiently in its absence. Other non-TATA box genes have sequence homology to
DHFR at their initiation sites. The transcription initiation sites shown in Fig. 5 exhibit homology to the 11-bp sequence immediately preceding the DHFR initiation site. None of these promoters has a TATA box appropriately positioned near its transcription initiation site. In particular, the major initiation site of the SV40 late promoter is strikingly similar to the DHFR sequence. These two genes have the sequence ATTTCNNGCCA. However, transcription initiates at the 3' end of the consensus sequence in the mouse DHFR pro- moter but at the 5' end of the consensus sequence in the SV40 late promoter and in the hamster and human DHFR genes (Fig. 5). Comparison of the initiation sites listed in Fig. 5 suggests that this sequence may be composed of two elements corresponding to the sequences ATTTC and GCCA, which can be separated by 1 to 19 nucleotides. For each of the genes listed in Fig. 5, transcription initiates at either or both of these elements. Because the protein(s) binding to the DfIFR initiation site protects a sequence
A
0 §-X J 3
CODING NONCODING STRAND STRAND
- so*_
_.1MP _
SV40 early starts
1 2 3 4 5 6 7 8 9
FIG. 4. The SV40 late promoter binds protein at its major initiation site. (A) The coding (lanes 1 to 3) and noncoding (lanes 4 to 6) strands of pSVSN were assayed for DNase I cleavage in the presence (lanes 3 and 6) and absence (lanes 2 and 5) of 60 jig of HeLa nuclear extract. Lanes 1 and 4 show the positions of G nucleotides (30). Symbols: LO], regions protected from DNase I cleavage; _z position of the transcription initiation site. Nucleo- tides protected from cleavage are indicated on either side. (B) Oligonucleotide competition of transcription from the SV40 early and late promoters. Transcription reactions were preincubated with or without excess concatemerized DHFR initiation site oligonucle- otides (DINIT) before addition of the SV40 late promoter pGem- HindIII-C-HindIII fragment (lanes 1 to 4) or the SV40 early pro- moter pSVS-SphI-NdeI fragment (lanes 5 to 8). Arrows indicate the position of correctly initiated RNA. Lane 9 shows the positions of DNA molecular size markers.
common to other housekeeping genes, we refer to the protein(s) as HIP1, for housekeeping initiation protein 1.
DISCUSSION
HIPi binds to the DHFR initiation site. The question of how the site of transcription initiation is specified for genes lacking a recognizable TATA box has generated consider- able debate. We have now shown that the sequence at the DHFR initiation site, rather than sequences farther up-
concatemerized DINIT sites (ng, 0
0 20 100 200 0 20 100 200 0
..4_
_ ww
v appears that a previously unidentified protein binds to theACAGCTCAGGGCTGCGAITCGCCCAAACTT DHFRDHFR transcription initiation site. GGGCGGGGCGGCCACAA2TICGCGECCAACTT DHFR (human) Te initiation site plays a variety of roles in different
V promoters. Most yeast genes contain one or more TATAGCGCCGGGCGAATGCAAIT=GC~CCf.AACTT DHFR (hamster) boxes that are required for transcription, yet rely upon TCCTCTTTCAGAGGTTAIIICAGGCCATGGTG 5V40 late sequences surrounding the initiation site to position tran-
V scription initiation (9, 21, 38). In contrast, mammalian TATA CCGGCAGCGQITTGAGCCATTGC HPRT boxes direct RNA polymerase II to initiate transcription a
T V ~~V VCCATCGCGCACTCCGGCTCGAIICGfCAGGCGGCG Ki-RAS fixed distance downstream. If the region downstream of the V V TATA box, including the initiation site, is deleted and
GCGGTGTTCCGCA=~TCAAGCCTCC PGK replaced with random sequences, RNA polymerase II can VTAAACCCCTCCACA ITCTGCAGCCC Osteonectin still initiate transcription accurately, although the level of
AGAVT VTTCGCGGCGCCGCGGACTCGCAGTG transcription may decrease (11-13). T Recently, the region containing the transcription initiation
CAGATITTCGGTCCCGGAAGTGTg&AAGATGGC SURF-1 site has been demonstrated to position the start of transcrip-
FIG. 5. Housekeeping genes having sequence homologies to the tion of two mammalian genes that do not contain TATA DHFR start site. Genes having initiation sites homologous to the boxes. Sale and Baltimore (43) demonstrated that the DHFR initiation site were identified by examining a collection of region around the terminal deoxynucleotidyltransferase manuscripts concerning non-TATA box promoters and by searching (TdT) initiation site is required for transcription. Deletion or GenBank with the consensus sequence ATTTCN(1-30)GCCA (only mutation of this region eliminates or reduces transcription, the identified consensus sequence homologies that occur at or near respectively. A 17-bp region containing the start site is transcription initiation sites are shown in this figure). These genes sufficient to position low levels of initiation in the absence of represent those having the best homology to the consensus se- other elements and higher levels when combined with up- quence. The sequences underlined are homologous to the DHFR stream elements such as a TATA box or GC box. The TdT initiation site. Arrowheads indicate the sites of transcription initia- gene is not a housekeeping gene: its expression is limited to tion. References for these sequences are as follows: DHFR (8, 34, 40); SV40 late promoter (20); HPRT (hypoxanthine phosphoribosyl- precursor B and T lymphocytes (27). The mechanism of transferase) (33); Ki-RAS (22); PGK (3-phosphoglycerate kinase) intiation for the TdT gene is distinct from the mechanism (42); osteonectin (32); IRF-1 (interferon regulatory factor 1) (36); used by the DHFR gene. The DHFR initiation site cannot SURF-1 (44). function in the absence of an Spl-binding site and bears no
apparent sequence homology to the TdT initiation site. In addition, Smale and Baltimore could detect no protein
stream, binds protein and positions RNA polymerase II to binding to the TdT initiation site (43). initiate transcription at that site. Ayer and Dynan (2) showed that substitution of nucleo-
Binding of a protein other than RNA polymerase to the tides around the major late initiation site of SV40 decreased start site of transcription is not well documented. Although levels of transcription in vitro and changed the site of the transcription factor TFIIB may bind in the vicinity of the initiation. Close examination of their results reveals that initiation site (7), it does not appear to bind DNA directly substitution of the ATTTC at the initiation site with random (45), but is positioned by binding to other proteins already sequence caused transcription to initiate 10 bp downstream, bound to the promoter. The herpes simplex virus ICP4 at the GCCA sequence. Through comparison of the pro- protein binds to its own transcription initiation site in a tected sequences and inhibition of transcriptional activity negative, autoregulatory manner (37). A cellular protein with oligonucleotides derived from the DHFR initiation site, spans from nucleotides -17 to +27 of the human immuno- we believe that the protein responsible for determining the deficiency virus type 1 promoter. However, unlike the initiation site for the SV40late gene is the same as or similar DHFR promoter, the human immunodeficiency virus type 1 to the one responsible for this activity in the DHFR gene. A TATA element appears to be responsible for positioning second sequence homologous to the HIPl-binding site oc- transcription initiation, since insertion of a DNA fragment curs 45 bp upstream of the SV40 late initiation site and also downstream of the TATA element causes an upstream shift binds a protein that may be HIP1. Mutation of this upstream in the transcription initiation site (25). There are four reasons sequence and of sequences between the two HIP1 sequences for our belief that the protein we are detecting is not RNA reduces transcription in vivo and in vitro (2, 4). polymerase. First, binding occurs to a template that lacks a The HIPl-binding site in the DHFR promoter positions the GC box and is transcriptionally inactive. Second, the foot- site of transcription initiation. We find that transcription printing reactions contain a 3,000-fold excess of nonspecific initiates at the HIP1 site when 10, 14, or 28 nucleotides are competitor DNA. Proteins that bind DNA in a nonspecific inserted between it and all other upstream sequences. Since manner, such as RNA polymerases that bind to ends of templates with inserts of 14 and 28 bp give similar levels of fragments, are thus outcompeted and do not bind to the transcription, there does not appear to be a preference for labeled DNA. Third, the footprint we detected is smaller one side of the helix. Insertion of 10 bp upstream of the HIP1 than the RNA polymerase II footprint observed on other site decreases transcription more than insertion of 14 or 28 promoters (7, 23). The footprint on the DHFR noncoding, or bp, possibly owing to the high G+C content and palindromic template, strand does not extend far 3' of the initiating nature of the insert. nucleotide. RNA polymerase II protects this strand at least To demonstrate more conclusively that the transcription 15 bp 3' of the initiation site in the adenovirus type 2 major initiation site of the DHFR promoter is a positioning element late promoter. Fourth, addition of an excess of HIPl-binding and not just a preferred initiation sequence for a factor sites increases transcription from the SV40 early promoter in binding elsewhere, oligonucleotides corresponding to an Spl vitro. Since this promoter is transcribed by RNA polymerase and a HIP1 site were cloned the appropriate distance apart in II, we cannot be inhibiting RNA polymerase II, because bacterial sequences of the pUC19 plasmid (Fig. 3C). When transcription would then decline, not increase. Thus, it just an Spl site was cloned into pUC19, no transcription was
VOL. 10, 1990
660 MEANS AND FARNHAM
observed. When both an Spl and a HIP1 consensus site were cloned into pUC19, transcription initiated at the HIP1 site, demonstrating that no DHFR sequences other than the Spl site and the HIP1 site are required for accurate initiation. Within the HIPi sites of different genes, transcription may
initiate primarily from the 5' end or the 3' end of either of the two sequence elements that constitute the HIP1 consensus sequence (Fig. 5). Transcription initiates primarily from the nucleotide next to the 3' end of the HIPi consensus se- quence in the mouse DHFR promoter. However, when the HIPi site is isolated from surrounding sequences and cloned into a bacterial vector containing an Spl-binding site, tran- scription initiates primarily from the 5' end of the HIPi sequence (at the HIP1 nucleotide that initiates hamster DHFR, SV40 late, and osteonectin transcription) and sec- ondarily from the 3' end of the sequence. Therefore, the choice of initiating nucleotides is not inherent in the HIP1 sequence, but is influenced by the surrounding sequences. Although mouse and human DHFR genes initiate at different nucleotides within the HIPi element, the mouse initiation site is used by the mouse promoter in transcription extracts prepared from human (Fig. 3) and mouse (18) cells. This indicates that the difference is not due to differences in HIPi between human and mouse cells. The DNA sequence imme- diately upstream of the HIPi consensus sequence is slightly different in the human, hamster, and mouse genes. It is possible that these sequence differences are responsible for the slightly different start sites. We are currently performing mutagenesis of this region and testing whether the binding of other transcription factors influences the exact site of initi- ation within the HIPi site.
In summary, our data are consistent with the conclusion that the same or a similar protein(s) binds to the initiation sites of the DHFR and SV40 late genes, and sequence data suggest that this protein(s) may bind to other housekeeping genes. Our results suggest the existence of at least two mechanisms for specifying the site of transcription initiation, one used by many high-expression, tissue-specific promoters and controlled by TFIID, and the other used by several low-expression, housekeeping promoters and controlled by HIP1. To determine the number and variability of HIPi proteins as well as to assess their specificity and affinity for various promoters, we are beginning experiments to purify and clone the HIPi protein(s).
ACKNOWLEDGMENTS
We thank Jody Flatt for growing the HeLa cells and for allowing us to refer to unpublished results, Stephanie McMahon for technical assistance in sequencing plasmid clones, Charles Nicolet for the control oligonucleotides, and the laboratory of Janet Mertz for pGemHindIII-C. We are grateful to all the members of the P. J. Farnham and W. M. Sugden laboratories for valuable discussion and to the members of the McArdle Laboratory Tumor Biology Group for helpful comments on the manuscript.
This work was supported by Public Health Service grants CA45240 and CA07175 from the National Institutes of Health. A.L.M. was supported, in part, by training grant CA09135 from the National Institutes of Health.
LITERATURE CITED 1. Angel, P., M. Imagawa, R. Chiu, B. Stein, R. J. Imbra, H. J.
Rahmsdorf, C. Jonat, P. Herrlich, and M. Karin. 1987. Phorbol ester-inducible genes contain a common cis element recognized by a TPA-modulated trans-acting factor. Cell 49:729-739.
2. Ayer, D. E., and W. S. Dynan. 1988. Simian virus 40 major late promoter: a novel tripartite structure that includes intragenic sequences. Mol. Cell. Biol. 8:2021-2033.
3. Borelli, M. J., M. A. Mackey, and W. C. Dewey. 1987. A method for freezing synchronous mitotic and G1 cells. Exp. Cell Res. 170:363-368.
4. Brady, J., M. Radonovich, M. Vodkin, V. Natarajan, M. Thoren, G. Das, J. Janik, and N. P. Salzman. 1982. Site-specific base substitution and deletion mutations that enhance or sup- press transcription of the SV40 major late RNA. Cell 31: 625-633.
5. Breathnach, R., and P. Chambon. 1981. Organization and expression of eucaryotic split genes coding for proteins. Annu. Rev. Biochem. 50:349-383.
6. Briggs, M. R., J. T. Kadonaga, S. P. Bell, and R. Tjian. 1986. Purification and biochemical characterization of the promoter- specific transcription factor, Spl. Science 234:47-52.
7. Buratowski, S., S. Hahn, L. Guarente, and P. A. Sharp. 1989. Five intermediate complexes in transcription initiation by RNA polymerase II. Cell 56:549-561.
8. Chen, M.-J., T. Shimada, A. D. Moulton, A. Cline, R. K. Humphries, J. Maizel, and A. W. Nienhuis. 1984. The functional human dihydrofolate reductase gene. J. Biol. Chem. 259:3933- 3943.
9. Chen, W., and K. Struhl. 1985. Yeast mRNA initiation sites are determined primarily by specific sequences, not by the distance from the TATA element. EMBO J. 4:3273-3280.
10. Ciudad, C. J., G. Urlaub, and L. A. Chasin. 1988. Deletion analysis of the chinese hamster dihydrofolate reductase gene promoter. J. Biol. Chem. 263:16274-16282.
11. Concino, M. F., R. F. Lee, J. P. Merryweather, and R. Wein- mann. 1984. The adenovirus major late promoter TATA box and initiation site are both necessary for transcription in vitro. Nucleic Acids Res. 12:7423-7433.
12. Corden, J., B. Wasylyk, A. Buchwalder, P. Sassone-Corsi, C. Kedinger, and P. Chambon. 1980. Promoter sequences of eu- karyotic protein-coding genes. Science 209:1406-1414.
13. Dierks, P., A. van Ooyen, M. D. Cochran, C. Dobkin, J. Reiser, and C. Weissmann. 1983. Three regions upstream from the cap site are required for efficient and accurate transcription of the rabbit P-globin gene in mouse 3T6 cells. Cell 32:695-706.
14. Dignam, J. D., R. M. Lebovitz, and R. Roeder. 1983. Accurate transcription initiation by RNA polymerase II in a soluble extract from isolated mammalian nuclei. Nucleic Acids Res. 11:1475-1489.
15. Dynan, W. S., S. Sazer, R. Tjian, and R. T. Schimke. 1986. Transcription factor Spl recognizes a DNA sequence in the mouse dihydrofolate reductase promoter. Nature (London) 319: 246-248.
16. Farnham, P. J., J. M. Abrams, and R. T. Schimke. 1985. Opposite-strand RNAs from the 5' flanking region of the mouse dihydrofolate reductase gene. Proc. Natl. Acad. Sci. 82:3978- 3982.
17. Farnham, P. J., and R. T. Schimke. 1985. Transcriptional regulation of mouse dihydrofolate reductase in the cell cycle. J. Biol. Chem. 260:7675-7680.
18. Farnham, P. J., and R. T. Schimke. 1986. In vitro transcription and delimitation of promoter elements of the murine dihydrofo- late reductase gene. Mol. Cell. Biol. 6:2392-2401.
19. Fromm, M., and P. Berg. 1982. Deletion mapping of DNA regions required for SV40 early region promoter function in vivo. J. Mol. Appl. Genet. 1:457-481.
20. Ghosh, P. K., V. B. Reddy, J. Swinscoe, P. Lebowitz, and S. M. Weissman. 1978. Heterogeneity and 5'-terminal structures of the late RNAs of simian virus 40. J. Mol. Biol. 126:813-846.
21. Hahn, S., E. T. Hoar, and L. Guarente. 1985. Each of three "TATA elements" specifies a subset of the transcription initi- ation sites at the CYC-1 promoter of Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA 82:8562-8566.
22. Hoffman, E. K., S. P. Trusko, N. A. Freeman, and D. L. George. 1987. Structural and functional characterization of the promoter region of the mouse c-Ki-ras gene. Mol. Cell. Biol. 7:2592-2596.
23. Horikoshi, M., T. Hai, Y.-S. Lin, M. R. Green, and R. G. Roeder. 1988. Transcription factor ATF interacts with the TATA factor to facilitate establishment of a preinitiation com- plex. Cell 54:1033-1042.
MOL. CELL. BIOL.
POSITIONING OF TRANSCRIPTION INITIATION 661
24. Imagawa, M., R. Chiu, and M. Karin. 1987. Transcription factor AP-2 mediates induction by two different signal-transduction pathways: protein kinase C and cAMP. Cell 51:251-260.
25. Jones, K. A., P. A. Luciw, and N. Duchange. 1988. Structural arrangements of transcription control domains within the 5'- untranslated leader regions of the HIV-1 and HIV-2 promoters. Genes Dev. 2:1101-1114.
26. Kadonaga, J. T., A. J. Courey, J. Ladika, and R. Tjian. 1988. Distinct regions of Spl modulate DNA binding and transcrip- tional activation. Science 242:1566-1570.
27. Landau, N. R., T. P. St. John, I. L. Weissman, S. C. Wolf, A. E. Silverstone, and D. Baltimore. 1984. Cloning of terminal trans- ferase cDNA by antibody screening. Proc. Natl. Acad. Sci. USA 81:5836-5840.
28. Lee, W., P. Mitchell, and R. Tjian. 1987. Purified transcription factor AP-1 interacts with TPA-inducible enhancer elements. Cell 49:741-752.
29. Maniatis, T., E. F. Fritsch, and J. Sambrook. 1982. Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.
30. Maxam, A. M., and W. Gilbert. 1980. Sequencing end-labeled DNA with base-specific chemical cleavages. Methods Enzymol. 65:499-560.
31. McGrogan, M., C. C. Simonsen, D. T. Smouse, P. J. Farnham, and R. T. Schimke. 1985. Heterogeneity at the 5' termini of mouse dihydrofolate reductase mRNAs. J. Biol. Chem. 260: 2307-2314.
32. McVey, J. H., S. Nomura, P. Kelly, I. J. Mason, and B. L. M. Hogan. 1988. Characterization of the mouse sparc/osteonectin gene. J. Biol. Chem. 263:11111-11116.
33. Melton, D. W., C. McEwan, A. B. McKie, and A. M. Reid. 1986. Expression of the mouse HPRT gene: deletional analysis of the promoter region of an X-chromosome linked housekeeping gene. Cell 44:319-328.
34. Mitchell, P. J., A. M. Carothers, J. H. Han, J. D. Harding, E. Kas, L. Venolia, and L. A. Chasin. 1986. Multiple transcription start sites, DNase I-hypersensitive sites, and an opposite-strand exon in the 5' region of the CHO dhfr gene. Mol. Cell. Biol.
6:425-440. 35. Mitchell, P. J., C. Wang, and R. Tjian. 1987. Positive and
negative regulation of transcription in vitro: enhancer-binding protein AP-2 is inhibited by SV40 T antigen. Cell 50:847-861.
36. Miyamoto, M., T. Fujita, Y. Kimura, M. Maruyama, H. Harada, Y. Sudo, T. Miyata, and T. Taniguchi. 1988. Regulated expres- sion of a gene encoding a nuclear factor, IRF-1, that specifically binds to IFN-P gene regulatory elements. Cell 54:903-913.
37. Muller, M. T. 1987. Binding of the herpes simplex virus imme- diate-early gene product ICP4 to its own transcription start site. J. Virol. 61:858-865.
38. Nagawa, F., and G. R. Fink. 1985. The relationship between the "TATA" sequence and transcription initiation sites at the HIS4 gene of Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA 82:8557-8561.
39. Nakajima, N., M. Horikoshi, and R. G. Roeder. 1988. Factors involved in specific transcription by mammalian RNA polymer- ase II: purification, genetic specificity, and TATA box-promoter interactions of TFIID. Mol. Cell. Biol. 8:4028-4040.
40. Sazer, S., and R. T. Schimke. 1986. A re-examination of the 5' termini of mouse dihydrofolate reductase RNA. J. Biol. Chem. 261:4685-4690.
41. Schilling, L. J., and P. J. Farnham. 1989. Identification of a new promoter upstream of the murine dihydrofolate reductase gene. Mol. Cell. Biol. 9:4568-4570.
42. Singer-Sam, J., D. H. Keith, K. Tani, R. L. Simmer, L. Shively, S. Lindsay, A. Toshida, and A. D. Riggs. 1984. Sequence of the promoter region of the gene for human X-linked 3-phosphoglyc- erate kinase. Gene 32:409-417.
43. Smale, S. T., and D. Baltimore. 1989. The "initiator" as a transcription control element. Cell 57:103-113.
44. Williams, T. J., and M. Fried. 1986. The MES-1 murine en- hancer element is closely associated with the heterogeneous 5' ends of two divergent transcription units. Mol. Cell. Biol. 6:4558-4569.
45. Zheng, X.-M., V. Moncollin, J.-M. Egly, and P. Chambon. 1987. A general transcription factor forms a stable complex with RNA polymerase B (II). Cell 50:361-368.
VOL. 10, 1990
Transcription Initiation from the Dihydrofolate Reductase Promoter Is Positioned by HIPI Binding at the Initiation Site
ANNA L. MEANS AND PEGGY J. FARNHAM* McArdle Laboratory for Cancer Research, University of Wisconsin, 1400 University Avenue, Madison, Wisconsin 53706
Received 30 August 1989/Accepted 31 October 1989
We have identified a sequence element that specifies the position of transcription initiation for the dihydrofolate reductase gene. Unlike the functionally analogous TATA box that directs RNA polymerase II to initiate transcription 30 nucleotides downstream, the positioning element of the dihydrofolate reductase promoter is located directly at the site of transcription initiation. By using DNase I footprint analysis, we have shown that a protein binds to this initiator element. Transcription initiated at the dihydrofolate reductase initiator element when 28 nucleotides were inserted between it and all other upstream sequences, or when it was placed on either side of the DNA helix, suggesting that there is no strict spatial requirement between the initiator and an upstream element. Although neither a single Spl-binding site nor a single initiator element was sufficient for transcriptional activity, the combination of one Spl-binding site and the dihydrofolate reductase initiator element cloned into a plasmid vector resulted in transcription starting at the initiator element. We have also shown that the simian virus 40 late major initiation site has striking sequence homology to the dihydrofolate reductase initiation site and that the same, or a similar, protein binds to both sites. Examination of the sequences at other RNA polymerase II initiation sites suggests that we have identified an element that is important in the transcription of other housekeeping genes. We have thus named the protein that binds to the initiator element HIP1 (Housekeeping Initiator Protein 1).
Interactions between transcription factors and specific DNA sequences within an RNA polymerase II promoter can be grouped into two categories, depending upon how they influence transcription. One class of factors, usually binding at least 50 base pairs (bp) upstream of the initiation site, regulates the efficiency of transcription, presumably by altering the rate or conformation of polymerase attachment. Examples of this class of factors are Spl (6, 26), Apl (1, 28), and Ap2 (24, 35). Deletion of binding sites for these factors results in a gradual reduction in transcriptional activity until all such sites are removed. The second class of transcription factors specifies the site of initiation. The only previously characterized transcription factor known to influence the site of initiation is TFIID, a protein that binds to an A+T-rich sequence called a TATA box and directs RNA polymerase II to start transcription approximately 30 bp downstream (5, 39). Deletion of the TATA box can result in spurious initiations and a low level of transcription (5). However, not every gene contains a TFIID consensus sequence at the correct distance upstream of the transcription initiation site. In particular, many cellular genes that are expressed at low levels and encode proteins found in all cell types (so-called housekeeping genes) do not have a TFIID consensus se- quence. One example is the dihydrofolate reductase (DHFR) gene.
The DHFR gene is expressed throughout the cell cycle in proliferating cells, but its transcription rate increases seven- fold at the G1-S phase boundary (17). We wish to understand the mechanism of this regulation and to characterize the factors required for the transcription of DHFR and other housekeeping genes. Toward this goal, we have identified DHFR promoter deletions that define the 5' and 3' bound- aries of the region that is absolutely required for DHFR transcription. We refer to the region defined by these dele- tions, containing nucleotides -65 to +15, as the DHFR
* Corresponding author.
minimal promoter. We have examined the DNA-protein interactions in this minimal promoter; we found that a protein binds to the transcription initiation site of the DHFR gene and specifies the site of initiation, and that binding of this factor and Spl are sufficient for accurate transcription initiation.
MATERIALS AND METHODS Cells and extract. HeLa cells were grown in alpha minimal
essential medium plus 5% supplemented calf serum (Hy- clone) to a density of 2 x 105 to 5 x 105 cells per ml. Nuclear extracts (14) were made from approximately 109 cells either on the day of cell harvest or from frozen cells (3).
Construction of plasmids. Plasmids containing sequences from the murine DHFR gene are pST410, pBSprol8, pDFX120, pBSprol9, pDMM285, and pSR320 containing DHFR nucleotides -356 to +61, -50 to +52, -65 to +52, -87 to +52, -270 to +15, and -356 to -30, respectively. pST410 was created by insertion of a SmaI-TaqI fragment (nucleotides -356 to +61) into the SmaI and AccI sites of pUC9. pBSprol8 and pBSprol9 were created by insertion of EcoRI-XbaI fragments from pdprol8 (31) and pdprol9 (31) into the EcoRI and XbaI sites of pBSM13+ (Stratagene Inc.). pDMM285 was created by insertion of a MaeI frag- ment (nucleotides -270 to + 15) into the SmaI site of pUC9. pSR320 was created by insertion of a PvuII-RsaI fragment (which includes DHFR sequences from -356 to -30, as well as vector DNA) from pSS625 (17) into the SmaI site of pUC9. pDFX120 was created by insertion of a FokI-XbaI fragment from pBSprol9 into pBSM13-. Note that, unlike previous papers in which the translation start codon was numbered + 1, numbering is relative to the transcription start site at +1 (the position of the translation start codon is now +56). pSVS contains the entire simian virus 40 (SV40) genome (19). pGemHindIII-C consists of the HindIIIC frag- ment of SV40 inserted into the HindIII site of pGemZF and was a gift from the laboratory of J. E. Mertz.
653
654 MEANS AND FARNHAM
pSVSN contains the SphI-NaeI fragment from pSVS (nucleotides 200 to 345) cloned into pBSM13- at the SphI and SmaI sites. pST410mp19 contains the EcoRI-HindIII fragment of pST410 cloned into the corresponding sites of M13mp19. pSTUmpl9 was derived from pST410mp19 by site-directed mutagenesis with the Bio-Rad Muta-Gene in vitro mutagenesis kit and contains a single-base-pair substi- tution at nucleotide -17 which changes the G of the coding strand to a C, thereby creating a StuI restriction site. pSTUmpl9 was cut with Stul, and 10- or 14-bp linkers were inserted. pSTU+lOmpl9 contains one copy of the XhoI linker 5'-CCCTCGAGGG-3'. pSTU+14mpl9 contains one XbaI linker 5'-CTAGTCTAGACTAG-3'; pSTU+28mpl9 contains two XbaI linkers. pGC was constructed by inserting the Spl-binding-site oligonucleotide
5'-GATCGGGGCGGGGC-3' 3'-CCCCGCCCCGCTAG-5'
into the BamHI site of pUC19. pGCDI was constructed from pGC by inserting the DHFR initiation site oligonucleotide
5'-AATTCATTTCGCGCCAAACTTGACG-3' 3 LGTAAAGCGCGGTTTGAACTGCTTAA-5'
into the EcoRI site, digesting with XmaI, removing the 5' overhang with mung bean nuclease, and inserting the 14-bp XbaI linker shown above. Thus, pGCDI contains an Spl- binding site and the DHFR initiation site separated by 37 bp of polylinker sequence (see Fig. 3C).
In vitro transcriptions. Templates for in vitro transcrip- tions were prepared as follows: pST410 (-356 to +61), pBSprol9 (-87 to +52), pBSprol8 (-50 to +52), pDFX120 (-65 to +52), pDMM285 (-270 to + 15), pSTUmpl9, pSTU+ lOmpl9, pSTU+ 14mpl9, and pSTU+28mpl9 were cleaved with PvuII; pST410 (-258 to +61) was cleaved with NotI and PvuII; pSR320 (-356 to -30) was cleaved with HaeII; pSVS was cleaved with SphI and NdeI; pGemHindIII-C was cleaved with HindIlI; pGC and pGCDI were cleaved with HindIII and NdeI. The promoter-con- taining fragments were all isolated by polyacrylamide gel electrophoresis followed by electroelution.
In vitro transcription reactions (final volume, 25 ,ul) were performed as described previously (18), with modifications for primer extension analysis or oligonucleotide competi- tion. For analysis by primer extension, 5 nM DNA was incubated for 15 min at 24°C with 2.4 ,ug of nuclear extract per ,u in 6 mM MgCl2-24 mM Tris hydrochloride (pH 7.4)-12% (vol/vol) glycerol-60 mM KCl-.12 mM EDTA-0.3 mM dithiothreitol-0.12 mM phenylmethylsulfonyl fluoride. Nucleoside triphosphates were then added to final concen- trations of 600 ,uM GTP, CTP, and UTP and 200 ,uM ATP. After an additional 15 min at 24°C, the reactions were stopped, and the products were extracted and precipitated (18). The precipitates were suspended in 10 RI containing 100 fmol of 32P-end-labeled primer (29), 0.5 M NaCl, 10 mM Tris (pH 7.5), and 5 mM EDTA. This mixture was heated at 85°C for 5 min and then incubated at 60°C for 60 min. Then 40 RI containing 10 U of avian myeloblastosis virus reverse tran- scriptase (Life Sciences, Inc.), 20 U of RNasin (Promega Biotech), 10 mM MgCl2, 12.5 mM dithiothreitol, 1.25 mM each deoxynucleoside triphosphate, and 12.5 mM Tris (pH 8.5) was added, and incubation was continued for 45 min at 42°C. The reactions were stopped by addition of 50 RI of 1% sodium dodecyl sulfate (wt/vol) and 20 mM NaCl, precip- itated with ethanol, and loaded onto an 8 M urea-8% polyacrylamide gel. The primer used for Fig. 3A anneals to
pUC19 nucleotides 455 to 479, and the primer used for Fig. 3B anneals to pUC19 nucleotides 358 to 375.
If oligonucleotide competition was performed, concate- merized oligonucleotides were added 5 min prior to addition of the promoter-bearing fragment. The DHFR initiation site oligonucleotides used for competition are concatemers of the sequence
5'-AATTCTGCGATTTCGCGCCAAACTTGACG-3' 3LGACGCTAAAGCGCGGTTTGAACTGCTTAA-5'.
DNase I protection assays. DHFR coding strands from pBSprol8 and pBSprol9 were phosphorylated at the EcoRI site with T4 polynucleotide kinase and [-y-32P]ATP. A sub- sequent digestion with SphI yielded 171- and 210-bp frag- ments, respectively. The noncoding strand of pBSprol8 was phosphorylated similarly at the Sail site. Subsequent diges- tion with EcoRI produced a 195-bp band. The pSVSN coding strand was phosphorylated at the HindIII site, and subse- quent digestion with NdeI produced a fragment 886 bp in length. These fragments were isolated by polyacrylamide gel electrophoresis followed by elution in an Elutrap (Schleicher & Schuell, Inc.). The pSVSN noncoding strand was labeled by isolating the 165-bp HindIII-SacI fragment and filling in the 5' overhang with the Klenow fragment ofDNA polymer- ase and [a-32P]dATP. All phosphorylating reactions were performed as described previously (29). DNase I footprinting reaction mixtures contained 60 ,ug of
nuclear extract, 1 ng of 32P-labeled DNA, 3 ,ug of poly(dI- dC-poly(dI-dC) or poly(dA-dT)-poly(dA-dT), 24 mM Tris (pH 7.4), 12% (vol/vol) glycerol, 60 mM KC1, 1.2 mM EDTA, 0.3 mM dithiothreitol, and 6 mM MgCl2, in a total volume of 20 ,ul. The reactions were incubated for 10 min at 24°C. DNase I (0.25 to 2 ,ug) was then added, and the samples were returned to 24°C for another 60 s. The reac- tions were immediately terminated by the addition of 4 jl of 0.25 M EDTA-1% sodium dodecyl sulfate (wt/vol), diluted to 75 RI, phenol extracted, and ethanol precipitated. Elec- trophoresis was carried out on an 8 M urea-6% or 8% polyacrylamide gel.
RESULTS
Delimitation of the DHFR minimal promoter region. The promoter region of the DHFR gene does not contain the CCAAT or TATA boxes that are commonly used as RNA polymerase II transcription initiation signals (31). Instead, the region directly upstream of the transcription initiation site consists of four copies of a 48-bp repeat, each of which contains a GC box that binds the transcription factor Spl (15). We have developed an in vitro transcription system for the DHFR promoter by using HeLa cell nuclear extract (18) and have now used this system to define the minimal region ofgenomic DNA necessary for accurate DHFR transcription initiation (Fig. 1). We had previously shown that a template with a 5' end extending to -87 retained transcriptional activity (18) but that a 5' deletion to nucleotide -50 inacti- vated the DHFR promoter. A template extending to nucle- otide -65 is also active, as compared with the template extending to nucleotide -50, which does not support initia- tion from the DHFR start site (Fig. 1). These results corre- spond well to results of deletion studies of the hamster DHFR gene that found that the 5' limit of the hamster promoter was 48 bp 5' of the major transcription start site (10). The DHFR templates used in previous studies extended several hundred base pairs downstream of the transcription initiation site. We have now compared 3' promoter deletions
MOL. CELL. BIOL.
I.:, L,~
(- L, . _z
ci. cilI6
.690 .-
_ _ __I
-30h
FIG. 1. Delimitation of the DHFR minimal promoter. (A) Transcription reactions were performed with DHFR promoter fragments and HeLa nuclear extract. The extent of DHFR sequences on each template is indicated above the lanes. The arrowheads indicate the expected size of the runoff transcript initiating at the DHFR major start site. No product of the correct size is seen when the -50/+52 or the -356/-30 template was used. The strong signals of 444 and 552 nucleotides seen in the -50/+52 and -270/+15 lanes, respectively, are due to end-to-end transcription of the template DNA by RNA polymerase. Similarly, the 800-nucleotide band in the -356/-30 lane is due to end-to-end transcription, whereas the three bands between the 311 and 444 markers are transcripts arising from minor start sites. The sizes of the molecular size markers (in base pairs) are indicated to the right of the figure. (B) Schematic of the DHFR promoter region. All sequences are
numbered relative to + 1 (the major DHFR transcription initiation site). Other start sites corresponding to RNAs transcribed from the opposite strand are also indicated (16, 41). The deletions that define the DHFR minimal promoter (-65 to + 15) are shown. The small boxes below the line represent Spl consensus binding sites. The four DHFR-proximal Spl consensus sites are in the opposite orientation to the six upstream sites. The open boxes above the line represent the four 48-bp repeats.
and have found that a template with a 3' boundary of + 15 can initiate accurately, but that further 3' deletion to -30 inactivates the promoter. Transcription does not initiate the correct distance downstream from the DHFR-proximal GC box on the template containing nucleotides -356 to -30, even though all four 48-bp repeats are retained. Thus, the boundaries of the region defined as the DHFR minimal promoter extend from nucleotides -65 to +15 and contain one binding site for the transcription factor Spl and the transcription initiation site, but no other consensus se- quences for previously identified factors.
Proteins bind to two sites in the DHFR minimal promoter. We have examined the protein-binding sites in the DHFR minimal promoter region. A DNA probe that was 5' end labeled at nucleotide -87 was used in DNase I protection assays with HeLa nuclear extract. Only two regions of this probe were protected from DNase I cleavage. One protected region spans the GC box, protecting nucleotides -60 through -40 in the absence of polyethylene glycol and -60 through -33 in the presence of 3% polyethylene glycol (Fig.
2A, lanes 2 to 4). The other protected region spans the transcription initiation site, protecting nucleotides -11 through +9. Addition of volume excluders such as polyeth- ylene glycol or polyvinyl alcohol may increase the detection of protein-DNA interactions. However, reactions containing up to 3% polyethylene glycol (lane 4) or 4.5% polyvinyl alcohol (data not shown) did not reveal any other protein- binding sites within the minimal promoter. The two binding sites detected correspond to regions required for transcrip- tion in vitro (Fig. 1). Deletion from nucleotides -65 to -50 removes half of the GC box and inactivates the promoter. Deletion from +15 to -30 abolishes correctly positioned initiation, resulting in spurious initiation sites throughout the template. To determine whether binding at the transcription initia-
tion site was dependent upon formation of a functional transcription complex, we assayed a transcriptionally inac- tive DHFR promoter construct by using DNase I protection. Because this fragment, containing DHFR sequences from nucleotides -50 to +52, lacks most of the GC box, binding
A
.1
NONCODING STRAND
Om
4w
on
CCTTGGTOGGGGCGGGGCCTMGCTGCGCMGTGGTACAGAGCTCAGGGCTGCGATTICGCGcCMACTTGACGGC GGMOCCACCCCCGCCCOGGATTCACGCGUCACCATGTGTCGAGTCCCGACCTAMGCGCCGGTTTGMCTGCC
FIG. 2. DNase I protection of the DHFR promoter region. (A) The coding strand of pBSprol9, containing DHFR sequences from -87 to +52 was 5' end labeled and digested with DNase I in the absence (lane 1) and presence (lanes 2 to 4) of 60 jig of HeLa nuclear extract and in the presence of 0% (lane 2), 1.5% (lane 3), or 3% (lane 4) polyethylene glycol (PEG). Lane 5 shows the position of G nucleotides in the fragment (30). Symbols: Fii, regions protected from DNase I cleavage; A, position of the transcription initiation site (40); M, Sp1 consensus site. (B) Both the noncoding (lanes 1 to 3) and the coding (lanes 4 to 6) strands of pBSprol8, containing DHFR sequences -50 to +52 (but lacking the Spl-binding site required for transcription), were digested with DNase I in the presence (lanes 1 and 6) and absence (lanes 2 and 5) of 60 pug of HeLa nuclear extract. Lanes 3 and 4 show the positions ofG and A nucleotides in the sequence (30). Symbols: L., regions protected from DNase I cleavage: .A position of the transcription initiation site. (C) Sequence of the DHFR minimal promoter. Sequences protected from DNase I cleavage on the coding and noncoding strands are indicated by lines above and below the sequence, respectively. The sequence protected by Spl on the noncoding strand was determined by DNase I digestion of a 5'-end-labeled noncoding strand fragment from pBSprol9 (data not shown) and is identical to the protected region described by Dynan et al. (15).
of Spl was not observed. However, protein did bind to the initiation site (Fig. 2B). Although the initiation site is in the center of the protected region on the coding strand (Fig. 2B, lane 6), the noncoding (template) strand is protected mainly 5' of the initiation site (Fig. 2B, lane 1). Our results indicate that protein binding to the DHFR initiation site is indepen- dent of Spl binding and therefore does not require formation of a functional transcription complex. The location of the DHFR initiation site determines the
position of transcription initiation. We have shown that a
deletion of DHFR sequences from nucleotides -30 to +15, including the protected region spanning the initiation site, abolishes correctly initiated transcription (Fig. 1). For many promoters, the sequence at the initiation site is less impor- tant than the TATA box located 30 nucleotides upstream. The TATA box can direct RNA polymerase II to initiate 30 nucleotides downstream even after replacement of the initi- ation site with random sequence. To assess the relative
A CODING STRAND
127\
101-* _
88-S
c
ACAAAnSSAA:_ ---- --. _g ACz _'A-...s
.......- t...g.g -
FIG. 3. The location of the DHFR initiation site determines the position of transcription initiation. (A) Primer extension analysis of in vitro transcriptions from templates diagrammed in panel C. Lane 1 contains molecular size markers (in base pairs). Arrowheads correspond to the initiation sites marked with the identical arrowheads in panel C. (B) Primer extension analysis of in vitro transcriptions from the GC template (lane 3) and the GCDI template (lane 2), both diagrammed in panel C. Arrowheads correspond to the initiation sites marked with the identical arrowheads in panel C. (C) Partial sequence of the coding strands of the templates used in panels A and B. The wild-type template, from pST410, includes DHFR sequences from -356 to +61. The STU template, from pSTUmp19, is identical to the wild-type promoter, except that the G at -17 has been changed to a C. STU+10, STU+14, and STU+28 (from pSTU+1Omp19, pSTU+14mp19, and pSTU+28mp19, respectively) have linkers of 10, 14, and 28 bp inserted at the StuI site of the STU template. The GC template contains a GC box cloned into pUC19. The GCDI template contains a GC box and the DHFR initiation site cloned into pUC19. Underlined nucleotides indicate the Spl and HIPi consensus sites. Nucleotides that do not correspond to DHFR sequences are in lowercase letters. Arrowheads indicate the sites of transcription initiation that were determined by the reactions in panels A and B.
contribution of the transcription initiation site and the -30 region, we performed the following experiments. To determine whether the binding of protein to the DHFR
initiation site influenced the position of transcription initia- tion, we tested the effect of moving the initiation site farther from the upstream elements. We created a StuI restriction site by a single-base-pair substitution at the 5' boundary of the protected region spanning the transcription initiation site. We then inserted oligonucleotides 10, 14, and 28 bp in length into this StuI site and assayed for in vitro transcrip- tional activity by primer extension of transcription products (Fig. 3A). If a factor binding upstream of the protected region specifies the position of transcription initiation by directing RNA polymerase to initiate a fixed distance down- stream, analogous to TFIID, the lengths of the RNAs from the insertion mutants would be increased by 10, 14, and 28 bases. If the protein binding to the initiation site specifies the position of initiation, then transcription would initiate at this site in any location and the RNAs would be the same length regardless of the insert size. The creation of the StuI site did not significantly change the level or site of initiation (Fig. 3A, lane 3). Analysis of the different insertion templates indi- cated that transcription initiated within the protein binding site, despite its greater distance from upstream elements (Fig. 3A, lanes 4 to 6). However, with the 10-bp insertion, transcription initiated equally often at the normal initiating nucleotide and 5 bp upstream, at the G of the GCCA element. This is a minor site of initiation in the DHFR promoter both in vivo (J. Flatt, unpublished data) and in
vitro (Fig. 3A, lanes 2 and 3). The 14- and 28-bp insertion mutations initiated predominantly at this G and, less fre- quently, at the A of the wild-type promoter. The level of transcription from every insertion template was lower than the level from the wild-type and STU templates, suggesting that there is an optimal distance between the transcription initiation site and upstream elements for efficient transcrip- tion.
Construction of a synthetic promoter. We have shown that removal of either the Spl binding site or the region contain- ing the initiation site abolishes transcriptional activity, dem- onstrating the importance of both elements (Fig. 1). To examine the requirement of the sequences between the Spl and the transcription initiation sites, we cloned oligonucleo- tides containing these sites into the polylinker region of pUC19, separated by the same number of nucleotides as in the DHFR promoter (plasmid GCDI [Fig. 3C]), and assayed this template for transcription in vitro (Fig. 3B). Although insertion of an Spl-binding site was not sufficient for tran- scriptional activity (Fig. 3B, lane 3), insertion of both the Spl and the DHFR initiation site oligonucleotides resulted in initiation of transcription within the DHFR initiation site oligonucleotide (Fig. 3B, lane 2). Thus, one GC box and the DHFR initiation site oligonucleotide are sufficient for accu- rate transcription in vitro. A protein binds to the SV40 late initiation site. Because the
DNA sequence at the initiation site of the SV40 late pro- moter is very similar to the sequence at the DHFR initiation site (see Fig. 5), we examined this region of the SV40 late
B
658 MEANS AND FARNHAM
promoter for protein-DNA interactions. DNase I footprint- ing of SV40 sequences from nucleotides 200 to 345 (the major initiation site is at nucleotide 325) revealed binding at two sites, one spanning the initiation site and the other centered approximately 45 bp upstream of the initiation site (Fig. 4, lanes 3 and 6). Examination of this upstream site revealed that it also contained a sequence, TTTCCGCC, similar to that at the DHFR initiation site. To determine whether the protein that binds to the DHFR
transcription initiation site is also involved in SV40 late transcription, we used concatemerized oligonucleotides con- taining the region from nucleotides -16 to +9 spanning the DHFR initiation site as a competitor of SV40 late promoter in vitro transcription reactions (Fig. 4B). A decrease in transcriptional activity from an SV40 late promoter fragment that is added to the reaction after the competitorDNA would indicate that the protein binding to the DHFR initiation site oligonucleotide was required for SV40 late transcription. We examined the effects of excess DHFR initiation site oligonu- cleotides on transcription from the SV40 early and late promoters. As discussed above, the SV40 major late initia- tion site, at nucleotide 325, has homology to the DHFR initiation site. The initiation site at nucleotide 170 is homol- ogous to the 5' half of the DHFR initiation site, having the sequence TTTC. The SV40 early start sites do not have homology to the DHFR initiation site. Concatemerized DHFR initiation initiation site oligonucleotides (200 ng) reduced transcription from both the 170 and 325 start sites of the SV40 late promoter and led to novel upstream initiations (Fig. 4B, lane 4). Excess DHFR initiation site oligonucleo- tides increased transcription from the SV40 early start sites (Fig. 4B, lanes 5 to 8). Control reactions were performed in which 200 ng of a concatemerized oligonucleotide containing a mutated (nonfunctional) heat shock element was added to transcription reactions before the SV40 late or early pro- moter fragments. No difference in transcriptional activity from either template was observed (data not shown), dem- onstrating that the effects on transcription caused by the DHFR initiation site concatemer were not due simply to excess DNA in the reaction. These results of competition with the DHFR initiation site concatemer suggest that the availability of the protein that binds to the DHFR initiation site may be important for determining the efficiency of transcription in the early (versus the late) direction of SV40 transcription. Initiation from the late start sites requires this protein, whereas initiation from the early start sites occurs more efficiently in its absence. Other non-TATA box genes have sequence homology to
DHFR at their initiation sites. The transcription initiation sites shown in Fig. 5 exhibit homology to the 11-bp sequence immediately preceding the DHFR initiation site. None of these promoters has a TATA box appropriately positioned near its transcription initiation site. In particular, the major initiation site of the SV40 late promoter is strikingly similar to the DHFR sequence. These two genes have the sequence ATTTCNNGCCA. However, transcription initiates at the 3' end of the consensus sequence in the mouse DHFR pro- moter but at the 5' end of the consensus sequence in the SV40 late promoter and in the hamster and human DHFR genes (Fig. 5). Comparison of the initiation sites listed in Fig. 5 suggests that this sequence may be composed of two elements corresponding to the sequences ATTTC and GCCA, which can be separated by 1 to 19 nucleotides. For each of the genes listed in Fig. 5, transcription initiates at either or both of these elements. Because the protein(s) binding to the DfIFR initiation site protects a sequence
A
0 §-X J 3
CODING NONCODING STRAND STRAND
- so*_
_.1MP _
SV40 early starts
1 2 3 4 5 6 7 8 9
FIG. 4. The SV40 late promoter binds protein at its major initiation site. (A) The coding (lanes 1 to 3) and noncoding (lanes 4 to 6) strands of pSVSN were assayed for DNase I cleavage in the presence (lanes 3 and 6) and absence (lanes 2 and 5) of 60 jig of HeLa nuclear extract. Lanes 1 and 4 show the positions of G nucleotides (30). Symbols: LO], regions protected from DNase I cleavage; _z position of the transcription initiation site. Nucleo- tides protected from cleavage are indicated on either side. (B) Oligonucleotide competition of transcription from the SV40 early and late promoters. Transcription reactions were preincubated with or without excess concatemerized DHFR initiation site oligonucle- otides (DINIT) before addition of the SV40 late promoter pGem- HindIII-C-HindIII fragment (lanes 1 to 4) or the SV40 early pro- moter pSVS-SphI-NdeI fragment (lanes 5 to 8). Arrows indicate the position of correctly initiated RNA. Lane 9 shows the positions of DNA molecular size markers.
common to other housekeeping genes, we refer to the protein(s) as HIP1, for housekeeping initiation protein 1.
DISCUSSION
HIPi binds to the DHFR initiation site. The question of how the site of transcription initiation is specified for genes lacking a recognizable TATA box has generated consider- able debate. We have now shown that the sequence at the DHFR initiation site, rather than sequences farther up-
concatemerized DINIT sites (ng, 0
0 20 100 200 0 20 100 200 0
..4_
_ ww
v appears that a previously unidentified protein binds to theACAGCTCAGGGCTGCGAITCGCCCAAACTT DHFRDHFR transcription initiation site. GGGCGGGGCGGCCACAA2TICGCGECCAACTT DHFR (human) Te initiation site plays a variety of roles in different
V promoters. Most yeast genes contain one or more TATAGCGCCGGGCGAATGCAAIT=GC~CCf.AACTT DHFR (hamster) boxes that are required for transcription, yet rely upon TCCTCTTTCAGAGGTTAIIICAGGCCATGGTG 5V40 late sequences surrounding the initiation site to position tran-
V scription initiation (9, 21, 38). In contrast, mammalian TATA CCGGCAGCGQITTGAGCCATTGC HPRT boxes direct RNA polymerase II to initiate transcription a
T V ~~V VCCATCGCGCACTCCGGCTCGAIICGfCAGGCGGCG Ki-RAS fixed distance downstream. If the region downstream of the V V TATA box, including the initiation site, is deleted and
GCGGTGTTCCGCA=~TCAAGCCTCC PGK replaced with random sequences, RNA polymerase II can VTAAACCCCTCCACA ITCTGCAGCCC Osteonectin still initiate transcription accurately, although the level of
AGAVT VTTCGCGGCGCCGCGGACTCGCAGTG transcription may decrease (11-13). T Recently, the region containing the transcription initiation
CAGATITTCGGTCCCGGAAGTGTg&AAGATGGC SURF-1 site has been demonstrated to position the start of transcrip-
FIG. 5. Housekeeping genes having sequence homologies to the tion of two mammalian genes that do not contain TATA DHFR start site. Genes having initiation sites homologous to the boxes. Sale and Baltimore (43) demonstrated that the DHFR initiation site were identified by examining a collection of region around the terminal deoxynucleotidyltransferase manuscripts concerning non-TATA box promoters and by searching (TdT) initiation site is required for transcription. Deletion or GenBank with the consensus sequence ATTTCN(1-30)GCCA (only mutation of this region eliminates or reduces transcription, the identified consensus sequence homologies that occur at or near respectively. A 17-bp region containing the start site is transcription initiation sites are shown in this figure). These genes sufficient to position low levels of initiation in the absence of represent those having the best homology to the consensus se- other elements and higher levels when combined with up- quence. The sequences underlined are homologous to the DHFR stream elements such as a TATA box or GC box. The TdT initiation site. Arrowheads indicate the sites of transcription initia- gene is not a housekeeping gene: its expression is limited to tion. References for these sequences are as follows: DHFR (8, 34, 40); SV40 late promoter (20); HPRT (hypoxanthine phosphoribosyl- precursor B and T lymphocytes (27). The mechanism of transferase) (33); Ki-RAS (22); PGK (3-phosphoglycerate kinase) intiation for the TdT gene is distinct from the mechanism (42); osteonectin (32); IRF-1 (interferon regulatory factor 1) (36); used by the DHFR gene. The DHFR initiation site cannot SURF-1 (44). function in the absence of an Spl-binding site and bears no
apparent sequence homology to the TdT initiation site. In addition, Smale and Baltimore could detect no protein
stream, binds protein and positions RNA polymerase II to binding to the TdT initiation site (43). initiate transcription at that site. Ayer and Dynan (2) showed that substitution of nucleo-
Binding of a protein other than RNA polymerase to the tides around the major late initiation site of SV40 decreased start site of transcription is not well documented. Although levels of transcription in vitro and changed the site of the transcription factor TFIIB may bind in the vicinity of the initiation. Close examination of their results reveals that initiation site (7), it does not appear to bind DNA directly substitution of the ATTTC at the initiation site with random (45), but is positioned by binding to other proteins already sequence caused transcription to initiate 10 bp downstream, bound to the promoter. The herpes simplex virus ICP4 at the GCCA sequence. Through comparison of the pro- protein binds to its own transcription initiation site in a tected sequences and inhibition of transcriptional activity negative, autoregulatory manner (37). A cellular protein with oligonucleotides derived from the DHFR initiation site, spans from nucleotides -17 to +27 of the human immuno- we believe that the protein responsible for determining the deficiency virus type 1 promoter. However, unlike the initiation site for the SV40late gene is the same as or similar DHFR promoter, the human immunodeficiency virus type 1 to the one responsible for this activity in the DHFR gene. A TATA element appears to be responsible for positioning second sequence homologous to the HIPl-binding site oc- transcription initiation, since insertion of a DNA fragment curs 45 bp upstream of the SV40 late initiation site and also downstream of the TATA element causes an upstream shift binds a protein that may be HIP1. Mutation of this upstream in the transcription initiation site (25). There are four reasons sequence and of sequences between the two HIP1 sequences for our belief that the protein we are detecting is not RNA reduces transcription in vivo and in vitro (2, 4). polymerase. First, binding occurs to a template that lacks a The HIPl-binding site in the DHFR promoter positions the GC box and is transcriptionally inactive. Second, the foot- site of transcription initiation. We find that transcription printing reactions contain a 3,000-fold excess of nonspecific initiates at the HIP1 site when 10, 14, or 28 nucleotides are competitor DNA. Proteins that bind DNA in a nonspecific inserted between it and all other upstream sequences. Since manner, such as RNA polymerases that bind to ends of templates with inserts of 14 and 28 bp give similar levels of fragments, are thus outcompeted and do not bind to the transcription, there does not appear to be a preference for labeled DNA. Third, the footprint we detected is smaller one side of the helix. Insertion of 10 bp upstream of the HIP1 than the RNA polymerase II footprint observed on other site decreases transcription more than insertion of 14 or 28 promoters (7, 23). The footprint on the DHFR noncoding, or bp, possibly owing to the high G+C content and palindromic template, strand does not extend far 3' of the initiating nature of the insert. nucleotide. RNA polymerase II protects this strand at least To demonstrate more conclusively that the transcription 15 bp 3' of the initiation site in the adenovirus type 2 major initiation site of the DHFR promoter is a positioning element late promoter. Fourth, addition of an excess of HIPl-binding and not just a preferred initiation sequence for a factor sites increases transcription from the SV40 early promoter in binding elsewhere, oligonucleotides corresponding to an Spl vitro. Since this promoter is transcribed by RNA polymerase and a HIP1 site were cloned the appropriate distance apart in II, we cannot be inhibiting RNA polymerase II, because bacterial sequences of the pUC19 plasmid (Fig. 3C). When transcription would then decline, not increase. Thus, it just an Spl site was cloned into pUC19, no transcription was
VOL. 10, 1990
660 MEANS AND FARNHAM
observed. When both an Spl and a HIP1 consensus site were cloned into pUC19, transcription initiated at the HIP1 site, demonstrating that no DHFR sequences other than the Spl site and the HIP1 site are required for accurate initiation. Within the HIPi sites of different genes, transcription may
initiate primarily from the 5' end or the 3' end of either of the two sequence elements that constitute the HIP1 consensus sequence (Fig. 5). Transcription initiates primarily from the nucleotide next to the 3' end of the HIPi consensus se- quence in the mouse DHFR promoter. However, when the HIPi site is isolated from surrounding sequences and cloned into a bacterial vector containing an Spl-binding site, tran- scription initiates primarily from the 5' end of the HIPi sequence (at the HIP1 nucleotide that initiates hamster DHFR, SV40 late, and osteonectin transcription) and sec- ondarily from the 3' end of the sequence. Therefore, the choice of initiating nucleotides is not inherent in the HIP1 sequence, but is influenced by the surrounding sequences. Although mouse and human DHFR genes initiate at different nucleotides within the HIPi element, the mouse initiation site is used by the mouse promoter in transcription extracts prepared from human (Fig. 3) and mouse (18) cells. This indicates that the difference is not due to differences in HIPi between human and mouse cells. The DNA sequence imme- diately upstream of the HIPi consensus sequence is slightly different in the human, hamster, and mouse genes. It is possible that these sequence differences are responsible for the slightly different start sites. We are currently performing mutagenesis of this region and testing whether the binding of other transcription factors influences the exact site of initi- ation within the HIPi site.
In summary, our data are consistent with the conclusion that the same or a similar protein(s) binds to the initiation sites of the DHFR and SV40 late genes, and sequence data suggest that this protein(s) may bind to other housekeeping genes. Our results suggest the existence of at least two mechanisms for specifying the site of transcription initiation, one used by many high-expression, tissue-specific promoters and controlled by TFIID, and the other used by several low-expression, housekeeping promoters and controlled by HIP1. To determine the number and variability of HIPi proteins as well as to assess their specificity and affinity for various promoters, we are beginning experiments to purify and clone the HIPi protein(s).
ACKNOWLEDGMENTS
We thank Jody Flatt for growing the HeLa cells and for allowing us to refer to unpublished results, Stephanie McMahon for technical assistance in sequencing plasmid clones, Charles Nicolet for the control oligonucleotides, and the laboratory of Janet Mertz for pGemHindIII-C. We are grateful to all the members of the P. J. Farnham and W. M. Sugden laboratories for valuable discussion and to the members of the McArdle Laboratory Tumor Biology Group for helpful comments on the manuscript.
This work was supported by Public Health Service grants CA45240 and CA07175 from the National Institutes of Health. A.L.M. was supported, in part, by training grant CA09135 from the National Institutes of Health.
LITERATURE CITED 1. Angel, P., M. Imagawa, R. Chiu, B. Stein, R. J. Imbra, H. J.
Rahmsdorf, C. Jonat, P. Herrlich, and M. Karin. 1987. Phorbol ester-inducible genes contain a common cis element recognized by a TPA-modulated trans-acting factor. Cell 49:729-739.
2. Ayer, D. E., and W. S. Dynan. 1988. Simian virus 40 major late promoter: a novel tripartite structure that includes intragenic sequences. Mol. Cell. Biol. 8:2021-2033.
3. Borelli, M. J., M. A. Mackey, and W. C. Dewey. 1987. A method for freezing synchronous mitotic and G1 cells. Exp. Cell Res. 170:363-368.
4. Brady, J., M. Radonovich, M. Vodkin, V. Natarajan, M. Thoren, G. Das, J. Janik, and N. P. Salzman. 1982. Site-specific base substitution and deletion mutations that enhance or sup- press transcription of the SV40 major late RNA. Cell 31: 625-633.
5. Breathnach, R., and P. Chambon. 1981. Organization and expression of eucaryotic split genes coding for proteins. Annu. Rev. Biochem. 50:349-383.
6. Briggs, M. R., J. T. Kadonaga, S. P. Bell, and R. Tjian. 1986. Purification and biochemical characterization of the promoter- specific transcription factor, Spl. Science 234:47-52.
7. Buratowski, S., S. Hahn, L. Guarente, and P. A. Sharp. 1989. Five intermediate complexes in transcription initiation by RNA polymerase II. Cell 56:549-561.
8. Chen, M.-J., T. Shimada, A. D. Moulton, A. Cline, R. K. Humphries, J. Maizel, and A. W. Nienhuis. 1984. The functional human dihydrofolate reductase gene. J. Biol. Chem. 259:3933- 3943.
9. Chen, W., and K. Struhl. 1985. Yeast mRNA initiation sites are determined primarily by specific sequences, not by the distance from the TATA element. EMBO J. 4:3273-3280.
10. Ciudad, C. J., G. Urlaub, and L. A. Chasin. 1988. Deletion analysis of the chinese hamster dihydrofolate reductase gene promoter. J. Biol. Chem. 263:16274-16282.
11. Concino, M. F., R. F. Lee, J. P. Merryweather, and R. Wein- mann. 1984. The adenovirus major late promoter TATA box and initiation site are both necessary for transcription in vitro. Nucleic Acids Res. 12:7423-7433.
12. Corden, J., B. Wasylyk, A. Buchwalder, P. Sassone-Corsi, C. Kedinger, and P. Chambon. 1980. Promoter sequences of eu- karyotic protein-coding genes. Science 209:1406-1414.
13. Dierks, P., A. van Ooyen, M. D. Cochran, C. Dobkin, J. Reiser, and C. Weissmann. 1983. Three regions upstream from the cap site are required for efficient and accurate transcription of the rabbit P-globin gene in mouse 3T6 cells. Cell 32:695-706.
14. Dignam, J. D., R. M. Lebovitz, and R. Roeder. 1983. Accurate transcription initiation by RNA polymerase II in a soluble extract from isolated mammalian nuclei. Nucleic Acids Res. 11:1475-1489.
15. Dynan, W. S., S. Sazer, R. Tjian, and R. T. Schimke. 1986. Transcription factor Spl recognizes a DNA sequence in the mouse dihydrofolate reductase promoter. Nature (London) 319: 246-248.
16. Farnham, P. J., J. M. Abrams, and R. T. Schimke. 1985. Opposite-strand RNAs from the 5' flanking region of the mouse dihydrofolate reductase gene. Proc. Natl. Acad. Sci. 82:3978- 3982.
17. Farnham, P. J., and R. T. Schimke. 1985. Transcriptional regulation of mouse dihydrofolate reductase in the cell cycle. J. Biol. Chem. 260:7675-7680.
18. Farnham, P. J., and R. T. Schimke. 1986. In vitro transcription and delimitation of promoter elements of the murine dihydrofo- late reductase gene. Mol. Cell. Biol. 6:2392-2401.
19. Fromm, M., and P. Berg. 1982. Deletion mapping of DNA regions required for SV40 early region promoter function in vivo. J. Mol. Appl. Genet. 1:457-481.
20. Ghosh, P. K., V. B. Reddy, J. Swinscoe, P. Lebowitz, and S. M. Weissman. 1978. Heterogeneity and 5'-terminal structures of the late RNAs of simian virus 40. J. Mol. Biol. 126:813-846.
21. Hahn, S., E. T. Hoar, and L. Guarente. 1985. Each of three "TATA elements" specifies a subset of the transcription initi- ation sites at the CYC-1 promoter of Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA 82:8562-8566.
22. Hoffman, E. K., S. P. Trusko, N. A. Freeman, and D. L. George. 1987. Structural and functional characterization of the promoter region of the mouse c-Ki-ras gene. Mol. Cell. Biol. 7:2592-2596.
23. Horikoshi, M., T. Hai, Y.-S. Lin, M. R. Green, and R. G. Roeder. 1988. Transcription factor ATF interacts with the TATA factor to facilitate establishment of a preinitiation com- plex. Cell 54:1033-1042.
MOL. CELL. BIOL.
POSITIONING OF TRANSCRIPTION INITIATION 661
24. Imagawa, M., R. Chiu, and M. Karin. 1987. Transcription factor AP-2 mediates induction by two different signal-transduction pathways: protein kinase C and cAMP. Cell 51:251-260.
25. Jones, K. A., P. A. Luciw, and N. Duchange. 1988. Structural arrangements of transcription control domains within the 5'- untranslated leader regions of the HIV-1 and HIV-2 promoters. Genes Dev. 2:1101-1114.
26. Kadonaga, J. T., A. J. Courey, J. Ladika, and R. Tjian. 1988. Distinct regions of Spl modulate DNA binding and transcrip- tional activation. Science 242:1566-1570.
27. Landau, N. R., T. P. St. John, I. L. Weissman, S. C. Wolf, A. E. Silverstone, and D. Baltimore. 1984. Cloning of terminal trans- ferase cDNA by antibody screening. Proc. Natl. Acad. Sci. USA 81:5836-5840.
28. Lee, W., P. Mitchell, and R. Tjian. 1987. Purified transcription factor AP-1 interacts with TPA-inducible enhancer elements. Cell 49:741-752.
29. Maniatis, T., E. F. Fritsch, and J. Sambrook. 1982. Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.
30. Maxam, A. M., and W. Gilbert. 1980. Sequencing end-labeled DNA with base-specific chemical cleavages. Methods Enzymol. 65:499-560.
31. McGrogan, M., C. C. Simonsen, D. T. Smouse, P. J. Farnham, and R. T. Schimke. 1985. Heterogeneity at the 5' termini of mouse dihydrofolate reductase mRNAs. J. Biol. Chem. 260: 2307-2314.
32. McVey, J. H., S. Nomura, P. Kelly, I. J. Mason, and B. L. M. Hogan. 1988. Characterization of the mouse sparc/osteonectin gene. J. Biol. Chem. 263:11111-11116.
33. Melton, D. W., C. McEwan, A. B. McKie, and A. M. Reid. 1986. Expression of the mouse HPRT gene: deletional analysis of the promoter region of an X-chromosome linked housekeeping gene. Cell 44:319-328.
34. Mitchell, P. J., A. M. Carothers, J. H. Han, J. D. Harding, E. Kas, L. Venolia, and L. A. Chasin. 1986. Multiple transcription start sites, DNase I-hypersensitive sites, and an opposite-strand exon in the 5' region of the CHO dhfr gene. Mol. Cell. Biol.
6:425-440. 35. Mitchell, P. J., C. Wang, and R. Tjian. 1987. Positive and
negative regulation of transcription in vitro: enhancer-binding protein AP-2 is inhibited by SV40 T antigen. Cell 50:847-861.
36. Miyamoto, M., T. Fujita, Y. Kimura, M. Maruyama, H. Harada, Y. Sudo, T. Miyata, and T. Taniguchi. 1988. Regulated expres- sion of a gene encoding a nuclear factor, IRF-1, that specifically binds to IFN-P gene regulatory elements. Cell 54:903-913.
37. Muller, M. T. 1987. Binding of the herpes simplex virus imme- diate-early gene product ICP4 to its own transcription start site. J. Virol. 61:858-865.
38. Nagawa, F., and G. R. Fink. 1985. The relationship between the "TATA" sequence and transcription initiation sites at the HIS4 gene of Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA 82:8557-8561.
39. Nakajima, N., M. Horikoshi, and R. G. Roeder. 1988. Factors involved in specific transcription by mammalian RNA polymer- ase II: purification, genetic specificity, and TATA box-promoter interactions of TFIID. Mol. Cell. Biol. 8:4028-4040.
40. Sazer, S., and R. T. Schimke. 1986. A re-examination of the 5' termini of mouse dihydrofolate reductase RNA. J. Biol. Chem. 261:4685-4690.
41. Schilling, L. J., and P. J. Farnham. 1989. Identification of a new promoter upstream of the murine dihydrofolate reductase gene. Mol. Cell. Biol. 9:4568-4570.
42. Singer-Sam, J., D. H. Keith, K. Tani, R. L. Simmer, L. Shively, S. Lindsay, A. Toshida, and A. D. Riggs. 1984. Sequence of the promoter region of the gene for human X-linked 3-phosphoglyc- erate kinase. Gene 32:409-417.
43. Smale, S. T., and D. Baltimore. 1989. The "initiator" as a transcription control element. Cell 57:103-113.
44. Williams, T. J., and M. Fried. 1986. The MES-1 murine en- hancer element is closely associated with the heterogeneous 5' ends of two divergent transcription units. Mol. Cell. Biol. 6:4558-4569.
45. Zheng, X.-M., V. Moncollin, J.-M. Egly, and P. Chambon. 1987. A general transcription factor forms a stable complex with RNA polymerase B (II). Cell 50:361-368.
VOL. 10, 1990