transcription mapping of mouse adenovirus type 1 early region 3

VIROLOGY 175,81-90 (1999)

Transcription Mapping of Mouse Adenovirus Type 1 Early Region 3

CLAYTON W. BEARD, AMY OBERHAUSER BALL, E. HANNAH WOOLEY, AND KATHERINE R. SPINDLER’

Department of Genetics, University of Georgia, Athens, Georgia 30602

Received September 6, 1989; accepted November 10, 1989

Early region 3 (E3) of mouse adenovirus type 1 was analyzed using Sl nuciease protection and primer extension assays, cDNA sequencing, and genomic sequencing. We present the genomic sequence from 79 to 83 map units of the viral genome, the precise ends and splice sites of the E3 mRNAs, and the predicted protein sequence encoded by the mRNAs. Three major classes of early mRNAs were identified; all were approximately 1 kb long, consisted of three exons, and shared 5’ and 3’ ends. The three classes had alternative splicing at the junction between the second and third exon. The three proteins predicted by the three mRNAs were slightly similar to the E3 19K glycoprotein of human adenovirus type 3; the longest of the three was the most similar. Open reading frames corresponding to late proteins were also identified in the translated mouse adenovirus type 1 DNA sequence. In mouse adenovirus, as in the human adenoviruses, L4 overlaps E3, and L5 starts just downstream of the E3 region. 0 1990 Academic Press, Inc.

INTRODUCTION

Early region 3 (E3) of human adenovirus types 2 and 5 (Ad2/5) is transcribed rightward from 75.9 to 86 map units (m.u.) on the genome (Berk and Sharp, 1978; Chow et al., 1979; Pettersson et a/., 1976). E3 transcription is complex, consisting of at least nine overlapping mRNAs generated by alternative splicing (Chow et al., 1979; Cladaras et al., 1985). Heteroduplex mapping indicates considerable sequence divergence in E3, even within subgroups of adenoviruses (Bartok et al., 1974; Belak et al., 1986; Garon et a/., 1973). E3 is nonessential for viral replication in vitro (Kelly and Lewis, 1973). However, since the E3 region has been evolutionarily conserved, it is believed that E3 proteins are important for natural infections. It has recently been shown that E3 moderates adenovirus infection of cotton rats (Ginsberg et a/., 1989) and hamsters (Morin et a/., 1987). Viral mutants deleted for E3 replicate like wild-type virus in cotton rat lungs, but elicit an in- creased inflammatory response. In addition, studies of E3 proteins and mutants in cell culture suggest that this region is involved in virus-host interactions.

The human E3 region contains at least nine open reading frames (offs) long enough to encode polypep- tides 6K or longer. Proteins have been identified in Ad2/ 5-infected cells corresponding to six of these orfs (Tol- lefson and Wold, 1988; Wang et al., 1988; Wold et a/., 1984; W. Wold, personal communication). The Ad2/5 19K glycoprotein (gpl9K) is the best-characterized of these and it is the most abundant viral protein at early times after infection (Persson e2 a/., 1979; Wold et a/.,

’ To whom requests for reprints should be addressed.

1985). The gpl9K is a transmembrane protein that can bind class I major histocompatibility (MHC) antigens (Kampe et a/., 1983; Kvist et a/., 1978; Paabo et al., 1983; Severinsson and Peterson, 1985; Signas et al., 1982) and this binding inhibits the glycosylation of the antigens and prevents their efficient transport to the cell surface (Andersson et al., 1985; Burgert and Kvist, 1985; Severinsson and Peterson, 1985). Despite sequence divergence in the E3 regions, the ability of the 19K glycoprotein to inhibit transport of the class I MHC antigen is conserved in all human adenovirus subgroups except subgroup A (Paabo eta/., 1986). The E3 10.4K protein has been shown to down-regulate the expression of epidermal growth factor receptors on infected cells (Carlin et al., 1989) and it has been shown that this may involve a complex of the 10.4K protein and the recently identified E3 14.5K protein (W. Wold, personal communication). In cultured cells infected by Ad2/5, an E3 14.7K protein prevents lysis by tumor necrosis factor (Gooding et a/., 1988). No function has been identified for the 1 1.6K and 6.7K proteins found in infected cells. The known biochemical activities of the 19K, 14.7K, 14.5K, and 10.4K Ad2/5 E3 proteins in cell culture infections all involve interactions with host cell components and are probably important for the bi- ology of adenovirus infection.

The role of E3 in in viva adenovirus infections has been difficult to study because of the species-specific- ity of human adenoviruses. Although there has been success in using cotton rats (Ginsberg et a/., 1989) and hamsters (Hjor-th eta/., 1988) to studyin vivo pathogenesis, development of a model using an adenovirus in its natural host is desirable. The molecular genetics of mouse adenovirus type 1 (MAV-1) are being character-

81 0042.6822/90 $3.00 CopyrIght 0 1990 by Academic Press, Inc All rights of reproduction ,n any form reserved

82 BEARD ET AL.

ized, and MAV-1 should provide a useful animal adenovirus model. In this work we have focused on the E3 region of MAV-1. Ball et a/. identified early region 1 (El) at the left end of the MAV-1 genome (Ball et al., 1989, 1988); on the basis of this orientation of MAV-1, we predicted that a region like that of Ad2/5 E3 would be found near 80 m.u. on the MAV-1 genome.

We present the genomic sequence, transcription mapping, and the predicted proteins of the E3 region of MAV-1. Three major alternatively spliced E3 mRNAs were identified in a region corresponding to 79 to 83 m.u. on the MAV-1 genome, a region approximately 1000 bp in length. Analysis of the predicted polypep- tides of these mRNAs revealed one with a slight similarity to human adenovirus E3 gpl9K. Other ot-fs with similarity to the late proteins, 33K phosphoprotein, pVIII, and fiber, were identified; however, no other similarities to human adenovirus E3 proteins or orfs were noted. The significance of these findings and their relevance to the role of E3 in adenovirus pathogenesis is discussed.

MATERIALS AND METHODS

Plasmids and genomic sequencing

The construction of plasmids containing MAV-1 genomic fragments used in this work was described previously (Ball et al., 1989). For sequencing genomic DNA, the HindIll-C fragment was cloned into Blue- scribe+ (Stratagene) in both orientations. A series of deletion clones was made using the Exolll nuclease method (Henikoff, 1984) and the clones were sequenced by the dideoxy chain-termination method (Sanger et al., 1977) using Sequenase enzyme (U.S. Biochemical Corp.). The sequence was compared with that of Raviprakash et a/. (1989) and found to be identical. Computer analysis was performed using IntelliGe- netics, Inc. and International Biotechnologies, Inc./ Pustell software as described (Ball et a/., 1988).

mRNA analysis

Mouse L cells in monolayers and MAV-1 were grown as previously described (Ball et al., 1989). Suspension L cells were seeded from monolayers at 2 X 1 O4 cells/ ml in a-MEM containing 10% heat-inactivated calf serum, and infected with 1 to 5 PFU/cell at a density of 5 X lo5 cells/ml in (u-MEM containing 2% heat-inactivated calf serum. Total RNA was isolated from mock- or MAV-1 -infected L cell monolayers or suspension L cells as previously described (Ball et al., 1989). Early RNA was collected at 22 hr post infection (p.i.) and late RNA was collected at 45 or 48 hr p.i. (Ball et a/., 1989). Northern blots, Sl nuclease protection assays, and

primer extension analyses were performed as previously described (Ball et a/., 1989).

The cDNA library was generated from 22 hr MAV-l- infected-cell poly(A)+ RNA (Ball eta/., 1989) and 2 X 1 O6 plaques were screened with the isolated MAV-1 HindIll-C fragment. The 29 cDNA clones which hybridized were isolated and sequenced by the dideoxy chain-termination method (Sanger et al., 1977) using Sequenase enzyme (U.S. Biochemical Corp.). Oligonu- cleotide primers used for sequencing cDNAs and for primer extension assays were synthesized by Operon Technologies.

RESULTS

Genomic sequence of E3

The MAV-1 genomic sequence of the HindIll-C fragment (77 to 89 m.u.; nt 1 is at the HindIll site at 77 m.u.) was obtained by sequencing clones from a deletion series as described under Materials and Methods. The transcription analysis below indicated that E3 was transcribed within the segment from 79 to 83 m.u., and only this segment is shown here. The sequence was determined from both strands for a majority of this segment. Where sequence for only one strand was obtained, it was compared and found to be identical to the HindIll-C fragment sequence determined by Ravi- prakash et al. (1989). Figure 1 presents the genomic sequence from nt 701 to 1780. Important features are indicated in the figure legend, including the TATA box, transcription and translation start sites, splice donor and acceptor sites, stop codons, and polyadenylation signals and sites.

Identification of multiple E3 transcripts

Northern blots of poly(A)+ RNA isolated at various times p.i. were probed with the HindIll-C fragment of MAV-1. Only a single class of mRNAs, approximately 1 kb in length, was detected at early times (Fig. 2, lane 2). The 1 -kb mRNAs could first be detected at 14 hr p.i. with a probe made from an E3 cDNA clone (see below) (Fig. 2, lane 7). Smaller probes from this region of the genome were used to more precisely locate the 1-kb RNAs within the HindIll-C fragment (data not shown). Assays with strand-specific probes indicated that both at early and late times after infection mRNAs from this region were only transcribed in the rightward direction (data not shown). Taken together these experiments identified a mRNA or family of mRNAs transcribed at early times in the rightward direction near 80 m.u., consistent with an MAV-1 E3 transcription unit.

Figure 3 presents the structures of the E3 messages determined in the experiments described below. Sl

MOUSE ADENOVIRUS E3 TRANSCRIPTION 83

790 GACACCTTTG ACGCCGCCCT AACAAGCAAC GGAGCGCAAT TAGCTGGAGG GGCGTGGATA AACW ACGGTAGTGT TCGCTACGAA

TATA box 880

GCGCCCTTGC AGCTGGCCGA GGAACAGGTC GGTGGACCGC TAAACGCCTT TGCTATAAAA CATCAGCTAC AACTAGCAGG AGGAGCTCTT 1 1 transcription start Bites

970 TCTGCTTCTDTCCGAAAT GAGCGGGGCG CCCAGAATCC CGCGCAGCGG AGGTATTGGG TCGTGGCAAT TTTCTCGAGA ATTCCCCCCT

start codon I intron #l donor site 1060

ACTGTTTACC TTAACCCTTT TTCCGGCAGT CCTGACACTT TTCCTCATCA ATTTCTTTCT AACTATGACT CTTTCTCTCA CACGGTGGAC

1150 GGGTATGACBTTCACCGT CCAGATCGGC TGCGCTTCCT GTGCCTGCTT CTACTCGTAT TGGGTTGGTG TTTGCCCGTG ACCGGTCATC

intron #l acceptor site 1 pVII1 atop codon

1240 CTCTCAAAGG GGTTCAACCA TCGCAGTGTC AGTGCCCTGC TAGTCCCCCG TGGACTAATT CTTCTGTTAC TTCCTTCGCC CAGAAAACAA

1330 AATGGGAAAA CTCACGGTAT GTACAAGTAA GCCGTACTTA AATTTTTCTC GTGCTATACG TACGTACCTG TGCGGCTCCA AATGCGATAA

41 4A I intron #2 donor sites 1420

CGCTATCTAT TTTACACCCC AGAAAATTGT TATCGAGCTG GTGCAGGAAA AAAAAACCAC TCAGTTACTC CTTTTGCTTG CAGCCAGTAT intron #2 acceptor oites 5 1 5Al

1510 TGCCCTGTAC CTTC-GTC CTCAACTCGG GGCAAGAATG CTGTTCGAAC TGGTGCAGGC CCGGACGACG AGTGTTTC-CAGCAGCGT

clam 2 otop codon class 1 atop codon 1600

GGCTGCTGCC CTGTTTGCCT GCGCCGGAGA GGAAATAATC AACCCAGCAA TTTTTCTGTT TCTGCATGTT CTCACACTTG TGATCGTTCT

1690 GGCTATGGCC GCTGAAGTAA TCTATAATCG CTGCCGTCGT ACTACTCGAC CTACTGCACC CCCACCCCCT GTCAACAATG CTGATTTTAA

1780 CCTGGCAGAT GCCTTAGATG AAACTTAC ALUAATAAAAA TTTGCAACAC GTACTCCGGC TCGCCTCCTUTTTTCTTT GCAGAAGGAC

polyadenan oignals I polyadenylation site class 3 stop codon fiber start codon

FIG. 1. MAV-1 genomic DNA sequence from 79 to 83 mu. Important features are underlined, including transcription and translation start sites, alternate splice donor sites 4 and 4A, alternate splice acceptor sites 5 and 5A, polyadenylation signals, and polyadenylation sites. Stop codons for the class 1, 2, and 3 E3 mRNAs (see text) are indicated. All features are those of E3 messages unless otherwise noted. Nucleotide 1 corresponds to the HindIll site at 77 m.u.; for the complete sequence of the HindIll-C fragment, see Raviprakash et a/. (1989).

nuclease protection and primer extension analyses were used to map the 5’ and 3’ ends and splice sites. Direct sequencing of cDNA clones was used to deter- mine the precise splice sites and the 3’polyadenylation site.

For 5’ end mapping, both the oligonucleotide for primer extension and the Sl nuclease probe were 5’ end-labeled at nt 873. The Sl nuclease and primer extension products were compared on a sequencing gel (Fig. 4, lanes 5-7 and 8-10, respectively); both produced fragments which mapped two closely spaced 5’ ends to nt 793 and 796 in early RNA. Other larger fragments which extended up to several hundred nu- cleotides upstream were protected by the S 1 nuclease probe in late RNA (data not shown), and were not mapped precisely. A series of bands in late RNA corresponding to 5’ ends around nt 770, the predicted early transcription TATA box, were seen in the Sl nuclease

analysis (Fig. 4, lane 7). These bands probably arise from Sl nuclease digestion at A-T-rich regions in het- eroduplexes (Hansen et al., 1981). The additional primer extension product seen at late times, which would map an end to nt 754 (Fig. 4, lane lo), is probably a reverse transcriptase-induced artifact, since similar 5’ ends were not seen in the Sl nuclease analysis. Strong stops were seen near nt 754 when DNA was sequenced with either avian myeloblastosis virus or Moloney murine leukemia virus reverse transcriptase, but were not seen when sequencing was performed using Sequenase (data not shown). The 3’ end of E3 mRNAs was mapped to ant 1740 using a probe 3’ end-labeled at nt 1450; this probe protected one 3’end in both early and late RNA (Fig. 5 and data not shown).

Two introns were identified using the probes dia- gramed in Fig. 6B. For the first intron the splice donor site (site 2) was mapped using Sl nuclease analysis

84 BEARD ET AL.

5 P

:: 22 45 % 8 11141822 45hr

123 456769 10

FIG. 2. Northern blot analysis of MAV-1 transcription from 79 to 83 m.u. Poly(A)+ RNA was isolated from mock-infected or MAV-l- infected ceils from 8 to 45 hr p.i. and analyzed on Northern blots as described under Materials and Methods. Hours p.i. are indicated across the top of the figure. Lanes l-3 were probed with random primer-labeled HindIll-C fragment of MAV-1. Lanes 4-10 were probed with a random primer-labeled E3 cDNA clone of the class 1 type (see Fig. 3). Numbers between lanes 3 and 4 represent posi- tions of Ad5 DNA HindIll fragments.

and a probe 3’ end-labeled at nt 827. This probe protected 100 and 110 nt fragments in early RNA, mapping a splice donor site to approximately nt 930 (Fig. 6A, lane 2). To map the first splice acceptor site (site 3), a probe was 5’ end-labeled at nt 1 144. This probe protected two fragments, 65 and 70 nt in length, in early RNA (Fig. 6A, lane 5) mapping a splice acceptor site to ant 1080.

The second splice donor site (site 4) was mapped with a probe 3’ end-labeled at nt 1144. The probe protected a fragment in early RNA that was 1 15 bp in length (Fig. 6A, lane 8); this mapped a donor site to writ 1260. To map the second splice acceptor site (Fig. 6B, site 5) a probe was 5’ end-labeled at nt 1673. This probe protected a fragment that was 250 nt long (Fig. 6A, lane 1 l), indicating a splice acceptor site around nt 1420. In additional experiments, products protected from Sl nuclease digestion by this probe were analyzed next to DNA sequencing reactions (Fig. 6C). The 250 nt bands seen in lanes 11 and 12 of Fig. 6A were found to be doublets (Fig. 6C, lanes 15 and 16); the lower band, ~3-4 nt shorter, was present in lower amounts than the upper band. The ratio of the two bands seen in Sl nuclease analysis was consistent with the proportion of cDNAs isolated corresponding to mRNAs alternatively spliced at acceptor sites 5 or 5A (see below).

Late transcripts were abundant in this region of the genome, as indicated by the additional bands in late RNAs in Fig. 6A, and were not mapped extensively. The

uppermost band corresponded to the full length of the Sl nuclease probe (Fig. 6A, lanes 3,6,9, and 12) suggesting that, at late times, unspliced mRNAs were produced from this region. In lane 3, the additional band = 450 nt in length may correspond to mRNAs that do not have the first intron removed. The intensity of bands corresponding to splicing at sites 4 and 5 in- creased at late times, suggesting that some late RNAs may have also spliced out the second intron. Because only a single 3’ end was seen at late times (Fig. 5) ap- parently one or more late mRNAs extended through E3 and shared the polyadenylation site.

Determination of splice sites from sequence analysis of viral cDNAs

Twenty-nine cDNA clones isolated from a library made from early MAV-l-infected-cell RNA were sequenced to confirm the nuclease protection and primer extension assays (Fig. 3). The precise splice sites are shown in Figs. 1 and 3. The sequences from these clones gave the exact splice sites and 3’ ends. All clones had the same splice donor and acceptor for the first splice, at nt 932 and 1085, respectively, and thus

Class

1 b 793/796 932 1085 1256 1414 1743

2 o/\lO/\H b 1266

3 SnAm)

1416

4 754

Number of cDNAs

24

FIG. 3. E3 mRNAs. The three major classes of E3 mRNAs (classes l-3) and a fourth mRNA (either minor early or late, see text) are diagrammed. The number of cDNAs which correspond to each class is given on the right. Horizontal lines and carets represent the exons and introns. respectively. Boxes above the mRNAs represent orfs; different shading patterns indicate different reading frames. Start, splice, and polyadenylation sites are indicated in nt numbers below the appropriate site for class 1 mRNAs. For classes 2, 3, and 4, only sites different from class 1 are indicated. The 5’ ends found by sequencing cDNAs ranged from nt 806 to 836 for class 1, 2, and 3 mRNAs (data not shown). Because of the cDNA synthesis method, the true 5’ ends of the mRNAs lie upstream of the 5’ ends of the cDNAs (Gubler and Hoffman, 1983). The dotted line for the single class 4 clone indicates that its true 5’ end is believed to be even further upstream (see text).

MOUSE ADENOVlRUS E3 TRANSCRIPTION 85

5'3 Gc Gc 750 Gc Gc CG

*GC TA GC Gc AT TA AT760 AT AT CG TA AT TA AT

I

AT * AT

AT 770 AT CG GC GC TA AT GC TA Gc TA 780 TA CO GC CG TA AT CG GC AT AT 790 GC CG

* GC CG CG

* CG TA TA GC CG 800 AT 3'5

Primer Sl Extension

--

754

770

793 796

12 345 676910

--; RNA

Sl Probe

-* 657 673

Primer

FIG. 4.5’end mapping of E3 mRNAs. RNA was isolated from mock- infected L cells or MAV-l-infected L cells 22 hr (early) or 48 hr (late) p.i. and hybridized to the Sl nuclease probe (nt 6-873) or the primer (nt 857-873) at 46 or 37”, respectively. The probe and primer are diagrammed at the bottom of figure; asterisks indicate the position of the label. For Sl nuclease analysis the hybrids were digested for 60 min at 23” with 50 U of Sl nuclease. Lanes l-4 show a sequencing ladder generated using the indicated primer with a genomic DNA template. Numbers on the right indicate the nucleotide numbers at the 5’end of protected and extended fragments and are indicated by asterisks on the DNA sequence shown on the left.

the heterogeneity seen in the Sl nuclease assays at these two splice sites (Fig. 6A, lanes 2 and 5) was probably an artifact of the Sl nuclease protection assay. Differential splicing around the second intron divided the cDNAs into three classes (Fig. 3). Class 1 cDNAs had a splice donor site (site 4) at nt 1256 and an accep-

tor site at nt 14 14 (site 5). Twenty-four of the 29 cDNAs were in this class. Class 2 clones had a splice donor site at nt 1266 (site 4A) and an acceptor at nt 1414 (site 5); two of the cDNA clones were in this class. Two class 3 clones were identified; they had a donor site at nt 1256 (site 4) and an acceptor site at nt 1418 (site 5A). A fourth class of cDNAs, represented by a single member (Fig. 3) had a 5’end upstream of the major 5’ end seen by Sl nuclease protection and primer extension (Fig. 4). The 5’ end of this cDNA corresponded to the strong stop seen in primer extension in Fig. 4, lane 10, and probably resulted from premature termination as discussed above. Because of the method of cDNA synthesis, the 5’ end of the mRNA corresponding to this cDNA must lie upstream of nt 754 (Gubler and Hoffman, 1983). It is not known whether this cDNA was derived from a very minor class of early mRNAs or from a late mRNA present when the MAV-l-infected-cell

- 1560 bp

- 964

- 645

- 525 - 472

- 392

4 -290

- 247

- 197

- 157

- 138

* /+ 1450 3660

FIG. 5.3’end mapping of E3 mRNAs. RNA was isolated from mock- infected L cells or MAV-1 -infected L cells 22 hr (early) or 48 hr (late) p.i. The Sl nuclease probe used, nt 1450-3660, IS shown at the bottom of the figure and was hybridized to the RNA at 52”. The hybrids were digested for 60 min at 23” with 50 U of Sl nuclease. The arrow indicates a protected fragment of approximately 290 nt which maps the 3’end to around 1740. Lane 4, end-labeled Rsal fragments of +X1 74 replicative form DNA used as size standards.

86 BEARD ET AL.

A Probe @ Probe @ Probe @ Probe @

1 2 3 4 5 6 7 6 9 10 11 12 13

- 247

- 197

- 157 - 136

- 69

B 5.4 -

Probe@ 3' E# 627 1673 I

Probe#@

Probe#@

Probe#R

1024x 5

s* 1144 1450

*5

C Probe @

s! 32 I!

964 1673' 14 15 16

FIG. 6. Mapping splice sites of E3 mRNAs. (A) RNA was isolated from L cells mock-infected or infected with MAV-1 for 22 hr (early) or 48 hr (late), and hybridized with the indicated probes. Hybridizations with probes 2, 3, and 4 were performed at 40”, and with probe 5 at 48”. The hybrids were digested for 30 min at 23” with 50 U of Sl nuclease. The sizes of protected fragments and the location of mapped ends are described in the text, Lane 13, size standards as in Fig. 5. (B) Probes used in A are indicated below a consensus mRNA diagram. The dotted caret lines indicate the alternative splices 4A and 5A (see text). (C) Aliquots of samples electrophoresed in lanes 1 O-l 2 in part A were analyzed on a sequencing gel (lanes 14-l 6, respectively). The doublet band (-250 nt) is discussed in the text. A + G indicates the A + G sequencing reaction of the probe (Maxam and Gilbert, 1980)

RNA was isolated for the cDNA synthesis. In any case, in all of the Sl nuclease analyses of early RNA, no band corresponding to a 5’ end at nt 754 was seen (Fig. 4 and data not shown). Ad2 E3 mRNAs spliced like other early E3 mRNAs, but with 5’ ends corresponding to transcription from the major late promoter, have been observed in late mRNA preparations @hat and Wold, 1986; Chow et a/., 1979); the class 4 MAV-1 mRNA may be similar.

identification of transcription signals

Transcription initiation signals were identified in the DNA sequence. A CCAAT box was identified at nt 624 (sequence not shown; Raviprakash ef al., 1989) and a TATA box was identified at nt 764. The E3 transcription unit contains two overlapping polyadenylation signals, AATAAA (Proudfoot and Brownlee, 1976) at nt 1719

and 1723 (Fig. 1); no other perfect signals were identified in the HindIll-C fragment. No mRNAs were found to have 3’ ends corresponding to the imperfect AA- TAAT at nt 1544 (data not shown). All but one of the cDNAs that had poly(A) tail sequence in the clone were polyadenylated at nt 1743; the exception was a class 1 mRNA polyadenylated at nt 1769. Two overlapping GU-rich regions (at nt 1759 and 1761) required for efficient 3’ end formation (McLauchlan et al., 1985) were identified downstream of the polyadenylation sites.

Predicted coding regions and similarity to human adenovirus proteins

The translated DNA sequence predicted MAV-1 proteins for the three alternatively spliced E3 mRNAs shown in Fig. 7. These and other orfs longer than 60

MOUSE ADENOVIRUS E3 TRANSCRIPTION a7

class 1, 2, 3 MSEMSGAPRI PRSGDRLRFL CLLLLVLGW-C LPV-TGHPLKG VQPSQCQC--PA SPPWTBVT SFAQKTKWEN 70 Ad3 gpl9K . . . L CgvLikcGWdC rsVeithnnKt wnntlsttwepg vPqWytvSVr g-pdgsiris 100

class 1 class 2 class 3 Ad3 gpl9K

SRQYCPVPSE SSTRGK--NAVR TAGAPDDECF* 100 SRYVQPVLPC TF* 82 SRIALYLLSP QLGARM--LFEL VQARTTSVSKSSVAULFAC AGEEIINPAI FLFLHVLTLV IVLAMAAEVI 140 nntfifsemc dLamfMsrqydL wppskeniva fSiAycLvtC iitaIIcvc1 hllivi... 158

class 3 YNRCRRTTRP TAPPPPVNNA DFNLADALDE TYNK* 174

FIG. 7. Amino acid sequence of the predicted MAV-1 E3 proteins. The predicted coding sequences for the three MAV-1 proteins and part of the Ad3 gpl9K protein are shown in one-letter amino acid code. Only the region of the Ad3 gpl9K protein which is similar to the MAV-1 predicted proteins is shown. The ammo and carboxyl termini of the Ad3 gpl9K, which do not have similarity to the MAV-1 E3 protein, are indicated by II “. The amino acids of the Ad3 gpl9K which are identical to the class 3 protein are capitalized. The sequence for the three MAV-1 proteins diverges after the second intron; the identical amino terminal portion of the MAV-1 proteins is shown at the top of the figure and the divergent portions are shown below. The mRNA class for each protein IS shown on the left and the aa numbers are shown on the right. The amino acids which are adjacent to or span splice sites in the MAV-1 predicted proteins are in bold, and potential glycosylation sites are underlined.

amino acids (aa) were examined for similarity to known human adenovirus proteins and orfs from the entire genome. Similarities were found to human adenovirus E3 gpl9K and to late proteins pVIII, the 33K phosphoprotein, and fiber. The TESTCODE algorithm, which pre- dicts the likelihood that a stretch of DNA is coding or noncoding (Fickett, 1982), predicted that the regions of DNA corresponding to the third E3 exon and late genes contained protein coding sequences.

Sequences from Ad2, 3, 5, and 35 (Cladaras and Wold, 1985; Flomenberg et al., 1988; H&is& et a/., 1980; H&is& and Galibert, 1981; Signas et al., 1986) were examined, and gpl9K was found to have similarity with MAV-1 orfs. The most similarity was seen between predicted orfs of the three MAV-1 mRNAs and the gpl9K of Ad3. The similarity was found throughout the MAV-1 second exon, aa 13 to 82, shared by all MAV-1 E3 mRNAs (25% identical, and an additional 7% similar aa). The similarity extended into the first 50 aa of the third exon unique to the class 3 mRNAs (21% identical, and 19% similar aa) (Fig. 7). The hydropathy profile of the protein from class 3 mRNAs (data not shown) predicted that the MAV-1 E3 protein could be a membrane-bound protein (Kyte and Doolittle, 1982). A putative N-terminal signal sequence from aa 1 to 37 (von Heijne, 1985) and a hydrophobic (potential transmembrane) region from aa 115 to 140 were observed. Two possible glycosylation sites (N-X-S/T) were present, at aa 56 and 100. The site at 100 is unique to proteins predicted by class 3 mRNAs. There were two possible start codons 4 aa apart; the first was in a slightly better context (Kozak, 1986) but it is not known which of the two codons initiates translation.

pVIII, was identified extending from nt 425 (sequence not shown; Raviprakash et al., 1989) to a stop codon at nt 1070 (Fig. 1). Fourteen amino acids of this orf are shared by the or-f in the first exon of the E3 mRNAs. An or-f of 615 aa was identified that had 20% identical amino acids when compared to the amino terminal 300 aa of Ad2 fiber, a late structural protein; the or-f extends from nt 1760 to 3629 (Fig. 1; sequence not shown; Raviprakash et a/., 1989). A third MAV-1 or-f extended from nt 88 to 415 (sequence not shown; Raviprakash et a/., 1989) and had 5 1% similarity to the carboyxl portion of the Ad2 33K phosphoprotein (Oosterom- Dragon and Anderson, 1983).

DNA sequence similarity to Ad2 was detected within the coding region corresponding to pVlll(50% identity). A possible MAV-1 leader sequence was noted by DNA sequence similarity to the Ad2 x-leader (65% identity over 90 nt) from nt 1022 to 1122, but identification of this or other late mRNA exons requires analysis of late mRNAs.

DISCUSSION

We have established the transcription map of E3 of MAV-1 by comparing genomic and cDNA sequences, and by Sl nuclease and primer extension assays. The identification of MAV-1 E3 was initially made by anal- ogy to human adenoviruses, in which E3 is located about m.u. 80 and transcribed at early times in the rightward direction. The data suggested that MAV-1 E3, while it was transcribed rightward from approximately the same map coordinate, differed significantly from the E3 of human adenoviruses.

Late protein coding sequences were identified within The entire MAV-1 E3 transcription unit covered only and adjacent to the sequence shown in Fig. 1. An or-f 1000 bp, in contrast to about 3300 bp in Ad5. Three with 41% identity to an Ad2 late structural protein, major classes of alternatively spliced early MAV-1

88 BEARD ET AL.

mRNAs were observed; there are 7 to 10 identified early mRNAs in various human adenoviruses (Chow et a/., 1979; Cladaras eta/., 1985). The MAV-1 E3 mRNAs were 5’and 3’coterminal, while in Ad2/5 the E3 mRNAs share 5’ ends but have at least two polyadenylation sites: the E3A poly(A) site at about 83 m.u. and the E3B polyadenylation site at about 86 m.u. (Berk and Sharp, 1978; Chow et a/., 1979; Cladaras and Wold, 1985). Furthermore, the MAV-1 E3 mRNAs were also appar- ently 3’coterminal with late mRNAs, in contrast to Ad2/ 5 where L4 terminates in the first intron of the E3 mRNAs.

The alternative splicing of the MAV-1 E3 mRNAs could have a functional significance in viral pathogenesis. Many of the DNA tumor viruses exhibit complex splicing in cell culture infections; however, the extent and relevance of RNA processing in natural infections is unknown. Bovine papillomavirus type 4 (BPV-4) is ca- pable of complex splicing, and BPV-4 mRNAs are found which qualitatively and quantitatively vary among the papillomas and BPV-4transformed cells that have been examined (Camp0 and Jarrett, 1987). It is not known whether this differential expression of BPV-4 in various cell types is significant for BPV-4 pathogenesis. Alternative splicing in human adenovirus El A leads to synthesis of two different primary translation products, the 289- and 243-aa proteins (Baker and Ziff, 1981; Perricaudet et a/., 1979). The 243-aa protein is dis- pensable for growth of the virus in HeLa cells, but is required for maximal replication in growth-arrested WI- 38 cells (Monte11 et a/., 1984; Spindler et a/., 1985). Thus production of alternatively spliced human adenovirus ElA mRNAs enables the virus to expand its host range. Study of MAV-1 E3 should allow analysis of the consequences of alternative splicing in vivo. For exam- ple, mutant viruses could be constructed in which a single cDNA sequence replaces the wild-type genomic E3 sequence. The in vivo pathogenesis of these mutants, that should express only one class of E3 mRNA, could be examined after infection of mice.

Predicted protein coding sequences from the MAV-1 mRNAs and other orfs in this region were analyzed and a slight similarity to gpl9K was detected. In addition orfs with similarity to late structural proteins pVlll and fiber were identified just upstream and downstream of the E3 mRNAs, respectively. A predicted or-f with similarity to the carboxyl terminus of the 33K late nonstructural protein was also observed. Similarity to other known adenovirus proteins or orfs was not detected; notably, sequences corresponding to the coding regions of the 3’portion of Ad2/5 E3, i.e., E3B, were not found in MAV-1. These coding regions may either have been deleted at some time from the MAV-1 ge-

nome or inserted into the human adenovirus genome after evolutionary divergence of these viruses.

The MAV-1 predicted coding region that has similarity to the gpl9K is interrupted by two introns, whereas the human coding region is uninterrupted. Also, a sequence of 18 aa, strongly conserved among Ad2,3,5, and 35 (Flomenberg et al., 1988) is not observed in the predicted MAV-1 protein, and only one of the conserved cysteines is found. It has been reported that nei- ther disulfide bridges between the 19K glycoprotein and MHC class I antigens nor carbohydrates are important for the gplSK-class I interaction (Burger-t and Kvist, 1987). Therefore, it is not clear what level of sequence conservation is important for predicting homologous function of human and mouse adenovirus E3 proteins. In MAV-1 El A, although protein sequences have diverged considerably from Ad2/5, short conserved regions are present and the El A transactivating function is conserved (Ball et a/., 1988).

In Ad2 the mRNAs which encode the 19K glycoprotein are the most abundant of the early mRNAs (Chow et a/., 1979); the gpl9K is the major early protein early after infection (Persson et a/., 1979; Wold et al., 1985). It was interesting that the MAV-1 class 3 mRNA which predicted a protein with the most similarity to gpl9K was not the major MAV-1 E3 mRNA. There are several possible explanations for this result. In MAV-1, the protein predicted by the class 3 mRNAs may only be required in small amounts; the translation efficiency of the MAV-1 E3 class 3 mRNA may be relatively high; or the amounts of mRNAs produced in vitro may differ from amounts of those found in vivo. Alternatively, the predicted protein may not be homologous to any human E3 proteins and therefore relative amounts of mRNA are irrelevant. In any case, because MAV-1 E3 differs so much from human adenovirus E3, it is difficult to predict the functions of the MAV-1 E3 proteins based on sequence similarity to human adenovirus E3 proteins.

Many of the conserved human adenovirus E3 proteins are not absolutely required for growth either in vitro or in vivo (Ginsberg et al., 1989; Kelly and Lewis, 1973). In vivo evidence supporting this comes from experiments involving infection of cotton rats with human Ad2/5 E3 viral mutants (Ginsberg et a/., 1989). Viruses with mutations in gpl9K or the 14.7K protein were in- fectious, although they had qualitatively different phe- notypes from wild type and from each other. Even viruses lacking the entire E3 were able to replicate in vivo. Ginsberg et al. (1989) propose that the human adenovirus E3 proteins moderate an adenoviral infection, lessening its severity, and are not essential for a productive infection. If this hypothesis of the function of adenovirus E3 proteins is true in mouse adenovirus,

MOUSE ADENOVIRUS E3 TRANSCRIPTION 89

the lack of many genes in MAV-1 corresponding to human adenovirus E3 proteins may explain its highly pathogenic phenotype: MAV-1 produces a virulent sys- temic and fatal infection for newborn mice (Hartley and Rowe, 1960; Heck er al., 1972; van der Veen and Mes, 1973; Ball and Spindler, unpublished).

With the identification of MAV-1 E3, the role played by the MAV-1 E3 proteins both in in viva and in vitro infections can now be studied with mutant viruses. In addition, it may be possible to study the human adenovirus E3 products which lack cognates in MAV-1 by inserting them into the MAV-1 genome.

ACKNOWLEDGMENTS

We are grateful to Drs. Marshall Horwitz and K. S. Raviprakash for communicating their genomic sequence of the MAV-1 HindIll-C fragment prior to publication. We thank Drs. Bert Semler and Suzanne Thlem for helpful comments on the manuscript. We thank Leslie Teugh Freeman and Elizabeth A. Justin for hybridization analyses. We are grateful to Keena Lowe for editorial assistance. This work was supported by Public Health Service Grant Al 23762 from the National Institutes of Health. K.R.S. is the recipient of an Ameri- can Cancer Society Junior Faculty Award.

REFERENCES

ANDERSSON. M., P%Bo, S., NILSSON, T., and PETERSON, P. A. (1985). Impaired intracellular transport of class I MHC antigens as a possible means for adenoviruses to evade immune surveillance. Ce//43, 215-222.

BAKER, C. C., and ZIFF, E. B. (1981). Promoters and heterogeneous 5’termini of the messenger RNAs of adenovirus serotype 2. /. Mol. No/. 149, 189-221.

BALL, A. O., BEARD, C. W., REDICK, S. D., and SPINDLER, K. R. (1989). Genome organization of mouse adenovirus type 1 early region 1: A novel transcription map. Virology 170, 523-536.

BALL, A. 0.. WILLIAMS, M. E., and SPINDLER, K. R. (1988). Identification of mouse adenovirus type 1 early region 1: DNA sequence and a conserved transactivating function. /. Viral. 62,3947-3957.

BARTOK, K., GARON, C. F., BERRY, K. W., FRASER, M. j., and ROSE, J. A. (1974). Specific fragmentation of adenovirus heteroduplex DNA molecules with single-strand specific nucleases of Neuro- spora crassa. J. Mol. Biol. 87,437~449.

BECK. S., VIRTANEN. A., ZABIELSKI, J., RUSVAI, M., BERENCSI. G., and PETTERSSON, U. (1986). Subtypes of bovine adenovirus type 2 exhibit malor differences in region E3. Virology 153, 262-271,

BERK, A. J., and SHARP, P. A. (1978). Structure of the adenovirus 2 early mRNAs. Cell 14, 695-711.

BHAT, B. M., and WOLD, W. S. M. (1986). Genetic analysis of mRNA synthesis in adenovirus region E3 at different stages of productive infection by RNA-processing mutants. /. Viral. 60, 54-63.

BURGERT. H.-G., and KVIST, S. (1985). An adenovirus type 2 glycoprotein blocks cell surface expression of human histocompatibility class I antigens. Cell 41, 987-997.

BURGERT. H.-G., and KVIST, S. (1987). The E3/19K protein of adenovirus type 2 binds to the domains of histocompatibility antigens required for CTL recognition. EMBO/. 6, 201 g-2026.

CAMPO, M. S., and JARRETT, W. F. H. (1987). Papillomaviruses and disease. In “Molecular Basis of Virus Disease” (W. C. Russell and I. W. Almond, Eds.), pp. 215-243. Cambndge Univ. Press, Cam- bridge.

CARLIN, C. R., TOLLEFSON, A. E., BRADY, H. A., HOFFMAN, B. L., and WOLD, W. S. M. (1989). Epidermal growth factor receptor is down- regulated by a 10,400 MW protein encoded by the E3 region of adenovirus. Cell 57, 135-l 44.

CHOW, L. T., BROKER, T. R., and LEWIS, J. B. (1979). Complex splicing patterns of RNAs from the early regions of adenovirus-2. /. Mol. Biol. 134, 265-303.

CLADARAS, C., BHAT, B., and WOLD, W. S. M. (1985). Mapping the 5’ ends, 3’ ends, and splice sites of mRNAs from the early E3 transcription unit of adenovirus 5. Virology 140, 44-54.

CLADARAS, C., and WOLD, W. S. M. (1985). DNA sequence of the early E3 transcription unit of adenovirus 5. Virology 140, 28-43.

FICKE~, J. W. (1982). Recognition of protein coding regions in DNA sequences. Nucleic Acids. Res. 10, 5303-5318.

FLOMENBERG. P. R., CHEN, M., and HORWITZ, M. S. (1988). Sequence and genetic organization of adenovirus type 35 early region 3. /. Viral. 62,443 l-4437.

GARON, C. F., BERRY, K. W., HIERHOLZER, J. C., and ROSE, J. A. (1973). Mapping of base sequence heterologles between genomes from different adenovirus serotypes. Virology 54,414-426.

GINSBERG, H. S., LUNDHOLM-BEAUCHAMP, U., HORSWOOD, R. L., PER- NIS, B., WOLD, W. S. M., CHANOCK, R. M., and PRINCE, G. A. (1989). Role of early region 3 (E3) in pathogenesis of adenovirus disease. Proc. Natl. Acad. Sci. USA 86, 3823-3827.

GOODING, L. R., ELMORE. L. W., TOLLEFSON, A. E., BRADY, H. A., and WOLD, W. S. M. (1988). A 14,700 MW protein from the E3 region of adenovirus lnhlblts cytolys~s by tumor necrosis factor. Cell 53, 341-346.

GUBLER, U., and HOFFMAN, B. 1. (1983). A simple and very efficient method for generating cDNA libraries. Gene 25,263-269.

HANSEN, U.. TENEN, D. G., LIVINGSTON, D. M., and SHARP, P. A. (1981). T antigen repression of SV40 early transcription from two promoters. Cell 27,603-6 12.

HARTLEY, J. W., and ROWE, W. P. (1960). A new mc*Ise virus appar- ently related to the adenovirus group. Virology 11, 645-647.

HECK, F. C., JR., SHELDON, W. G., and GLEISER. C. A. (1972). Pathogen- esis of experimentally produced mouse adenovirus infection In mice. Amer. J. Vet. Res. 33,841-846.

HENIKOFF, S. (1984). Unidirectional digestion with exonuclease Ill cre- ates targeted breakpoints for DNA sequencing. Gene 28, 351- 359.

H~RISS~, J., COURTOIS, G., and GALIBERT, F. (1980). Nucleotide sequence of the EcoRl D fragment of adenovirus 2 genome. Nucleic Acids Res. 8, 2 173-2 192.

H~RISS~, J., and GALIBERT, F. (1981). Nucleotide sequence of the EcoRl E fragment of adenovirus 2 genome. Nucleic Acids Res. 9, 1229-l 240.

HJORTH, R. N., BONDE, G. M., PIERZCHALA, W. A., VERNON, S. K., WIE- NER, F. P., LEVNER, M. H., LUBECK, M. D., and HUNG, P. P. (1988). A new hamster model for adenoviral vaccination. Arch. Viral. 100, 279-283.

K;~MPE, O., BELLGRAU, D.. HAMMERLING, U., LIND, P.. Piiii~o, S., SEVERINSSON, L., and PETERSON. P. A. (1983). Complex formation of class I transplantation antigens and a viral glycoprotein. 1. Biol. Chem. 258, 10,594-l 0,598.

KELLY, T. J., JR., and LEWIS, A. M., JR. (1973). Use of nondefective adenovirus-simian virus 40 hybrids for mapping the simian virus 40 genome. /. Viral. 12, 643-652.

KOZAK, M. (1986). Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribo- somes. Ce//44,283-292.

KVIST, S., &BERG, L., PERSSON. H., PHILIPSON. L., and PETERSON, P. A. (1978). Molecular association between transplantation anti-

90 BEARD ET AL.

gens and cell surface antigen in adenovirus-transformed cell line. Proc. Natl. Acad. Sci. USA 75, 5674-5678.

KYTE, J., and DOOLIITLE, R. F. (1982). A simple method for displaying the hydropathic character of a protein. J. Mol. Viol. 157, 105-l 32.

MAXAM, A. M., and GILBERT, W. (1980). Sequencing end-labeled DNA with base-specific chemical cleavages. ln “Methods in Enzymol- ogy” (L. Grossman and K. Moldave, Eds.), Vol. 65, pp. 499-560. Academic Press, San Diego.

MCL~UCHLAN, J., GAFFNEY, D., WHITTON, J. L., and CLEMENTS, J. 8. (1985). The consensus sequence YGTGTTYY located downstream from the AATAAA signal is required for efficient formation of mRNA 3’termini. Nucleic Acids Res. 13, 1347-l 368.

MONTELL, C., COURTOIS, G., ENG, C., and BERK, A. (1984). Complete transformation by adenovirus 2 requires both ElA proteins. Cell 36,951-961.

MORIN, J. E., LUEECK, M. D., BARTON, J. E., CONLEY, A. J., DAVIS, A. R., and HUNG, P. P. (1987). Recombinant adenovirus induces antibody response to hepatitis B virus surface antigen in hamsters. Proc. Nat/ Acad. Sci. USA 84,4626-4630.

OOSTEROM-DRAGON, E. A., and ANDERSON, C. W. (1983). Polypeptide structure and encoding location of the adenovirus serotype 2 late, nonstructural 33K protein. J. Viral. 45,251-263.

PZ~BO, S., NILSSON, T., and PETERSON, P. A. (1986). Adenoviruses of subgenera B, C, D, and E modulate cell-surface expression of major histocompatibility complex class I antigens. Proc. Nat/. Acad. SC;. USA 83,9665-9669.

P;iii~o, S., WEBER, F., KXMPE, O., SCHAFFNER, W., and PETERSON, P. A. (1983). Association between transplantation antigens and a viral membrane protein synthesized from a mammalian expression vec- tor. Cell 35445-453.

PERRICAUDET, M., AKUSJARVI, G., VIRTANEN, A., and PETERSSON, U. (1979). Structure of two spliced mRNAs from the transforming region of human subgroup C adenoviruses. Nature (London) 281. 694-696.

PERSSON, H., SIGN&, C., and PHILIPSSON, L. (1979). Purification and characterization of an early glycoprotein from adenovirus type 2- infected cells. 1. Viral. 29, 938-948.

PETERSSON, U., TIEIBE~S, C., and PHILIPSON, L. (1976). Hybridization maps of early and late messenger RNA sequences on the adenovirus type 2 genome. J. Mol. Biol. 101, 479-50 1.

PROUDFOOT, N. J., and BROWNLEE, G. G. (1976). 3’ non-coding region sequences in eukaryotic messenger RNA. Nature (London) 263, 211-214.

RAVIPRAKASH, K. S.. GRUNHAUS, A., EL KHOLY, A., and HORWITZ, M. S. (1989). The mouse adenovirus type 1 contains an unusual E3 region. J. Virol. 63, 5455-5458.

SANGER, F., NICKLEN, S., and COULSON, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proc. Nat/. Acad. Sci. USA 74, 5463-5467.

SEVERINSSON, L., and PETERSON, P. A. (1985). Abrogation of cell surface expression of human class I transplantation antigens by an adenovirus protein in Xenopus laevis oocytes. J. Cell Biol. 101, 540-547.

SIGN&, C., AKUSJ~RVI, G., and PETTERSSON, U. (1986). Region E3 of human adenoviruses; differences between the oncogenic adenovirus-3 and the non-oncogenic adenovirus-2. Gene 50, 173-l 84.

SIGN&, C., KATIE, M. G., PERSSON, H., and PHILIPSON, L. (1982). An adenovirus glycoprotein binds heavy chains of class I transplantation antigens from man and mouse. Nature (London) 299,175-l 78.

SPINDLER, K. R., ENG, C. Y., and BERK, A. 1. (1985). An adenovirus early region 1A protein is required for maximal viral DNA replication in growth-arrested human cells. J. Viral. 53, 742-750.

TOLLEFSON, A. E.. and WOLD, W. S. M. (1988). Identification and gene mapping of a 14,700-molecular-weight protein encoded by region E3 of group C adenoviruses. J. Virol. 62,33-39.

VAN DER VEEN, J., and MES, A. (1973). Experimental infection with mouse adenovirus in adult mice. Arch. Gesamte Virusforsch. 42, 235-241.

VON HEIJNE, G. (1985). Signal sequences. The limits of variation. J. Mol. Biol. 184,99-105.

WANG, E. W., SCOTT, M. O., and RICCIARDI, R. P. (1988). An adenovirus mRNA which encodes a 14,700-M, protein that maps to the last open reading frame of region E3 is expressed during infection. J. Virol. 62, 1456-1459.

WOLD, W. S. M., CLADARAS, C., DEUTSCHER, S. L., and KAPOOR, Q. S. (1985). The 19-kDa glycoprotein coded by region E3 of adenovirus: Purification, characterization, and structural analysis. J. Biol. Chem. 260,2424-2431.

WOLD. W. S. M., CLADARAS, C., MAGIE. S. C., and YACOUB, N. (1984). Mapping a new gene that encodes an 11,600-molecular-weight protein in the E3 transcription unit of adenovirus 2. J. Viral. 52, 307-313.

transcription mapping of mouse adenovirus type 1 early region 3

Documents