structural studies of adenovirus type-2 hexon protein

14
Eur. J. Biochem. 48, 179-192 (1974) Structural Studies of Adenovirus Type-2 Hexon Protein Hans JORNVALL, Ulf PETTERSSON, and Lennart PHILIPSON Kemiska Institutionen I, Karolinska Institutet, Stockholm, and Mikrobiologiska Institutionen, Wallenberglaboratoriet, Uppsala (Received February 9 i May 22, 1974) The adenovirus type 2 hexon protein produced in excess during infection was carboxymethylated by reduction and alkylation with iodo[2-14C]acetatein 6 M guanidine hydrochloride. Only one type of polypeptide chain was detected by exclusion chromatography in dissociating buffers. Peptide mapping experiments, sequence analysis of peptides and identification of possible protein terminal residues also suggest that all subunits are similar and probably identical. Structural analysis of ['4C]carboxymethylated hexon and of hexon which was labeled in vivo with [3SS]cysteine revealed six unique cysteine derivatives in soluble tryptic peptides. A seventh unique cysteine derivative, only partly carboxymethylated, was present in the tryptic core material and was recovered after chymotryptic digestion. Seven unique cysteine derivatives in the hexon subunit, correspond to a minimum molecular weight of about 100000 for the subunit. This value is consistent with values obtained from exclusion chromatography in guanidine hydrochloride, from mercurial titration of an accessible thiol group in the intact protein and from peptide mapping experiments. Peptides corresponding to more than one tenth of the total number of residues in the protein chain were characterized and, with the inclusion of other peptides studied, about one quarter of all residues are accounted for. The results exclude extensive regions of identity in primary structure within the hexon subunits and the partial resistance towards attack by proteolytic enzymes is structurally explained. The adenoviruses are medium-sized icosahedral DNA viruses, some of which are oncogenic. The hexon is the main capsomer, occurring at 240 non-vertex positions in the capsid and contains one of at least ten [1,2] viron polypeptides. It was the first animal- virus protein to be crystallized [3]. It is soluble in native form and is produced in excess during viral infection. Widely different molecular weights have been reported for the intact hexon as well as for its subunits (mainly within values of 200 000 - 400 000, and 60000- 120000, respectively, for a review, see [4]). The hexon subunit thus contains a large polypeptide, the maximum value of which accounts for the coding capacity of about 11 % of the entire viral genome. Recent analysis of sedimentation velocity in the absence or presence of guanidine-HC1, of equilibrium sedimentation, of crystallographic parameters, and Abbreviations. Tos-PheCH,Cl, L-l-tosylamido-2-phenyl- ethyl chloromethyl ketone; Tos-LysCH,Cl, l-chloro-3-tosyl- amido-7-amino-2-heptanone; dansyl, S-dimethylaminonaph- thalene-l-sulphonyl. Enzymes. Chymotrypsin (EC 3.4.21.1); trypsin (EC 3.4.21.4); thermolysin (EC 3.4.24.4); staphylococcal protease (EC 3.4.99). of dodecylsulphate-polyacrylamide gel electrophoresis indicate values close to [5] or just above [4] 300000 for the intact protein and 100000 [5] to 120000 [1,6,7] for the subunit. Crystallographic studies also reveal three asymmetric units in the hexon [4,8], which is consistent with results from electron microscopy [9]. Crystallographic data, however, also suggest addi- tional elements of partial two-fold symmetry [lo] (and P.-E. Werner, personal communication) and an earlier preliminary chemical analysis indicated [l 11that hexon might be hexameric with a subunit size considerably smaller than that estimated by physicochemical meth- ods. In the present work, therefore, further analyses have been performed and the primary structure of the protein studied. Subunit sizes may be judged independently of physicochemical methods by determination of the number of certain unique residues in the protein. This method previously resolved uncertainties in the subunit arrangement of some dehydrogenases [ 12,131. Cysteine/half-cystine residues in the hexon protein were therefore characterized in peptides by sequence analysis after modification with I4C-labeled iodo- Eur. J. Biochem. 48 (1974)

Upload: hans-joernvall

Post on 02-Oct-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Structural Studies of Adenovirus Type-2 Hexon Protein

Eur. J. Biochem. 48, 179-192 (1974)

Structural Studies of Adenovirus Type-2 Hexon Protein

Hans JORNVALL, Ulf PETTERSSON, and Lennart PHILIPSON

Kemiska Institutionen I, Karolinska Institutet, Stockholm, and Mikrobiologiska Institutionen, Wallenberglaboratoriet, Uppsala

(Received February 9 i May 22, 1974)

The adenovirus type 2 hexon protein produced in excess during infection was carboxymethylated by reduction and alkylation with iodo[2-14C]acetate in 6 M guanidine hydrochloride. Only one type of polypeptide chain was detected by exclusion chromatography in dissociating buffers. Peptide mapping experiments, sequence analysis of peptides and identification of possible protein terminal residues also suggest that all subunits are similar and probably identical.

Structural analysis of ['4C]carboxymethylated hexon and of hexon which was labeled in vivo with [3SS]cysteine revealed six unique cysteine derivatives in soluble tryptic peptides. A seventh unique cysteine derivative, only partly carboxymethylated, was present in the tryptic core material and was recovered after chymotryptic digestion. Seven unique cysteine derivatives in the hexon subunit, correspond to a minimum molecular weight of about 100000 for the subunit. This value is consistent with values obtained from exclusion chromatography in guanidine hydrochloride, from mercurial titration of an accessible thiol group in the intact protein and from peptide mapping experiments.

Peptides corresponding to more than one tenth of the total number of residues in the protein chain were characterized and, with the inclusion of other peptides studied, about one quarter of all residues are accounted for. The results exclude extensive regions of identity in primary structure within the hexon subunits and the partial resistance towards attack by proteolytic enzymes is structurally explained.

The adenoviruses are medium-sized icosahedral DNA viruses, some of which are oncogenic. The hexon is the main capsomer, occurring at 240 non-vertex positions in the capsid and contains one of at least ten [1,2] viron polypeptides. It was the first animal- virus protein to be crystallized [3]. It is soluble in native form and is produced in excess during viral infection. Widely different molecular weights have been reported for the intact hexon as well as for its subunits (mainly within values of 200 000 - 400 000, and 60000- 120000, respectively, for a review, see [4]). The hexon subunit thus contains a large polypeptide, the maximum value of which accounts for the coding capacity of about 11 % of the entire viral genome.

Recent analysis of sedimentation velocity in the absence or presence of guanidine-HC1, of equilibrium sedimentation, of crystallographic parameters, and

Abbreviations. Tos-PheCH,Cl, L-l-tosylamido-2-phenyl- ethyl chloromethyl ketone; Tos-LysCH,Cl, l-chloro-3-tosyl- amido-7-amino-2-heptanone; dansyl, S-dimethylaminonaph- thalene-l-sulphonyl.

Enzymes. Chymotrypsin (EC 3.4.21.1); trypsin (EC 3.4.21.4); thermolysin (EC 3.4.24.4); staphylococcal protease (EC 3.4.99).

of dodecylsulphate-polyacrylamide gel electrophoresis indicate values close to [5] or just above [4] 300000 for the intact protein and 100000 [5] to 120000 [1,6,7] for the subunit. Crystallographic studies also reveal three asymmetric units in the hexon [4,8], which is consistent with results from electron microscopy [9]. Crystallographic data, however, also suggest addi- tional elements of partial two-fold symmetry [lo] (and P.-E. Werner, personal communication) and an earlier preliminary chemical analysis indicated [l 11 that hexon might be hexameric with a subunit size considerably smaller than that estimated by physicochemical meth- ods. In the present work, therefore, further analyses have been performed and the primary structure of the protein studied.

Subunit sizes may be judged independently of physicochemical methods by determination of the number of certain unique residues in the protein. This method previously resolved uncertainties in the subunit arrangement of some dehydrogenases [ 12,131. Cysteine/half-cystine residues in the hexon protein were therefore characterized in peptides by sequence analysis after modification with I4C-labeled iodo-

Eur. J. Biochem. 48 (1974)

Page 2: Structural Studies of Adenovirus Type-2 Hexon Protein

180 Structural Studies of Adenovirus Hexon Protein

acetate or after labeling of the hexon in vivo with [3sS]cysteine. The degree of identity between the sub- units of the hexon or the existence of possible repetitive sequences which would influence the interpretation of crystallographically observed symmetries was also estimated. In addition, -SH titration of the native protein with chloromercurinitrophenol [ 141, exclusion chromatography under denaturing conditions and peptide mapping experiments were performed.

The results suggest that the hexon subunits are similar, probably identical, and the minimum molec- ular weight is determined. Peptides corresponding to more than one tenth of all residues are characterized. The N-terminal structure is previously known [I 51 and a probable protein C-terminal part is now identified.

MATERIALS AND METHODS

Purification of Hexon

Adenovirus type 2 was grown in spinner cultures of KB cells(161. Virions and soluble antigens were obtained as previously described [17,18]. The fraction containing the soluble antigens was separated on DEAE-cellulose [ 191 and hexons were further purified by preparative polyacrylamide-gel electrophoresis [4], except for some samples for sequence analysis, which were purified by crystallization in order to increase the yields. Crystallization was repeated 3 times in 0.5 M sodium acetate buffer pH 4.6 at 4°C; the crystals were redissolved in 0.1 M sodium phosphate buffer pH 7.0 and the protein precipitated with am- monium sulfate to 50% saturation in between each crystallization step. This method yielded a hexon preparation which was 95 - 98 % pure based on poly- acrylamide-gel electrophoresis with and without do- decylsulphate and produced peptide maps identical to the hexon preparations purified by polyacrylamide electrophoresis. [35S]Cysteine-labeled hexon was pre- pared from cells incubated in a medium containing 10% of the normal cystine concentration but with 1 pCi/ml [35S]cysteine. The isotope was added 16 h after infection and growth continued for an additional period of 24 h [15]. A similar procedure was used to label hexon with [I4C]arginine in vivo.

Titration of an Accessible - SH Group with Organic Mercurial

Hexon samples prepared as described above were titrated directly or after pretreatment with dithio- erythritol. Hexon prepared from buffers containing dithioerythritol (1 mM) in all solutions during pu-

rification, was also analyzed after full or partial removal of added thiol excess by dialysis, as indicated. All titrations were performed with the protein (about 1 mg/ml in 3 ml) in 0.1 M triethanolamine, 10 mM NaC1, 1 mM EDTA pH 7.9 with HCl[14]. 2-Chloro- mercuri-4-nitrophenol was added in a small volume (2 - 5 p1, containing 0.1 - 5 nmol mercurial/pl) directly to the cuvette and the increase in absorbance at 410 nm was measured. The - SH content was calculated from the curve of absorbance increase versux mercurial added [14].

Carboxymethylation

The freeze-dried non-radioactive protein was dis- solved (10 mg/ml) in 6 M guanidine-HC1, 0.1 M Tris, 2 mM EDTA pH 8.1, and bubbled with nitrogen. Freshly prepared 0.5 M dithioerythritol solution was added (0.5 pl per mg hexon) and the sample flushed with nitrogen again. After reduction for 3 h at 37 "C, 50 mM iod0[2-'~C]acetate solution (30 p1 per pl thiol solution) was added in the same way. Alkylation was performed for 2 h at room temperature and reagents were removed by dialysis against 1 mM HC1.

The 35S-labeled protein was similarily carboxy- methylated but with non-radioactive iodoacetate instead.

Performic-Acid Oxidation

Freeze-dried samples were dissolved in cold formic acid, performic acid added (about 100 p1 per mg protein) and oxidation performed as previously de- scribed [20]. After 4 h, three volumes of cold distilled water was added and reagents removed by dialysis against distilled water, followed by lyophilization.

Analysis of Protein N-Terminal Residues

Samples containing 1 mg hexon were analyzed for N-terminal groups with 35S-labeled phenylisothio- cyanate [21]. Phenylthiohydantoins obtained were mix- ed with unlabeled markers, separated [22] by thin- layer chromatography or paper electrophoresis in 0.02 M potassium phosphate, pH 6.5, eluted with 95 % ethanol and counted in a Beckman scintillation spectrometer.

Amino-Acid Analysis

Samples were hydrolyzed in vacuum at 110 "C with 6 M HCl containing 1 OiOO mercaptoethanol, and analyzed on a Beckman model 120 B amino-acid analyzer. Tryptophan was determined spectrophoto- metrically [23].

Eur. J. Biochem. 48 (1974)

Page 3: Structural Studies of Adenovirus Type-2 Hexon Protein

H. Jornvall, U. Pettersson, and L. Philipson 181

Chromatography under Dissociating Conditions RESULTS

Exclusion chromatography was performed on a column of 4 % agarose (Sepharose 4B, Pharmacia, Uppsala, Sweden) in 5 M guanidine-HC1, 0.01 M EDTA, 0.02 M LiC1, 0.05 M Tris, pH 8.0 [24,25]. Samples of ['4C]carboxymethylated hexon, similarily modified p-galactosidase, bovine serum albumin and hen egg albumin, and colour markers of blue dextran 2000 (Pharmacia, Uppsala, Sweden) and dansyl- alanine were analyzed. Eluted radioactivity was count- ed in a Packard liquid scintillation spectrometer in Instagel ( 3 ml with about 5O-pl sample).

Pep tide Mapping

1 - 2 mg ['4C]carboxymethylated hexon was di- gested with Tos-PheCH,Cl-trypsin (Worthington) in 0.1 M ammonium bicarbonate at 37°C. An enzyme to substrate ratio of 1 : 100 was used and after 4 h the same amount of Tos-PheCH,Cl-trypsin was added again for a second incubation. Samples were then submitted to multidimensional peptide mapping on paper [12,26]. The first dimension was electrophoresis at pH 6.5. The second electrophoresis at pH 3.5 for acidic peptides, chromatography in n-butanol- acetic acid-water-pyridine (15: 3: 12: 10, by vol.) for basic peptides and electrophoresis at pH 1.9, followed by chromatography as above for neutral peptides [26]. Dimensions were changed by stitching appropriate areas onto new papers [27] and resolutions followed by autoradiography after all dimensions. Papers were finally stained with cadmium-ninhydrin [28] or with reagents specific for arginine and tryptophan [29].

Purification and Analysis of Peptides

Samples of ['4C]carboxymethylated hexon were digested with Tos-PheCH,Cl-trypsin as described above, or, alternatively, with Tos-LysCH,Cl-chymo- trypsin (Worthington) under similar conditions. Pep- tide mixtures in each case were fractionated on Sephadex G-50 followed by preparative electropho- resis at different pH and chromatography on paper. Peptides were revealed by autoradiography and by staining guide strips with the ninhydrin reagent. Pure peptides were eluted with water. Total compositions were determined after hydrolysis (see above) for 20- 24 h, and N-terminal residues were obtained after dansylation [30]. For sequence analysis, peptides were analyzed by the dansyl-Edman method [30,31] and often redigested with proteolytic enzymes, as indicated. Dansyl derivatives were identified by thin-layer chro- matography on polyamide sheets as previously de- scribed [32,33].

Amino-Acid Analysis

The amino-acid composition of the hexon was calculated from analysis of duplicate samples which were hydrolyzed for 20 and 72 h. Values for threonine and serine were extrapolated to zero time. For valine and isoleucine the 72-h values were taken to be the final figures. Other residues were determined as the mean values. The results obtained have been reported separately [35] but are given in Table 1, together with other previously reported compositions and the mean of all analyses.

It is evident that especially in the case of the rare residues, different ratios have been reported. For the present investigation this is of little importance but the content of cysteine/half-cystine is critical. It was therefore now determined both as cysteic acid after oxidation and as carboxymethylcysteine after reduc- tion and alkylation, and was in all cases related to each of the four stable residues aspartic acid, glutamic acid, glycine and alanine in order to get a more accurate value. Carboxymethyl-cysteine values obtained were on the average 6 % lower than cysteic acid values, due to incomplete alkylation (below). The cysteic acid values were therefore used in the determination of the total composition (Table 1) and from six different determinations the content of cysteic acid in the oxidized protein was found to be 0.79 & 0.05 mol/ 100 mol.

The minimum size of a subunit, containing 7 cys- teine/half cystine residues (below), would thus seem to comprise about 900 residues corresponding to a molecular weight of around 100000 -t 7000 (Table 1).

Exclusion Chromatography in 5 M Guanidine-HC1

['4C]Carboxymethylated hexon (5 mg) eluted as a single peak when analyzed by exclusion chromatog- raphy under dissociating conditions. This is shown in Fig.1, in which results with 14C-labeled marker proteins are also included.

In order to ascertain that the hexon peak was not composed of incompletely resolved components it was divided into 8 fractions after application of a sample containing 10 mg I4C-labeled hexon. Each fraction was analyzed by peptide mapping. All fractions pro- duced identical peptide patterns and in particular, all peptides with [14C]carboxymethylated cysteine res- idues (see below) were present in all fractions. This suggests that the hexon protein is composed of sub- units that are similar, probably identical in size. From the retention volumes a molecular weight in the order of 100000 was calculated for the subunits.

Eur. J. Biochem. 48 (1974)

Page 4: Structural Studies of Adenovirus Type-2 Hexon Protein

182 Structural Studies of Adenovirus Hexon Protein

Table 1. Amino-acid composition of adenovirus type-2 hexon Value of 0.79 0.05 mo1/100 mol was used for cysteic acid for the calculation of values per 7 cysteic acids; n.d., not determined

Amino acid Composition from -

P S I 1191 [341 [71

Residues17 cysteic Mol.wt/ acids 7 cysteic acids

Average

mo1/100 mol

Cysteic acid 0.8 n.d. n.d. n.d. 0.8 7.0 7 720 Aspartic acid 14.4 14.4 13.1 15.2 14.1 124.9 125 14390 Threonine 7.0 7.3 7.5 7.2 7.2 63.8 64 6470 Serine 6.8 7.2 7.8 6.9 7.1 62.9 63 5 490 Glutamic acid 9.5 9.7 9.9 10.7 9.9 87.7 88 11 360 Proline 6.1 6.5 5.6 6.1 6.0 53.2 53 5 150 Glycine 6.8 7.8 6.8 7.2 7.1 62.9 63 3 590 Alanine 7.1 7.5 7.7 7.9 7.5 66.5 67 4 760 Valine 5.7 5.4 6.0 5.5 5.6 49.6 50 4 960 Methionine 2.7 2.3 1 .0 2.1 2.0 17.7 18 2 360 Isoleucine 3.6 3.4 3.8 3.6 3.6 31.9 32 3 620 Leucine 7.7 7.5 7.9 7.6 7.6 67.3 67 7 580 Tyrosine 5.6 5.0 5.9 5.9 5.5 48.7 49 8 000 Phenylalanine 4.4 4.3 4.8 4.9 4.6 40.8 41 6 030 Tryptophan 1.7" 0.9 1.1 n.d. 1.2 10.6 11 2050 Lysine 4.2 4.4 5.0 4.0 4.4 39.0 39 5 000 Histidine 1.4 1.7 1.8 1.1 1.5 13.3 13 1780 Arginine 4.6 4.7 4.0 4.2 4.3 38.1 38 5 930

Total 100.0 888 i 60 99240 + 6500

a Determined spectrophotometrically [23].

I

Volume (m l )

Fig. 1. Exclusion chromatography of ['4C]curboxymethylated hexon and marker proteins. Sepharose 4B (1.5 x 35 cm) in 5 M guanidine-HCI, 0.01 M EDTA, 0.02 M LiCl and 0.05 M Tris, pH 8.0. [14C]Carboxymethylated hexon (. . . . . .), bovine serum albumin (----), hen egg albumin (-.-.-) and a

mixture of hexon and the two albumins (-). Arrows indicate peak positions of colour markers. b-Galactosidase eluted on this column slightly earlier than the hexon and peaked in fractions corresponding to 24.5 ml

Eur. J. Biochem. 48 (1974)

Page 5: Structural Studies of Adenovirus Type-2 Hexon Protein

H. Jornvall, U. Pettersson, and L. Philipson 183

Table 2. Titration with 2-chloromercuri-4-nitrophenol ofacces- sible - SH groups in different hexon preparations Excess dithioerythritol in reduced samples was removed by dialysis against the triethanolamine buffer before measure- ment, and remaining concentration was determined, indepen- dent of protein - SH groups [14] during titration

Concentration of dithioerythritol present Protein molec- during ular weight per

- - titratable hexon hexon reduction - SH titration - SH group preparation after

preparation

0 - 0 430000 0 - 0 620000 0 - 0 790 000

1000 - 4 110000 1000 - 0 120 000

0 250 0 110000

Titration of an Accessible - SH Group

Three different samples of native hexon, prepared in the absence of added thiols, revealed almost no protein - SH groups titratable with 2-chloromercuri- 4-nitrophenol, as shown in Table 2. When hexon was prepared with 1 mM dithioerythritol during all purifi- cation steps, higher titers of mercurial-binding protein - SH groups were detectable (Table 2) after dialysis at 4°C for 2-5 days to remove or lower the dithio- erythritol concentration. The same titer of protein - SH groups was also observed when native hexon, prepared in the absence of thiols was reduced with 250 nM dithioerythritol/mg protein for 2 h at 37 "C and then dialyzed as above to remove excess thiol.

These titrations indicate the presence of one type of protein - SH groups preferentially accessible to mercurial binding in hexon. It is largely oxidized in ordinary preparations of the protein but apparently readily reduced with thiols (Table 2). The low -SH content decrease the accuracy of the titrations but a protein subunit containing one accessible - SH group would correspond to a molecular weight of about 110000.

Analysis of Hexon N- Terminal Residues No labeled amino acids were obtained when three

different samples of 1 mg each of hexon was analyzed by the micromodification of the Edman procedure using 35S-labeled phenylisothiocyanate.

In control experiments performed on the same molar amount of human fibrinogen, the expected tyrosine and alanine residues [36] were recovered in amounts over 10 to 30-fold of those of any residues

from the hexon preparation. This suggests that hexon lacks free N-terminal residues, supporting the conclu- sion that the previously detected acetyl-blocked N-ter- minus [15] containing the sequence Ac-Ala-Thr-Pro- Ser, which was recovered in a yield of about 50% is derived from all hexon subunits.

Pep tide Mapping

['4C]Carboxymethylated hexon was digested with Tos-PheCH,Cl-trypsin and the peptide mixture sep- arated by multidimensional electrophoresis and chro- matography as described in the Methods section. About 25 basic, 15 neutral and 15 acidic peptides were clearly identified and at least 10 more less intensely stained spots were noticed, thus a total of about 65 spots. The exact number was difficult to estimate, especially since many acidic peptides were poorly separated, weakly stained by ninhydrin or even nin- hydrin-negative, but visualized by autoradiography or specific staining for arginine and tryptophan. This is common for large acidic peptides containing N-ter- minal hydrophobic residues but a variable strength of peptide spots may also be due to peptide bonds partially susceptible to hydrolysis. Both of these explanations were found to be true in the structural investigation (below), which also revealed that some tryptic peptides are insoluble and that several lysine and arginine bonds are completely resistant to attack by trypsin due to a following proline residue (below). Visible peptide spots are therefore likely to represent the minimum number of unique peptides. It is there- fore concluded that the hexon protein contains at least 65 peptide bonds, sensitive to trypsin, corresponding to the minimum number of lysine and arginine res- idues in the protein subunit.

When hexon prepared from virus cultures grown in a medium containing 14C-labeled arginine was similarily analyzed a total of about 30 I4C-labeled peptide spots were seen, again with some variation in intensity. Hence it is conluded that the protein subunit contains at least 30 arginine residues, or rough- ly half the number of lysine and arginine obtained above.

CHARACTERIZATION OF THE NUMBER

l4 C- Labeled Tryp tic Pep tides from [14CjCarboxyrnethylated Hexon

['4C]Carboxymethylated hexon in samples of 40- 150 mg was digested with Tos-PheCH,Cl-trypsin and the peptides fractionated by chromatography on Sephadex G-5O.The recovery of radioactivity from

OF CYSTEINE/HALF-CYSTINE RESIDUES

Eur. J. Biochem. 48 (1974)

Page 6: Structural Studies of Adenovirus Type-2 Hexon Protein

184 Structural Studies of Adenovirus Hexon Protein

a, r

-Ala 5 '" EJ a P

- a

L

I U - u1 - Origin

I I I I I 1 2 3 1 ,

hadion nunber Fig. 2. Gel,filtrution ($a tryptic digest o f [ ' "C]carbo ,~~met l~~l - uted hexon. Sephadex G-50 fine (2.5 x 75 cm) in 17; am- monium bicarbonate. Radioactivity refers to samples of 20 p1 from each tube after application of 40 mg digest. Eluted

material was pooled into four fractions as indicated and sub- mitted to electrophoresis a t pH 6.5 and for peptide G/H also to electrophoresis a t pH 1.9. Autoradiographs obtained are shown below the elution profile

the column was about 95 %. All radioactive peptides were purified on paper by repeated steps of electro- phoresis and chromatography as indicated in the Methods section. The elution profile and electro- phoretic patterns are shown in Fig.2. A total of nine different radioactive peptides (6 acidic, 2 neutral, 1 basic) were obtained and are designated A to I in Fig. 2. Peptides B, F, G and H (Fig. 2) were of varying strength in different preparations, and B was not always detectable. Properties of all radioactive pep- tides are given in Table 3. It is evident that each peptide contains one carboxymethyl-cysteine residue.

The dansyl-Edman method was used for sequence analysis of the intact peptides, and of smaller frag- ments, obtained by digestions with chymotrypsin or a staphylococcal protease preferentially cleaving at glutamyl bonds [37,38]. The results of sequence ana- lysis are given in Table 4, in which secondarily produc- ed fragments are also shown. Data for fragments which are necessary to establish the structures of the two large peptides A and B are given in Table 5, and for those from remaining peptides in Table 6. All other

peptides obtained, indicated in Table 4, were also analyzed and the results agree with the structures deduced. Amide groups were determined from the electrophoretic mobilities at pH 6.5 [39] of peptides, directly and in some cases after and before Edman steps removing dicarboxylic residues.

The structures show that three of the nine peptides are not unique but are derived from three other frag- ments due to peptide bonds partially sensitive to tryptic cleavage. Thus, peptide I is a shorter segment of H (due to a lysyl bond split in most but not all substrate molecules), while peptides B and G are shorter fragments of peptides A and F, respectively (due to tyrosyl bonds partially sensitive to tryptic cleavage), as shown in Table 4. The nine peptides A-I thus represent six unique carboxymethylcysteine res- idues in the hexon protein.

Peptides A, E and F presented special problems in sequence analysis. Thus, in spite of two tyrosine, one phenylalanine and two leucine residues in pep- tide A (Table 3) it was difficult to obtain small frag- ments for sequence analysis by digestion with chymo-

Eur. J. Biochem. 48 (1974)

Page 7: Structural Studies of Adenovirus Type-2 Hexon Protein

H. Jornvall, U . Pettersson, and L. Philipson 185

Table 3. Data jor ‘‘C-labeled tryptic peptides recovered from [’4C]carhoxymethylated hexon Compositions are calculated from values obtained after acid hydrolysis for 24 h, without corrections for impurities, destruction or incomplete liberation of residues. Recovery of peptides A, B and F - I is variable (see text)

Peptide A B C D E F G H I

No. of purification steps 3 3 5 4 3 5 5 5 5 ~ ~~~~ . -

Recovery ( x ) 10 4 19 23 21 9 3 3 13

Electrophoretic ~~~ . ~ ~ ~~~~~~~~~~

mobility at pH 6.5 [39] 0.38 0.57 0.16 0.39 0.64 - 0.21 0 0 0.36

. _________.. . - _ _ _ _ _ ~ .- ~~ .~ ~~ ~________

Amino-acid composition Cm-C ysteine 0.8 (1) 0.8 (1) 0.8 (1) 0.9 (1) 0.8 (1) 1.0 (1) 0.8 (1) 0.9 (1) 0.9 (1) Aspartic acid 4.2 (4) 4.0 (4) 2.2 (2) 3.1 (3) 2.2 (2) - - - - 1.1 (1) 1.0 (1)

1.0 (I) 1.1 (I) 1.8 (2) 1.9 (2) - - - -

0.8 (1) 2.0 (2) 1.9 (2) - - - -

Threonine 2.7 (3) 2.6 (3) - -

Serine 0.4 - 0.3 - - - 1.2 (1) 2.1 (2) 0.9 (1) 0.8 (1) 0.4 - - -

Glutamic acid 4.3 (4) 3.3 (3) - - 1.9 (2) 3.8 (4) - -

Proline 2.1 (2) 1.9 (2) 0.8 (1) - -

Glycine 4.1 (4) 3.9 (4) 2.0 (2) 1.9 (2) 2.2 (2) 1.1 (I) 1.0 (I) - -

Alanine 1.3 (I) 0.3 - 1.9 (2) 1.1 (I) 1.2 (1) 1.0 (I) - - Valine 1.1 (I) 0.9 (I) 2.0 (3)” 1.7 (2) - -

0.8 (1) - -

Isoleucine 2.5 (4)” 1.8 (3)” 0.8 (I) - -

Leucine 1.8 (2) 2.1 (2) 1.9 (2) - -

Tyrosine 1.9 (2) 2.0 (2) 0.9 (1) 1.0 (1) - -

Phenylalanine 0.8 (I) 1.1 (1 ) - -

Histidine 0.9 (I) 0.8 (1) - -

- - ~- - -

- -

- - - -

- - - - - - - -

- - - - - - 1.0 (I) 0.8 (I) - - - -

1.8 (2) 2.0 (2) ~ - - -

- - 0.8 (1) 0.8 (1) Lysine 1.1 (I) - - - - 0.8 (1) - - 1.0 (1) 1.1 (1) 1.0 (1) - -

Methionine - - - - - - - - - -

- - 0.8 (1) 0.9 (1) - - - -

- - - - - -

- - - - - - - - - - - -

Arginine - - - - 1.1 (1) - - 1.0 (1) 0.9 (1) - - 0.9 (1) 1.0 (1) ~

Total 31 27 16 16 15 13 11 6 5

N-terminus Isoleucine Isoleucine Valine Serine Glycine (Threo- (Threo- Lysine Phenyl- nine) nine) alanine

~ _ _ -

~~

a Sequence analysis reveals that peptides A and B contain an Ile-Ile bond and peptide C a Val-Val bond. Incomplete hydrolysis of these bonds explain the low recoveries of isoleucine and valine, respectively.

Recovery of N-terminal residue in peptides F and G is low (see text).

trypsin. In fact, peptide B (= Acl) and Ac2 (Table 4) were the main products except after extensive treat- ment, when peptides Ac3-5 were also obtained. The presence of proline, carboxymethylcysteine and gly- cine at positions following leucine and aromatic res- idues in peptide A (Table 4) explain the decreased sensitivity towards chymotryptic digestion. The staph- ylococcal protease was therefore of great value to produce smaller fragments. Peptide E contains six dicarboxylic residues apart from carboxymethylcys- teine and three of these occur consecutively (Table 4). The staphylococcal protease was also in this case useful due to its specificity for glutamyl bonds [37,38]. Peptide F (and G) failed to reveal a distinct N-ter- minus by the dansyl method, although threonine was obtained in low yield. Subtractive Edman degrada- tion [40] also indicates that the N-terminal residue should be threonine, since this is the only residue lost after one cycle of degradation. The explanation of the unusual behaviour of this terminus is unknown

but an alternative residue or side-chain substitution may not be excluded at this position. It may also be noticed that peptides F and G have a lysyl bond resistant to tryptic hydrolysis due to a following proline residue.

The elution profile from the Sephadex column varied slightly with different preparations. Four elution profiles of tryptic fragments obtained from [‘“CI- carboxymethylated hexon digests are shown in Fig. 3, together with one curve from a 35S-labeled derivative (cf. below). All profiles, obtained from different preparations and column loads, are similar, with a minute peak (I in Fig. 3) preceeding a complex of three peaks (11, I11 and IV, Fig.3). The main difference involves the ratio of peaks I11 and IV. Similar electro- phoretic pictures (Fig. 2) were, however, obtained from all profiles, in spite of the relative differences in the heights of peaks I11 and IV. Additional radio- activity in fraction IV must therefore be distributed in several fragments, undetectable as distinct peptide

Eur. J. Biochem. 48 (1974)

Page 8: Structural Studies of Adenovirus Type-2 Hexon Protein

186 Structural Studies of Adenovirus Hexon Protein

Table 4. Amino-acid sequences of ['4C]carboxymethylated tryptic peptides and related secondary fragments Peptides obtained by redigestions are denoted by addition of c, for chymotrypsin, and p, for staphylococcal protease, to the name of the parent peptide. Cleavages obtained by these enzymes are indicated by arrows, marked C and P, respectively, in which case solid-line arrows indicate high sensitivity towards proteolysis. - Indicates residue determined by the dansyl-Edman method, residues within parentheses were recovered in low yield. 4 Indicates recovery of this derivative even without hydrolysis after dansylation, proving that it is derived the C-terminus

t Pept ides

F and G

and

r e l a t e d I

( T h r ) - T h r - P r o - M e t - L y s - P r o - C y s ( C m ) - ( T y r - G l y - S ~ ~ - T y ~ ) - A l ~ - A ~ ~ ,

I 23

I k F c 2 4

- 2 2 2 2 [2 ..-.A 2) - - - - - - - c-1 - peptides G = F c i

~~

Pept ides

H and I Lys-Phe-Leu-Cys(Cm)-Asp-Arg. ---\------1 -H and

I

r e l a t e d --LA* -1- pept ides

bands. The relative differences in fractions 111 and IV produce increased amounts of peptides B and G, reflect variable extents of proteolytic digestions. More indicating more drastic proteolytic digestions. extensive digestion yields a decrease in fraction I11 and Peak I is a minor component (Fig.3), containing increase in fraction IV, since the fragments produced considerably less radioactivity than that corresponding are shorter. Noticeably, samples yielding a large to one unique cysteine residue (fraction 11, containing fraction IV also give a small fraction I (below) and the cysteine derivative of peptides A and B) in the

Eur. J. Biochem. 48 (1974)

Page 9: Structural Studies of Adenovirus Type-2 Hexon Protein

H. Jornvall, U. Pettersson, and L. Philipson 187

Table 5. Data,for fragments f rom peptide A Total compositions were determined as in Table 3

Peptide Ac2 Ac4 Ac5 AP1 AP2 AP3 AP5 ~~

Electrophoretic mobility at pH 6.5 [39] - 0.45 0.45 0.30 0.52 0.61 - 0:33 0.37

Amino-acid composition Cm-C ysteine Aspartic acid Threonine Serine Glutamic acid Proline GI ycine Alanine Valine Isoleucine Leu c i n e Tyrosine Phen ylalanine Lysine Histidine

- -

0.9 (I) - -

- -

1.0 (I) - -

0.8 (I) - -

- - 1.2 (I) - -

0.8 (1)

1.6 (2) 1.2 (1)

0.4 - - - 1.2 (I) 3.3 (3)

1.0 (I)

1.1 (1)

- -

0.8 (1)

0.8 (1) 0.9 (1) - -

- -

1.2 (I) 1.9 (2)

- -

3.1 (3) 0.3 - 1.1 (1) 0.9 (1)

0.9 (I) - -

- -

- -

- -

0.8 (I) 4.0 (4) 1.9 (2)

- -

- - - - - - - -

1.0 (1) 2.2 (2) 2.1 (2)

1.1 (I)

- -

4.2 (4) - -

- - - -

- - 1.0 (2)" 0.9 (I)

1.8 (2) 0.8 (I) 0.8 (1)

- -

- -

- -

- - - -

1.0 (1) - -

- - 0.9 (I) - -

1.1 (I)

1.1 (I)

- - - -

- -

0.9 (I) - -

1.0 (I) - -

0.9 ( I ) - -

0.8 (I) 2.3 (2) 1.0 (1)

2.1 (2)

0.4 - 0.3 -

2.8 (3)

0.8 (I) 0.9 (1) 1.7 (2) 0.9 (1) 0.8 (1)

- -

- -

- -

Total 4 13 9 3 22 6 15

N-terminus Glutamine Cm-Cysteine Glycine Isoleucine Aspartic acid Threonine Leucine

a Sequence analysis reveals the peptide to contain an Ile-Ile bond, explaining the low recovery of isoleucine.

Table 6. Data for fragments from peptides C- F Total compositions were determined as in Table 3

Peptide Ccl Cc2 Dcl Dc2 Dc3 Ecl Ec2 Epl Ep2 Ep3 Fc2

Electrophoretic mobility at pH 6.5 [39] 0.47 - 0.36 0.66 0 - 0.48 0.58 0.61 0.70 0.33 0 - 0.70

Amino-acid composition Cm-C ysteine 0.9 (1) - - - - 0.8 (1) - - 0.8 (1) - - 0.8 (1) - - - - - - Aspartic acid 1.1 (1) 1.0 (1) 1.2 (1) 2.1 (2) - - 1.0 (1) 1.1 (1) 1.2 (1) 1.2 (1) 1.1 (1) - -

1.2 (1) 0.9 (1) - - 1.0 (1) - - 1.0 (1) - - - - - - - - - - Threonine Serine 0.3 - - - 1.0 (1) 0.3 - - - 1.3 (1) 1.2 (1) 1.0 (1) 1.2 (1) 1.0 (1) - -

Glutamic acid - - - - 1.1 (1) 1.2 (1) - - 1.0 (1) 2.9 (3) 2.1 (2) 2.0 (2) - - - - 0.8 (1) - - 0.8 (1) - - - - - - 0.9 (1) - - - - - - - - Proline

G 1 y c i n e 1.1 (1) 1.2 (1) 1.9 (2) - - - - 1.0 (1) 1.2 (1) 1.0 (1) 1.1 (1) 1.0 (1) - -

Alanine 1.0 (1) 1.1 (1) - - 1.2 (1) - - 0.9 (1) - - 0.9 (1) - - - - 1.1 (1) Valine 2.0 (3)" - - 0.9 (1) 1.0 (1) - - - - - - - - - - - - - -

Methionine - - 0.9 (1) - - - - - - - - - - - - - - - - - - Isoleucine 0.9 (1) 0.9 (1) - - - - - - - - - - - - - - - - - - Leucine

0.9 (1) - - - - - - - - - - - - - - - -

0.9 (1) 1.0 (1) - - - - - - - - - - - - Lysine Arginine - - 0.9 (1) - - - - - - - - 0.9 (1) - - 0.9 (1) 0.9 (1) 1.0 (1)

0.8 (1) 0.8 (1) - - - - - - - - - - - - - - - - - -

Tyrosine 0.8 (1) - - - - - - - -

Total 10 6 7 9 3 7 8 8 7 4 2

N-terminus Valine Iso- Serine Aspar- Methio- Glycine Glu- Glycine Gluta- Aspartic Alanine leucine agine nine tamic mine acid

acid

a Sequence analysis reveals the peptide to contain a Val-Val bond, explaining the low recovery of valine.

Eur. J . Biochem. 48 (1974)

Page 10: Structural Studies of Adenovirus Type-2 Hexon Protein

Structural Studies of Adenovirus Hexon Protein 188

m 9 2 OIO " W I

A

I \

B I \ 96

/J '* C

94 OI0 LL Fig. 3. Elution profiles , j iom gel filtrations o j Jive diferent tryptic digests of carhoxymethylated hexon. Sephadex G-50 (fine) in 1 ammonium bicarbonate. Different column loads and sizes. Radioactivity and volume scales reduced to com- parative relative sizes in each case. Recovery of radioactivity from each column is indicated. A, B, D and E : tryptic digests of ordinary hexon carboxymethylated with iod0[2-'~C]acetate. (C) Tryptic digest of hexan labeled in vivo with [35S]cysteine and carboxymethylated with non-radioactive iodoacetate

protein. Furthermore, digestion with chymotrypsin of material from fraction I, followed by peptide mapping, reveals several weak radioactive peptide spots with mobilities similar to those obtained from chymotryptic digests of the whole hexon (see below). Fraction I is therefore concluded not to contain a unique cysteine derivative but to contain large in- completely digested material. consistently, extensive digestions produce a small fraction I11 and also a small peak in fraction I.

The characterized tryptic peptides, representing six unique cysteine derivatives (Table 4), thus account for the total number of carboxymethylcysteine residues in the soluble 14C-labeled tryptic peptides. In order to estimate the total number of cysteine/half-cystine residues in the protein the existence of insoluble peptides and the extent of [14C]carboxymethylation must be evaluated. Alternative digests of [14C]carboxy- methylated hexon were therefore studied in order to avoid uncertainties due to insoluble peptides. Hexon grown in the presence of 35S-labeled cysteine was also

Eff luent (rnl) 800 1000 1200 1400 1600 1800

I ' I I E

-' 7 1 I I I I I I I:".. . . I I . .

, . . , . . E r

1 2 3 4 5 6 7 8 9

0.4

0 . 3 g % C

- m

0.2 2 c m s

0 LI)

L

0.1 2

0

Fract ion number

Fig. 4. Ge1,filtration o j a chymotrypsin digest of ('4C]carbo,xy- methylated hexon. Sephadex G-50 (fine, 5 x 100 cm) in 1 % ammonium bicarbonate. Radioactivity (-) and absorb- ance at 280 nm (. . . . . .)

analyzed in order to avoid ambiguities due to in- complete carboxymethylation.

Chymotvyptic Peptides from ('4C]Carboxymethylated Hexon

Chymotrypsin and thermolysin both produced completely soluble digests. In each case, however, many peptide bonds were partially sensitive, and complex peptide mixtures containing large numbers of peptides in variable recovery were obtained. The chymotryptic digest was fractionated by chromatog- raphy on Sephadex G-50 with complete recovery of radioactivity. The elution profile is shown in Fig.4. Due to the large number of peptides, all radioactive fragments were not obtained completely free of con- taminants after purification. All those recovered could, however, except in one case, be related to different fragments containing the six carboxymethylcysteine residues (Table 4) characterized from the tryptic digests.

The chymotryptic radioactive peptide incompat- ible with previously known structures was recovered in fraction 6 in Fig. 4, and data for this peptide (CHY 1) are shown in Table 7. Another peptide (CHY 2) in the same fraction had identical electrophoretic mobility at neutral pH but was considerably more acidic at pH 3.5. Data for this peptide are also included in Table 7. CHY 1 and 2 are obviously derived from the same region of the protein. They differ only in the cysteine derivative which is recovered partly as car- boxymethylcysteine and partly as unsubstituted but oxidized cysteic acid, respectively (about equal a- mounts of each form).

Eur. J. Biochem. 48 (1974)

Page 11: Structural Studies of Adenovirus Type-2 Hexon Protein

H. Jornvall, U. Pettersson, and L. Philipson 189

Table 7. Data for two chymotryptic and one tryptic peptide The chymotryptic peptides CHY 1 and 2 contain a cysteine derivative absent in peptides A-I. The tryptic peptide TRY is likely to contain the protein C-terminus. Compositions were determined as in Table 3 and sequences as in Table 4

Peptide CHY 1 CHY 2 TRY

No. of purifica- cation steps 3 3 3

~~ ~

Recovery (%) 11 8 51

Electrophoretic mobility at pH 6.5 [39] 0.66 0.66 0

Amino-acid composition Cysteic acid - - 0.7 (1) - -

- - - - Cm-Cysteine 1.0 (I) Aspartic acid 1.1 (1) 1.1 (1) 1.1 (1)

Serine 1.2 (1) 1.2 (1) 1.0 (1) 1.1 (1)

0.3 - 1.1 (1) Glycine - -

2.0 (2)

Leucine 0.9 (1) 1.1 (I)

Phenylanine - - - -

2.8 (3) - - - - Threonine

Proline

Alanine Isoleucine 0.7 (1) 0.8 ( I )

Tyrosine 1.0 (1) 0.8 (1)

- - - -

- - - -

- -

- -

- -

0.8 (1)

Total 6 6 10

N-terminus Aspartic Aspartic Threonine acid acid

Sequence CHY 1 : Asp-Ser-Ile-Cys(Cm)-Leu-Tyr CHY 2 : Asp-Ser-Ile-Cys(0,H)-Leu-Tyr TRY : Thr-Pro-Phe-Ser-Ala-Gly-Asn-

Ala-Thr-Thr

This partly labeled residue constitutes a seventh unique cysteine derivative in the hexon protein. It is probably inside a large insoluble tryptic fragment, explaining that it was not detected in the fractionated tryptic peptides (Fig.2). The partial I4C label is also consistent with the fact that almost all radioactivity of the tryptic digest was recovered after the Sephadex fractionation (Fig. 2) although the seventh residue was not revealed.

Analysis of (35 SICysteine-Labeled Hexon

[3sS]Cysteine-labeled hexon (1.5 mg) was mixed with unlabeled hexon (50 mg) and carboxymethylated with non-radioactive iodoacetate. Trypsin was used for proteolytic digestion since tryptic peptide mixtures were least complex (above).

The elution profile of the carboxymethylated 3sS- labeled peptide mixture is shown in Fig.3C. It is

similar to the other curves, obtained from [14C]car- boxymethylated samples. Electrophoretic separations of the fractions from the [35S]cysteine-labeled hexon digest yielded autoradiographic pictures identical to those from [14C]carboxymethylated digests (Fig. 2). Peptides A to I were all prepared in the 3sS-labeled form and no other radioactive bands, present in more than trace amounts were seen. Noticeably, peak I (Fig. 3) is also a minor component in the [3sS]cysteine- labeled digest.

The precipitate from the tryptic digest of the [3sS]cysteine-labeled hexon was digested with chymo- trypsin. Electrophoretic separations then revealed the presence of one comparatively strong and several weaker bands. The former had electrophoretic mo- bilities identical to peptide CHY 1 and therefore con- firms the conclusion that a cysteine derivative is inside an insoluble tryptic peptide. The presence of several weak bands suggest that the insoluble tryptic fraction is also composed of partly undigested material con- taining remaining cysteine residues.

[3sS]Cysteine-labeled hexon, therefore, confirms the conclusion that soluble tryptic hexon peptides only contain six unique cysteine derivatives and that a seventh such residue is present in the insoluble core material after tryptic digestion.

Protein C-Terminus

During work with the 14C or 35S-labeled hexon digests many non-radioactive peptides were also iso- lated. One of these, obtained from tryptic digests of the hexon (present in fractions 2 and 3, Fig. 2) probably contains the protein C-terminus, since it lacks a basic residue and the C-terminus is threonine. The recovery and the amino-acid sequence of this peptide are shown in Table 7 (peptide TRY). Threonyl bonds are usually not cleaved by trypsin and all other purified tryptic peptides had an expected C-terminal arginine or lysine residue or alternatively tyrosine from chymotryptic- like cleavages. Peptide TRY, furthermore, was recov- ered in an exceptionally high yield. Its properties there- fore suggest that it constitutes the C-terminal part of the protein subunits but attempts to verify this by hydrazinolysis of intact hexon were unsuccessful, since no definitive results were obtained.

DISCUSSION NATURE OF THE HEXON SUBUNITS

Homogeneity

Only one type of subunit was detected by exclusion chromatography of ['4C]carboxymethylated hexon

Eur. J. Biochem. 48 (1974)

Page 12: Structural Studies of Adenovirus Type-2 Hexon Protein

190 Structural Studies of Adenovirus Hexon Protein

under dissociating conditions. Mapping of tryptic peptides obtained from adjacent fractions of the eluate gave identical peptide patterns after ninhydrin staining or autoradiography. DEAE-chromatography in 8 M urea of [14C]carboxymethylated hexon also produced apparently identical peptide patterns from adjacent fractions although a broad elution profile was obtained (not shown) as in a similar analysis of another modified protein [13]. Structural analysis revealed that long re- gions around each of the labeled carboxymethylcysteine residues were completely homogeneous. The amino- acid sequences of these regions together account for over one tenth of all residues in the subunits. Only one N-terminal peptide previously characterized [ 151, and another probable C-terminal peptide (TRY, Table 7) were identified and recovered in excellent yield. It is concluded that hexon subunits are highly similar and probably identical, which is consistent with previous data from gel electrophoresis [l, 6,7], electron microscopy [9], sedimentation [6,11] and crys- tallographic [4,5,8] analysis.

Size

The subunit size may be estimated from the number of unique cysteine derivatives in the hexon protein. Nine peptides, containing six different carboxymethyl- cysteine residues (Table 4), were reproducibly detect- ed in tryptic digests of reduced and ['4C]carboxy- methylated hexon (Fig.2 and 3). The same tryptic peptides were also detected from hexon which was labeled in vivo, by [3sS]cysteine, establishing that no peptides in the I4C-labeled digests were overlooked due to incomplete carboxymethylation.

A seventh unique cysteine derivative was detected in chymotrypric digests of the [14C]carboxymethylated protein. The cysteine residue in this peptide was, how- ever, only partly carboxymethylated, producing two similar peptides with carboxymethylcysteine and cys- teic acid, respectively (CHY 1 and 2, Table 7). It is probably an insoluble core peptide in the tryptic digest since chymotryptic digestion of this precipitate yielded CHY 1. The latter peptide was also obtained in the 3sS-labeled form in chymotryptic digests of hexon labeled in vivo, as well as in chymotryptic digests of the tryptic core material from this hexon derivative.

The presence of a seventh labeled residue in a large partly insoluble tryptic peptide is compatible with the incomplete recovery of radioactivity on gel filtration (Fig. 3). The size of this loss is insignificant since it is possible that part of the insoluble tryptic core may also have been cleaved at less sensitive bonds producing smaller fragments in low yield. These fragments would be recovered on gel filtration but not distinguished from background radioactivity in the purification.

Partial enzymatic cleavages and incomplete I4C la- beling therefore complicate determinations of peptides with carboxymethylcysteine. The occurrence of an eighth cysteine derivative in the hexon protein (cor- responding to a subunit size of 114000, cJ Table 1) can therefore not be excluded. The analysis, however, of tryptic and chymotryptic peptides from ['4C]car- boxymethylated as well as from [35 Slcysteine-labeled hexon revealed no evidence for additional cysteine derivatives.

Seven unique cysteine derivatives correspond to a subunit molecular weight of around I00000 (Table I). This value is consistent with the results of exclusion chromatography of carboxymethylated hexon under dissociating conditions (Fig. l), and with the titration in the intact protein of a -SH group, accessible to chloromercurinitrophenol (Table 2). Peptide mapping is a less accurate method of evaluating the subunit size, mainly since many peptide bonds are partly resistant to enzymatic hydrolysis due to particular sequence patterns (see below). This fact probably explains why a previous chemical analysis [I 11 suggested a consider- ably smaller subunit size. The size reported in the present study is also in excellent agreement with crys- tallographically determined values (100 000 [5]) and in the same range as values reported from gel electro- phoresisandultracentrifugation (about 120000 [1,6,7]). The chemical analyses of the present investigation thus now reveal a subunit size in the same order of mag- nitude as that obtained by physicochemical methods, 100000- 120000.

DETAILS OF SUBUNIT STRUCTURE

The possible existence of duplicated regions or other special structures in the hexon protein was estimated by sequence analysis and peptide mapping experiments.

A feature of the hexon structure that seems un- usual is the distribution of residues affecting the sensitivity of the protein towards enzymatic hydrolysis. For example, peptide A has two leucine, two tyrosine and one phenylalanine residue (Table 3). In spite of this, only one bond is completely susceptible to chymo- tryptic hydrolysis (Table 4). This is due to the presence of adjacent proline or other restricting residues at remaining positions which otherwise would be sen- sitive to cleavage. Furthermore, tryptic peptides con- tain one trypsin-insensitive Lys-Pro sequence (peptides F/G, Table 4). In fact, half of the proline residues in the peptides shown in Table 4 block the enzymatic hydrolysis of peptide bonds that would otherwise have been split by the enzymes used. From the structures of remaining peptides analyzed, this seems to be a

Eur. J. Biochem. 48 (1974)

Page 13: Structural Studies of Adenovirus Type-2 Hexon Protein

H. Jornvall, U. Pettersson, and L. Philipson 191

general feature, which leads to certain sequence pat- terns around many proline residues. These similarities are, however, irregular and very short and are not suggested to indicate any duplicated regions. Instead, they probably reveal an evolution of hexon towards a structure comparatively insensitive to proteolytic en- zymes. This must be functionally advantageous for a structural protein.

The carboxymethylated peptides (Tables 4 and 7) contain more than 10% of the residues in the entire hexon subunit and, together with other partially char- acterized peptides, more than one quarter of the total number of residues in the protein are accounted for. In spite of this, no regions with clearly related amino- acid sequences have been detected. Except for short regions of special structure (cfi above) sequences are widely different. In particular, no two cysteine/half- cystine residues show any evidence of occurring in related regions. A total of 75-80 tryptic peptides is expected from the content of lysine and arginine in the protein (Table l), if no identical sequences occur, but only about 65 peptide spots were detected. This underestimate of the sum of different lysine and argi- nine residues is probably due to the presence of peptides which are insoluble, contain more than one lysine/arginine due to to the distribution of proline residues, incompletely resolved (e.g. peptides A and D, Fig.2, are not resolved in peptide mapping) or nin- hydrin-negative (e.g. peptides A/B and C, due to the N-terminal hydrophobic residues, Table 4).

It is concluded from both sequence analysis and peptide mapping that extensive regions of identical structures do not occur in the hexon protein, and that no closely homologous regions have yet been iden- tified.

STRUCTURAL DETERMINATIONS OF THE HEXON POLYPEPTIDE

The general features of the hexon subunit are now characterized and conclusions for further structural work may be obtained. It is evident that many segments are often recovered in several peptides due to partial resistance towards attack by proteolytic enzymes. Extensive digestion, on the other hand, reduces the specificity of the enzymes used, as illustrated in this work by these secondary tryptic cleavages of peptides A (to produce B) and F (to produce G) at unexpected peptide bonds. Due to the multiplicity of peptides and the large size of the protein, some fragments are difficult to purify and the yields are low. It may be noticed that in some tryptic digests the recovery of all peptides, and especially of the least abundant ones, was considerably lower than that given in Table 3.

For this reason specific radioactive labeling is especially useful in structural work on the hexon. Labels in vivo seem to offer many possibilities, since metabolic interconversions during labeling of the viral protein are obviously rare. Thus, no evidence was detected in the present work for 35S label in methionine residues of the hexon prepared from a culture contain- ing [35S]cysteine, as expected since methionine is an essential amino acid. Previous attempts to label with [14C]acetate in vivo [15] also yielded a highly selective label of the terminal acetyl group. It should therefore be possible to prepare other specifically labeled deriv- atives. Labeling of methionine residues would then seem useful, since this residue occurs at low frequency in the protein (Table 1) and since knowledge of all sequences around methionine residues will permit ordering of fragments obtained by treatment of the protein with cyanogen bromide.

The authors are indebted to Drs A.-C.Ryden, and R. Trentham for generous gifts of staphylococcal protease and chloromercurinitrophenol, respectively. Excellent tech- nical assistance by Miss J. Soderling and Mrs E. Hjertson is gratefully acknowledged. This work was supported by a grant from the Swedish Cancer Society.

REFERENCES

1. Maizel, J. V., Jr, White, D. 0. & Scharff, M. D. (1968) Virology, 36, 11 5 - 125 and 126 - 136.

2. Everitt, E., Sundquist, B., Pettersson, U. & Philipson, L. (1973) Virology, 52, 130- 147.

3. Pereira, H. G., Valentine, R. C. & Russell, W. C. (1968) Nature (Lond.) 219, 946-947.

4. Franklin, R. M., Pettersson, U., Akervall, K., Strand- berg,B. &Philipson,L. (1971)J.Mol. Biol. 57,383-395.

5. Cornick, G., Sigler, P. B. & Ginsberg, H. S. (1973) J. Mol. Biol. 73, 533- 537.

6. Horwitz, M. S., Maizel, J. V., Jr & Scharff, M. D. (1970) J . Virol. 6, 569- 571.

7. Laver, W. G. (1970) Virolog-y, 41, 488- 500. 8. Cornick, G., Sigler, P. B. & Ginsberg, H. S. (1971) J .

9. Crowther, R. A. & Franklin, R. M. (1972) J . Mol. Biol.

10. Franklin, R. M., Harrison, S. C., Pettersson, U., Bran- dCn, C. I., Werner, P. E. & Philipson, L. (1971) Cold Spring Harbor Symp. Quant. Biol. 36, 503 - 510.

11. Pettersson, U. (1970) Structural Proteins of Adenoviruses (Diss.), Uppsala, Sweden.

12. Harris, J. I. & Perham, R. N. (1965) J. Mol. Biol. 13,

13. Jornvall, H. & Harris, J. I. (1970) Euv. J . Biochem. 13,

14. McMurray, C. H. & Trentham, D. R. (1969) Biochem.

15. Jornvall, H., Ohlsson, H. & Philipson, L. (1974) Biochem.

16. Pettersson,U. & Hoglund,S. (1969) Virology, 39,90- 106. 17. Green, M. & Piiia, M. (1963) Virology, 20, 199-207. 18. Lonberg-Holm, K. & Philipson, L. (1969) J. Virol. 4,

Mol. Biol. 57, 397 - 401.

68,181 - 184.

876- 884.

565 - 516.

J . 115, 913-921.

Biophys. Res. Commun. 56, 304-310.

323-338.

Eur. J. Biochem. 48 (1974)

Page 14: Structural Studies of Adenovirus Type-2 Hexon Protein

192 H. Jornvall, U. Pettersson, and L. Philipson: Structural Studies of Adenovirus Hexon Protein

19. Pettersson, U., Philipson, L. & Hoglund, S. (1967) Viro-

20. Hirs, C. H. W. (1956) J . Biol. Chem. 219, 611-621. 22. Laver, W. G., Suriano, R. J. & Green, M. (1967) J . Virol.

22. Edman, P. & Sjoquist, J. (1956) Actu Chem. Scand. 10,

23. Edelhoch, H. (1967) Biochemistry, 6, 1948- 1954. 24. Davison, P. F. (1968) Science (Wash. D.C.) 161, 906-

25. Fish, W. W., Mann: K. G. & Tanford, C. (1969) J . Biol.

26. Jornvall, H. (1970) Eur. J . Biochem. 16, 41-49. 27. Brown, J. R. & Hartley, B. S. (1966) Biochem. J . 101,

28. Heilmann, J., Barrollier, J. & Watzke, E. (1957) Hoppe-

29. Ambler, R. P. (1963) Biochem. J . 89, 349-378.

.logy, 33, 575-590.

1,723-728.

1507 - 1509.

907.

Chem. 244,4989- 4994.

214- 228.

Seyler’s Z . Physiol. Chem. 309,219-220.

30. Gray, W. R. & Hartley, B. S. (1963) Biochem. J . 89, 379-380 and 59P.

31. Gray, W. R. (1967) Methods Enzymol. 11, 469-475 and (1972) Methods Enzymol. 25, 333- 344.

32. Hartley, B. S. (1970) Biochem. J. 119, 805-822. 33. Jornvall, H. (1970) Eur. J . Biochem. 14, 521-534. 34. Boulanger, P. A., Flamencourt, P. & Biserte, G. (1969)

35. Petterson, U. (1971) Virology, 43, 123- 136. 36. Blomback, B. & Yamashina, I. (1958) Arkiv Kemz, 12,

37. Houmdrd, J. & Drapeau, G. R. (1972) Proc. Nutl. Acud.

38. Ryden, A.-C., Ryden, L. & Philipson, L. (1974) Eur. J.

39. Offord, R. E. (1966) Nature (Lond.) 211, 591-593. 40. Konigsberg, W. (1972) Methods Enzymol. 25, 326-332.

Eur. J . Biochem. 10, 116- 131.

299- 319.

Sci. U.S.A. 69, 3506-3509.

Biochem. 44, 105- 114.

H. Jornvall, Kemiskd Institutionen I, Karolinska Institutet, Solnavagen 1, S-10403 Stockholm 60, Sweden U. Pettersson and L. Philipson, Mikrobiologiska Institutionen, Wallenberglaboratoriet, Dag Hammerskjolds Vag 21, S-75237 Uppsala, Sweden

Eur. J. Biochem. 48 (1974)