structural characterization of a recombinant cd4-igg hybrid molecule

10
Eur. J. Biochem. 194.611-620 (1990) C FEBS 1990 Structural characterization of a recombinant CD4-IgG hybrid molecule Reed J. HARRIS, Karen L. WAGNER and Michael W. SPELLMAN Department of Medicinal and Analytical Chemistry, Genentech, Inc., South San Francisco, CA, USA (Received March 39/August 23, 1990) - EJB 90 0303 CD4-IgG is a homodimer of a hybrid polypeptide consisting of the two amino-terminal domains (residues 1 - 180) of human CD4 fused to the hinge region and the second and third constant-sequence (CH2and cH3) Fc domains (residues 216 - 441) of human immunoglobulin G (IgG-1). This antibody-like molecule, termed an immunoadhesin, was produced in an effort to combine the binding specificity of CD4 with several potentially desirable properties of IgG molecules [Capon et al. (1989) Nature 337, 525 - 5311. The structural characteristics of the molecule have been evaluated to demonstrate that CD4-IgG has the same features as the N-terminal region of soluble CD4, while retaining those expected for the Fc portion of human IgG. Identification of peptides recovered from the tryptic map confirmed 98.8% of the expected structure of CD4-IgG. The detection of glucosamine in peptides containing Asn257 and the retention time shift of this tryptic peptide after deglycosylation confirmed the presence of Asn-linked oligosaccharides at this position. Four pairs of intrachain and two interchain disulfide bonds were also established. An important event in the pathogenesis of acquired im- mune deficiency syndrome (AIDS) is the binding of the human immunodeficiency virus type 1 (HIV-1) envelope 120-kDa glycoprotein, gpl20, to cells bearing the CD4 surface antigen [l, 21. This interaction can lead to the destruction of HIV- infected cells by syncytia formation [3, 41 or to attack of cells that have bound and/or processed gp120 by the immune system [5]. There are many strains of HIV-1 with differing gp120 moieties, but all must retain the CDCbinding ability to be infective [6]. The gpl20-binding region of CD4 has been localized to the amino-terminal domain that has similarity to the variable domains of immunoglobulins [7 - 101. This binding requires the presence of the disulfide bond in this domain [Ill. By deletion of the transmembrane and cytoplasmic do- mains, soluble recombinant forms of CD4 have been con- structed which are able to block HIV-I infectivity in vitro 112 - 161. Hybrid polypeptides wherein the amino-terminal gp120- binding region of CD4 has replaced the variable regions of immunoglobulin heavy chains can also inhibit HIV-1 infec- tivity [17, IS]. These novel hybrid molecules, termed immunoadhesins, also possess Fc binding properties, in vivo pharmacokinetic properties that are superior to soluble CD4, and other clinically useful IgG functions [17]. We sought to confirm the primary structure and to identify the post-translational modifications of a recombinant CD4- IgG hybrid molecule to determine whether or not it retains the appropriate structural features of its constituent CD4 and IgG domains. Soluble CD4 was extensively characterized 119, 201 including the identification of the disulfide bonds in the Correspondence to R. J. Harris, Genentech. Inc., 460 Point San Bruno Blvd., South San Francisco, CA 94080 USA Abbreviations. HIV-1, human immunodeficiency virus type 1 ; FAB-MS, fast-atom-bombardment mass spectrometry; RCM, re- duced and S-carboxymethylated; AIDS, acquired immune deficiency syndrome. Enzymes. Trypsin (EC 3.4.21.4), Staphylococcus aureus V8 pro- tease (EC 3.4.21.19); peptide: N-glycosidase (EC 3.5.1.52). Note. The novel amino acid sequence data published here have been deposited with the EMBL sequence data bank. two N-terminal variable-like domains. The disulfide bonds of the constant-sequence (C,) domains of human IgG have been established 1211. Asn-linked glycosylation of the CH2domain of the IgG-1 heavy chain was also found to be functionally important [22]. The carboxy-terminal lysine residue of human IgG-1 heavy chain expected from the cDNA sequence was not recovered, suggesting that it is removed post-translationally This study describes the structural characterization of a CDCIgG molecule that is purified from stably transfected Chinese hamster ovary cells. CD4-IgG is a homodimer of polypeptide subunits that consist of residues 1 - 180 of human CD4 fused to residues 216-441 (the Fc domain) of human OgG-I. Each subunit has 10 cysteine residues and one poten- tial Asn-linked glycosylation site. ~31. EXPERIMENTAL PROCEDURES Materials ovary cells and purified as described in [17]. CD4-IgG was secreted from transfected Chinese hamster N- and C-terminal analyses An aliquot of CD4-IgG was subjected to Edman degra- dation using an Applied Biosystems model 477Al120A protein sequencer. For C-terminal analysis, CNBr cleavage was used to generate a C-terminal fragment which was purified by reversed-phase HPLC. Desalted CD4-IgG was lyophilized, then reconstituted in 70% (by vol.) formic acid. CNBr (Pierce) was added at a 3 : 2 (by mass) ratio and the sample was incu- bated for 8 h under nitrogen at room temperature. A second aliquot was added for overnight incubation under the same conditions. After evaporation under a nitrogen stream, the sample was reconstituted again in 70% (by vol.) formic acid. A portion of the sample was subjected to HPLC purifi- cation. Amino acid analyses of peak fractions were performed using a Beckman model 6300 system with ninhydrin detection. Fast-atom-bombardment mass spectrometry (FAB-MS) was

Upload: reed-j-harris

Post on 30-Sep-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Eur. J. Biochem. 194.611-620 (1990) C FEBS 1990

Structural characterization of a recombinant CD4-IgG hybrid molecule Reed J . HARRIS, Karen L. WAGNER and Michael W. SPELLMAN

Department of Medicinal and Analytical Chemistry, Genentech, Inc., South San Francisco, CA, USA

(Received March 39/August 23, 1990) - EJB 90 0303

CD4-IgG is a homodimer of a hybrid polypeptide consisting of the two amino-terminal domains (residues 1 - 180) of human CD4 fused to the hinge region and the second and third constant-sequence (CH2 and cH3) Fc domains (residues 216 - 441) of human immunoglobulin G (IgG-1). This antibody-like molecule, termed an immunoadhesin, was produced in an effort to combine the binding specificity of CD4 with several potentially desirable properties of IgG molecules [Capon et al. (1989) Nature 337, 525 - 5311. The structural characteristics of the molecule have been evaluated to demonstrate that CD4-IgG has the same features as the N-terminal region of soluble CD4, while retaining those expected for the Fc portion of human IgG. Identification of peptides recovered from the tryptic map confirmed 98.8% of the expected structure of CD4-IgG. The detection of glucosamine in peptides containing Asn257 and the retention time shift of this tryptic peptide after deglycosylation confirmed the presence of Asn-linked oligosaccharides at this position. Four pairs of intrachain and two interchain disulfide bonds were also established.

An important event in the pathogenesis of acquired im- mune deficiency syndrome (AIDS) is the binding of the human immunodeficiency virus type 1 (HIV-1) envelope 120-kDa glycoprotein, gpl20, to cells bearing the CD4 surface antigen [ l , 21. This interaction can lead to the destruction of HIV- infected cells by syncytia formation [3, 41 or to attack of cells that have bound and/or processed gp120 by the immune system [5]. There are many strains of HIV-1 with differing gp120 moieties, but all must retain the CDCbinding ability to be infective [6]. The gpl20-binding region of CD4 has been localized to the amino-terminal domain that has similarity to the variable domains of immunoglobulins [7 - 101. This binding requires the presence of the disulfide bond in this domain [Ill.

By deletion of the transmembrane and cytoplasmic do- mains, soluble recombinant forms of CD4 have been con- structed which are able to block HIV-I infectivity in vitro 112 - 161. Hybrid polypeptides wherein the amino-terminal gp120- binding region of CD4 has replaced the variable regions of immunoglobulin heavy chains can also inhibit HIV-1 infec- tivity [17, IS]. These novel hybrid molecules, termed immunoadhesins, also possess Fc binding properties, in vivo pharmacokinetic properties that are superior to soluble CD4, and other clinically useful IgG functions [17].

We sought to confirm the primary structure and to identify the post-translational modifications of a recombinant CD4- IgG hybrid molecule to determine whether or not it retains the appropriate structural features of its constituent CD4 and IgG domains. Soluble CD4 was extensively characterized 119, 201 including the identification of the disulfide bonds in the

Correspondence to R. J. Harris, Genentech. Inc., 460 Point San Bruno Blvd., South San Francisco, CA 94080 USA

Abbreviations. HIV-1, human immunodeficiency virus type 1 ; FAB-MS, fast-atom-bombardment mass spectrometry; RCM, re- duced and S-carboxymethylated; AIDS, acquired immune deficiency syndrome.

Enzymes. Trypsin (EC 3.4.21.4), Staphylococcus aureus V8 pro- tease (EC 3.4.21.19); peptide: N-glycosidase (EC 3.5.1.52).

Note. The novel amino acid sequence data published here have been deposited with the EMBL sequence data bank.

two N-terminal variable-like domains. The disulfide bonds of the constant-sequence (C,) domains of human IgG have been established 1211. Asn-linked glycosylation of the CH2 domain of the IgG-1 heavy chain was also found to be functionally important [22]. The carboxy-terminal lysine residue of human IgG-1 heavy chain expected from the cDNA sequence was not recovered, suggesting that it is removed post-translationally

This study describes the structural characterization of a CDCIgG molecule that is purified from stably transfected Chinese hamster ovary cells. CD4-IgG is a homodimer of polypeptide subunits that consist of residues 1 - 180 of human CD4 fused to residues 216-441 (the Fc domain) of human OgG-I. Each subunit has 10 cysteine residues and one poten- tial Asn-linked glycosylation site.

~ 3 1 .

EXPERIMENTAL PROCEDURES

Materials

ovary cells and purified as described in [17]. CD4-IgG was secreted from transfected Chinese hamster

N- and C-terminal analyses

An aliquot of CD4-IgG was subjected to Edman degra- dation using an Applied Biosystems model 477Al120A protein sequencer. For C-terminal analysis, CNBr cleavage was used to generate a C-terminal fragment which was purified by reversed-phase HPLC. Desalted CD4-IgG was lyophilized, then reconstituted in 70% (by vol.) formic acid. CNBr (Pierce) was added at a 3 : 2 (by mass) ratio and the sample was incu- bated for 8 h under nitrogen at room temperature. A second aliquot was added for overnight incubation under the same conditions. After evaporation under a nitrogen stream, the sample was reconstituted again in 70% (by vol.) formic acid.

A portion of the sample was subjected to HPLC purifi- cation. Amino acid analyses of peak fractions were performed using a Beckman model 6300 system with ninhydrin detection. Fast-atom-bombardment mass spectrometry (FAB-MS) was

61 2

LL L l L M m M - 4 0 6 221 281 327 385

Fig. 1. Dirgmm u/'t/ic o i ~ r u l l structure of CD4-ZgC. C-S-S-C are disulfide bonds between cysteine residues. M refers to methioninc. N refers to the asparagine-linked glycosylation site. CD4-1 and CD4-2 are the domains from human CD4, while CH2 and Ci13 are the constant sequence domains from the Fc portion of human IgC

i u I > Lys-Lys-Val-Val-Leu-Gly-Lys-Lys-Gly-Asp-Tlir-Val-Glu-Leu-Thr-

CYS-Thr-Ala-Ser-Gln-Lys-Lys-Ser-I le-Gln-Phe-His-Trp-Lys-Asn-

Ser-Asn-Gln-Ile-Lys-Ile-Leu-Gly-Asn-Gln-Gly-Ser-Phe-Leu-Thr-

Lys-Gly-Pro-Ser-Lys-Leu-Asn-Asp-Arg-Ala-Asp-Ser-Arq-Arg-Ser-

L l b I 0 1 , Leu-Trp-Asp-Gln-Gly-Asn-Phe-Pro-Leu-Ile-Ile-Lys-Asn-Leu-Lys-

J h 80 8 5 9 0

Ile-Glu-Asp-Ser-Asp-Thr-Tyr-Ile-~~S-Glu-Val-Glu-Asp-Gln-Lys-

Glu-Glu-Val-Gln-Leu-Leu-Val-Phe-Gly-Leu-Thr-Ala-Asn-Ser-Asp-

Thr-His-Leu-Leu-Gln-Gly-Gln-Ser-Leu-Thr-Leu-Thr-Leu-Glu-Ser-

Pro-Pro-Gly-Ser-Ser-Pro-Ser-Val-Gln-cyS-Arg-Ser-Pro-Arq-Gly-

Lys-Asn-Ile-Gln-Gly-Gly-Lys-Thr-Leu-Ser-Val-Ser-Gln-Leu-Glu-

Leu-Gln-Asp-Ser-Gly-Thr-Trp-Thr-cyS-Thr-Val-Leu-Gln-Asn-Gln-

Lys-Lys-Val-Glu-Phe-Lys-I le-Asp-I le-Val -Val-Leu-Ala-Phe-Gln-

Asp-Lys-Thr-His-Thr-cyS-Pro-Pro-cyS-Pro-Ala-Pro-Glu-Leu-Leu-

Gly-Gly-Pro-Ser-Val-Phe-Leu-Phe-Pro-Pro-Lys-Pro-Lys-Asp-Thr-

L e u - M e t - I l e - S e r - A r q - T h r - P r o - G l u - V a l - T h r -CyS-Val-Val-Val-Asp-

Val-Ser-His-Glu-Asp-Pro-Glu-Val-Lys-Phe-Asn-Trp-Tyr-Val-Asp-

Gly-Val-Glu-Val-His-Asn-Ala-Lys-Thr-Lys-Pro-Arq-Glu-Glu-Gln-

Tyr-Asn-Ser-Thr-Tyr-Arg-Val-Val-Ser-Val-Leu-Thr-Val-Leu-His-

Gln-Asp-Trp-Leu-Asn-Gly-Lys-Glu-Tyr-Lys-cyS-Lys-Val-Ser-Asn-

L y s - A l a - L e u - P r o - A l a - P r o - I l e - G l u - L y s - T h r - I le-Ser-Lys-Ala-Lys-

Gly-Gln-Pro-Arq-Glu-Pro-Gln-Val -Tyr-Thr-Leu-Pro-Pro-Ser -Arq-

Glu-Glu-Met-Thr-Lys-Asn-Gln-Val-SEr-Leu-Thr-cyS-Leu-Val-Lys-

Gly-Phe-Tyr-Pro-Ser-Asp-Ile-Ala-Val-Glu-Trp-Glu-Ser-Asn-Gly-

Gln-Pro-Glu-Asn-Asn-Tyr-Lys-Thr-Thr-Pro-Pro-Val -Leu-Asp-Ser -

Asp-Gly-Ser-Phe-Phe- Leu-Tyi -Ser-Lys-Leu-Thr-Val-Asp-Lys-Ser-

Arq-Trp-Gln-Glri-Gly-Asn-Val-Phe-Ser-~~S-Ser-Val-Met-His-Glu-

Ala-Leu-His-Asn-His-Tyr-Thr-Gln-Lys-Ser-Leu-Ser-Leu-Ser-Pro-Gly

I 6 > " 2 3 0

$ 1 4 0 4

4 6 5 0 5 6 0

3 , 9 , 1" i 1 0 5

1 0 6 11" 1 1 1 2 0

1 2 1 l j i 11" 1 1 5

I l h i 4 ( I4 i 5 0

1 ,l 1 , 1 6 " 1 6 5

I b h 1 0 Ili 1 8 0

1 8 1 1 8 i 9 0 1 9 >

1 3 6 I O U L O , 2 1 0

2 1 1 2 1 > > u 2 2 ,

2 2 6 2 3 5 2 4 0

> I I 1 4 2,o > , 5

2 3 6 i( 2 6 3 > l o

2 1 > 2 8 0 2 8 5

L i 2 9 $ 0 0

l o , 30 3 , <t $ 1 ,

1 1 < > 1 2 1 1 % "

4 3 , 310 3 4 5

34 6 3 5 255 % L O

i l i i v l i i

3 , 1 ? 9 3 % > I90

5 9 1 4 0 0 1 0 6

Fig 2 The e u p c c t c d uriiino acid tequencc' of CD4-IgG Cysieine rehi-

d w s dre 1x1 bold, whi le the glycosylation site is indicdted by italics

also performed on aliquots of some of the fractions using a Jeol HXllOHF/HXllOHF system.

Tryptic mapping

CD4-IgG was reduced and S-carboxymethylated (RCM), then digested with trypsin as described previously for rCD4 [19]. An aliquot of RCM CD4-IgC was deglycosylated prior to tryptic digestion by incubation for 8 h at 37'C in 250 mM sodium phosphate, 5 m M EDTA, pH 8.6, with peptide: N- glycosidase F (Genzyme) at 12.5 Uimg protein, followed by an equivalent aliquot added for an additional 16 h of treatment.

The digest mixture was separated by reversed-phase HPLC (Fig. 4); aliquots of peak fractions were subjected to hydroly- sis in 6 M HC1 at 110 "C in vucuo for 2 h for amino sugar or 24 h for amino acid analysis. Aliquots of many peak fractions were also subjected to N-terminal sequence analysis.

Disulfhk bond assignments

A sample of CD4-IgG was digested with trypsin (treated with tosylphenylalanine chloromethane, 1 : 100, by mass) for 2 h in 100 m M NH4HC03, then second and third aliquots were added after 2 h and 4 h, respectively. After a total 6 h. the trypsin was inactivated by boiling the sample. S . aureus V8 protease (Pierce) was added at a mass ratio of 1 : 50 for digestion at 37°C. Additional aliquots (1 : 50, by mass) were added after 2 h and 4 h. After a total of 6 h of V8 protease digestion, the sample was frozen.

An aliquot of this trypsinlV8 digest was incubated with 25 m M dithiothreitol for 4 h at 37' C to allow comparison of the HPLC maps of reduced and nonreduced samples. The peaks that were susceptible to reduction were collected and identified by amino acid analysis and FAB-MS. This tech- nique was used for identifying the CD4 domain and hinge- region disulfide-bonded fragments.

To identify the Fc-domain disulfides, the CNBr fragments containing the cH2 and the CH2 + C1,3 regions (CNBr peaks 12 and 14, respectively) were subjected to trypsin digestion. Aliquots of these digestions were also incubated in dithio- threitol to allow the characterizations of the disulfide-contai- ning HPLC fractions by amino acid analysis and FAB-MS.

RESULTS A N D DISCUSSION

N-twminnl sequence analuvsi.s

The amino-terminal sequence of Cd4-IgG was determined by 10 cycles of automated Edman degradation. The observed

613

2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 T i m e ( m i n 1

Fig. 3. Reversed-phase HPLC profile of CNBr cleavage,fragments. A Vydac C4 column (4.6 x 250 mm) was equilibrated at 40°C with 0.1% trifluoroacetic acid; 6 min after sample injection, a gradient of 0-50% solvent l3 (0.1% trifluoroacetic acid in acetonitrile) in 100 min was developed by a Hewlett-Packard 1090M system at 1 .O mL/min. Values were determined as absorbance at 214 nm ( x lo3)

2

5 0 0

38

4 0

4 2

5 1

5 0 4 4

4 1

4 6

4 5 1 56

9 5 3 5 5

1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 T i m e ( m i n 1

Fig. 4. Trj’ptic map of’ RCM CD4-ZgG. A Vydac C18 column (4.6 x 250 mrn) was equilibrated at 35’C with 0.1% trifluoroacetic acid; 3 min after sample injection, a gradient of 0-40% solvent B (0.1% trifluoroacetic acid in acetonitrile) in 80 min was developed by a Hewlett- Packard 1090M system at 1 .O mL/min. Values were measured as absorbance at 214 nm ( x lo3)

sequence was that predicted for this molecule (Lys-Lys-Val- Val-Leu-Gly-Lys-Lys-Gly-Asp-) after the removal of the CD4 signal peptide. N o N-terminal heterogeneity or internal se- quences were observed (data not shown).

C-terminal analjwis

CNBr cleavage was used to liberate the C-terminal frag- ment(s) resulting from cleavage after Met388. CNBr cleavage

614

1 00:

8 0:

6 0:

4 0:

m 20: m m

A 1 2 0 1

19

1 1 2 0

6

N 120:

100:

8 0:

6 01

4 0:

2 0:

(r

4 0

Fig. 5 . Detail q f n a t i w and deglycosylated R C M CD4-ZgG tryptic maps. The HPLC conditions and peak numbers arc the same as for Fig. 4. (A) RCM CD4-IgC tryptic map. (B) Deglycosylated RCM CDCIgG tryptic map

generated several fragments that were purified by reversed- phase HPLC as shown in Fig. 3. Translation of the cDNA for CD4-IgG predicted a C-terminal CNBr fragment with the sequence His-Glu-Ala-Leu-His-Asn-His-Tyr-Thr-Gln-Lys- Ser-Leu-Ser-Leu-Ser-Pro-Gly-Lys (residues 389 - 407). How- ever, the collected peaks labeled 2 - 9 all have the amino acid composition of residues 389 - 406. FAB-MS of the main peaks showed that the peptide heterogeneity in this set was the result of formylation of the peptide at the N-terminus of the peptide andjor the side-chain of Lys399 (data not shown) during the incubation in the 70% formic acid. No peptides containing Lys407 were found, nor were fragments representing any C- termini other than Gly406 observed. These results demon- strate that the C-terminal lysine residue predicted from the cDNA sequence for CD4-IgG has been completely removed. This C-terminal processing is the same as has been observed for human IgG (231.

Tryptic incippitzg

CD4-IgG was reduced and S-carboxymethylated to allow complete tryptic digestion of the molecule. The reversed-phase HPLC elution profile of trypsin-digested RCM CD4-IgG is shown in Fig. 4. Peak fractions were collected manually and subject to amino acid analysis for peptide identifications, which are summarized in Table 1 below. Many peptide identi- fications, were confirmed by N-terminal sequence analysis; these are indicated by the asterisks in Table 1. All of the predicted tryptic fragments larger than two residues were found by these techniques. The dipeptides T19 (Gly-Lys) and T47 (Ser-Arg) as well as the N-terminal Lys residue did not resolve from the salt peak at the beginning of the HPLC chromatogram. These results confirmed 401 of the 406 resi- dues (98.8%) of the polypeptide sequence.

Carbohydrate heterogeneity resulted in multiple peaks for the glycopeptide containing the Asn-linked glycosylation site

(T32), which appeared in peaks 18, 19 and 20. The presence of glucosamine, which is a component of N-linked oligosac- chdrides, in hydrolysates of these three peaks (4.1, 3.0 and 2.9 mol/mol peptide for peaks 18, 19 and 20, respectively) demonstrates the glycosylation of this peptide. No non-glyco- sylated form of T32 was found, thus Asn-257 is fully glycos- ylated. Deglycosylation of RCM CD4-IgG prior to trypsin digestion caused peaks 18, 19 and 20 to condense into a single peak (peak N1 of Fig. 5) with a higher retention time than the glycosylated forms, as would be expected after the removal of these hydrophilic structures. Galactosamine, which is a component of 0-linked oligosaccharides, was not recovered from the 2-h hydrolysates of CD4-IgG. Therefore, no 0- linked glycosylation is present.

Additional heterogeneity was introduced by the S- carboxymethylation procedure. The peptides that contain- Asn-Gly-sequences (T33 and T44) undergo some deamidation via a cyclic imide under the high pH required for S- carboxymethylation [24]. The imide form hydrolyzes rapidly to either -Asp-Gly- or a -B-Asp-Gly- (isoaspartate) form. The isoaspartyl forms of these peptides elute just ahead of their respective unmodified forms, while the deamidated (-Asp-Gly-) forms elute just after them. The T33 peptide shows all three forms (peaks 53,54 and 55) while the T44 peptide shows only the isoaspartyl and unmodified forms (peaks 43 and 44). A small amount of S-alkylation of the methionine residues in T27 and T48 generated extra peaks for these peptides (peaks 20 and 37, respectively).

The C-terminal tryptic peptide was recovered in good yield (73%) in peak 30. No other C-terminal fragments were ob- served, which is consistent with the CNBr cleavage results described above. The peptide containing the junction of the CD4 and Fc domains (T24) was recovered in peak 47. N-terminal sequence analysis of this peptide confirmed the expected sequence at this junction site. Peak 56 contains trace amounts of a mixture of incompletely digested fragments. A

615

Table 1. Tryptic map peak identifications Peptides were identified by comparison with the expected peptide amino acid compositions. Peaks labeled with the asterisks (*) were confirmed by N-terminal sequence analysis. The starting material consisted of 4.7 nmol trypsin-digested RCM CD4-IgG. C refers to carboxymethylcysteine

Peak Amount Peptide Residues Identification

1

2

3

4

5

6

7

8

9

10

11

1 2

13

*14

1 5

*16

*17

18

1 9

20

21

22

*23

-

4.25

3.57

3.57

2.55

3.87

4.35

3.60

0 .72

1 . 5 3

4 .38

0.54

4 . 4 1

0.30

2.70

0 .72

4.29

3.15

1 .59

3.90

0 .90

0.54

0.04

1 .32

2.07

1 .02

2.94

3.60

0.45

0 .41

-

-

T12

T35

T36

T10

T18

T30+31

T40

T40

T34

T39+40

T15

T15

T38

T11

T20

T11

T3 8

T8

T4 2

T35+36

T48c

T32

T3 2

T32

T46

T3

T23

T3

-

-

55-58

281-282

283-286

47-50

132-134

249-252

301-304

301-304

278-280

299-304

73-75

73-75

295-298

51-54

137-142

51-54

295-298

30-35

316-320

281-286

377-381

253-261

253-261

253-261

370-374

3-7

168-171

3-7

s a l t

m i x t u r e of f r e e K,R and d i p e p t i d e s

ADSR

CK

VSNK

GPSK

SPR

TKPR

GQPR

GQPR

EXK

AKGQPR

NLK

NLK

TISK

LNDR

NIQGGK

LNDR

TISK

NSNQIK

EENTK

CKVSNK

WQWN

EEQYNSTYR ( g l y c o s y l a t e d a t N)

EEQYNSTVR

EEQYNSTYR

LTVDK

W K

VEFK

W K

61 6

Table I . (Continuation)

Peak Amount Peptide Residues Identification

38

39

40

41

42

*43

*44

45

46

*47

48

49

31

32

33

*34

*35

36

*37

* 24 0.56

0.28

25 4.47

*26 1.35

0.68

27 2.82

28 0.57

29 2.04

30 2.13

3.39

3.24

0.57

0.78

0.30

0.38

0.19

3.21

0.45

0.27

3.93

1.80

2.67

3.21

2.31

0.09

0.23

2.07

0.99

1.71

2.19

0.45

0.90

T2+3

T41c

T22+23

T4+5

T27

T4+5

T5

T37

T37

T4 9

T27

T4 1

T41+42

T9+10

T9C

T6+7

T9

T9

T4 8

T7

T43

T29

T28

T4 8

T4 8

T44

T44

T4 5

T21

T24

T13+14

T45a

2-7

305-309

167-171

8-21

209-215

8-21

9-21

287-294

287-294

400-406

209-215

305-315

305-320

36-49

36-45

22-29

36-46

36-46

377-399

23-29

321-330

235-248

216-234

377-399

377-399

331-352

331-352

353-369

143-166

172-182

59-72

353-367

KVYLGK

EPQVY

KVEFK

KGDTVELTCTASQK

DTLMISR

KGDTVELTCTASQK

GDTVELWTASQK

ALPAPIEK

ALPAPIEX

SLSLSPG (C-terminal)

DTLMISR

EPQVYTLPPSR

EPQVYTLPPSREEMTK

ILGNOGSFLTKGPSK

ILGNOGSFLT

KSIQFHWK

ILGNOGSFLTK

ILGNOGSFLTK

WaQGNVFSCSVMHEALHNHYTQK

SIQFHWK

NQVSLTCLVK

FNWnTDGVEVHNAK

TPEVTONVDVSHEDPEVK

WQOGNVFSCSVMHEXLHNHYTQK

WaQGNVFSCSVMHEXLHNHYTQK

GFYPSDIAVEWESNGQPENNYK (blocked at -NG-)

GFYPSDIAVEWESNGQPENNYK (-Asn-Gly peptide)

TTPPVLDSDGSFFLYSK

TLWSQLELQDSGTWTCTVLQNQK

IDIVVLAFQDK (junction of CD4/IgG at Q-D)

RSLWNFPLIIK

TTPPVLDSXSFFLY

61 7

Table 1. (Continuation)

Peak Amount Peptide Residues Identification

50 3 .27 T14 60-12 SLWDWNFPLIIK

5 1 3.78 T25+26 183-208 THTCPPCPAPFLGGPSVFLFPPKPK

52 0.24 T33t34 262-280 WSVLTVLHQDWLNGKEYK

*53 0 .51 T33 262-277 WSVLTVLHQDWLNGK (blocked at -NG-)

*54 0 .93 T33 262-217 WSVLTVLHQDWGK (-Am-Gly- peptide)

*55 0 .42 T3 3 262-277 WSVLTVLHQDWLDGK (deamidated -Asp-Gly- peptide)

*56 0 .14 T39-? 299- ? AKGQPREPQVYTLPPSREEKNQVS-?

0.10 T40-? 301- ? GQPREPOVVTLPPSREEMTKNQVSLT-?

0.03 T21-? 143- ? TLSVSQLEZQDSG-?

0 .02 T33-? 262- ? WSVLTVLHG-?

57 0.90 T16+17 76-131 IEDSDTYICEWEJXXFSVQLLVFGLTANSDTH~QSLTL-

TLESPPGSSPWQCR

X

5 0 38 I

t :::I 5 0

I

I

I

I ” Y

33 I

1 1

2 0 0

1 5 0

1 0 0

5 0

10 2 0 3 0 4 0 5 0 6 0 7 0 8 8 T i m e ( m i n 1

Fig. 6. Reversed-phase HPLCprof le of trypsinlV8 digested CD4-IgG. The HPLC conditions are the same as for Fig. 4. (A) TrypsinIV8 digest mixture after reduction with dithiothreitol. (B) Non-reduced trypsinlV8 digest mixture

minor amount of chymotrypsin-like cleavage was observed (e.g. T41c in Table 2). All of the peptides recovered were consistent with the sequence given in Fig. 1.

followed by S.aureu.s V8 protease. Three peaks (33,38 and 50 in Fig. 6) contained peptides that were susceptible to reduction with dithiothreitol. Peak 38 contains the fragment (14- 21)- S-S-(78 - 85) , while peak 50 contains the fragment (120- 131)-

Disulfide bond assignments S-S-(151- 166). These disulfide bonds (between Cys-16 and Cys84 and between Cysl30 and Cys159) were confirmed by

The CD4-domain and interchain (hinge-region) disulfide bonds were assigned by digestion of CD4-IgG with trypsin

FAB-MS analysis which showed- the presence of ionized masses (MH+) that are consistent with the presence of such

61 8

300$ a 13 00

I I

200-

CNlEC

100-

(2,

m m A A

0 -- h n

1 0 2 0 30 4 0 50 6 0 7 0 T i m e (rnin 1

Fig. 7. Revrrsc~d-pl?a.se MPLCprqfiles of tryptic digests of CNBrfragments. The HPLC conditions are the same as for Fig. 4. (A) Tryptic digest of fragment from CNBr peak 12 (residues 213-318). (B) Tryptic digest of fragment from CNBr peak 14 (residues 213-388)

bonds in each fragment (Table 2). These assignments are the same as for soluble CD4 [19, 201 and its homologs from mice and sheep [25].

The fragment in peak 33 (Fig. 6) contains a disulfide- bonded dimer of (183 - 193). FAB-MS analysis of this sample showed an MH' that is consistent with a peptide dimer con- taining two interchain disulfide bonds (Table 2). The orien- tation of these disulfides, between identical residues on each subunit (Cys186 to Cys186 and Cys189 to Cys189), was de- monstrated by the recoveries of bis(pheny1thiohydantoin)-cys- tine [26] at both cycles 4 and 7 during N-terminal sequence analysis of this peptide dimer (data not shown). Had these disulfide bonds been Cysl86 to Cysl89, then bis(phen- y1thiohydantoin)-cystine would have been observed only at cycle 7. The yield of this fragment was low, presumably due to resistance to proteolysis caused by the extraordinary num- ber of proline residues in the vicinity of these two disulfide bonds.

The Fc-domain intrachain disulfide bonds were asigned after tryptic digestion of CD4-IgG CNBr fragments. CNBr peak 12 (Fig. 3) contains the C1,2 domain fragment (residues 213-318), while CNBr peak 14 contains the CH2 and CH3 domains (residues 213- 388). Amino acid analysis and FAB- MS of aliquots of peaks collected from HPLC mapping of these digests (Fig. 7. Table 2) demonstrated that peaks CNl2C and CN14K each have fragments containing disulfide bonds in patterns identical to those found for human IgG [21]. A peak that elutes at 49 min in the digests of both CNBr peaks 12 and 14 (Fig. 7) contains the T28-S-S-T35 fragment. CNBr peak 12K gave an MH' of 2330.6, demonstrating the presence of the disulfide bond between Cys221 and Cys281 in the cH2 domain. A peak that eluted at approximately 59'min (peak CN14K of Fig. 7) contains the T43-S-S-T48 fragment and gave an MH' of 2458.9, demonstrating the presence of a disulfide bond between Cys-327 and Cys-385 in the CH3 do-

main. The disulfide bonding pattern of this molecule is summarized in Fig. 1.

The CD4-IgG molecule presented us with technical chal- lenges due to its size and chimeric nature. We sought to demon- strate that the structure and post-translational modifications of the CD4 region are similar to those found in soluble CD4, and that the Fc domains correspond to those found in human IgG.

In this study, the CD4-IgG molecule was found to have the correct N-terminus without any proteolysis of the polypeptide subunits. Virtually all of the expected amino acid sequence was confirmed by the recovery of tryptic peptides from the HPLC map of S-carboxymethylated CD4-IgG. The C-ter- minal lysine residue was found to be removed, as is the case for human IgG [23], leaving homogeneous C-termini. The complete glycosylation of the potential N-linked site in the CH2 domain (at Asn257) was demonstrated. The intrachain disulfide bonding pattern is the same as for the two N-terminal domains of soluble CD4, while the CH2 and C,,3 domains have disulfide bonds that are identical to human IgG-I. Two interchain disulfide bonds were also observed in the hinge region of the molecule.

This CD4-IgG molecule as well as the immunoadhesin molecules described previously [17] have the desired properties of being able to bind gp120 and thus block HIV-1 infectivity in vitro, while retaining the clinically useful features of an immunoglobulin, such as a longer plasma half-life and Fc- receptor binding. The knowledge that such hybrid molecules also possess the appropriate overall structures in their component domains makes them attractive candidates for the treatment of AIDS.

We thank Dr Steven Chamow and David Peers for providing the CDCIgG, Dr Patrick Griffin and James Bourell for performing FAB- MS analyses, Dr Timothy Gregory for helpful discussions and Carol Morita for preparing Fig. 1.

619

Table 2. Amino acid compositions and FAB-MS results for disulfide-containing fragments Residues/molecule peptide are given. MH+ are the average protonated masses

Amino Trypsin/V8 peptides in peaks CNBr/trypsin peptides in peaks acid

TV8-33 TV8-38 TV8-50 CN12C CN14K

ASX

T h r

Ser

Glx

P r o

GlY

A l a

CY s

V a l

M e t

Ile

Leu

Tyr

P h e

His

LY S

WI

0.9

2.1(2)

0.3

1.2(1)

4.1(4)

0.2

1.0(1)

1.7(2)

0.2

0 . 0

0.4

0.6

0.2

0.2

1.0(1)

0.4

0.2

E x p e c t e d MH+: 2301.6

Observed MHt: 2301.3

THTCPPCPAPE

THTCPPCPAPE S t r u c t u r e : I I

2.1(2)

2.9(3)

1.8(2)

3.0(3)

0.1

0 .0

1.0(1)

1.8(2)

0 . 0

0.1

1.9(2)

1.2(1)

1.0(1)

0.0

0.0

1.0(1)

0.1

2037.3

2037.1

LTCTASQK

2.0(2)

2.7(3)

4.2(5)

4.0(4)

2.7(3)

2.0(2)

0.0

2.0(2)

1.9(2)

0.0

0.2

2.1(2)

0.0

0.1

0 . 0

1.0(1)

0.9(1)

3022.3

3022.3

2.5(2)

2.1(2)

1.0(1)

3.1(3)

1.9(2)

0.1

0.1

1.2(2)

5.7(6)

0.0

0 .0

0.0

0.0

0.0

1.1(1)

1.9(2)

0 .0

2330.6

2330.2

2.0(2)

1.0(1)

2.5(3)

3.0(3)

0.1

1.0(1)

0.1

1.6(2)

4.0(4)

++ (Hse)

0 . 0

1.9(2)

0.0

1.0

0.0

1.1(1)

0.0

2458.9

2458.8

TPEVTFVSHEDPEVK NWSLTUVK /

DSMYI~E L Q D S G T Q T ~ Q N Q K W€QGNVGSCSV(M)

REFERENCES

1. Sattentau, Q. J. & Weiss, R. A. (1988) Cell52, 631 -633. 2. Fauci, A. S. (1988) Science 239, 617-622. 3. Sodroski, J., Goh, W. C., Rosen, C., Campbell, K . & Haseltine,

4. Lifson, J. D., Feinberg, M. B., Chakrabartil, S., Moss, B., Wong- Staal, F., Steiner, K. S. & Engleman, E. G. (1986) Nature 323,

5 . Siliciano, R. F., Lawton, T., Knall, C., Karr, R. W., Berman, P.,

6. Coffin, J. M. (1986) Cell46, 1-4. 7. Berger, E. A,, Fuerst, T. R. & Moss, B. (1988) Proc. Nut1 Acad.

Sci. USA 85, 2357-2361. 8. Peterson, A. & Seed, B. (1988) Cell 54, 65-72. 9. Landau, N. R., Warton, M. & Littman, D. R. (1988) Nature 334,

W. A. (1986) Nature 322,470-474.

725 - 728.

Gregory, T. G. & Reinherz, E. L. (1988) Cell 54, 561 - 575.

159- 162.

10. Arthos, J., Deen, K. C., Chaikin, M. A,, Fornwald, J. A., Sathe, G., Sattentau, J. A,, Clapham, P. R., Weiss, R. A., McDougal, J. S., Peitropaolo, C., Axel, R., Truneh, A., Maddon, P. J. & Sweet, R. W. (1989) Nature 57, 469-481.

11. Richardson, N. E., Brown, N. R., Hussey, R. E., Vaid, A,, Matthews, T. J., Bolognesi, D. P. & Reinherz, E. L. (1988) Proc. Nut1 Acad. Sci. USA 85, 6102-6106.

12. Smith, D. H., Byrn, R. A., Marsters, S. A. Gregory, T., Groopman, J. E. & Capon, D. J. (1987) Science 238, 1704- 1707.

13. Fisher, R. A., Bertonis, J . M., Meier, W., Johnson, V. A,, Costopoulos, D. S., Liu, T., Tizard, R., Walker, B. D., Hirsch, M. S., Schooley, R. T. & Flavell, R. A. (1988) Nature 331,76- 78.

14. Hussey, R. E., Richardson, N. E., Kowalski, M., Brown, N. R., Chang, H.-S., Siliciano, R. F., Dorfman, T., Walker, B., Sodroski, J. & Reinherz, E. L. (1988) Nature 331, 78-81.

15. Deen. K . C., McDougal, J. S., Inacker, R., Folena-Wasserman, G.. Arthos, J.. Roscnberg, J., Maddon, P. J.. Axel. R. & Sweet,

16. Traunecker, A,, Luke. W. & Karjalainen, K . (1988) Nature 331,

17. Capon, D. J.. Chamow. S. M.. Mordenti, J., Marsters, S. A,, Gregory, T., Mitsuya, H., Byrn, R. A., Lucas, C., Wurm, F. M., Groopman. J. E., Broder, S. & Smith, D. Ff. (1989) Noture

18. Traunecker, A., Schneider, J., Kiefer, H. & Karjalainen, K. (1989)

19. Harris, R. J., Chamow, S. M., Gregory, T. & Spellman, M. W. (1 990) Ew. J . 8ioc/ien?. 188, 291 - 300.

20. Carr, S. A., Hemling. M. E., Folena-Wasserman, G., Sweet, R. W.. Anumula. K., Bart-, J. R., Huddleslon, M. J. & Taylor, P. (1989) .J. Biol. Chew. 264. 21 286-21 295.

R. W. (1988) Nature 331, 82-84.

84-86.

337, 525 - 53 I .

NaturP 339, 68 -- 70.

21. Edelman, G. M., Cunningham, B. A , , Gall, W. E., Gottlieb, P. D., Rutishauser, U. & Waxdal. M. J . (1969) Biochemistry 63,

22. Takahashi, N., Ishii, I., Ishihara, H., Mori, M., Tejima, S.. Jefferis, R., Endo, S. & Arata, Y. (1987) Biochmzistry 26,

23. Ellison, J. W., Berson, B. J. & Hood. L. E. (1982) l ~ u c / P i c Acidr

24. Bornstein, P. & Balian, G. (1977) Merl?ud.s EnqwoI. 47. 132-

25. Classon, B. J., Tsagaratos, J., McKeniie, I . I;. C. & Walkcr, I. D.

26. Marti, T., Rosselet, S. J., Titani. K. & Walsh, K . A. (1987)

78-85.

1137-1144.

R ~ s . 10,4071 -4079.

145.

(1986) Proc. Natl Acad. Sci. USA 83,4499-4503.

Biochemistry 26, 8099 - 8109.