foldingofthen-terminal,ligand-bindingregionofintegrin ... › content › pnas › 94 › 1 ›...

8
Proc. Natl. Acad. Sci. USA Vol. 94, pp. 65–72, January 1997 Biochemistry This contribution is part of the special series of Inaugural Articles by members of the National Academy of Sciences elected on April 30, 1996. Folding of the N-terminal, ligand-binding region of integrin a-subunits into a b-propeller domain TIMOTHY A. SPRINGER The Center for Blood Research and Harvard Medical School, Department of Pathology, 200 Longwood Avenue, Boston, MA 02115 Contributed by Timothy A. Springer, November 13, 1996 ABSTRACT The N-terminal 440 aa of integrin a sub- units contain seven sequence repeats. These are predicted here to fold into a b-propeller domain. A homologous domain from the enzyme phosphatidylinositol phospholipase D is predicted to have the same fold. The domains contain seven four- stranded b-sheets arranged in a torus around a pseudosym- metry axis. The trimeric G-protein b subunit (G beta) ap- pears to be the most closely related b-propeller. Integrin ligands and a putative Mg 21 ion are predicted to bind to the upper face of the b-propeller. This face binds substrates in b-propeller enzymes and is used by the G protein b subunit to bind the G protein a subunit. The integrin a subunit I domain, which is structurally homologous to the G protein a subunit, is tethered to the top of the b-propeller domain by a hinge that may allow movement of the domains relative to one another. The Ca 21 -binding motifs in integrin a subunits are on the lower face of the b-propeller. Integrins are arguably the most structurally and functionally complex family of cell adhesion molecules (1–4). To date, 16 different integrin a subunits and 8 different integrin b subunits in mammals have been defined, which give rise to 21 different ab complexes. The extracellular domains are large, with up to 1114 aa for a and 678 aa for b. Diverse ligands on the cell surface and in the extracellular matrix are recognized. The adhesiveness of integrins can be rapidly modulated by signals from within the cell, a phenomenon termed ‘‘inside-out sig- naling’’ (5–7). Dynamic regulation of integrin adhesiveness is associated with alterations in affinity for ligand, expression of epitopes recognized by monoclonal antibodies (mAbs), clus- tering on the cell surface, and cytoskeletal association. The divalent cations Mg 21 or Ca 21 are required for ligand binding, and alteration of divalent cation composition in the medium can augment or inhibit adhesion. Regions in both the a and b subunit cytoplasmic domains have been defined that are important for regulation of adhesiveness, but how this is transduced and the nature of the alteration in the extracellular region are unclear (6, 7). Electron microscopy of integrins shows a globular ligand-binding domain, containing the N- terminal portions of the a and b subunits, that is connected to the membrane by two stalks corresponding to the C-terminal portions of the a and b subunit extracellular domains (8). Regions important for ligand binding have been localized to the N-terminal portions of the a and b subunits (9). The N-terminal region of the a subunit is composed of seven repeats of about 60 aa each (10). These contain FG (phenyl- alanyl-glycyl) and GAP (glycyl-alanyl-prolyl) consensus se- quences and will be termed FG-GAP repeats. A putative Ca 21 -binding motif is present in FG-GAP repeat 4 in some integrins and in repeats 5 through 7 in all integrins (11). Ligand binding by certain b1 and b3 integrins has been mapped within the FG-GAP repeats of the a subunits by cross-linking to ligand and by site-directed mutagenesis (7, 9, 12–16). Within the third FG-GAP repeat in some integrins is an inserted (I) domain of 200 residues. The structure of the I domain of the integrins Mac-1 and LFA-1 has recently been solved and has a fold similar to that found in nucleotide-binding enzymes, small G proteins, and heterotrimeric G protein a subunits (17–20). The I domain contains a metal ion-dependent adhe- sion site (MIDAS). Two serines, a threonine, and two water molecules coordinate a Mg 21 , with a sixth coordination posi- tion available to bind an acidic residue in the ligand. Specificity for ligand is provided by residues that surround the MIDAS in the integrin LFA-1, suggesting that this is the ligand-binding site for the I domain-containing integrins (21). A similar MIDAS motif is present within a 250-aa region that is the most conserved portion of integrin b subunits, and this region has been predicted to adopt a fold similar to that of the I domain (17). Insight into the fold of the seven a subunit FG-GAP repeats could substantially advance our ability to interpret current data and design further experiments on the structure and function of integrins. Here I predict the fold of this region of 440 amino acid residues. Conservation of the number of repeats as seven in evolution, symmetry arguments, and prop- erties of the FG-GAP sequence repeats and of protein do- mains in general suggest that the only known fold that the seven repeats can adopt is the b-propeller. Strong support for this fold is provided by secondary structure and threading predictions, model building, and disulfide bond topology. Methods For prediction of secondary structure and disulfide-bonded cysteines, integrins present in Swiss-Prot (references ITA1 RAT, ITA2 HUMAN, ITA3 CRISP, ITA3 HUMAN, ITA4 HUMAN, ITA4 MOUSE, ITA5 HUMAN, ITA6 CHICK, ITA6 HUMAN, ITA8 CHICK, ITAB HUMAN, ITAE HUMAN, ITAL HUMAN, ITAL MOUSE, ITAM HUMAN, ITAM MOUSE, ITAP DROME, ITAV CHICK, ITAV HUMAN, ITAX HUMAN, YMA1 CAEEL, and YMB3 CAEEL) and selected integrins in GenBank (acces- sion nos. A45226, ITA1HU; L25886, ITA2BO; S44142, ITA2MO; D13867, ITA3MO; U12683, ITA5XE; X69902, ITA6MO; L23423, ITA7MO; L36531, ITA8HU; A49459, ITA9HU; U37028, ITADHU; U12236, ITAEMO; S40311, Copyright q 1997 by THE NATIONAL ACADEMY OF SCIENCES OF THE USA 0027-8424y97y9465-8$2.00y0 PNAS is available online at http:yywww.pnas.org. Abbreviations: MIDAS, metal ion-dependent adhesion site; GO, galactose oxidase; PIPLD, phosphatidylinositol phospholipase D. 65 Downloaded by guest on June 13, 2021

Upload: others

Post on 30-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Proc. Natl. Acad. Sci. USAVol. 94, pp. 65–72, January 1997Biochemistry

    This contribution is part of the special series of Inaugural Articles by members of the National Academy of Scienceselected on April 30, 1996.

    Folding of the N-terminal, ligand-binding region of integrina-subunits into a b-propeller domainTIMOTHY A. SPRINGERThe Center for Blood Research and Harvard Medical School, Department of Pathology, 200 Longwood Avenue, Boston, MA 02115

    Contributed by Timothy A. Springer, November 13, 1996

    ABSTRACT The N-terminal '440 aa of integrin a sub-units contain seven sequence repeats. These are predicted hereto fold into a b-propeller domain. A homologous domain fromthe enzyme phosphatidylinositol phospholipase D is predictedto have the same fold. The domains contain seven four-stranded b-sheets arranged in a torus around a pseudosym-metry axis. The trimeric G-protein b subunit (G beta) ap-pears to be the most closely related b-propeller. Integrinligands and a putative Mg21 ion are predicted to bind to theupper face of the b-propeller. This face binds substrates inb-propeller enzymes and is used by the G protein b subunitto bind the G protein a subunit. The integrin a subunit Idomain, which is structurally homologous to the G protein asubunit, is tethered to the top of the b-propeller domain by ahinge that may allow movement of the domains relative to oneanother. The Ca21-binding motifs in integrin a subunits areon the lower face of the b-propeller.

    Integrins are arguably the most structurally and functionallycomplex family of cell adhesion molecules (1–4). To date, 16different integrin a subunits and 8 different integrin b subunitsin mammals have been defined, which give rise to 21 differentab complexes. The extracellular domains are large, with up to1114 aa for a and 678 aa for b. Diverse ligands on the cellsurface and in the extracellular matrix are recognized. Theadhesiveness of integrins can be rapidly modulated by signalsfrom within the cell, a phenomenon termed ‘‘inside-out sig-naling’’ (5–7). Dynamic regulation of integrin adhesiveness isassociated with alterations in affinity for ligand, expression ofepitopes recognized by monoclonal antibodies (mAbs), clus-tering on the cell surface, and cytoskeletal association. Thedivalent cations Mg21 or Ca21 are required for ligand binding,and alteration of divalent cation composition in the mediumcan augment or inhibit adhesion. Regions in both the a and bsubunit cytoplasmic domains have been defined that areimportant for regulation of adhesiveness, but how this istransduced and the nature of the alteration in the extracellularregion are unclear (6, 7). Electron microscopy of integrinsshows a globular ligand-binding domain, containing the N-terminal portions of the a and b subunits, that is connected tothe membrane by two stalks corresponding to the C-terminalportions of the a and b subunit extracellular domains (8).Regions important for ligand binding have been localized to

    the N-terminal portions of the a and b subunits (9). TheN-terminal region of the a subunit is composed of sevenrepeats of about 60 aa each (10). These contain FG (phenyl-alanyl-glycyl) and GAP (glycyl-alanyl-prolyl) consensus se-quences and will be termed FG-GAP repeats. A putative

    Ca21-binding motif is present in FG-GAP repeat 4 in someintegrins and in repeats 5 through 7 in all integrins (11). Ligandbinding by certain b1 and b3 integrins has been mapped withinthe FG-GAP repeats of the a subunits by cross-linking toligand and by site-directed mutagenesis (7, 9, 12–16). Withinthe third FG-GAP repeat in some integrins is an inserted (I)domain of 200 residues. The structure of the I domain of theintegrins Mac-1 and LFA-1 has recently been solved and hasa fold similar to that found in nucleotide-binding enzymes,small G proteins, and heterotrimeric G protein a subunits(17–20). The I domain contains a metal ion-dependent adhe-sion site (MIDAS). Two serines, a threonine, and two watermolecules coordinate a Mg21, with a sixth coordination posi-tion available to bind an acidic residue in the ligand. Specificityfor ligand is provided by residues that surround the MIDAS inthe integrin LFA-1, suggesting that this is the ligand-bindingsite for the I domain-containing integrins (21). A similarMIDASmotif is present within a 250-aa region that is the mostconserved portion of integrin b subunits, and this region hasbeen predicted to adopt a fold similar to that of the I domain(17).Insight into the fold of the seven a subunit FG-GAP repeats

    could substantially advance our ability to interpret currentdata and design further experiments on the structure andfunction of integrins. Here I predict the fold of this region of'440 amino acid residues. Conservation of the number ofrepeats as seven in evolution, symmetry arguments, and prop-erties of the FG-GAP sequence repeats and of protein do-mains in general suggest that the only known fold that theseven repeats can adopt is the b-propeller. Strong support forthis fold is provided by secondary structure and threadingpredictions, model building, and disulfide bond topology.

    Methods

    For prediction of secondary structure and disulfide-bondedcysteines, integrins present in Swiss-Prot (references ITA1RAT, ITA2 HUMAN, ITA3 CRISP, ITA3 HUMAN,ITA4 HUMAN, ITA4 MOUSE, ITA5 HUMAN, ITA6CHICK, ITA6 HUMAN, ITA8 CHICK, ITAB HUMAN,ITAE HUMAN, ITAL HUMAN, ITAL MOUSE, ITAMHUMAN, ITAM MOUSE, ITAP DROME, ITAV CHICK,ITAV HUMAN, ITAX HUMAN, YMA1 CAEEL, andYMB3 CAEEL) and selected integrins in GenBank (acces-sion nos. A45226, ITA1HU; L25886, ITA2BO; S44142,ITA2MO; D13867, ITA3MO; U12683, ITA5XE; X69902,ITA6MO; L23423, ITA7MO; L36531, ITA8HU; A49459,ITA9HU; U37028, ITADHU; U12236, ITAEMO; S40311,

    Copyright q 1997 by THE NATIONAL ACADEMY OF SCIENCES OF THE USA0027-8424y97y9465-8$2.00y0PNAS is available online at http:yywww.pnas.org.

    Abbreviations: MIDAS, metal ion-dependent adhesion site; GO,galactose oxidase; PIPLD, phosphatidylinositol phospholipase D.

    65

    Dow

    nloa

    ded

    by g

    uest

    on

    June

    13,

    202

    1

  • ITAPS1DR; U14135, ITAVMO; and X81108, ITAVPL) wereused. For secondary structure prediction, the I domain wasremoved at structural boundaries (19), and integrins weretruncated '12 residues after the end of the last FG-GAPrepeat using SEQED in GCG (22). A PIR file was made with TOPIRof GCG, the format was edited with NISUS WRITER (Nisus,Solana Beach, CA), and sequences were submitted to PHD (23)as PIR lists. Insertions of .10 residues cannot be removed byPHD and are misaligned; therefore, ITAE HUMAN and1TAEMO were not used, and additionally ITA3 CRISP,ITA3 HUMAN, ITA3MO, and ITA6 CHICK were not usedfor aM prediction. For phosphatidylinositol phospholipase D(PIPLD), the Swiss-Prot PHLD BOVIN and GenBank hu-man L11701 and L11702 sequences were used. Because thesesequences are highly identical, diversity was increased byseparating the seven repeats and submitting PIR lists with 7 33 5 21 sequences. Lists with a different repeat of PHLD BO-VIN as the target sequence and with two different offsets atthe ends were submitted, since predictions at each residue takeinto account the preceding and following six residues.Models of a4 were made with SEGMOD of LOOK (Molecular

    Applications Group, Palo Alto, CA). Alignments (see Fig. 2)used seven circularly permuted coordinate files for G beta(protomer B of the bg dimer; ref. 24) each beginning withstrand 4 of a different sheet (W). To force cis-proline in the fivea4 integrin Ws with GAP, which appeared required to obtaina strand 1 that extended through the FG sequence that isb-bridged to the GAP motif, the cis-proline in W4 of galactoseoxidase (GO) in the same alignment position and one residueon each side were superimposed and used as the template. Thebest representative for each W was chosen from amongmultiple a4 models; criteria were a well developed b-sheetstructure as determined by DSSP, and F andA residues from theFG and GAP motifs that were buried and in proximity of oneanother. These Ws were used as templates for a final model.The template selected for W2 contained the C81–C85 andC150–C165 disulfides that had spontaneously formed; theC58–C68 strand 2 to strand 3 and C111–C132 strand 3 tostrand 4 disulfides were present in all models as a consequenceof the alignment. Three copies of the Ca21-binding structurefrom galactose-binding protein (25) were aligned with W5,W6, and W7 (see Fig. 2), superimposed with LOOK usingb-strand residues Q142–V144, and residues L135–Q142 andthe Ca21 were used as templates. Residues G577–D586 in GOwere used to model a representative integrin b2 subunitsequence, residues V238–F247, as a polypeptide finger with anaspartic acid at its tip. It was placed three residues lower in thecentral cavity with respect to strand 1 than in GO. Theassociated Cu21wasmodeled asMg21. TheMg21was centeredon the pseudosymmetry axis at the average position of Cacarbons in bridge position 2 in strand 1. This gave a similarspatial relationship between metals and the Ca carbons ofmetal-ligating histidine and aspartic acid residues in GO, themodel, and the integrin I domain (17). The side chains ofresidues in position 2 of strand 1 were adjusted to an orien-tation similar to that of T102, C148, and S189 in G beta.Models were evaluated for b-sheet structure with DSSP (26)and for native-like packing environments of their residues withthe QUACHK module of WHAT IF (27).

    Deduction of a Protein Fold

    Many surface proteins contain tandem, independently foldeddomains, which commonly are unpaired and extend linearlylike beads on a string; however, a number of features suggestthat the seven N-terminal repeats of integrin a subunits are notindependent domains, but rather fold into a single globulardomain. (i) The number of disulfide bonds in FG-GAP repeatsis inadequate for small, autonomous domains. FG-GAP re-peats are 62 6 7 aa. Structurally characterized repeating

    domains of similar size found on cell surfaces include com-plement control repeats of 61 6 4 residues, fibronectin type IIrepeats of 60 6 1 residues, and epidermal growth factorrepeats of 406 3 residues (28). Disulfide bonds are particularlyimportant to stabilize small domains; all domains of '60residues and smaller contain two intradomain disulfides. How-ever, few FG-GAP repeats contain cysteines. Furthermore,some disulfides are interrepeat, rather than intrarepeat (29),which is unlikely for tandem independent domains. (ii) FG-GAP repeats contain two long consensus hydrophobic seg-ments, VVVGAP and VYLF (see Fig. 2), that cannot beaccommodated within a small domain. Small domains havetwo layers, whereas hydrophobic segments require a structurewith three or more layers, so that they can be buried beneathamphipathic segments (30). (iii) Integrins always contain sevenrepeats, whereas for linear tandem repeats, the number ofrepeats is not constrained. Even within families of closelyrelated adhesion molecules, repeat number varies. AmongIgSF members that bind integrins, there are two or five IgSFdomains in intercellular adhesion molecules (ICAMs) andthree to seven in vascular cell adhesionmolecule-1 (VCAM-1).Among selectins, there are two to nine complement controlrepeats (31). Cadherin superfamily members usually containfive domains, but one member contains 34 (32). The numberof FG-GAP repeats is conserved at seven in all known integrina subunits, including those in Caenorhabditis elegans andDrosophila melanogaster and the 16 different a subunits inmammals, despite insertion of the I domain in some of these.(iv) Exchange of the FG-GAP repeats before and after the Idomain between the integrins Mac-1 (aMb2) and p150,95(aXb2) reveals epitopes that require the presence of bothregions (33). Furthermore, although not previously pointedout, mAb HP1y3 to VLA-4 interacts with both FG-GAPrepeat 1 and repeats 5–7 (34).Because this evidence strongly suggests that FG-GAP re-

    peats cannot be autonomously folded units, I surveyed proteinstructures for those in which the FG-GAP repeats could becooperatively folded. Proteins have recently been systemati-cally classified according to their fold—i.e., the way in whichthe b-strand and a-helical secondary structure elements arearranged within the primary structure and folded in threedimensions to form domains (30, 35). Almost without excep-tion, proteins with sequence homology adopt the same fold.The fold is more evolutionarily conserved than sequencehomology and may also arise through convergent evolution, sothat proteins with the same fold may not show statisticallysignificant sequence homology. The similarities in sequencesuggest that each FG-GAP repeat adopts a similar structure;therefore, for the seven units to fold cooperatively, the proteinfold should contain seven pseudosymmetric units. Althoughseveral folds with pseudosymmetry exist, only one can haveseven pseudosymmetric units, an all-b structure known as theb-propeller.

    b-propellers contain four, six, seven, or eight b-sheetsarranged radially and pseudosymmetrically around a centralaxis. Each sheet contains four antiparallel b-strands (see Figs.1 and 3). Strand 1 is closest to and runs parallel to the centralaxis (36). The overall shape of the domain is cylindrical, withstrand 4 on the outside of the cylinder. Because b-sheets have

    FIG. 1. Folding topology of the integrin b-propeller. The Ws areupright. b-strands are arrows, known disulfides in aIIb (29) arehorizontal lines, and boundaries between FG-GAP repeats aremarkedwith vertical dashes.

    66 Biochemistry: Springer Proc. Natl. Acad. Sci. USA 94 (1997)

    Dow

    nloa

    ded

    by g

    uest

    on

    June

    13,

    202

    1

  • an inherent twist, the angle of successive strands increases withrespect to the central axis (36), giving rise to a propellerpattern in which each sheet represents one blade. Strands 1, 2,3, and 4 are connected by successive hairpin turns, and strand4 of one sheet is connected to strand 1 of the next (Fig. 1).Known b-propellers with seven blades are the trimeric Gprotein beta subunit (G beta; refs. 24 and 37), GO (38), andmethylamine dehydrogenase (39). Viral and bacterial neura-minidases contain six blades (40–42).

    Secondary Structure and Fold Prediction

    A previous prediction of secondary structure using a multiplesequence alignment suggested that the FG-GAP repeats eachhave four structural segments (43). One segment was ofuncertain structure, and a second segment was predicted tocontain an amphipathic helix followed by a partly buriedb-strand. Two further segments were predicted to be buriedb-strands. Since this secondary structure prediction, which wasexcellent for its time, further methods with improved accuracyhave been developed, and a greater number of integrin asubunits are present in the data base, which allows furtherimprovement in accuracy. Since secondary structure is highlyconserved within a family, prediction accuracy is improvedwhen information from the aligned sequences of the family istaken into account.

    Predictions were made for well-characterized a subunitsassociated with three different b subunits; the b3-associatedaIIb (CD41) subunit, the b1-associated a4 (CD49d) subunit,and the b2-associated aM (CD11b) subunit. The N-terminalsegments containing the seven FG-GAP repeats, with the Idomain removed where required, of each a subunit as thetarget sequence, together with corresponding regions from$30 different integrin a subunits were submitted to PHD (23).In contrast to the previous prediction in which all seven repeatsfrom 16 a subunits were aligned together with one another toyield one consensus prediction for a repeat (43), the sevenrepeats were aligned along their entire length, and thus theprediction for each repeat is independent. Remarkably, out ofthe 28 positions in which b-strand is expected for a b-propeller(four strands3 seven sheets), b-strand was predicted in 25, 25,and 26 instances for aIIb, a4, and aM, respectively (Fig. 2).The congruence between a b-propeller fold and secondary

    structure prediction was further tested with PIPLD. TheC-terminal portion of this enzyme contains seven FG-GAPrepeats with Ca21-binding motifs in repeats 1, 2, 3, and 6 (refs.43 and 46; Fig. 2). Indeed, the sequence homology between theseven repeats in integrins and PIPLD is highly significant (P,1026 with BLASTP), showing duplication from a commonancestral gene and strongly suggesting adoption of the sameprotein fold. b-Strands were predicted by PHD in 27 of the 28positions expected for a b-propeller in PIPLD (Fig. 2).

    FIG. 2. Alignments, secondary structure, and disulfide bonds. Residues in human integrin a-subunits and bovine phosphatidylinositolphospholipase D (PIPLD) predicted by PHD to be b-strand and a-helix are highlighted in gold and magenta, respectively. The known secondarystructure elements of G beta and GO b-propeller domains are highlighted with the same colors. Residues present in b-strands in the a4 modelas determined by DSSP (26) are underlined. Cysteines known in aIIb (29) or predicted in a4 and aM to be disulfide bonded together are codedwith the same color. Equivalent ladder positions in each strand are numbered at the top for G beta; G beta and GO are structurally aligned. Theposition of framework residues is marked with a line below the consensus sequence for G beta and GO. Residues in G beta that contact the switchII region of the G a-subunit (37, 44) are shown in red. The alignment unit is the sheet (W) rather than the sequence repeat; note that the N-terminalsegment is in strand 4, W7. For structural alignments, the seven b-sheets (Ws) of GO (38) and G beta (24, 37, 44) were cut out and structurallysuperimposed on one another, and the intact b-propellers were also superimposed using the program 3DMALIGN (45). Sequence alignments wereprepared with PILEUP in GCG (22) and manually adjusted with MEGALIGN (DNAstar, Madison, WI). All gaps except those between strands wereremoved. Consensus sequences were calculated with PRETTY in GCG using a plurality of 3y7 of the total number of W.

    Biochemistry: Springer Proc. Natl. Acad. Sci. USA 94 (1997) 67

    Dow

    nloa

    ded

    by g

    uest

    on

    June

    13,

    202

    1

  • Automatic fold prediction algorithms confirmed a b-pro-peller. Input of the seven repeat region into the sequencethreading program THREADER (47) showed that all the top,significant hits (Z scores , 23.0) were b-propeller structures:viral and bacterial neuraminidases and methylamine dehydro-genase. The secondary structure and accessibility prediction-based threading program TOPITS (48) yielded the b-propellerdomain within GO as the top hit for all three integrins, witha significant score for aIIb (Z 5 23.68).

    The Relation Between b-Sheets and FG-GAP Repeats

    Using a common convention for b-propellers, I will call eachsheet a W, with each leg of the W corresponding to oneb-strand. In the conventional view from the side with the Wsin the propeller upright, the strand 1 to 2 (1-2) loops and thestrand 3 to 4 (3-4) loops are on the bottom, and the strand 2to 3 (2-3) loops and the connections between strand 4 of oneW and strand 1 of the next W (4-1 loop) are on the top (Fig.1). For brevity, 4-1 loops are named after the followingW—e.g., the loop between W1 and W2 is the 4-1 loop of W2.The FG-GAP repeats and the Ws could be offset relative toone another, since the amino acid sequence threads throughthe propeller in a circular fashion, with the N and C terminiadjacent in the structure. In most b-propellers, the N- andC-terminal segments are tied together by having both contrib-ute strands to W7 (49).The four segments in FG-GAP repeats can be identified

    with particular strands in the Ws by their hydrophobicity,which is well defined in b-propellers. b-Propellers contain acentral cavity that is lined with strand 1 and that contains watermolecules. Therefore, b-strand 1 is partially exposed to sol-vent. b-Strand 4 on the exterior of the propeller is fully exposedto solvent and is the least hydrophobic strand. b-Strands 2 and3 are completely buried and therefore are the most hydropho-bic. Hence, the first segment of each FG-GAP repeat, whichis the least hydrophobic and least conserved between repeats,corresponds to the outer strand 4, the second segment, whichis partially hydrophobic, corresponds to the inner strand 1, andthe third and fourth segments, which are hydrophobic, corre-spond to the buried strands 2 and 3 (Figs. 1 and 2). Thistopology, with W7 containing strand 4 contributed by theN-terminus and strands 1-3 contributed by the C terminus, isshared by most b-propellers (24, 37, 38, 40–42, 49).

    Disulfide Bonds

    Three disulfide bonds have been chemically defined within theFG-GAP repeats of aIIb (29). The interrepeat C56–C65disulfide links strand 3 and strand 4 in W1 (Figs. 1 and 2) andthus is intrasheet. The C107–C130 disulfide links strand 2 andstrand 3 of W2. The C146–C167 disulfide is in the 4-1 loop ofW3. Disulfide bonds between strands predict their ladderrelationship. In b-sheets, each strand is bridged to the next byhydrogen bonds between the peptide carbonyl and amidegroups; the ladder positions of residues in all strands of thesheet are thus defined. The residues in each strand at the sameladder or bridge position in the b-sheets of G beta are giventhe same number at the top of Fig. 2. Disulfide bonds withinb-sheets occur between cysteines at the same ladder positionon neighboring strands; the Ca carbons of residues offset byone or more position are too far apart. The influenza neur-aminidase b-propeller contains five such intra-sheet disulfides(40, 41). Integrins were aligned with G beta and GO, such thatdisulfide bonds would occur between residues in equivalentbridge positions. This defined the ladder relationships betweenintegrin b-strands 2 and 3 in allW, since the alignment betweenthe cysteine and non-cysteine-containing strands is excellent;and between strands 3 and 4 in only those W containing adisulfide, since strand 4 is poorly conserved.

    The disulfide bridges in integrins other than aIIb can bepredicted from homology to cysteines in aIIb and frommultiple alignments. Cysteines at two different alignmentpositions that are shared by the same subset of integrins arepredicted to be linked. All predicted disulfides are compatiblewith the b-propeller fold. For example, C478 and C490 in W5of aM are predicted to form a disulfide between b-strands 3and 4 at the same position as the disulfide in W1 (Fig. 2). C446of aM and C278 of a4 do not have a match, and are predictednot to form a disulfide. Indeed, C278 in a4 has been demon-strated to be free (50). In W5 of the C. elegans integrin YMA1(Fig. 2), a disulfide is predicted between cysteines that replacethe F and the A of the FG-GAP motif. The F and the A arealigned in ladder position 1 of strands 1 and 2, providingfurther support for the alignment shown in Fig. 2.

    Sequence Alignment

    Among b-propellers, only G beta andGO resemble integrins andPIPLD in number of blades and topology in W7. A structure-based sequence alignment between the Ws of G beta and GO(Fig. 2) reveals the well knownWD40 repeats in G beta (51) andsequence repeats within GO as previously mentioned (36). In-tegrins were aligned with G beta and GO according to thepositions of the predicted and known b-strands, and the ladderpositions defined by disulfides; sequence similarities were thennoted that confirmed or fine-tuned the alignment, particularlywith G beta (Fig. 2). For example, the residue before thebeginning of b-strand 1 in G beta (‘‘0’’ position at top of Fig. 2)aligns with similar hydrophilic and bulky residues in integrins andPIPLD. At position 1 of strand 1, the consensus F or L residuein integrins and PIPLD aligns with a buried, hydrophobic con-sensus I residue in G beta. A b-bulge follows, in which the G ofthe FG repeat is inserted. There is a precedent for insertion inevolution of a glycine into a b-bulge (52). The subsequent,b-bulged residue in G beta (‘‘b’’ position, Fig. 2) is located wherethe polypeptide turns from running across the top of the propellerto down the inner lining of the central cavity. The residue in theb position is often too bulky to be accommodated at position 2,where theb-strands are very closely packed and small residues arefound. At positions 2 and 3 in strand 1, consensus SV and SLresidues in integrins and PIPLD align with consensus SL residuesinG beta. Other similarities in strands 2, 3, and 4 are present (Fig.2), including alignment in strand 2 of the consensus GA inintegrins and PIPLD with the consensus GS in G beta.

    Alternative Splicing

    The b-propeller model is compatible with alternative splicingof the Drosophila PS2, and the mammalian a6, a7, and a3integrin subunits (53–55). In all cases, the alternatively splicedor deleted region corresponds to b-strands 3 and 4 of W3. Ofany strands in a b-propeller, the loss or replacement ofb-strands 3 and 4 would have the least deleterious effect,because they constitute the external half of the sheet and leavethe packing in the interior of the domain intact.

    Integrin b-Propeller Models

    Models were built to further test the b-propeller fold, and tomake predictions about integrin structure and function. Mod-els made using GO or G beta as templates and the alignmentin Fig. 2 yielded packing quality scores in the range 21.4 to21.7 (27), which suggests that they were threaded correctly—i.e., had the proper fold. Scores were better with G beta thanGO as template, in agreement with greater similarity tointegrins in sequence and in length of the predicted b-strands(Fig. 2). b-strand 1 extended through the bulge position in themajority of Ws, and in the remaining Ws it had a similar

    68 Biochemistry: Springer Proc. Natl. Acad. Sci. USA 94 (1997)

    Dow

    nloa

    ded

    by g

    uest

    on

    June

    13,

    202

    1

  • polypeptide chain path (Figs. 2 and 3). Furthermore, alldisulfide bonds were properly formed (Fig. 3, black).

    The upper surface of the integrin b-propeller models has apocket resembling a slightly oval amphitheater, with tiers of

    FIG. 3. Ribbon diagrams (65) of the model for the integrin a4-subunit b-propeller domain. Views are from the top (A) and side (B). Each Wis shown in a different color. A hypothetical polypeptide finger in the central cavity is gray. Cysteines in disulfides are black. Side chains in b-strand1 in positions 0, b, and 2 are shown in gold, lavender, and rose, respectively; their oxygens and nitrogens are red and blue, respectively. Ca21 ionsand a hypothetical Mg21 ion are gold and silver spheres, respectively.

    Biochemistry: Springer Proc. Natl. Acad. Sci. USA 94 (1997) 69

    Dow

    nloa

    ded

    by g

    uest

    on

    June

    13,

    202

    1

  • residues surrounding the central cavity. The residues in the lastpart of the 4-1 loop and in the bulge region of strand 1 aresuccessively lower as they approach the turn into the centralcavity; strand 2 exits further away from the cavity and parallelsstrand 1 as residues are successively higher and further fromthe cavity. Among the residues pointing up into a putativeligand-binding region at the center of the pocket, the residuesat the b position in strand 1 (Fig. 3, lavender) line the upperrim of the central cavity and form the inner tier. The next tieris formed by the last residue of the 4-1 loop (0 position ofstrand 1; Fig. 3, gold). These residues line an inner pocketaround the top of the central cavity. Tyrosine, which isfrequently present in hapten-binding pockets in antibodies, isabundant at both the 0 and b positions in integrins (Figs. 2 and3). In PIPLD, tryptophan occurs three times in these positions,and six of the seven residues in the 0 position are basic (Fig.2), forming a ring of positive charge around a putative activesite. The residue equivalent to the P of GAP, and the nextusually hydrophilic residue in the 2-3 loop, form an overlappingand more distal tier. Residues in these positions in strand 2 andin the 0 and b positions in strand 1 in G beta (Fig. 2, shown inred) make contact with the switch II region of the G proteina subunit (37, 44).The b-propeller enzymes all have their active sites at the top

    of the b-propeller, near the pseudosymmetry axis. It is gen-erally true that enzymes with the same fold have their activesites in the same location and that these sites are present whereadjacent loops run in opposite directions (30). The top of theb-propeller is richly endowed with such loops, the alternating2-3 and 4-1 loops. These often vary in length, providingexcursions for substrate and ligand binding and make extensivecontacts with one another, particularly toward the axis. Bycontrast, the 1-2 and 3-4 loops on the bottom of the b-propellerare simple hairpins that are more uniform in length and makefew contacts with one another. Fold is likely to be a reliablepredictor of binding site for those adhesion molecules withdivalent cations in their recognition sites, because similarspecializations are required for binding divalent cations andsubstrates. Indeed, in the case of the nucleotide-binding ordouble-twisted fold that is shared between enzymes, small Gproteins and regulatory G protein a subunits, and integrin Idomains (17, 18), the same site that binds divalent cation andsubstrate in the enzymes andG proteins bindsMg21 and ligandin integrin I domains.The prediction based on the protein fold of a ligand-binding

    site on the upper surface of the integrin b-propeller domain issupported by mutational studies. Irie et al. (14) surveyed theregion of the integrin a4 subunit to which function-blockingmAb mapped by mutating 50 residues in repeats 1-4. Mutationof three residues, Y187, W188, and G190, in loop 2-3 of W3(Fig. 2), specifically reduced binding to VCAM-1 and fibronec-tin. Mutation of three residues in the same location in theintegrin a5 subunit (14) and in the integrin aIIb subunit (16)similarly reduced or abolished ligand binding. Mutationalstudies on the a3 subunit have shown that substitution ofresidues 153–165 with the corresponding sequence from a4abolishes binding to laminin (56). This region corresponds tothe 4-1 loop of W3. In complete agreement with the b-pro-peller model, these regions in a4 and a3 integrins are not onlypresent on the top of the domain, but the 4-1 loop of W3 andthe 2-3 loop of W3 are adjacent in the structure.Previous studies have emphasized the importance of the

    FG-GAP repeats, but they have conceptualized them asseparate domains and mapped ligand binding to differentlocations (9). Ligand cross-linking sites correspond to W5 (12)and to the region from W3 to W6 (13). The ligand specificityof aIIb can be transferred with a region spanning from strand4 of W7 to strand 3 of W5, but not with a region spanning fromstrand 4 of W3 to strand 3 of W7 (15). mAbs to a4 that blockligand binding map to W3 (34, 57). Ligand-mimetic antibodies

    to the fibrinogen-binding site on aIIb map to the 2-3 loop ofW3 (16). By contrast, eight different mAb to the a3 subunitthat block adhesion to laminin and homotypic aggregationmapto a region extending from strand 4 of W7 to the 1-2 loop ofW2 (56). These results are difficult to interpret with a modelin which the FG-GAP repeats represent tandem, independentdomains, but are highly consistent with a b-propeller model inwhich all FG-GAP repeats contribute to a common surface.Since almost all integrins require Mg21 for ligand binding

    and their ligands contain functionally important glutamic oraspartic acid residues that may coordinate a divalent cation(1–4, 9), it is reasonable to inspect models to determinewhether a Mg21-binding site might be present in the putativeligand-binding site on the upper surface. A substantial pro-portion of b-propellers bind cations, and the sites are locatedon the upper surface in the 1-4 and 2-3 loops or in the centralcavity (38, 41, 58–61). In integrin b-propeller models, residueswith side chains containing oxygen are abundant near theupper end of the central cavity, although few of these residuesare acidic (Fig. 3). The only type of divalent cation-bindingregion that is apparent would be similar to the MIDAS in Idomains (17). The coordination chemistry in the MIDAS isunusual for Mg21 and never seen for Ca21. Three oxygensfrom serine and threonine side chains and one water moleculedonate coordinations to a Mg21 ion in an xy plane. The lack ofcharged residues in the primary coordination sphere in theMIDAS has significance for the strength of binding to an acidicresidue in the ligand. Serines and threonines are abundant inan xy plane at the top of the central cavity in integrinb-propeller models (Fig. 3, rose residues), in position 2 ofstrand 1 (Fig. 2). This is at the narrowest aperture of the centralchannel. Moving around the circumference of this aperture,each group of three adjacent serine and threonine residues—i.e., those in adjacent b-strands—is in a remarkably similardisposition to the serines and threonine in theMIDAS. The Cacarbons of the adjacent residues are 5.21 6 0.23 Å apart,compared with 5.55 Å in the MIDAS, and the next mostadjacent are 9.37 6 0.37 Å apart, compared with 8.4 Å in theMIDAS. Only one Mg21 could be bound at a time, but thereare as many as seven different possible binding sites distributedaround the pseudosymmetry axis, which could have importantimplications for regulation of ligand affinity and specificity.The fourth position in the xy plane might be occupied by awater molecule; the residues are too far apart to form fourcoordinations to an ion, as occurs in four-bladed b-propellers(59–61). An acidic residue to hold a water in position tocoordinate the Mg21 in the 2z position as found in the Idomain MIDAS appears to be missing. However, one mightimagine a polypeptide finger from another domain—e.g., theb subunit—protruding up into the central cavity. Such a fingercould have a residue at its tip that coordinates with the cationfrom below, as occurs in the central cavity of GO. Indeed, thecavities of all seven- and eight-bladed propellers except G betaare filled with a polypeptide finger or prosthetic group, and Gbeta may not be a real exception, because it might bind to loopsfrom seven transmembrane receptors (37). There is no evi-dence for such a finger in integrins; however, it was includedin models to demonstrate that there would be room for one,particularly when placed lower in the central cavity than inGO,as appropriate for the predicted lower position of the cation.The central cavity of G beta is narrower, but more conical, thanthat of GO. The remaining 1z coordination position for ahypothetical Mg21 would be above it in the center of the uppersurface of the b-propeller, in an ideal position for coordinationto a bound ligand.The putative Ca21-binding motifs in integrins located in the

    1–2 loops were modeled on the Ca21-binding motif in galac-tose-binding protein (25), since, in contrast to the EF-handmotif, in galactose-binding protein one of the coordinatingresidues is located in a b-strand, as predicted in integrins (Fig.

    70 Biochemistry: Springer Proc. Natl. Acad. Sci. USA 94 (1997)

    Dow

    nloa

    ded

    by g

    uest

    on

    June

    13,

    202

    1

  • 2). The Ca21-binding sites are predicted to be near one anotheron the lower surface of the b-propeller domain (Fig. 3). TheN terminus of integrins is also predicted to be on the lowersurface and has highly conserved Asn and Asp residues atpositions 2 and 4 that might ligate a cation. Since removal ofCa21 activates ligand binding by many integrins and destabi-lizes association with the b subunit in several integrins (4, 6, 9),the lower surface of the b-propeller domain in the a subunitmay be involved in interactions with the b subunit and inconformational changes that regulate ligand binding.The I domain is inserted into the b-propeller domain in the

    4-1 loop between W2 and W3. Integrins with I domainsuniquely contain cysteines in the 2-3 loop of W2 and in the 4-1loop, just before the beginning of the I domain, that arepredicted to be disulfide-bonded (aM C97 and C128; Fig. 2).The two polypeptide chain connections and the disulfide bondare all predicted to be close together on the upper rim of theb-propeller domain. The connections to the b-propeller arealso close together in the I domain structure (17), and ahydrophilic linker segment is present on the C terminal side ofthe I domain. These features suggest a hinge between the Idomain and the b-propeller domain.The I domain is structurally homologous to the G protein

    a-subunit (17, 19), and when the integrin I domain andb-propeller domain are superimposed on the trimeric Gprotein structure (37, 44), the N and C termini of the I domainare very close to a 4-1 loop on the rim of the b-propellerdomain, as proposed for the hinge connection (Fig. 4). Thesuperimposition illustrates the relative sizes of the b-propellerand I domains, and one possible orientation between them.Since the I domain can be expressed independently of theintegrin b-propeller domain (17), in an intact integrin, move-

    ments at the hinge might allow reversible binding of the Idomain to the upper surface of the b-propeller domain, as ininteractions between G protein a and b subunits. Conforma-tional changes in the b-propeller domain similar to those thatregulate ligand binding might regulate I domain binding to theb-propeller domain and, in turn, regulate ligand binding by theI domain. Specifically, a conformational change in the Idomain MIDAS that correlates with binding of a surrogateligand and that is linked to a substantial change in position ofthe C-terminal a-helix (17–20) would be expected to be linkedto movements of the I domain relative to the b-propellerdomain.

    Conclusion

    The convergence of multiple, independent lines of evidenceprovides strong support for the conclusion that the FG-GAPrepeats of integrins fold into a b-propeller domain. Indeed,GO, G beta, integrins, and PIPLD appear so similar that it islikely that they evolved from a common ancestral b-propellerdomain, as previously suggested for GO and G beta (24). It isinteresting that integrins have statistically significant sequencehomology to the enzyme PIPLD, which, like integrins, ispredicted here to have a b-propeller domain. This demon-strates an explicit evolutionary link between enzymatic andnonenzymatic proteins with b-propeller folds. It is also inter-esting that like integrins and PIPLD, most b-propeller en-zymes are extracellular—i.e., membrane-bound (40, 41), se-creted (38, 42, 60, 61), or periplasmic (39, 49, 58). Theb-propeller domain appears to be well suited to many differenttypes of functions. Indeed, because there are so many differentmembers of the integrin (1–4) and G protein b-subunit orWD40 families (51), members of the b-propeller family in-volved in ligand binding, regulation, and signaling may bemorenumerous than the enzymes.Previous fold predictions have been reviewed (62, 63). The

    prediction that the integrin I domain had the same fold as thesmall G protein ras—i.e., the double-twisted, nucleotide-binding fold (64) is most relevant. This was an outstanding andcorrect prediction (62), although the subsequent structurerevealed a different variant of this fold, in which the order oftwo of the b-strands in the sheet is reversed (17). Among foldpredictions, a b-propeller is auspicious. These are among thelargest protein domains known, and therefore the relativedisposition of a large number of amino acid residues can bepredicted. Furthermore, the b-propeller fold is highly stereo-typed (36). In most protein folds, variation can occur in thenumber of secondary structure elements and in the order inwhich the elements occur in the amino acid sequence. Bycontrast, all known b-propellers contain four b-strands persheet and an invariant pattern for threading the sequencethrough the three-dimensional structure.Three-dimensional models of the integrin b-propeller do-

    main are therefore likely to be more accurate than usuallyexpected for models based on templates that lack statisticallysignificant sequence homology. Needless to say, there will bediscrepancies with the actual structures, and accuracy will varyas function of position in the fold. The strands and nearbypositions in loops with clear correspondence to the templatewill be predicted the most accurately. It is propitious that thesepositions include residues thought to be important in bindingligand and possibly a Mg21 ion, on the upper surface of theb-propeller domain. Some 4-1 and 2-3 loops in integrins areparticularly long and are not predictable; in b-propellers thesemay go up in the general direction of the pseudosymmetry axis,extend in a radial direction, or even drape over the outercylindrical surface. However, the adjacency of loops is welldefined in b-propellers as shown by inspection of G beta andGO. On the upper surface, most loops contact other loops inthe same and two adjacent W; there are few more distant

    FIG. 4. Superimposition of the aM integrin subunit I domain andb-propeller domain on a trimeric G protein. (A) The aM I domain(magenta; ref. 17), b-propeller domain (purple), and polypeptidefinger (white) are shown with Ca traces. Mg21 and Ca21 ions are greenand grey spheres, respectively. Residues at the ends of the domains thatmay be connected by linker segments are shown in yellow. Accordingto the superimposition, the a carbons of C128 and D132 are 7.4 Åapart, and K315 and S329 are 11 Å apart. (B) The trimeric G proteina-subunit (magenta) and bound GDP (space-filling), b-subunit (pur-ple), and g-subunit (cyan; ref. 44) are shown in the same orientation.The residues in yellow are separated by 5–7 residues from lipid-modified termini that bind to the inner face of the plasma membrane(37, 44). These interactions with the plasma membrane increaseassociation between the a-subunit and the bg-subunit dimer and tosome extent may act like the covalent linkage between the I domainand b-propeller domain in integrin a-subunits. The aM b-propellerdomain was modeled on a4, and superimposed on G beta with thealignment in Fig. 2 and a 2y7 rotation on the pseudosymmetryaxis—i.e. with aM W1 on G beta W6. A total of 49 residues in aM Idomain elements bA, a1, bB, bD, bE, bF, and a7 (17) was superim-posed on Ga b1, a1, b3, b4, b5, b6, and a5 (44), respectively, with anRMS of 1.56 Å using LOOK. The figure was made with LOOK.

    Biochemistry: Springer Proc. Natl. Acad. Sci. USA 94 (1997) 71

    Dow

    nloa

    ded

    by g

    uest

    on

    June

    13,

    202

    1

  • contacts. On the lower surface, loops are more regular and theinteractions more circumscribed. The 1-2 loops only interactwith 1-2 loops in neighboringW and with 3-4 loops in the sameand in the following W; 3-4 loops do not interact withneighboring 3-4 loops. One prediction of adjacencies by theb-propeller model has already been confirmed; the 1-2 loop ofW5 is in close proximity to the 3-4 loop of W6 in the aMintegrin subunit (C. Oxvig and T.A.S., unpublished).

    I thank members of my family and laboratory for their understand-ing during this project. Biocomputing resources on the internet,particularly at European Molecular Biology Laboratory (EMBL),were essential for this work. I am grateful to the C. Sander group andespecially to B. Rost for his support of the PIR list submission formatfor PHD; G. Vriend for WHAT IF; D. Jones for THREADER; A. Sáli forMODELLER and advice; J.-H. Wang for advice; S. Sprang and J. Sondekfor G protein coordinates; R. Liddington for I domain coordinates; Y.Takata for a preprint; T. Huynh and C. Huang for installation and helpwith programs; C. Oxvig for help with editing sequences; J. Casasnovasfor help with RIBBONS; and S. Chen for help with spreadsheets forcoordinate transformations and analysis of b-propellers. This work wassupported by National Institutes of Health Grants CA31798 andCA31799.

    1. Springer, T. A. (1990) Nature (London) 346, 425–433.2. Hemler, M. E. (1990) Annu. Rev. Immunol. 8, 365–400.3. Hynes, R. O. (1992) Cell 69, 11–25.4. Smyth, S. S., Joneckis, C. C. & Parise, L. V. (1993) Blood 81,

    2827–2843.5. Dustin, M. L. & Springer, T. A. (1989) Nature (London) 341,

    619–624.6. Diamond, M. S. & Springer, T. A. (1994) Curr. Biol. 4, 506–517.7. Ginsberg, M. H. (1995) Biochem. Soc. Trans. 23, 439–446.8. Weisel, J. W., Nagaswami, C., Vilaire, G. & Bennett, J. S. (1992)

    J. Biol. Chem. 267, 16637–16643.9. Loftus, J. C., Smith, J. W. & Ginsberg, M. H. (1994) J. Biol.

    Chem. 269, 25235–25238.10. Corbi, A. L., Miller, L. J., O’Connor, K., Larson, R. S. &

    Springer, T. A. (1987) EMBO J. 6, 4023–4028.11. Tuckwell, D. S., Brass, A. & Humphries, M. J. (1992) Biochem. J.

    285, 325–331.12. D’Souza, S. E., Ginsberg, M. H., Burke, T. A. & Plow, E. F.

    (1990) J. Biol. Chem. 265, 3440–3446.13. Smith, J. W. & Cheresh, D. A. (1990) J. Biol. Chem. 265, 2168–

    2172.14. Irie, A., Kamata, T., Puzon-McLaughlin, W. & Takada, Y. (1995)

    EMBO J. 14, 5550–5556.15. Loftus, J. C., Halloran, C. E., Ginsberg, M. H., Feigen, L. P.,

    Zablocki, J. A. & Smith, J. W. (1996) J. Biol. Chem. 271, 2033–2039.

    16. Kamata, T., Irie, A., Tokuhira, M. & Takada, Y. (1996) J. Biol.Chem. 271, 18610–18615.

    17. Lee, J.-O., Rieu, P., Arnaout, M. A. & Liddington, R. (1995) Cell80, 631–638.

    18. Qu, A. & Leahy, D. J. (1995) Proc. Natl. Acad. Sci. USA 92,10277–10281.

    19. Lee, J.-O., Bankston, L. A., Arnaout, M. A. & Liddington, R. C.(1995) Structure (London) 3, 1333–1340.

    20. Qu, A. & Leahy, D. J. (1996) Structure (London) 4, 931–942.21. Huang, C. & Springer, T. A. (1995) J. Biol. Chem. 270, 19008–

    19016.22. Devereux, J., Haeberli, P. & Smithies, O. (1984) Nucleic Acids

    Res. 12, 387–395.23. Rost, B. (1996) Methods Enzymol. 266, 525–539.24. Sondek, J., Bohm, A., Lambright, D. G., Hamm, H. E. & Sigler,

    P. B. (1996) Nature (London) 379, 369–374.25. Vyas, N. K., Vyas, M. N. & Quiocho, F. A. (1987) Nature (Lon-

    don) 327, 635–638.26. Kabsch, W. & Sander, C. (1983) Biopolymers 22, 2577–2637.27. Vriend, G. (1990) J. Mol. Graphics 8, 52–56.28. Barclay, A. N., Birkeland, M. L., Brown, M. H., Beyers, A. D.,

    Davis, S. J., Somoza, C. & Williams, A. F. (1993) The LeucocyteAntigen Facts Book (Academic, London).

    29. Calvete, J. J., Henschen, A. & Gonzalez-Rodriguez, J. (1989)Biochem. J. 261, 561–568.

    30. Branden, C. & Tooze, J. (1991) Introduction to Protein Structure(Garland, New York), pp. 1–302.

    31. Springer, T. A. (1994) Cell 76, 301–314.32. Mahoney, P. A., Weber, U., Onofrechuk, P., Beissmann, H.,

    Bryant, P. J. & Goodman, C. S. (1991) Cell 67, 853–868.33. Diamond, M. S., Garcia-Aguilar, J., Bickford, J. K., Corbi, A. L.

    & Springer, T. A. (1993) J. Cell Biol. 120, 1031–1043.34. Schiffer, S. G., Hemler, M. E., Lobb, R. R., Tizard, R. & Osborn,

    L. (1995) J. Biol. Chem. 270, 14270–14273.35. Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia, C. (1995)

    J. Mol. Biol. 247, 536–540.36. Murzin, A. G. (1992) Proteins 14, 191–201.37. Wall, M. A., Coleman, D. E., Lee, E., Iniguez-Lluhi, J. A., Pos-

    ner, B. A., Gilman, A. G. & Sprang, S. R. (1995) Cell 83, 1047–1058.

    38. Ito, N., Phillips, S. E. V., Stevens, C., Ogel, Z. B., McPherson,M. J., Keen, J. N., Yadav, K. D. S. &Knowles, P. F. (1991)Nature(London) 350, 87–90.

    39. Vellieux, F. M. D., Huitema, F., Groendijk, H., Kalk, K. H., Jzn,J. F., Jongejan, J. A., Duine, J. A., Petratos, K., Drenth, J. & Hol,W. G. J. (1989) EMBO J. 8, 2171–2178.

    40. Colman, P. M., Varghese, J. N. & Laver, W. G. (1983) Nature(London) 303, 41–44.

    41. Burmeister, W. P., Ruigrok, R. W. H. &Cusack, S. (1992)EMBOJ. 11, 49–56.

    42. Crennell, S. J., Garman, E. F., Laver, W. G., Vimr, E. R. &Taylor, G. L. (1993) Proc. Natl. Acad. Sci. USA 90, 9852–9856.

    43. Tuckwell, D. S., Humphries, M. J. & Brass, A. (1994) Cell Adhes.Commun. 2, 385–402.

    44. Lambright, D. G., Sondek, J., Bohm, A., Skiba, N. P., Hamm,H. E. & Sigler, P. B. (1996) Nature (London) 379, 311–319.

    45. Sáli, A. & Blundell, T. L. (1993) J. Mol. Biol. 234, 779–815.46. Scallon, B. J., Fung, W.-J. C., Tsang, T. C., Li, S., Kado-Fong, H.,

    Huang, K.-S. & Kochan, J. P. (1991) Science 252, 446–448.47. Jones, D. T., Taylor, W. R. & Thornton, J. M. (1992) Nature

    (London) 358, 86–89.48. Rost, B. (1995) in Proceedings of the Third International Confer-

    ence on Intelligent Systems for Molecular Biology: Cambridge, U.K.,ed. Rawlings, C. (AAAI, Menlo Park, CA), pp. 314–321.

    49. Xia, Z.-x., Dai, W.-w., Xiong, J.-p., Hao, Z.-p., Davidson, V. L.,White, S. & Mathews, F. S. (1992) J. Biol. Chem. 267, 22289–22297.

    50. Pujades, C., Teixido, J., Bazzoni, G. & Hemler, M. E. (1996)Biochem. J. 313, 899–908.

    51. Neer, E. J., Schmidt, C. J., Nambudripad, R. & Smith, T. F.(1994) Nature (London) 371, 297–300.

    52. Chan, A. W. E., Hutchinson, E. G., Harris, D. & Thornton, J. M.(1993) Protein Sci. 2, 1574–1590.

    53. Brown, N. H., King, D. L., Wilcox, M. & Kafatos, F. C. (1989)Cell 59, 185–195.

    54. Ziober, B. L., Vu, M. P., Waleh, N., Crawford, J., Lin, C.-S. &Kramer, R. H. (1993) J. Biol. Chem. 268, 26773–26783.

    55. Delwel, G. O., Kuikman, I. & Sonnenberg, A. (1995) Cell Adhes.Commun. 3, 143–161.

    56. Zhang, X.-P., Puzon-McLaughlin, W., Irie, A., Kovach, N.,Prokopishyn, N. L., Laferte, S., Takeuchi, K. I., Tsuji, T. &Takada, Y. (1997) J. Biol. Chem., in press.

    57. Kamata, T., Puzon, W. & Takada, Y. (1995) Biochem. J. 305,945–951.

    58. Xia, Z.-x., Dai, W.-w., Zhang, Y.-f., White, S. A., Boyd, G. D. &Mathews, F. S. (1996) J. Mol. Biol. 259, 480–501.

    59. Faber, H. R., Groom, C. R., Baker, H. M., Morgan, W. T., Smith,A. & Baker, E. N. (1995) Structure (London) 3, 551–559.

    60. Li, J., Brick, P., O’Hare, M. C., Skarzynski, T., Lloyd, L. F.,Curry, V. A., Clark, I. M., Bigg, H. F., Hazleman, B. L., Cawston,T. E. & Blow, D. M. (1995) Structure (London) 3, 541–549.

    61. Libson, A. M., Gittis, A. G., Collier, I. E., Marmer, B. L., Gold-berg, G. I. & Lattman, E. E. (1995) Nat. Struct. Biol. 2, 938–942.

    62. Russell, R. B. & Sternberg, M. J. E. (1995)Curr. Biol. 5, 488–490.63. Sippl, M. J. & Flockner, H. (1996) Structure (London) 4, 15–19.64. Edwards, Y. J. K. & Perkins, S. J. (1995) FEBS Lett. 358,

    283–286.65. Carson, M. (1991) J. Appl. Crystallogr. 24, 958–961.

    72 Biochemistry: Springer Proc. Natl. Acad. Sci. USA 94 (1997)

    Dow

    nloa

    ded

    by g

    uest

    on

    June

    13,

    202

    1