protein structures - carey.pdf

Upload: soundarya-chandramouleeswaran

Post on 03-Apr-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 Protein structures - carey.pdf

    1/20

    Previously published in

    Biophysical Society On-line Textbook

    PROTEINS

    CHAPTER 1. PROTEIN STRUCTURE

    Section 1.

    Primary structure, secondary motifs,

    tertiary architecture, and quaternary organization

    Jannette Carey* and Vanessa Hanley^

    *Department of Chemistry

    ^Department of Chemical Engineering

    Princeton University

    Princeton, NJ 08544-1009

    *corresponding author

    (609) 258-1631 phone

    (609) 258-6746 FAX

    [email protected]

  • 7/28/2019 Protein structures - carey.pdf

    2/20

    1. The amino acid building blocks

    Proteins are polymeric chains that are built from monomers called amino

    acids. All structural and functional properties of proteins derive from the chemical

    properties of the polypeptide chain. There are four levels of protein structuralorganization: primary (1), secondary (2), tertiary (3), and quaternary (4). Primary

    structure is defined as the linear sequence of amino acids in a polypeptide chain.

    The secondary structure refers to certain regular geometric figures of the chain.

    Tertiary structure results from long-range contacts within the chain. The quaternary

    structure is the organization of protein subunits, or two or more independent

    polypeptide chains.

    Amino acids are the chemical constituents of proteins, and are characterized

    by a central alpha carbon atom. The alpha indicates the priority position from which

    the numbering follows for all subordinate groups. Four substituents are connected

    to this C: one substituent is the alpha proton -H, another is the side chain -R that

    gives rise to the chemical variety of the amino acids, the third is the carboxylic acid

    functional group (-COOH), and the fourth is the amino functional group (-NH). The

    carbon is the asymmetric center of the molecule for all 20 amino acids except

    glycine, which has only a proton as its side chain. The configuration about the

    carbon center must be the L-isomer for proteins synthesized on the ribosome. This is

    probably an accident of chemical evolution where the L-isomer happens to be the

    one chosen for early prebiotic systems and fixed into evolutionary history.

  • 7/28/2019 Protein structures - carey.pdf

    3/20

    figure 1

    The side chains have a wide chemical variety which is vital for the unique

    functions of biological proteins (Figure 1). These side chains can be grouped intothree categories: nonpolar, uncharged polar, and charged polar. The simplest amino

    acid is glycine. Alanine, valine, leucine, isoleucine, and proline are amino acids

    whose side chains are entirely aliphatic. Alanine has a methyl group as its side

    chain. Valine has two methyl groups connected to the carbon, and this residue is

  • 7/28/2019 Protein structures - carey.pdf

    4/20

    said to be -branched. Leucine has one more carbon atom in the side chain than

    valine, so that two methyl groups are attached to C. Leucine and isoleucine are

    isomers whose only difference in structure is the position of the methyl groups.

    Isoleucine is a -branched amino acid and has a second asymmetric center at the

    carbon. Proline contains an aliphatic side chain that is covalently bonded to the

    nitrogen atom of the -amino group, forming an imide bond and leading to a

    constrained 5-membered ring.

    Side chains that are generally nonpolar have low solubility in water because

    they can form only van der Waals interactions with water molecules. On the other

    hand, the rest of the amino acids contain heteroatoms in their side chains, opening

    many bonding possibilities. The uncharged members of this group include: serine,

    threonine, asparagine, glutamine, tyrosine, and tryptophan. Serine, threonine, and

    tyrosine contain hydroxyl groups so they can function as both hydrogen bond

    donors and acceptors, and threonine also has a methyl group, making it -branched.

    The benzene ring of tyrosine permits stabilization of the anionic phenolate form

    upon loss of the hydroxyl proton, which has a pKa near 10. Serine and threonine

    cannot be deprotonated at ordinary pH values. Asparagine and glutamine side

    chains are relatively polar in that they can both donate and accept hydrogen bonds.

    The nitrogen and proton of the tryptophan indole side chain can also participate in

    hydrogen bond interactions.

    Another group of polar residues can bear a full, formal charge depending on

    pH, but their pKa values are such that the charged form is largely populated near

    neutral pH's. These include lysine, arginine, histidine, aspartic acid, glutamic acid,and cysteine. Lysine and arginine are two basic amino acids that can bear a positive

    charge at the end of their side chain. The lysine -amino group has a pKa value near

    10, while the arginine guanidino group has a pKa value of ~12. Histidine is another

    basic residue with its side chain organized into a closed ring structure that contains 2

  • 7/28/2019 Protein structures - carey.pdf

    5/20

    nitrogen atoms. One of these nitrogens already has a proton on it, but the other one

    has an available position that can take up an extra proton and form a positively

    charged histidine group with a pKa of about 6. Aspartic acid and glutamic acid differ

    only in the number of methylene (-CH2-) groups in the side chain, with one and twomethylene groups, respectively. Their carboxylate groups are extremely polar and

    can both donate and accept hydrogen bonds, and have pKa values near 4.5.

    The sulfhydryl (thiol) group of cysteine can ionize at slightly alkaline pH

    values, with a pKa near 9. The thiolate form can react with a second sulfhydryl to

    form a disulfide bond that is reversible by reduction. Methionine has a long alkyl

    side chain that also contains a sulfur heteroatom but is hydrophobic. This sulfur

    atom is relatively inert as a hydrogen bond acceptor.

    A group of three amino acids that all have aromatic side chains are

    phenylalanine, tryptophan, and tyrosine. The aromatic ring of phenylalanine is like

    that of benzene or toluene. It is very hydrophobic and chemically reactive only

    under extreme conditions, though its ring electrons are readily polarized. The side

    chain of histidine is arguably considered aromatic; it meets the electron rule in one

    of its protonation states, but does not have the characteristic strong near-UV

    absorption of the other three aromatic amino acids. The UV spectra of the three

    aromatic groups are distinctive, as are their extinction coefficients, and these

    properties are reflected in the electronic spectra of polypeptides.

    Finally, the -amino and -carboxylate groups of amino acids can also ionize,

    with pKa's of 6.8-7.9 and 3.5-4.3, respectively, for the aliphatic amino acids; nearby

    charged side chains can alter the pKa's of these groups. Each amino acidincorporated into a polypeptide chain is referred to as a residue. Thus, only the

    amino- and carboxyl- terminal residues possess available amino and

    carboxylate groups, respectively.

  • 7/28/2019 Protein structures - carey.pdf

    6/20

    2. The polypeptide chain

    In order to form the amino acid monomers into a polymeric chain, amino

    acids are condensed with one another through dehydration synthesis. This reaction

    occurs when water is lost between the carboxylic functional group of one amino acidand the amino functional group of the next to form a C-N bond. These

    polymerization reactions are not spontaneous; however they can be arranged to

    occur through the energy-driven action of the ribosome. Ribosomes are complexes

    of proteins and RNA that translate a gene sequence in the form of mRNA into a

    protein sequence. The 20 amino acids listed above are encoded by the genes and

    incorporated by the ribosomal machinery during protein synthesis. Other minor

    amino acids are incorporated by ribosomes, but are derived by post-translation

    modifications.

    The reverse reaction, involving hydrolysis of the peptide bond, is not

    spontaneous either. It can be accomplished chemically, but only under very

    vigorous conditions. For example, treatment with very strong acid (1 molar HCl)

    and boiling at 100C overnight can hydrolyze the peptide bonds. So, the reverse

    hydrolysis reaction actually happens very slowly under normal conditions. Thus,

    proteins are chemically and biologically stable unless they are deliberately

    depolymerized. The decomposition of a polypeptide chain into individual amino

    acids can also be facilitated by hydrolytic enzymes.

    Most proteins are heteropolymeric (i.e., they contain most or all the different

    amino acids). Only rarely do regions of proteins consist of sequences composed of

    just a few amino acids. Any region of a typical protein will therefore have achemically heterogeneous environment. This heterogeneity is further amplified by

    the higher levels of protein structure, as we will see.

  • 7/28/2019 Protein structures - carey.pdf

    7/20

    3. The peptide bond

    The peptide bond between two amino acids is a special case of an amide bond

    flanked on both sides by -carbon atoms. Peptide bond angles and lengths are well-

    known from many direct observations of protein and peptide structures. Thepeptide bond (C-N) length is observed to be 1.33 (Figure 2A). This is considerably

    shorter than the adjacent (nonpeptide) C-N bond length of 1.45, but longer than

    the C=O bond length of 1.23.

    Figure 2A

    Figure 2B

    These bond lengths and angles reflect the distribution of electrons between

    atoms due to differences in polarity of the atoms, and the hybridization of their

    bonding orbitals. The two more electronegative atoms, O and N, can bear partial

  • 7/28/2019 Protein structures - carey.pdf

    8/20

    negative charges, and the two less electronegative atoms, C and H, can bear partial

    positive charges. The peptide group consisting of these four atoms can be thought of

    as a resonance structure. (Figure 2B) Thus, the peptide bond has partial double bond

    character, accounting for its intermediate bond length.Like any double bond, rotation about the peptide bond angle is restricted,

    with an energy barrier of 3 kcal/mole between cis and trans forms. These two

    isomers are defined by the path of the polypeptide chain across the bond. (Figure3)

    Successive carbons in the chain (i, i+1) are on the same side of the bond in the cis

    isomer as opposed to the staggered conformation of the trans isomer. For all amino

    acids but proline, the cis configuration is greatly disfavored because of steric

    hindrance between adjacent side chains. Ring closure in the proline side chain

    draws the carbon away from the preceding residue, leading to lower steric

    hindrance across the X-pro peptide bond. In most residues, the trans to cis

    distribution about this bond is about 90 - 10, but with proline, the trans to cis

    distribution is about 70 - 30.

    Figure 3

    Also like any other double bond, certain atoms are confined to a single plane

    about the peptide bond. The group of six atoms between successive -carbon atoms,

    inclusive, lie in one plane exactly as do the six atoms of ethene. These six atoms are

    shown in figure 3. In the trans configuration of the peptide bond, the combined

  • 7/28/2019 Protein structures - carey.pdf

    9/20

    effects of polarity and planarity result in a permanent small dipole moment across

    the peptide bond, with its negative end on the side of the carbonyl oxygen. The

    planarity of the peptide bond has additional profound consequences for polypeptide

    structure, as we shall see.

    4. Restrictions on bond rotations

    While there is restricted rotation about the peptide bond, there is free rotation

    about the four bonds to the -carbon of each residue. Two of these rotations are of

    particular relevance for the structure of the polypeptide backbone. To fully

    appreciate these rotations, we must shift our perspective from the peptide-bond-

    centered view of figure 3 to the C-centered view of figure 4A. The bond from the

    carbon to the carbonyl carbon of that residue is given the name . Similarly, the

    bond from the carbon to the amino group of that residue is given the name .

    Because C is one of the six planar atoms of the peptide group, rotation about or

    flanking C rotates the entire plane of the peptide group (Figure 4B).

    Figure 4A Figure 4B

  • 7/28/2019 Protein structures - carey.pdf

    10/20

    Since the entire plane rotates on either side of C, certain values of the angles

    and cannot be achieved due to steric occlusion. The allowed regions of,space

    differ for each amino acid because of the restriction due to C and its substituents.

    However, even for glycine, some angles are not allowed.

    Figure 5A

  • 7/28/2019 Protein structures - carey.pdf

    11/20

    Figure 5B

    The allowed regions of , space for each amino acid are displayed on

    Ramachandran plots. The allowed regions can be defined in terms of the energetic

    cost that must be paid to enter a disallowed region (Figure 5B), or in terms of the

    limiting so-called hard-sphere boundary when atoms clash (Figure 5A). For -

    branched residues the restrictions are severe, and only a small fraction of,space is

    allowed. Valine and isoleucine have access to only about 5% of all , space.

    However, all residues have access to at least part of the most favorable regions of,

    space in the upper and lower left of the plot. As we will return to shortly, it turns

    out that these two regions correspond to combinations of and angles that

    characterize the two common regular secondary structures that can be adopted by

    the polypeptide backbone, the -helix and -strand.

  • 7/28/2019 Protein structures - carey.pdf

    12/20

    Note that there is an energy barrier between the -helical region of,space

    and the -strand region. Thus, direct conversion between - and - structures is

    restricted even though most residues are allowed in both regions. Two conclusions

    from recent structural analysis of proteins and peptides are relevant to this point.First, the peptide bond deviates slightly from planarity in a surprisingly large

    fraction of cases (10). Presumably, the observed range of peptide bond angles has the

    effect of slightly enlarging the allowed , space, and perhaps reducing the /

    barrier. Second, in protein structures certain residues are overrepresented outside

    the allowed regions, and these tend to be the small polar residues (11). Presumably,

    these can form favorable local interactions that compensate for the energetic penalty

    in those ,regions.

    5. Secondary structures

    Since the restrictions on ,space arise in part from steric hindrance between

    side chain and backbone, this same steric hindrance is the origin of and

    secondary structures. There is no sequence dependence on the steric restrictions of

    the and space because , restrictions arise within each residue rather than

    between residues. However, a sequence of residues that all have similar allowed ,

    space can give rise to a chain segment that forms or structures. Thus, these

    secondary structures owe their formation to both backbone and side chain steric

    restrictions. This analysis provides an important insight into the origins of protein

    secondary structures: these structures are intrinsically favorable for the chain under

    all conditions, independent of considerations about bonding.The helix structure looks like a spring. The most common shape is a right

    handed helix defined by the repeat length of 3.6 amino acid residues and a rise of

    5.4 per turn. Thus residues (i+3) and (i+4) are closest to residue (i) in the helix

    (Figure 6A). The pitch and dimensions of the helix also bring the peptide dipole

  • 7/28/2019 Protein structures - carey.pdf

    13/20

    moments of successive residues into proximity such that their opposite charges

    neutralize each other substantially in the middle of a helix (Figure 6B). At the ends,

    the peptide dipole cannot be neutralized by this mechanism, resulting in a net helix

    macrodipole of approximately one-half unit of charge at each end. This charge maybe neutralized by nearby side chains.

    Figure 6A

  • 7/28/2019 Protein structures - carey.pdf

    14/20

    Figure 6B

    The pitch and dimensions of the helix also bring the amide proton of residue

    (i+3) or (i+4) into proximity to the carbonyl oxygen of residue (i) such that ahydrogen bond can form. All peptide group hydrogen bond donors and acceptors are

    satisified in the central part of the helical segment, but not at the ends. W hi le

    structural evidence clearly indicates that these hydrogen bonds are highly populated

    in helical segments of proteins, their contribution to helix stability is less clear since

    donors and acceptors would be satisfied by hydrogen bonding to water in nonhelical

    structures. However, ,restrictions can have the effect of preorganizing the chain

    into a helical conformation, which may favor hydrogen bonding by enhancing the

    local concentration of donors and acceptors.

    strands are the other regular secondary structure that proteins form (Figure

    7A). These are extended structures in which successive peptide dipole moments

  • 7/28/2019 Protein structures - carey.pdf

    15/20

    alternate direction along the chain. Because it is an extended structure, , steric

    hindrance is reduced in the strand, and the region of,space is larger than the

    region. Two or more strand segments can pair by hydrogen bonding and dipolar

    interactions to form a -sheet. Unlike helical segments, all peptide group hydrogenbond donors and acceptors are satisfied not within but between -strand segments;

    thus individual -strands do not have an independent existence.

    Also unlike a helical segment, adjacent strands of a sheet can come from

    sequentially distant segments of the chain; rarely, this can occur even within one

    strand of a sheet. sheets can consist of either parallel or antiparallel strands, or a

    mixture of the two. In purely antiparallel sheets, segments that are sequentially next

    to each other in the primary structure often form adjacent strands.

    Figure 7A

  • 7/28/2019 Protein structures - carey.pdf

    16/20

    Figure 7B

    However, even when forming a hairpin from contiguous chain segments, linearly

    distant residues are brought into proximity at the N-and C- terminal ends of the

    hairpin (Figure 7B). Thus, while a -strand is a secondary structure element because

    of its geometrically regular features, a -sheet can be thought of as a tertiary

    structural feature because it is intrinsically nonlocal. This example illustrates that

    the distinctions between secondary and tertiary structural features are not entirely

    clear.

    So-called turn structures are also classified as secondary structural elements,

    but unlike helices and strands, they do not have a repeating, regular geometry.

    Rather, they can have well-defined spatial dispositions defined by certain values of

    and angles that often require specific residue types and/or sequences, as well asfixed hydrogen bonding patterns. Most turns are local in the primary structure, but

    omega loops (12) can have a large number of intervening residues lacking defined

    geometries, with the turn being defined by the conformations of residues that form

    the constriction that gives this turn its name ().

    Turns are essential for allowing the polypeptide chain to fold back upon itself

    to form tertiary interactions. Such interactions are generally long-range, and result

    in compaction of the protein into a globular, often approximately spherical, form.

    The turn regions are thus generally located on the outside of the globular structure,

    with helices and/or sheets forming its core. Turns on the surfaces of proteins have a

    wide range of dynamics, from quite mobile in cases where they form few

  • 7/28/2019 Protein structures - carey.pdf

    17/20

    interactions with the underlying protein surface to quite fixed due to extensive

    tertiary contacts. Thus, turns are also ambiguously classified as secondary structure

    elements.

    6. Tertiary structures

    The side chains project outward from both -helical (figure 6A) and -strand

    (figure 7A) structures, and are therefore available for interactions with other

    surfaces through hydrophobic contacts and various kinds of bonding interactions to

    form the tertiary structure. In a helix, the side chains project radially outward, and

    in a strand successive side chains project alternately up and down. Rotation about

    bonds in the side chain are also restricted, however. The same steric hindrance that

    limits the backbone conformation also limits the side chain conformation about the

    C-C bond to preferred rotamers defined by rotation angle 1. Rotation angles

    beyond 1 are restricted by side chain packing in the tertiary structure.

    If secondary structural elements result from steric restrictions in ,space, it

    is less obvious why tertiary structures form. Proteins with highly organized tertiary

    structures generally have a well-developed core of hydrophobic residues contributed

    from most or all of the secondary structure elements in the chain. Thus, secondary

    and tertiary structures are in general intimately and explicitly interconnected. These

    buried residues do not form merely a liquid-like oily interior, but rather are usually

    well-packed, with extensive rotamer restrictions. In aqueous solvents, the

    hydrophobic effect drives the chain toward compaction to relieve unfavorable

    solvation of these exposed side chains, but compaction and internal organization areentropically costly due to loss of chain flexibility, and it is likely that these competing

    effects nearly cancel each other energetically.

    On the other hand, upon compaction, bonding interactions with solvent

    molecules are replaced by intramolecular partners, with a likely net gain in

  • 7/28/2019 Protein structures - carey.pdf

    18/20

    favorable energetic contributions due to several effects, including lower dielectric

    constant in the incipient interior. Hydrogen bonding is favored within secondary

    structures because these are partially preorganized by , restrictions into

    configurations that permit bonding at little additional entropic cost. In the case of-sheet formation, an additional favorable effect may result when two -strands are

    brought into register, much like DNA duplex formation.

    The view developed in these paragraphs suggests that protein secondary and

    tertiary structures are not independent of each other, but rather interdependent. It

    seems likely that this interdependence is the molecular origin of the extraordinary

    cooperativity of protein structural stability, which is reflected in the observation that

    protein secondary and tertiary structures are lost concomitantly and in an all-or-

    none manner upon changes in environment that disfavor the folded state, such as

    higher temperature or solvent additives.

    7. Quaternary structure

    The highest level of protein structural organization is the quaternary structure. The

    subunits that associate may be identical or not, and their organization may or may

    not be symmetric. In general, quaternary structure results from association of

    independent tertiary structural units through surface interactions, such as

    formation of the hemoglobin tetramer from myoglobin-like monomers. However,

    an increasing number of examples illustrates that tertiary structure can also be

    formed concomitantly with quaternary association in some cases. A notable example

    is the tryptophan repressor protein, which forms a highly intertwined dimer inwhich essentially all tertiary contacts are satisfied only across the subunit interface,

    rather than within each polypeptide chain (13). Thus, subunit assembly is

    necessarily a step in tertiary structure formation. Another example is the cyclin/Cdk

    inhibitor, which like Trp repressor has a well-formed secondary structure but no

  • 7/28/2019 Protein structures - carey.pdf

    19/20

    intramolecular tertiary structure; rather, all tertiary interactions are formed through

    its contacts to the binary cyclin/Cdk complex (14). These examples show that the co-

    dependence of tertiary and quaternary structures parallels the co-dependence

    between secondary and tertiary structures, and suggest that the distinction amongthese levels of the protein structure organizational hierarchy are blurry at best, and

    perhaps even misleading for our understanding of protein structural stability and

    folding.

    8. Literature cited

    1. Creighton, T. E. (1983) in Proteins: Structures and Molecular Properties, W.H.

    Freeman and Company, New York. pg. 3.

    2. ibid., pg. 5.

    3. Cantor, C. R., and Schimmel, P. R. (1980) in Biophysical Chemistry, W.H.

    Freeman and Company, New York. pg. 41.

    4. Creighton, T. E. (1983) in Proteins: Structures and Molecular Properties, W.H.

    Freeman and Company, New York. pg. 174

    5. Cantor, C. R., and Schimmel, P. R. (1980) in Biophysical Chemistry, W.H.

    Freeman and Company, New York. pg. pg. 165.

    6. Creighton, T. E. (1983) in Proteins: Structures and Molecular Properties, W.H.

    Freeman and Company, New York. pg. 7.

    7. ibid., pg. 160.

    8. Cantor, C. R., and Schimmel, P. R. (1980) in Biophysical Chemistry, W.H.

    Freeman and Company, New York. pg. pg. 256.9. Creighton, T. E. (1983) in Proteins: Structures and Molecular Properties, W.H.

    Freeman and Company, New York. pg. 167.

    10. MacArthur, M.W., and Thornton, J.M. (1996) J. Mol. Biol. 264, 1180-1195.

  • 7/28/2019 Protein structures - carey.pdf

    20/20

    11. Gunasekaran, K., Ramakrishnan, C., and Balaram, P. (1996) J. Mol. Biol. 264, 191-

    198.

    12. Fetrow, J. S. (1995) FASEB J. 9, 708-717.

    13. Schevitz, R.W., Otwinowski, Z., Joachimiak, A., Lawson, C.L., and Sigler, P.B.(1985) Nature 317, 782-786.

    14. Russo, A.A., Jeffrey, P.D., Patten, A.K., Massague, J., and Pavletich, N.P. Nature

    382, 325-331.