chapter 4 protein three-dimensional structure · 4.3 tertiary structure: water-soluble proteins...

20
CHAPTER 4 P roteins are the embodiment of the transition from the one-dimensional world of DNA sequences to the three-dimensional world of molecules capa- ble of diverse activities. DNA encodes the sequence of amino acids that consti- tute the protein. The amino acid sequence is called the primary structure. Functioning proteins, however, are not simply long polymers of amino acids. These polymers fold to form discrete three-dimensional structures with specific biochemical functions. Three-dimensional structure resulting from a regular pattern of hydrogen bonds between the NH and the CO components of the amino acids in the polypeptide chain is called secondary structure. The three- dimensional structure becomes more complex when the R groups of amino acids far apart in the primary structure bond with one another. This level of structure is called tertiary structure and is the highest level of structure that an individual polypeptide can attain. However, many proteins require more than one chain to function. Such proteins display quaternary structure, which can be as simple as a functional protein consisting of two identical polypeptide chains or as complex as one consisting of dozens of different polypeptide chains. Remarkably, the final three-dimensional structure of a protein is determined simply by the amino acid sequence of the protein. Protein Three-Dimensional Structure 4.1 Primary Structure: Amino Acids Are Linked by Peptide Bonds to Form Polypeptide Chains 4.2 Secondary Structure: Polypeptide Chains Can Fold into Regular Structures 4.3 Tertiary Structure: Water-Soluble Proteins Fold into Compact Structures 4.4 Quaternary Structure: Multiple Polypeptide Chains Can Assemble into a Single Protein 4.5 The Amino Acid Sequence of a Protein Determines Its Three- Dimensional Structure A spider’s web is a device built by the spider to trap prey. Spider silk, a protein, is the main component of the web. Silk is composed largely of sheets, a fundamental unit of protein structure. Many proteins have sheets; silk is unique in being composed all most entirely of sheets. [ra-photos/Stockphoto.] 42

Upload: lehanh

Post on 16-Jun-2018

237 views

Category:

Documents


0 download

TRANSCRIPT

CHAPTER

4

Proteins are the embodiment of the transition from the one-dimensionalworld of DNA sequences to the three-dimensional world of molecules capa-

ble of diverse activities. DNA encodes the sequence of amino acids that consti-tute the protein. The amino acid sequence is called the primary structure.Functioning proteins, however, are not simply long polymers of amino acids.These polymers fold to form discrete three-dimensional structures with specificbiochemical functions. Three-dimensional structure resulting from a regularpattern of hydrogen bonds between the NH and the CO components of theamino acids in the polypeptide chain is called secondary structure. The three-dimensional structure becomes more complex when the R groups of amino acidsfar apart in the primary structure bond with one another. This level of structureis called tertiary structure and is the highest level of structure that an individualpolypeptide can attain. However, many proteins require more than one chain tofunction. Such proteins display quaternary structure, which can be as simple as afunctional protein consisting of two identical polypeptide chains or as complexas one consisting of dozens of different polypeptide chains. Remarkably, the finalthree-dimensional structure of a protein is determined simply by the amino acidsequence of the protein.

Protein Three-DimensionalStructure

4.1 Primary Structure: Amino AcidsAre Linked by Peptide Bonds toForm Polypeptide Chains

4.2 Secondary Structure: PolypeptideChains Can Fold into RegularStructures

4.3 Tertiary Structure: Water-SolubleProteins Fold into CompactStructures

4.4 Quaternary Structure: MultiplePolypeptide Chains Can Assembleinto a Single Protein

4.5 The Amino Acid Sequence of aProtein Determines Its Three-Dimensional Structure

A spider’s web is a device built by the spider to trap prey. Spider silk, a protein, is the maincomponent of the web. Silk is composed largely of � sheets, a fundamental unit of proteinstructure. Many proteins have � sheets; silk is unique in being composed all most entirely of� sheets. [ra-photos/Stockphoto.]

42

434.1 Primary Structure

In this chapter, we will examine the properties of the various levels of proteinstructure. Then, we will investigate how primary structure determines the finalthree-dimensional structure.

4.1 Primary Structure: Amino Acids Are Linked by PeptideBonds to Form Polypeptide Chains

Proteins are complicated three-dimensional molecules, but their three-dimensionalstructure depends simply on their primary structure—the linear polymers formedby linking the �-carboxyl group of one amino acid to the �-amino group ofanother amino acid. The linkage joining amino acids is a protein is called a peptide bond (also called an amide bond). The formation of a dipeptide from twoamino acids is accompanied by the loss of a water molecule (Figure 4.1). Theequilibrium of this reaction lies on the side of hydrolysis rather than synthesisunder most conditions. Hence, the biosynthesis of peptide bonds requires aninput of free energy. Nonetheless, peptide bonds are quite stable kineticallybecause the rate of hydrolysis is extremely slow; the lifetime of a peptide bond inaqueous solution in the absence of a catalyst approaches 1000 years.

–+ H2O

Peptide bond

+H3NC

C

O

O

R1H

+H3NC

C

O

O

R2H

+H3NC

C

O

R1HN

CC

O

O

H

R2H

+–

Figure 4.1 Peptide-bond formation. Thelinking of two amino acids is accompaniedby the loss of a molecule of water.

LeuPheGlyGlyTyr

Amino-terminal residue

Carboxyl-terminal residue

+H3N CN

C

O

HNH

CN

CC C

C

H H

O

HNH

HH

O

O

C

O

OCH

H2CH

H2CH2C

OH

HCCH3

CH3

CH

Figure 4.2 Amino acid sequences havedirection. This illustration of thepentapeptide Try-Gly-Gly-Phe-Leu (YGGFL)shows the sequence from the aminoterminus to the carboxyl terminus. Thispentapeptide, Leu-enkephalin, is an opioidpeptide that modulates the perception ofpain.

A series of amino acids joined by peptide bonds form a polypeptide chain, andeach amino acid unit in a polypeptide is called a residue. A polypeptide chain haspolarity because its ends are different: an �-amino group is present at one end andan �-carboxyl group at the other. By convention, the amino end is taken to be thebeginning of a polypeptide chain, and so the sequence of amino acids in apolypeptide chain is written starting with the amino-terminal residue. Thus, inthe pentapeptide Tyr-Gly-Gly-Phe-Leu (YGGFL), tyrosine is the amino-terminal(N-terminal) residue and leucine is the carboxyl-terminal (C-terminal) residue(Figure 4.2). The reverse sequence, Leu-Phe-Gly-Gly-Tyr (LFGGY), is a differentpentapeptide, with different chemical properties. Note that the two peptides inquestion have the same amino acid composition but differ in primary structure.

CC CO

C

O

CC

O

HR1

H

HR3

R2

CC

O

CC

OH

HR5

R4

NH

HN

HN

HN

NH

+ 2 e–+2 H+Oxidation

Reduction

Cysteine

Cysteine

Cystine

NC

C

CH2

S

H

OH

H

NC

C

H2C

S

H

O H

H

NC

C

CH2

S

H

OH

NC

C

H2C

S

H

O H

Figure 4.4 Cross-links. The formation of adisulfide bond from two cysteine residuesis an oxidation reaction.

A polypeptide chain consists of a regularly repeating part, called the main chainor backbone, and a variable part, comprising the distinctive side chains (Figure 4.3).The polypeptide backbone is rich in hydrogen-bonding potential. Each residuecontains a carbonyl group ( ), which is a good hydrogen-bond acceptor, and,with the exception of proline, an amino group ( ) group, which is a goodhydrogen-bond donor. These groups interact with each other and with functionalgroups from side chains to stabilize particular structures.

Most natural polypeptide chains contain between 50 and 2000 amino acidresidues and are commonly referred to as proteins. The largest protein known is themuscle protein titin, which serves as a scaffold for the assembly of the contractileproteins of muscle. Titin consists of almost 27,000 amino acids. Peptides made ofsmall numbers of amino acids are called oligopeptides or simply peptides. The meanmolecular weight of an amino acid residue is about 110 g mol�1, and so the mol-ecular weights of most proteins are between 5500 and 220,000 g mol�1. We can alsorefer to the mass of a protein, which is expressed in units of daltons; a dalton is aunit of mass very nearly equal to that of a hydrogen atom. A protein with a mole-cular weight of 50,000 g mol�1 has a mass of 50,000 daltons, or 50 kd (kilodaltons).

In some proteins, the linear polypeptide chain is covalently cross-linked. Themost common cross-links are disulfide bonds, formed by the oxidation of a pairof cysteine residues (Figure 4.4). The resulting unit of two linked cysteines iscalled cystine. Disulfide bonds can form between cysteine residues in the samepolypeptide chain or they can link two separate chains together. Rarely,nondisulfide cross-links derived from other side chains are present in proteins.

N¬HC“O

Figure 4.3 Components of a polypeptidechain. A polypeptide chain consists of aconstant backbone (shown in black) andvariable side chains (shown in green).

C

O

Carbonyl group

Proteins Have Unique Amino Acid Sequences Specified by GenesIn 1953, Frederick Sanger determined the amino acid sequence of insulin, a proteinhormone (Figure 4.5). This work is a landmark in biochemistry because it showed forthe first time that a protein has a precisely defined amino acid sequence consisting onlyof L amino acids linked by peptide bonds. Sanger’s accomplishment stimulated otherscientists to carry out sequence studies of a wide variety of proteins. Currently, thecomplete amino acid sequences of more than 2 million proteins are known. Thestriking fact is that each protein has a unique, precisely defined amino acid sequence.

44

This partial double-bond character prevents rotation about thisbond and thus constrains the conformation of the peptide back-bone. The double-bond character is also expressed in the length ofthe bond between the CO and the NH groups. The distancein a peptide bond is typically 1.32 Å (Figure 4.7), which is betweenthe values expected for a single bond (1.49 Å) and a double bond (1.27 Å). Finally, the peptide bond is uncharged,allowing polymers of amino acids linked by peptide bonds to formtightly packed globular structures that would otherwise be inhib-ited by charge repulsion.

Two configurations are possible for a planar peptide bond. In thetrans configuration, the two �-carbon atoms are on opposite sides ofthe peptide bond. In the cis configuration, these groups are on the

C“NC¬N

C¬N

Knowing amino acid sequences is important for several reasons. First, aminoacid sequences determine the three-dimensional structures of proteins. Second,knowledge of the sequence of a protein is usually essential to elucidating its mech-anism of action (e.g., the catalytic mechanism of an enzyme). Third, sequencedetermination is a component of molecular pathology, a rapidly growing area ofmedicine. Alterations in amino acid sequence can produce abnormal function anddisease. Severe and sometimes fatal diseases, such as sickle-cell anemia (Chapter 8)and cystic fibrosis, can result from a change in a single amino acid within a pro-tein. Fourth, the sequence of a protein reveals much about its evolutionary history.Proteins resemble one another in amino acid sequence only if they have a commonancestor. Consequently, molecular events in evolution can be traced from aminoacid sequences; molecular paleontology is a flourishing area of research.

Polypeptide Chains Are Flexible Yet Conformationally RestrictedPrimary structure determines the three-dimensional structure of a protein, and thethree-dimensional structure determines the protein’s function. What are the rulesgoverning the relation between an amino acid sequence and the three-dimensionalstructure of a protein? This question is very difficult to answer, butwe know that certain characteristics of the peptide bond itself areimportant. First, the peptide bond is essentially planar (Figure 4.6).Thus, for a pair of amino acids linked by a peptide bond, six atomslie in the same plane: the �-carbon atom and CO group of the firstamino acid and the NH group and �-carbon atom of the secondamino acid. Second, the peptide bond has considerable double-bondcharacter owing to resonance structures: the electrons resonatebetween a pure single bond and a pure double bond.

Gly-Ile-Val-Glu-Gln-Cys-Cys-Ala-Ser-Val-Cys-Ser-Leu-Tyr-Gln-Leu-Glu-Asn-Tyr-Cys-AsnA chain

B chain

Phe-Val-Asn-Gln-His-Leu-Cys-Gly-Ser-His-Leu-Val-Glu-Ala-Leu-Tyr-Leu-Val-Cys-Gly-Glu-Arg-Gly-Phe-Phe-Tyr-Thr-Pro-Lys-Ala

5 10 15 21

5 10 15 20 25 30

S

S

S S

S

S

Figure 4.5 Amino acid sequence of bovine insulin.

N

H

O

CαC

1.32 Å

1.24 Å

1.45 Å1.51 Å

H

CαCα

C

N

1.0 Å

O

Figure 4.7 Typical bond lengths within a peptide unit. Thepeptide unit is shown in the trans configuration.

QUICK QUIZ 1 (a) What is theamino terminus of the tripeptide

Gly-Ala-Asp? (b) What is the approximatemolecular weight of a protein composedof 300 amino acids? (c) Approximatelyhow many amino acids are required toform a protein with a molecular weight of110,000?

454.1 Primary Structure

Figure 4.6 Peptide bonds are planar. In a pair of linkedamino acids, six atoms (C�, C, O, N, H, and C�) lie in aplane. Side chains are shown as green balls.

CC

NC

H

O

CC

N+

C

H

O–

Peptide-bond resonance structures

+180

+180

120

120

60

600

0

−60

−60

−120

−120−180

−180

(� = 90°, � = −90°)Disfavored

Figure 4.10 A Ramachandran diagramshowing the values of f and c. Not all fand c values are possible withoutcollisions between atoms. The most-favorable regions are shown in dark greenon the graph; borderline regions are shownin light green. The structure on the right isdisfavored because of steric clashes.

46

same side of the peptide bond. Almost allpeptide bonds in proteins are trans. Thispreference for trans over cis can beexplained by the fact that steric clashesoccur between R groups in the cis config-uration but not in the trans configuration(Figure 4.8).

In contrast with the peptide bond,the bonds between the amino group andthe �-carbon atom and between the �-carbon atom and the carbonyl group are

pure single bonds. The two adjacent rigid peptide units may rotate about thesebonds, taking on various orientations. This freedom of rotation about two bonds ofeach amino acid allows proteins to fold in many different ways. The rotations aboutthese bonds can be specified by torsion angles (Figure 4.9). The angle of rotationabout the bond between the nitrogen atom and the �-carbon atom is called phi (f).The angle of rotation about the bond between the �-carbon atom and the car-bonyl carbon atom is called psi (c). A clockwise rotation about either bond asviewed toward the �-carbon atom corresponds to a positive value. The f and cangles determine the path of the polypeptide chain.

� = +85°� = −80°

(B) (C)

View downthe N–C� bond

View downthe C�–CO bond

NC

CO

RHN

CC

OH

RH

N

HH

CC

O

RH

� �

(A)

Figure 4.9 Rotation about bonds in a polypeptide. The structure of each amino acid in apolypeptide can be adjusted by rotation about two single bonds. (A) Phi (f) is the angle ofrotation about the bond between the nitrogen and the �-carbon atoms, whereas psi (c) isthe angle of rotation about the bond between the �-carbon and the carbonyl carbon atoms.(B) A view down the bond between the nitrogen and the �-carbon atoms, showing how f ismeasured. (C) A view down the bond between the carbonyl carbon atoms and the �-carbon,showing how c is measured.

Trans Cis

Figure 4.8 Trans and cis peptide bonds. The trans form is strongly favored because ofsteric clashes that occur in the cis form.

A measure of rotation about a bond,torsion angle is usually taken to liebetween -180 and +180 degrees.Torsion angles are sometimes calleddihedral angles.

Are all combinations of f and c possible? Gopalasamudram Ramachandranrecognized that many combinations are not found in nature, because of stericclashes between atoms. The f andc values of possible conformations can be visu-alized on a two-dimensional plot called a Ramachandran diagram (Figure 4.10).

Three-quarters of the possible (f, c) combinations are excluded simply by localsteric clashes. Steric exclusion, the fact that two atoms cannot be in the same placeat the same time, restricts the number of possible peptide conformations and is thusa powerful organizing principle.

4.2 Secondary Structure: Polypeptide Chains Can Fold intoRegular Structures

Can a polypeptide chain fold into a regularly repeating structure? In 1951, LinusPauling and Robert Corey proposed that certain polypeptide chains have theability to fold into two periodic structures called the a helix (alpha helix) and theb pleated sheet (beta pleated sheet). Subsequently, other structures such as turnsand loops were identified. Alpha helices, � pleated sheets, and turns are formedby a regular pattern of hydrogen bonds between the peptide NH and CO groupsof amino acids that are often near one another in the linear sequence, or primarystructure. Such regular folded segments are called secondary structure.

The Alpha Helix Is a Coiled Structure Stabilized by Intrachain Hydrogen BondsThe first of Pauling and Corey’s proposed secondary structures was the � helix, arodlike structure with a tightly coiled backbone. The side chains of the aminoacids composing the structure extend outward in a helical array (Figure 4.11).

474.2 Secondary Structure

(C)(B)

(D)

(A)

Figure 4.11 The structure of the � helix.(A) A ribbon depiction shows the �-carbonatoms and side chains (green). (B) A sideview of a ball-and-stick version depicts thehydrogen bonds (dashed lines) betweenNH and CO groups. (C) An end view showsthe coiled backbone as the inside of thehelix and the side chains (green)projecting outward. (D) A space-filling viewof part C shows the tightly packed interiorcore of the helix.

NC

CN

CC

NC

CN

CC

NC

CN

H

HO

H O

H

OH

O

O

Ri+2 Ri+4H H H

Ri+1

Ri

H Ri+3H

CC

HO

Ri+5H

Figure 4.12 The hydrogen-bondingscheme for an � helix. In the � helix, theCO group of residue i forms a hydrogenbond with the NH group of residue i � 4.

The � helix is stabilized by hydrogen bonds between the NH and CO groups ofthe main chain. The CO group of each amino acid forms a hydrogen bond withthe NH group of the amino acid that is situated four residues ahead in thesequence (Figure 4.12). Thus, except for amino acids near the ends of an � helix,

7 Å

Figure 4.15 The structure of a � strand.The side chains (green) are alternativelyabove and below the plane of the strand.The bar shows the distance between tworesidues.

Figure 4.14 A largely �-helicalprotein. Ferritin, an iron-storage

protein, is built from a bundle of � helices. [Drawn from 1AEW.pdb.]

(A) (B)

Figure 4.13 Schematic views of � helices.(A) A ribbon depiction. (B) A cylindricaldepiction.

all the main-chain CO and NH groups are hydrogen bonded. Each residue is relatedto the next one by a rise, also called translation, of 1.5 Å along the helix axis and arotation of 100 degrees, which gives 3.6 amino acid residues per turn of helix.Thus, amino acids spaced three and four apart in the sequence are spatially quiteclose to one another in an � helix. In contrast, amino acids spaced two apart inthe sequence are situated on opposite sides of the helix and so are unlikely to makecontact. The pitch of the � helix is the length of one complete turn along the helixaxis and is equal to the product of the translation (1.5 Å) and the number ofresidues per turn (3.6), or 5.4 Å. The screw sense of a helix can be right-handed(clockwise) or left-handed (counterclockwise). Right-handed helices are energet-ically more favorable because there are fewer steric clashes between the side chainsand the backbone. Essentially all � helices found in proteins are right-handed.In schematic representations of proteins, � helices are depicted as twisted ribbonsor rods (Figure 4.13).

Not all amino acids can be readily accommodated in an � helix. Branching atthe �-carbon atom, as in valine, threonine, and isoleucine, tends to destabilize �helices because of steric clashes. Serine, aspartate, and asparagine also tend to dis-rupt � helices because their side chains contain hydrogen-bond donors or acceptorsin close proximity to the main chain, where they compete for main-chain NH andCO groups. Proline also is a helix breaker because it lacks an NH group and becauseits ring structure prevents it from assuming the f value to fit into an � helix.

The �-helical content of proteins ranges widely, from none to almost 100%.For example, about 75% of the residues in ferritin, an iron-storage protein, are in� helices (Figure 4.14). Indeed, about 25% of all soluble proteins are composedof � helices connected by loops and turns of the polypeptide chain. Single �helices are usually less than 45 Å long. Many proteins that span biologicalmembranes also contain � helices.

Beta Sheets Are Stabilized by Hydrogen Bonding Between Polypeptide StrandsPauling and Corey named their other proposed periodic structural motif the �pleated sheet (� because it was the second structure that they elucidated). The �pleated sheet (more simply, the � sheet) differs markedly from the rodlike � helixin appearance and bond structure.

Instead of a single polypeptide strand, the � sheet is composed of two or morepolypeptide chains called b strands. A � strand is almost fully extended rather thanbeing tightly coiled as in the � helix. The distance between adjacent amino acids alonga � strand is approximately 3.5 Å, in contrast with a distance of 1.5 Å along an � helix.The side chains of adjacent amino acids point in opposite directions (Figure 4.15).

Screw sense refers to the direction in whicha helical structure rotates with respect toits axis. If viewed down the axis of a helix,the chain turns in a clockwise direction; ithas a right-handed screw sense. If turningis counterclockwise, the screw sense is left-handed.

48

A � sheet is formed by linking two or more � strands lying next to one anotherthrough hydrogen bonds.Adjacent chains in a � sheet can run in opposite directions(antiparallel � sheet) or in the same direction (parallel � sheet) (Figure 4.16).Many strands, typically 4 or 5 but as many as 10 or more,can come together in � sheets.

(A)

(B)

Figure 4.16 Antiparallel and parallel� sheets. (A) Adjacent � strands run inopposite directions. Hydrogen bondsbetween NH and CO groups connecteach amino acid to a single amino acidon an adjacent strand, stabilizing thestructure. (B) Adjacent � strands run inthe same direction. Hydrogen bondsconnect each amino acid on one strandwith two different amino acids on theadjacent strand.

494.2 Secondary Structure

Figure 4.17 The structure of a mixed � sheet.

Such � sheets can be purely antiparallel, purely parallel, or mixed (Figure 4.17).Unlike � helices, � sheets can consist of sections of a polypeptide that are not nearone another. That is, in two � strands that lie next to each other, the last amino acidof one strand and the first amino acid of the adjacent strand are not necessarilyneighbors in the amino acid sequence.

In schematic representations, � strands are usually depicted by broadarrows pointing in the direction of the carboxyl-terminal end to indicate thetype of � sheet formed—parallel or antiparallel. Beta sheets can be almost flat

Figure 4.19 A protein rich in �sheets. The structure of a fatty acid-

binding protein. [Drawn from 1FTP.pdb.]

i

i + 1i + 2

i + 3

(B)(A)

but most adopt a somewhat twisted shape (Figure 4.18). The � sheet is animportant structural element in many proteins. For example, fatty acid-bindingproteins, which are important for lipid metabolism, are built almost entirelyfrom � sheets (Figure 4.19).

Polypeptide Chains Can Change Direction by Making Reverse Turns and LoopsMost proteins have compact, globular shapes, requiring reversals in the directionof their polypeptide chains. Many of these reversals are accomplished by commonstructural elements called reverse turns and loops (Figure 4.20). Turns and loopsinvariably lie on the surfaces of proteins and thus often participate in interactionsbetween other proteins and the environment. Loops exposed to an aqueous envi-ronment are usually composed of amino acids with hydrophilic R groups.

(B)

(A)

504 Protein Three-Dimensional Structure

Figure 4.20 The structure of areverse turn. (A) The CO group of

residue i of the polypeptide chain ishydrogen bonded to the NH group ofresidue i � 3 to stabilize the turn. (B) Apart of an antibody molecule has surfaceloops (shown in red). [Drawn from 7FTP.pdb.]

Figure 4.21 An �-helical coiled coil.(A) Space-filling model. (B) Ribbon

diagram. The two helices wind aroundeach other to form a superhelix. Suchstructures are found in many proteins,including keratin in hair, quills, claws, andhorns. [Drawn from 1CIG.pdb.]

(A) (B)

Figure 4.18 A twisted � sheet. (A) Aschematic model. (B) The schematic viewrotated by 90 degrees to illustrate thetwist more clearly.

Fibrous Proteins Provide Structural Support for Cells and TissuesSpecial types of helices are present in two common proteins, �-keratin and collagen.These proteins form long fibers that serve a structural role. �-Keratin, which is theprimary component of wool and hair, consists of two right-handed � helices inter-twined to form a type of left-handed superhelix called an a coiled coil. �-Keratin is amember of a superfamily of proteins referred to as coiled-coil proteins (Figure 4.21).

51

In these proteins, two or more � helices can entwine to form a very stable structurethat can have a length of 1000 Å (100 nm) or more. Human beings have approxi-mately 60 members of this family, including intermediate filaments (proteins thatcontribute to the cell cytoskeleton) and the muscle proteins myosin andtropomyosin. The two helices in �-keratin are cross-linked by weak interactions suchas van der Waals forces and ionic interactions. In addition, the two helices may belinked by disulfide bonds formed by neighboring cysteine residues.

A different type of helix is present in collagen, the most abundant mam-malian protein. Collagen is the main fibrous component of skin, bone, tendon,cartilage, and teeth. It contains three helical polypeptide chains, each nearly 1000residues long. Glycine appears at every third residue in the amino acid sequence,and the sequence glycine-proline-proline recurs frequently (Figure 4.22).

Hydrogen bonds within each peptide chain are absent in this type of helix.Instead, the helices are stabilized by steric repulsion of the pyrrolidine rings of theproline residues (Figure 4.23). The pyrrolidine rings keep out of each other’s waywhen the polypeptide chain assumes its helical form, which has about threeresidues per turn. Three strands wind around each other to form a superhelicalcable that is stabilized by hydrogen bonds betweenstrands. The hydrogen bonds form between the pep-tide NH groups of glycine residues and the COgroups of residues on the other chains. The inside ofthe triple-stranded helical cable is very crowded andexplains why glycine has to be present at every thirdposition on each strand: the only residue that can fit inan interior position is glycine (Figure 4.24A). Theamino acid residue on either side of glycine is locatedon the outside of the cable, where there is room forthe bulky rings of proline residues (Figure 4.24B).

(A)

GG

G

(B)

Figure 4.24 The structure of the protein collagen. (A) Space-fillingmodel of collagen. Each strand is shown in a different color. (B) Cross section of a model of collagen. Each strand is hydrogenbonded to the other two strands. The �-carbon atom of a glycineresidue is identified by the letter G. Every third residue must beglycine because there is no space in the center of the helix. Noticethat the pyrrolidine rings are on the outside.

ProPro

ProProGly

Gly

Figure 4.23 The conformation of a single strand of a collagen triple helix.

-Gly-Pro-Met-Gly-Pro-Ser-Gly-Pro-Arg-22

13

-Gly-Leu-Hyp-Gly-Pro-Hyp-Gly-Ala-Hyp-31-Gly-Pro-Gln-Gly-Phe-Gln-Gly-Pro-Hyp-40-Gly-Glu-Hyp-Gly-Glu-Hyp-Gly-Ala-Ser-49-Gly-Pro-Met-Gly-Pro-Arg-Gly-Pro-Hyp-58-Gly-Pro-Hyp-Gly-Lys-Asn-Gly-Asp-Asp-

Figure 4.22 The amino acid sequence ofa part of a collagen chain. Every thirdresidue is glycine. Proline andhydroxyproline also are abundant.

Clinical Insight

Vitamin C Deficiency Causes ScurvyAs we have seen, proline residues are important in creating the coiled-coil struc-ture of collagen. Hydroxyproline is a modified version of proline with a hydroxylgroup replacing a hydrogen in the pyrrolidine ring. It is a common element of col-lagen, appearing in the glycine-proline-proline sequence as the second proline.Hydroxyproline is essential for stabilizing collagen, and its formation illustratesour dependence on vitamin C.

Vitamin C Human beings are among thefew mammals unable to synthesize vitaminC. Citrus products are the most commonsource of this vitamin. Vitamin C functionsas a general antioxidant to reduce thepresence of reactive oxygen speciesthroughout the body. In addition, it servesas a specific antioxidant by maintainingmetals, required by certain enzymes suchas the enzyme that synthesizeshydroxyproline, in the reduced state.

[Don Farrell/Digital Vision/Getty Images.]

52

(A)

Iron atomHeme group

(B)(B) Heme groupHeme group

Figure 4.25 The three-dimensionalstructure of myoglobin. (A) A ribbon

diagram shows that the protein consistslargely of � helices. (B) A space-fillingmodel in the same orientation shows howtightly packed the folded protein is. Noticethat the heme group is nestled into acrevice in the compact protein with onlyan edge exposed. One helix is blue toallow comparison of the two structuraldepictions. [Drawn from 1A6N.pdb.]

Vitamin C is required for the formation of stable collagen fibers because itassists in the formation of hydroxyproline from proline. Less-stable collagenresults in scurvy. The symptoms of scurvy include skin lesions and blood-vessel fragility. Most notable are bleeding gums, the loss of teeth, and periodon-tal infections. Gums are especially sensitive to a lack of vitamin C because thecollagen in gums turns over rapidly. Vitamin C is required for the continuedactivity of prolyl hydroxylase, which synthesizes hydroxyproline. This reactionrequires an Fe2� ion to activate O2. This iron ion, embedded in prolyl hydrox-ylase, is susceptible to oxidation, which inactivates the enzyme. How is theenzyme made active again? Ascorbate (vitamin C) comes to the rescue by reduc-ing the Fe3� of the inactivated enzyme. Thus, ascorbate serves here as a specificantioxidant. ■

4.3 Tertiary Structure: Water-Soluble Proteins Fold into Compact Structures

As already discussed, primary structure is the sequence of amino acids, and sec-ondary structure is the simple repeating structures formed by hydrogen bondsbetween hydrogen and oxygen atoms of the peptide backbone. Another level ofstructure, tertiary structure, refers to the spatial arrangement of amino acidresidues that are far apart in the sequence and to the pattern of disulfide bonds.This level of structure is the result of interactions between the R groups of thepeptide chain. To explore the principles of tertiary structure, we will examine myo-globin, the first protein to be seen in atomic detail.

Myoglobin Illustrates the Principles of Tertiary StructureMyoglobin is an example of a globular protein (Figure 4.25). In contrast withfibrous proteins such as keratin, globular proteins have a compact three-dimensional structure and are water soluble. Globular proteins, with theirmore intricate three-dimensional structure, perform most of the chemicaltransactions in the cell.

Myoglobin, a single polypeptide chain of 153 amino acids, is an oxygen-binding protein found predominantly in heart and skeletal muscle; it appears toserve as an “oxygen buffer” to maintain constant intracellular oxygen concen-tration under varying degrees of aerobic metabolism. The capacity of myoglo-bin to bind oxygen depends on the presence of heme, a prosthetic (helper)group containing an iron atom. Myoglobin is an extremely compact molecule.

534.3 Tertiary Structure

Its overall dimensions are 45 � 35 � 25 Å, an order of magnitude less than if itwere fully stretched out. About 70% of the main chain is folded into eight �helices, and much of the rest of the chain forms turns and loops between helices.

Myoglobin, like most other proteins, is asymmetric because of the complexfolding of its main chain. A unifying principle emerges from the distribution ofside chains. The striking fact is that the interior consists almost entirely of nonpolarresidues (Figure 4.26). The only polar residues on the interior are two histidineresidues, which play critical roles in binding the heme iron and oxygen. The out-side of myoglobin, on the other hand, consists of both polar and nonpolarresidues, which can interact with water and thus render the molecule water solu-ble. The space-filling model shows that there is very little empty space inside.

(B)(A)

Figure 4.26 The distribution of amino acids in myoglobin. (A) A space-filling model ofmyoglobin, with hydrophobic amino acids shown in yellow, charged amino acids

shown in blue, and others shown in white. Notice that the surface of the molecule has manycharged amino acids, as well as some hydrophobic amino acids. (B) In this cross-sectionalview, notice that mostly hydrophobic amino acids are found on the inside of the structure,whereas the charged amino acids are found on the protein surface. [Drawn from 1MBD.pdb.]

This contrasting distribution of polar and nonpolar residues reveals a keyfacet of protein architecture. In an aqueous environment such as the interior ofa cell, protein folding is driven by the hydrophobic effect—the strong tendencyof hydrophobic residues to avoid contact with water. The polypeptide chaintherefore folds so that its hydrophobic side chains are buried and its polar, chargedchains are on the surface. Similarly, an unpaired peptide NH or CO group of themain chain markedly prefers water to a nonpolar milieu. The only way to burya segment of main chain in a hydrophobic environment is to pair all the NHand CO groups by hydrogen bonding. This pairing is neatly accomplished in an� helix or � sheet. Van der Waals interactions between tightly packed hydrocar-bon side chains also contribute to the stability of proteins. We can now under-stand why the set of 20 amino acids contains several that differ subtly in sizeand shape. They provide a palette of shapes that can fit together tightly to fillthe interior of a protein neatly and thereby maximize van der Waals interac-tions, which require intimate contact.

Some proteins that span biological membranes are “the exceptions that provethe rule” because they have the reverse distribution of hydrophobic andhydrophilic amino acids. For example, consider porins, proteins found in theouter membranes of many bacteria. Membranes are built largely of the hydropho-bic hydrocarbon chains of lipids (p. 157). Thus, porins are covered on the outsidelargely with hydrophobic residues that interact with the hydrophobic environ-ment. In contrast, the center of the protein contains many charged and polar

Helix-turn-helix

Figure 4.27 The helix-turn-helixmotif, a supersecondary structural

element. Helix turn-helix motifs are foundin many DNA-binding proteins. [Drawn from1LMB.pdb.]

544 Protein Three-Dimensional Structure

amino acids that surround a water-filled channel running through the middle ofthe protein. Thus, because porins function in hydrophobic environments, they are“inside out” relative to proteins that function in aqueous solution.

The Tertiary Structure of Many Proteins Can Be Divided into Structural and Functional UnitsCertain combinations of secondary structure are present in many proteins and fre-quently exhibit similar functions. These combinations are called motifs or supersec-ondary structures. For example, an � helix separated from another � helix by a turn,called a helix-turn-helix unit, is found in many proteins that bind DNA (Figure 4.27).

Some polypeptide chains fold into two or more compact regions that may beconnected by a flexible segment of polypeptide chain, rather like pearls on a string.These compact globular units, called domains, range in size from about 30 to 400amino acid residues. For example, the extracellular part of CD4, a cell-surface pro-tein on certain cells of the immune system, comprises four similar domains ofapproximately 100 amino acids each (Figure 4.28). Different proteins may havedomains in common even if their overall tertiary structures are different.

Figure 4.28 Protein domains. The cell-surface protein CD4 consists of four similardomains. [Drawn from 1WIO.pdb.]

Figure 4.29 Quaternary structure.The Cro protein of bacteriophage λ

is a dimer of identical subunits. [Drawnfrom 5CRO.pdb.]

4.4 Quaternary Structure: Multiple Polypeptide Chains CanAssemble into a Single Protein

Many proteins consist of more than one polypeptide chain in their functionalstates. Each polypeptide chain in such a protein is called a subunit. Quaternarystructure refers to the arrangement of subunits and the nature of their interac-tions. The simplest sort of quaternary structure is a dimer consisting of twoidentical subunits. This organization is present in Cro, a DNA-binding proteinfound in a bacterial virus called λ (Figure 4.29). Quaternary structure can be assimple as two identical subunits or as complex as dozens of different polypep-tide chains. More than one type of subunit can be present, often in variablenumbers. For example, human hemoglobin, the oxygen-carrying protein inblood, consists of two subunits of one type (designated �) and two subunits ofanother type (designated �), as illustrated in Figure 4.30. Thus, the hemoglo-bin molecule exists as an a2b2 tetramer.

4.5 The Amino Acid Sequence of a Protein Determines Its Three-Dimensional Structure

How is the elaborate three-dimensional structure of proteins attained? The clas-sic work of Christian Anfinsen in the 1950s on the enzyme ribonuclease revealedthe relation between the amino acid sequence of a protein and its conformation.Ribonuclease is a single polypeptide chain consisting of 124 amino acid residuescross-linked by four disulfide bonds (Figure 4.31). Anfinsen’s plan was to destroythe three-dimensional structure of the enzyme and to then determine the condi-tions required to restore the tertiary structure. The application of chaotropicagents such as urea effectively disrupt a protein’s noncovalent bonds such ashydrogen bonds and van der Waals interactions. The disulfide bonds can becleaved reversibly with a sulfhydryl reagent such as b-mercaptoethanol (Figure4.32). In the presence of a large excess of �-mercaptoethanol, the disulfides(cystines) are fully converted into sulfhydryls (cysteines).

When ribonuclease was treated with �-mercaptoethanol in 8 M urea, theproduct was a randomly coiled polypeptide chain devoid of enzymatic activity.When a protein is converted into a randomly coiled peptide without its normalactivity, it is said to be denatured (Figure 4.33).

Anfinsen then made the critical observation that the denatured ribonuclease,freed of urea and �-mercaptoethanol by dialysis, slowly regained enzymaticactivity. He immediately perceived the significance of this chance finding: theenzyme spontaneously refolded into a catalytically active form with all of the cor-rect disulfide bonds re-forming. All the measured physical and chemical proper-ties of the refolded enzyme were virtually identical with those of the nativeenzyme. These experiments showed that the information needed to specify the cat-alytically active three-dimensional structure of ribonuclease is contained in its aminoacid sequence. Subsequent studies have established the generality of this central

55

(B)(A)

Figure 4.30 The �2�2 tetramerof human hemoglobin.

The structure of the two identical �subunits (red) and the two identical �subunits (yellow). (A) The ribbon diagramshows that they are composed mainly of� helices. (B) The space-filling modelillustrates the close packing of the atomsand shows that the heme groups (gray)occupy crevices in the protein. [Drawnfrom 1A3N.pdb.]

H2NC

O

NH2

Urea

CC

M M Q N

YN S S S A SA T S

SD

MHQREFKAAAE

TK+H3N

KS

RNLTK

DR

CK40

PV N T F V H E S L A D V

QA

VC

S QKNVA

CC

NT K

Q NG

YQSYSTMSITDR

ETGSS

KYP

NC A

Y K T T Q A N K H I I VA

CE

GNPYVPVHFDASV

C O

O100

124

90 120

30

1 20

80

10

70

60110

50

Figure 4.31 Amino acid sequence ofbovine ribonuclease. The four disulfidebonds are shown in color. [After C. H. W.Hirs, S. Moore, and W. H. Stein, J. Biol. Chem.235(1960):633–647.]

H

HS

S

S

SProtein Protein

Excess �-mercaptoethanol

CH2

H2C

SHO

H

CH2

H2C

SO

H CH2

H2C

SO

H

Figure 4.32 The role of �-mercaptoethanol in reducing disulfidebonds. Notice that, as the disulfides arereduced, the �-mercaptoethanol isoxidized and forms dimers.

Figure 4.34 Typing-monkey analogy. Amonkey randomly poking a typewriter couldwrite a line from Shakespeare’s Hamlet,provided that correct keystrokes wereretained. In the two computer simulationsshown, the cumulative number of keystrokesis given at the left of each line.

principle of biochemistry: sequence specifies conformation. The dependence ofconformation on sequence is especially significant because conformation deter-mines function.

Similar refolding experiments have been performed on many other proteins. Inmany cases, the native structure can be generated under suitable conditions. Forother proteins, however, refolding does not proceed efficiently. In these cases, theunfolded protein molecules usually become tangled up with one another to formaggregates. Inside cells, proteins called chaperones block such illicit interactions.

Proteins Fold by the Progressive Stabilization of Intermediates RatherThan by Random SearchHow does a protein make the transition from an unfolded structure to a uniqueconformation in the native form? One possibility is that all possible conformationsare tried out to find the energetically most favorable one. How long would such arandom search take? Cyrus Levinthal calculated that, if each residue of a 100-residue protein can assume three different conformations, the total number ofstructures would be 3100, which is equal to 5 � 1047. If the conversion of one struc-ture into another were to take 10�13 seconds (s), the total search time would be 5� 1047 � 10�13 s, which is equal to 5 � 1034 s, or 1.6 � 1027 years. Clearly, it wouldtake much too long for even a small protein to fold properly by randomly tryingout all possible conformations. Moreover, Anfinsen’s experiments showed that pro-teins do fold on a much more limited time scale. The enormous difference betweencalculated and actual folding times is called Levinthal’s paradox. Levinthal’s para-dox and Anfinsen’s results suggest that proteins do not fold by trying every possi-ble conformation; rather, they must follow at least a partly defined folding pathwayconsisting of intermediates between the fully denatured protein and its nativestructure.

The way out of this paradox is to recognize the power of cumulative selection.Richard Dawkins, in The Blind Watchmaker, asked how long it would take a mon-key poking randomly at a typewriter to reproduce Hamlet’s remark to Polonius,“Methinks it is like a weasel” (Figure 4.34). An astronomically large number ofkeystrokes, of the order of 1040, would be required. However, suppose that we pre-served each correct character and allowed the monkey to retype only the wrongones. In this case, only a few thousand keystrokes, on average, would be needed.The crucial difference between these cases is that the first employs a completelyrandom search, whereas, in the second, partly correct intermediates are retained.

The essence of protein folding is the tendency to retain partly correct interme-diates because they are slightly more stable than unfolded regions. However, theprotein-folding problem is much more difficult than the one presented to oursimian Shakespeare. First, the criterion of correctness is not a residue-by-residuescrutiny of conformation by an omniscient observer but rather the total freeenergy of the folding intermediate. Second, even correctly folded proteins are onlymarginally stable. The free-energy difference between the folded and the unfoldedstates of a typical 100-residue protein is 42 kJ mol�1 (10 kcal mol�1); thus, eachresidue contributes on average only 0.42 kJ mol�1 (0.1 kcal mol�1) of energy to

110

26

58

6584

124

Denatured reduced ribonucleaseNative ribonuclease

95

40

95

26 72

65

110

58

1

8472

40

HS

HSHS

HS

SH

SH

HSHS

8 M urea and�-mercaptoethanol

Figure 4.33 The reduction anddenaturation of ribonuclease.

564 Protein Three-Dimensional Structure

maintain the folded state. This amount is less than the amount of thermalenergy, which is 2.5 kJ mol�1 (0.6 kcal mol�1) at room temperature. This meagerstabilization energy means that correct intermediates, especially those formedearly in folding, can be lost. Nonetheless, the interactions that lead to folding canstabilize intermediates as structure builds up. The analogy is that the monkeywould be somewhat free to undo its correct keystrokes.

The folding of proteins is sometimes visualized as a folding funnel, or energylandscape (Figure 4.35). The breadth of the funnel represents all possible confor-mations of the unfolded protein. The depth of the funnel represents the energydifference between the unfolded and the native protein. Each point on the surfacerepresents a possible three-dimensional structure and its energy value. The fun-nel suggests that there are alternative pathways to the native structure.

One model pathway postulates that local interactions take place first—inother words, secondary structure forms—and these secondary structures facili-tate the long-range interactions leading to tertiary-structure formation. Anothermodel pathway proposes that the hydrophobic effect brings together hydropho-bic amino acids that are far apart in the amino acid sequence. The drawingtogether of hydrophobic amino acids in the interior leads to the formation of aglobular structure. Because the hydrophobic interactions are presumed to bedynamic, allowing the protein to form progressively more stable interactions, thestructure is called a molten globule. Another, more general model, called thenucleation–condensation model, is essentially a combination of the two preced-ing models. In the nucleation–condensation model, both local and long-rangeinteractions take place to lead to the formation of the native state.

Clinical Insight

Protein Misfolding and Aggregation Are Associated with Some Neurological DiseasesUnderstanding protein folding and misfolding is of more than academic interest.A host of diseases, including Alzheimer disease, Parkinson disease, Huntington dis-ease, and transmissible spongiform encephalopathies (prion disease), are associ-

574.5 Sequence Defines Structure

Figure 4.35 Folding funnel. The foldingfunnel depicts the thermodynamics ofprotein folding. The top of the funnelrepresents all possible denaturedconformations—that is, maximalconformational entropy. Depressions onthe sides of the funnel representsemistable intermediates that mayfacilitate or hinder the formation of thenative structure, depending on their depth.Secondary structures, such as helices, formand collapse onto one another to initiatefolding. [After D. L. Nelson and M. M. Cox,Lehninger Principles of Biochemistry, 5th ed.(W. H. Freeman and Company, 2008), p. 143.]

Entropy

Molten globulestates

Beginning of helix formation and collapse

Energy

Native structure

Percentage ofresidues ofprotein in nativeconformation

Discrete foldingintermediates

0

100

ated with improperly folded proteins. All of these diseases result in the depositionof protein aggregates, called amyloid fibrils or plaques (Figure 4.36). These diseasesare consequently referred to as amyloidoses. A common feature of amyloidoses isthat normally soluble proteins are converted into insoluble fibrils rich in � sheets.The correctly folded protein is only marginally more stable than the incorrect form.But the incorrect forms aggregates, pulling more correct forms into the incorrectform. We will focus on the transmissible spongiform encephalopathies.

One of the great surprises in modern medicine was that certain infectiousneurological diseases were found to be transmitted by agents that were similar insize to viruses but consisted only of protein. These diseases include bovine spongi-form encephalopathy (commonly referred to as mad cow disease) and the analo-gous diseases in other organisms, including Creutzfeld–Jacob disease (CJD) inhuman beings and scrapie in sheep. The agents causing these diseases are termedprions. Prions are composed largely or completely of a cellular protein called PrP,which is normally present in the brain. The prions are aggregated forms of the PrPprotein termed PrPSC.

The structure of the normal protein PrP contains extensive regions of �helix and relatively little �-strand structure. The structure of a mammalianPrPSC has not yet been determined, because of challenges posed by its insolu-ble and heterogeneous nature. However, a variety of evidence indicates thatsome parts of the protein that had been in �-helical or turn conformationshave been converted into �-strand conformations. This conversion suggeststhat the PrP is only slightly more stable than the �-strand-rich PrPSC; however,after the PrPSC has formed, the � strands of one protein link with those ofanother to form � sheets, joining the two proteins and leading to the forma-tion of aggregates, or amyloid fibrils.

With the realization that the infectious agent in prion diseases is an aggre-gated form of a protein that is already present in the brain, a model for diseasetransmission emerges (Figure 4.37). Protein aggregates built of abnormal forms

Figure 4.36 Alzheimer disease. Coloredpositron emission tomography (PET) scansof the brain of a normal person (left) andthat of a patient who has Alzheimerdisease (right). Color coding: high brainactivity (red and yellow); low activity (blueand black). The Alzheimer patient’s scanshows severe deterioration of brain activity.[Dr. Robert Friedland/Photo Researchers.]

584 Protein Three-Dimensional Structure

Figure 4.37 The protein-only model forprion-disease transmission. A nucleusconsisting of proteins in an abnormalconformation grows by the addition ofproteins from the normal pool. Normal PrP pool

PrPSC nucleus

of PrP act as nuclei to which other PrP molecules attach. Prion diseases can thusbe transferred from one individual organism to another through the transfer ofan aggregated nucleus, as likely happened in the mad cow disease outbreak in theUnited Kingdom in the 1990s. Cattle given animal feed containing material fromdiseased cows developed the disease in turn. Amyloid fibers are also seen in thebrains of patients with certain noninfectious neurodegenerative diseases such asAlzheimer and Parkinson diseases. How such aggregates lead to the death of thecells that harbor them is an active area of research. ■

SUMMARY

4.1 Primary Structure: Amino Acids Are Linked by Peptide Bonds to Form Polypeptide ChainsThe amino acids in a polypeptide are linked by amide bonds formedbetween the carboxyl group of one amino acid and the amino group of thenext. This linkage, called a peptide bond, has several important properties.First, it is resistant to hydrolysis, and so proteins are remarkably stablekinetically. Second, each peptide bond has both a hydrogen-bond donor(the NH group) and a hydrogen-bond acceptor (the CO group). Becausethey are linear polymers, proteins can be described as sequences of aminoacids. Such sequences are written from the amino to the carboxyl terminus.

4.2 Secondary Structure: Polypeptide Chains Can Fold into Regular StructuresTwo major elements of secondary structure are the � helix and the �strand. In the � helix, the polypeptide chain twists into a tightly packedrod. Within the helix, the CO group of each amino acid is hydrogenbonded to the NH group of the amino acid four residues farther alongthe polypeptide chain. In the � strand, the polypeptide chain is nearlyfully extended. Two or more � strands connected by NH-to-CO hydro-gen bonds come together to form � sheets. The strands in � sheets canbe antiparallel, parallel, or mixed.

4.3 Tertiary Structure: Water-Soluble Proteins Fold into Compact StructuresThe compact, asymmetric structure that individual polypeptides attain iscalled tertiary structure. The tertiary structures of water-soluble proteinshave features in common: (1) an interior formed of amino acids withhydrophobic side chains and (2) a surface formed largely of hydrophilicamino acids that interact with the aqueous environment. The driving forcefor the formation of the tertiary structure of water-soluble proteins is thehydrophobic interactions between the interior residues. Some proteins thatexist in a hydrophobic environment, in membranes, display the inverse dis-tribution of hydrophobic and hydrophilic amino acids. In these proteins,the hydrophobic amino acids are on the surface to interact with the envi-ronment, whereas the hydrophilic groups are shielded from the environ-ment in the interior of the protein.

4.4 Quaternary Structure: Multiple Polypeptide Chains Can Assemble into a Single ProteinProteins consisting of more than one polypeptide chain display quaternarystructure; each individual polypeptide chain is called a subunit. Quaternarystructure can be as simple as two identical subunits or as complex as dozensof different subunits. In most cases, the subunits are held together by non-covalent bonds.

59Summary

604 Protein Three-Dimensional Structure

4.5 The Amino Acid Sequence of a Protein Determines Its Three-Dimensional StructureThe amino acid sequence completely determines the three-dimensionalstructure and, hence, all other properties of a protein. Some proteins can beunfolded completely yet refold efficiently when placed under conditions inwhich the folded form is stable. The amino acid sequence of a protein isdetermined by the sequences of bases in a DNA molecule. This one-dimensional sequence information is extended into the three-dimensionalworld by the ability of proteins to fold spontaneously.

Key Termsprimary structure (p. 43)peptide (amide) bond (p. 43)disulfide bond (p. 44)phi (f) angle (p. 46)psi (c) angle (p. 46)Ramachandran diagram (p. 46)� helix (p. 47)� pleated sheet (p. 47)

secondary structure (p. 47)rise (translation) (p. 48)� strand (p. 48)coiled coil (p. 50)tertiary structure (p. 52)motif (supersecondary structure)

(p. 54)domain (p. 54)

subunit (p. 54)quaternary structure (p. 54)folding funnel (p.57)molten globule (p.57)prion (p. 58)

Answer to QUICK QUIZ(a) Glycine is the amino terminus. (b) The average mole-cular weight of amino acids is 110. Therefore, a proteinconsisting of 300 amino acids has a molecular weight of

approximately 33,000. (c) A protein with a molecularweight of 110,000 consists of approximately 1000 aminoacids.

1. Matters of stability. Proteins are quite stable. The life-time of a peptide bond in aqueous solution is nearly 1000years. However, the free energy of hydrolysis of proteins isnegative and quite large. How can you account for the sta-bility of the peptide bond in light of the fact that hydrolysisreleases much energy?

2. Name those components. Examine the segment of a pro-tein shown here.

4. Alphabet soup. How many different polypeptides of 50amino acids in length can be made from the 20 commonamino acids?

5. Sweet tooth, but calorie conscious.Aspartame (NutraSweet),an artificial sweetener, is a dipeptide composed of Asp-Phe inwhich the carboxyl terminus is modified by the attachment ofa methyl group. Draw the structure of Aspartame at pH 7.

6. Vertebrate proteins? What is meant by the term polypep-tide backbone?

7. Not a sidecar. Define the term side chain in the contextof amino acid or protein structure.

8. One from many. Differentiate between amino acid com-position and amino acid sequence.

9. Shape and dimension. Tropomyosin, a 70-kd muscleprotein, is a two-stranded �-helical coiled coil. Estimate thelength of the molecule.

10. Contrasting isomers. Poly-L-leucine in an organic sol-vent such as dioxane is � helical, whereas poly-L-isoleucineis not. Why do these amino acids with the same number andkinds of atoms have different helix-forming tendencies?

NN

H

C

H H

N

H

C C

H

CC

O

C

O

CH3 CH2OHH H O

Problems

(a) What three amino acids are present?(b) Of the three, which is the N-terminal amino acid?(c) Identify the peptide bonds.(d) Identify the �-carbon atoms.

3. Who’s charged? Draw the structure of the dipeptideGly-His. What is the charge on the peptide at pH 5.5? pH 7.5?

Problems 61

1241

7284

95

5840

26

11065

Native ribonuclease

40

95

26 7265

110

58

1

84

Scrambled ribonuclease

Trace of�-mercaptoethanol

11. Active again. A mutation that changes an alanineresidue in the interior of a protein into valine is found tolead to a loss of activity. However, activity is regained whena second mutation at a different position changes anisoleucine residue into glycine. How might this secondmutation lead to a restoration of activity?

12. Scrambled ribonuclease. When performing his experi-ments on protein refolding, Christian Anfinsen obtained aquite different result when reduced ribonuclease was reoxi-dized while it was still in 8 M urea and the preparation wasthen dialyzed to remove the urea. Ribonuclease reoxidizedin this way had only 1% of the enzymatic activity of thenative protein. Why were the outcomes so different whenreduced ribonuclease was reoxidized in the presence andabsence of urea?

13. A little help. Anfinsen found that scrambled ribonucleasespontaneously converted into fully active, native ribonucleasewhen trace amounts of �-mercaptoethanol were added to anaqueous solution of the protein. Explain these results.

into enzymatically active ribonuclease. In contrast, insulin israpidly inactivated by PDI. What does this important obser-vation imply about the relation between the amino acidsequence of insulin and its three-dimensional structure?

15. Stretching a target. A protease is an enzyme that cat-alyzes the hydrolysis of the peptide bonds of target proteins.How might a protease bind a target protein so that its mainchain becomes fully extended in the vicinity of the vulner-able peptide bond?

16. Often irreplaceable. Glycine is a highly conserved aminoacid residue in the evolution of proteins. Why?

17. Potential partners. Identify the groups in a protein thatcan form hydrogen bonds or electrostatic bonds with anarginine side chain at pH 7.

18. Permanent waves. The shape of hair is determined inpart by the pattern of disulfide bonds in keratin, its majorprotein. How can curls be induced?

19. Location is everything 1. Most proteins have hydrophilicexteriors and hydrophobic interiors. Would you expect thisstructure to apply to proteins embedded in the hydropho-bic interior of a membrane? Explain.

20. Location is everything 2. Proteins that span biologicalmembranes often contain � helices. Given that the insidesof membranes are highly hydrophobic, predict what type ofamino acids would be in such a helix. Why is an � helix par-ticularly suitable for existence in the hydrophobic environ-ment of the interior of a membrane?

21. Who goes first? Would you expect Pro–X peptide bondsto tend to have cis conformations like those of X–Probonds? Why or why not?

14. Shuffle test. An enzyme called protein disulfide iso-merase (PDI) catalyzes disulfide–sulfhydryl exchange reac-tions. PDI rapidly converts inactive scrambled ribonuclease

Selected readings for this chapter can be found online at www.whfreeman.com/Tymoczko