2. molecular genetics: proteins amino acids

14
2. Molecular Genetics: Proteins 2.1. Amino Acids. Proteins are long molecules composed of a string of amino acids. There are 20 commonly seen amino acids. These are given in table 1 with their full names and with one and three letter abbreviations for them. The capital letters in the name give a hint on how to remember the one letter code To a molecular biologist, each of these amino acids has its own personality in terms of shape and chemical properties. Often the property can be given a numerical value based on experimental measurements. We give just one in the figure 1, the hydropathy in- dex, which measures how much the amino acid dislikes dissolving in water. An amino acid with a high hydropathy index, isoleucine, for example, can be thought of as not mixing with water, or being oily. Some classifica- tions simply divide the amino acids into two categories, hydrophilic (with low hydropa- thy index) and hydrophobic (with high hy- dropathy index). 1

Upload: others

Post on 03-Feb-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

2. Molecular Genetics: Proteins

2.1. Amino Acids. Proteins are long moleculescomposed of a string of amino acids. Thereare 20 commonly seen amino acids. Theseare given in table 1 with their full names andwith one and three letter abbreviations forthem. The capital letters in the name givea hint on how to remember the one lettercode

To a molecular biologist, each of these aminoacids has its own personality in terms ofshape and chemical properties. Often theproperty can be given a numerical value basedon experimental measurements. We givejust one in the figure 1, the hydropathy in-dex, which measures how much the aminoacid dislikes dissolving in water. An aminoacid with a high hydropathy index, isoleucine,for example, can be thought of as not mixingwith water, or being oily. Some classifica-tions simply divide the amino acids into twocategories, hydrophilic (with low hydropa-thy index) and hydrophobic (with high hy-dropathy index).

1

2

Alanine Ala ACysteine Cys CAspartic AciD Asp DGlutamic Acid Glu EPhenylalanine Phe FGlycine Gly GHistidine His HIsoleucine Ile ILysine Lys KLeucine Leu LMethionine Met MAsparagiNe Asn NProline Pro PGlutamine Gln QARginine Arg RSerine Ser SThreonine Thr TValine Val VTryptophan Trp WTYrosine Tyr Y

Table 1. Amino acids and their abbreviations.

2.2. The genetic code. Based on the dis-covery of the structure of DNA as a long

3

Figure 1. Figure from Introduction to Protein Architecture, byArthur Lesk showing a hydrophobicity scale for amino acids

word in a four letter alphabet, the key to ge-netics was found to be a code. A sequenceof three letters is a code for one of the 20amino acids. A string of 3n letters codesfor a protein with n amino acids and givesthe sequence in which the amino acids arestrung together.

4

Attempts were made to discover the codeby logical reasoning, but the code was foundby experiments expressing proteins from man-ufactured sequences of DNA. The geneticcode is given by the following table. Look-ing at the table see, for example, that TGCcodes for the amino acid Cystine. You alsonote that TAA codes for STOP, which meansthat the string of amino acids stops, and theprotein is complete.

First Second Position ThirdPosition------------------------------------ Position

| U(T) C A G |

U(T) Phe Ser Tyr Cys U(T)Phe Ser Tyr Cys CLeu Ser STOP STOP ALeu Ser STOP Trp G

C Leu Pro His Arg U(T)Leu Pro His Arg CLeu Pro Gln Arg ALeu Pro Gln Arg G

5

A Ile Thr Asn Ser U(T)Ile Thr Asn Ser CIle Thr Lys Arg AMet Thr Lys Arg G

G Val Ala Asp Gly U(T)Val Ala Asp Gly CVal Ala Glu Gly AVal Ala Glu Gly G

2.3. Amino acid template. We are in-terested in the 3D structure of proteins. Pro-teins are composed of amino acids bound to-gether, so first we look at amino acid struc-tures. All amino acids have a COOH car-boxyl and NHH amide part. The part whichdistinguishes different amino acids is calledthe side chain or residue. See figure 2

The structure of the amino acids can belearned by first learning their side chain topol-ogy. The topology tells only how the atomsare connected; more information is needed

6

Figure 2. Template for amino acid. R denotes the side chain,or residue. The NHH is called an amide group and the COOH acarboxyl group.

before you know the 3D structure. The ad-ditional information consists of other param-eters called torsion angles. Figure 3 (froma paper of Ponder and Richards) gives thetopology of the amino acids along with in-formation on how the atoms and torsion an-gles are labeled. The figure also indicateshow many torsion angles are needed to de-termine the structure. We will discuss thisin more detail in a later chapter and referback to the figure often.

2.4. Tetrahedral geometry. The geom-etry of the amino acids is partially deter-mined by the tetrahedral geometry of the

7

Figure 3. Side chain topology

carbon bond. The bond directions for car-bon are approximately the same as from thecentroid of a regular tetrahedron to the ver-tices.

To get an idea of the geometry of a tetra-hedron, a regular four-sided solid, you canconstruct one in Maple. Here is a Maple fileto construct a tetrahedron. The tetrahedroncan be rotated with the mouse. The bondangles at a carbon bonded to four atoms are

8

all approximately 110 degrees as if the car-bon is the center of a tetrahedron and thebonded atoms are at the vertices.

2.5. Amino acid structure. Figure 4 showsa typical structure of the amino acid leucine.Configuration of side chains are sometimes

Figure 4. Structure of leucine shown in a computer graphics stickmodel. Hydrogens are white, carbons black, oxygens red, and ni-trogens blue.

called rotamers because the tetrahedral ge-ometry at the carbon bonds stays the sameand the only degree of freedom is rotationabout the carbon bonds.

2.6. The peptide bond. To form a pro-tein, amino acids are bonded together insequence making a long chain. The bondbetween adjacent amino acids is called thepeptide bond. The carboxyl group of one

9

amino acid and the amide group of the sub-sequent amino acid lose an oxygen and twohydrogens, i. e., water (figure 5).

Figure 5. Peptide Bond. When two amino acids bond togetherin forming a proteins, they give off one molecule of water.

The bond is approximately planar; the sixatoms involved in the bond lie in a plane,called the peptide plane. The electrons as-sociated with these atoms form a cloud calledthe π orbital. There is a special geometryassociated with the peptide plane shown infigure 6.

2.7. Protein structure. As amino acidsare bonded together they form into a specificshape called the fold. The structure of a

10

Figure 6. Peptide plane geometry. a) shows the distribution ofelectrons in the bond. b) shows the bond angles as determined fromcrystal structures. This information is often used when modelingproteins. The information on angles and distance was obtainedfrom crystallographic studies.

protein is hard to see because of the numberof atoms involved.

Before the era of computer graphics, onlyan artist could render an understandablepicture of a protein. One such artist wasIrving Geis. Here (figure 7) is his painting ofsperm whale myoglobin. There is a websitedevoted to artistic renditions of moleculesby Irving Geis and others. These renditionshave been replaced by computer graphics.

2.8. Secondary structure. The organi-zation of the atoms of a protein is complexbut certain regular features appear. Themost common are the alpha-helix and thebeta-sheet (figures 8 and 9). These are re-ferred to as secondary structures and can

11

Figure 7. Painting by Irving Geis of a stick model of sperm whale myoglobin.

be visualized using ribbon diagrams createdcomputer programs called protein viewers.These can be found online at the ProteinData Bank website. See figures 10 and 11.The spiraling ribbons are alpha-helixes andthe straight ribbons are beta-sheets.

Figures 8 and 9 are a detailed view of thealpha helix and the beta sheet. The struc-tures are distinguished by the hydrogen bond-ing patterns. In an alpha helix the hydrogen

12

bonds join atoms nearby in the chain; in abeta sheet the hydrogen bonds join atomsbetween two different parts of the chain.

Figure 8. Stick diagram of the backbone of an alpha helix, show-ing the hydrogen bonds in pink. Carbons are black, nitrogens blue,oxygens red, and hydrogens white.

13

Figure 9. Diagram showing as dotted lines hydrogen bonds be-tween protein strands in a beta sheet.

Figure 10. Ribbon diagram for the protein myoglobin. Thecurled ribbons indicate alpha helices. There are no beta sheetsin this structure. The strings are called loops and have no partic-ular structure.

14

Figure 11. Part of the protein carboxopeptidase A containingboth an alpha helix and a beta sheet. The uncurled ribbons indi-cate strands of beta sheets.