bmmb597e protein evolution protein classification 1
TRANSCRIPT
![Page 1: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/1.jpg)
BMMB597EProtein Evolution
Protein classification
1
![Page 2: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/2.jpg)
2
Protein families
• The first protein structures determined by X-ray crystallography, myoglobin and haemoglobin, were solved (in 1959—60) before the amino acid sequences were determined
• It came as a surprise that the structures were quite similar
• Soon it became clear, on the basis of both sequences and structures, that there were families of proteins
![Page 3: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/3.jpg)
myoglobin haemoglobin
3
![Page 4: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/4.jpg)
4
50 years earlier, there were some hints …
• E.T. Reichert & A.P. Brown. The differentiation and specificity of corresponding proteins and other vital substances in relation to biological classification and organic evolution: the crystallography of hemoglobins. (Carnegie Institution of Washington, 1909)
• Crystallography 3 years before discovery of X-ray diffraction?
![Page 5: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/5.jpg)
5
Reichert and Brown studied interfacial angles in haemoglobin crystals
• Stenö’s law (1669): different crystals of the same substance may have differerent sizes and shapes, but the angles between faces are constant for each substance
• They found that the angles differed from species to species
• Similarities in values of interfacial angles were consistent with classical taxonomic tree
• They even found differences between oxy- and deoxyhaemoglobin
![Page 6: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/6.jpg)
6
Most premature scientific result ever?
• These results implied:– That proteins adopted (or at least could adopt)
unique structures, to form a crystal– That protein structures varied between species– That this variation was parallel with the evolution
of the species– That proteins could change structure as a result of
changes in state of ligation• In 1909!
![Page 7: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/7.jpg)
7
M.O. Dayhoff
• Pioneer of bioinformatics• Collected protein sequences• First curated ‘database’• Recognized that proteins form families, on the
basis of amino acid sequences• Computational sequence alignments• First evolutionary tree • First amino-acid substitution matrix (later
replaced by BLOSUM)
![Page 8: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/8.jpg)
8
Can relationships among proteins be extended beyond families?
• Families = sets of proteins with such obvious similarities that we assume that they are related
• One question: how much similarity do we need to believe in a relationship?
• How far can evolution go?• Convergent evolution?• Cautionary tale: chymotrypsin / subtilisin
![Page 9: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/9.jpg)
9
Chymotrypsin-subtilisin
• Both proteolytic enzymes– Chymotrypsin mammalian– subtilisin from B. subtilis
• Both have catalytic triads• Same function – same mechanism• Sequences 12% similar (near noise level)
• However, structures show them to be unrelated
![Page 10: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/10.jpg)
10
Chymotrypsin / Subtilisin
![Page 11: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/11.jpg)
Catalytic triad in serine proteinases
11
![Page 12: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/12.jpg)
12
Chymotrypsin and subtilisin have similar catalytic triads
![Page 13: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/13.jpg)
13
How can we classify proteins that belong to families?
• Align sequences• Calculate phylogenetic tree (various ways to
do this, depend on sequence alignment)• Usually, phylogenetic tree of homologous
proteins from different species follow phylogenetic tree based on classical taxonomy
• That is reassuring• But what happens as divergence proceeds?
![Page 14: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/14.jpg)
14
How can we classify proteins that do not obviously belong to families?
• Base this on structure rather than sequence• Structural similarities are maintained as
divergence proceeds, better than sequence similarities
• For closely related proteins, expect no difference between sequence-based and structure based classification
• How far can classification be extended?
![Page 15: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/15.jpg)
15
SCOP Structural Classification of Proteins
• Idea of A.G. Murzin, based on old work by C. Chothia and M. Levitt
• Even if two proteins are not obviously homologous, they may share structural features, to a greater or lesser degree.
• For instance, the secondary structures of some proteins are only -helices
• Others, have -sheets but no -helices
![Page 16: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/16.jpg)
16
SCOP
• SCOP is a database that gives a hierarchical classification of all protein domains
• Recall that a domain is a compact subunit of a protein structure that ‘looks as if’ it would have independent stability
Fragment of fibronectin
![Page 17: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/17.jpg)
17
Dissection of structure into domains
• It is not always quite so obvious how to divide a protein into domains
• There is some (not a lot) of room for argument• Note that sometimes the chain passes back
and forth between domains• In these cases one or both domains do not
consist entirely of a consecutive set of residues
![Page 18: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/18.jpg)
18
lactoferrin
![Page 19: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/19.jpg)
19
SCOP, CATH, DALI Database classify protein structures
• SCOP (Structural Classification of Proteins) • CATH (Class, Architecture, Topology, Homologous
superfamily)• DALI Database • These web sites have many useful features: – information-retrieval engines, including
search by keyword or sequence– presentation of structure pictures– links to other related sites including bibliographical
databases.
![Page 20: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/20.jpg)
20
SCOPhttp://www.scop.mrc-lmb.cam.ac.uk
• SCOP organizes protein structures in a hierarchy according to evolutionary origin and structural similarity.
• Domains -- extracted from the Protein Data Bank entries.
• Sets of domains are grouped into families: sets domains for which imilarities in structure, function and sequence imply a common evolutionary origin.
![Page 21: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/21.jpg)
21
The SCOP hierarchy
• Families that share a common structure, or even a common structure and a common function, but lack adequate sequence similarity – so that the evidence for evolutionary relationship is suggestive but not compelling – are grouped into superfamilies
• Superfamilies that share a common folding topology, for at least a large central portion of the structure, are grouped as folds.
• Finally, each fold group falls into one of the general classes.
![Page 22: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/22.jpg)
22
Major classes in SCOP
• – secondary structure all helical• – secondary structure all sheet• / – helices and sheets, but in different parts of
structure• + – contain -- supersecondary structure• ‘small proteins’ – which often have little
secondary structure and are held together by disulphide bridges or ligands; for instance, wheat-germ agglutinin)
![Page 23: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/23.jpg)
23
Summary of SCOP hierarchy
• Class• Fold• Superfamily• Family• Domain
![Page 24: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/24.jpg)
24
SCOP classification of flavodoxin
Protein: Flavodoxin from Clostridium beijerinckii [TaxId: 1520]Lineage:Root: scop Class: Alpha and beta proteins (a/b) [51349] Mainly parallel beta sheets (beta-alpha-beta units) Fold: Flavodoxin-like [52171] 3 layers, a/b/a; parallel beta-sheet of 5 strand, order 21345 Superfamily: Flavoproteins [52218] Family: Flavodoxin-related [52219] binds FMN Protein: Flavodoxin [52220] Species: Clostridium beijerinckii [TaxId: 1520] [52226] PDB Entry Domains:5nul complexed with fmn; mutant chain a [31191]
2fax complexed with fmn; mutant chain a [31194]
… many others
![Page 25: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/25.jpg)
25
Clostridium beijerinckii Flavodoxin(stereo pair)
![Page 26: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/26.jpg)
26
Flavodoxin NADPH-cytochrome P450 reductase
same superfamily, different family
![Page 27: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/27.jpg)
27
Flavodoxin CHEY same fold, different superfamily
![Page 28: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/28.jpg)
28
Flavodoxin Spinach ferredoxin reductase
same class, different folds
![Page 29: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/29.jpg)
29
Flavodoxin in the SCOP hierarchy• To give some idea of the nature of the similarities expressed by the
differentlevels of the hierarchy
• Flavodoxin from Clostridium beijerinckii and NADPH-cytochrome P450 reductase are in the same superfamily, but different families.
• Flavodoxin and the signal transduction protein CHEY are in the same fold category, but different superfamilies.
• Flavodoxin and Spinach ferredoxin reductase are in the same class – + – but have different folds.
![Page 30: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/30.jpg)
30
CATH presents a classification scheme similar to that of SCOP
• CATH = Class, Architecture, Topology, Homologous superfamily, the levels of its hierarchy.
• In CATH, proteins with very similar structures, sequences and functions are grouped into sequence families.
• A homologous superfamily contains proteins for which similarity of sequence and structure gives evidence of common ancestry
• A topology or fold family comprises sets of homologous superfamilies that share the spatial arrangement and connectivity of helices and strands
• Architectures are groups of proteins with similar arrangements of helices and sheets, but with different connectivity. For instance, different four -helix bundles with different connectivities would share the same architecture but not the same topology in CATH
• General classes of architectures in CATH are: . , - (subsuming the / and + classes of SCOP), and domains of low secondary structure content.
![Page 31: BMMB597E Protein Evolution Protein classification 1](https://reader034.vdocuments.net/reader034/viewer/2022050909/56649ec45503460f94bcdcef/html5/thumbnails/31.jpg)
31
Do different classification schemes agree?• To classify protein structures (or any other set of objects) you
need to be able to measure the similarities among them. • The measure of similarity induces a tree-like representation of
the relationships. • CATH, SCOP, DALI and the others, agree, for the most part, on
what is similar, and the tree structures of their classifications are therefore also similar.
• However, even an objective measure of similarity does not specify how to define the different levels of the hierarchy.
• These are interpretative decisions, and any apparent differences in the names and distinctions between the levels disguise the underlying general agreement about what is similar and what is different.