from structure to function janet thornton european bioinformatics institute
TRANSCRIPT
![Page 1: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/1.jpg)
From Structure to Function
Janet Thornton
European Bioinformatics Institute
![Page 2: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/2.jpg)
From Structure to Functional Annotation
![Page 3: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/3.jpg)
Mid-West Center forStructural Genomics (MCSG)
University of TorontoAled Edwards
Argonne National LaboratoryAndrzej Joachimiak
Northwestern UniversityWayne Anderson
University of Washington at St LouisDaved Fremont
UT Southwestern Medical CenterZbyszek Otwinowski
University of VirginiaWladek Minor
EBI / University College LondonJanet Thornton, Christine Orengo
![Page 4: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/4.jpg)
![Page 5: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/5.jpg)
60 structures solved to date
ylxR hypothetical cytosolic protein
Hypothetical protein (EC4030_F)
Hypothetical protein (MTH1)
ygbM hypothetical protein (EC1530)
Conserved hypothetical protein (MT777)
cutA protein implicated in Cu homeostasis (TM1056)
Some examples …~30% are ‘hypothetical proteins’
![Page 6: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/6.jpg)
TIM barrel enzymes – 18 different homologous families
>60 different E.C. numbers
EC Wheel of TIM barrelsStructure of TIM barrel:Triose phosphate isomerase
![Page 7: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/7.jpg)
Pairwise sequence identity and conservation of enzyme function (Todd et al 2001)
• Single-domain proteins: >81,000 homologous enzyme / enzyme and enzyme / non-enzyme pairs
0%10%20%30%40%50%60%70%80%90%
100%
0-10
11-2
0
21-3
0
31-4
0
41-5
0
51-6
0
61-7
0
71-8
0
81-9
0
91-1
00
Sequence identity (%)
UnconservedConserved
Fractionalpercentage
![Page 8: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/8.jpg)
From Structure To Biochemical Function
Gene Protein 3D Structure Function
Given a protein structure:• Where is the functional site?• What is the multimeric state of the protein?
– PQS – Hannes Ponstingl (this morning)
• Which ligands bind to the protein?• What is biochemical function?
![Page 9: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/9.jpg)
Automated Structure Comparison
• The most powerful method for assigning function from structure is global or partial 3D structure comparison (e.g. Dali, SSAP; SSM)
• Hidden Markov Models derived from structural domains can often recognise distant relatives from sequence– Christine Orengo (tomorrow)
![Page 10: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/10.jpg)
Aspartate Amino Transferase Superfamily
Aspartate Aminotransfera
se
2,2-Dialkylglycine Decarboxylase
Tyrosine Phenolyase
Ornithine Decarboxylase
![Page 11: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/11.jpg)
Aspartate Amino Transferase Superfamily
Aspartate Aminotransferase
2,2-Dialkylglycine Decarboxylase
Tyrosine Phenolyase
Ornithine Decarboxylase
2.6.1.1
4.1.1.64 4.1.1.17
4.1.99.2
77
76
77
76
73
79
11
106
9
7
7
![Page 12: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/12.jpg)
Aspartate Amino Transferase Family
Aspartate Aminotransferase
2,2-Dialkylglycine Decarboxylase
Tyrosine Phenolyase
Ornithine Decarboxylase
2.6.1.1
4.1.1.64
4.1.1.17
4.1.99.2
all bind Pyridoxal 5’ Phosphate (PLP) co-factor
![Page 13: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/13.jpg)
Number of enzyme functions
0
10
20
30
40
50
60
superfamilies
num
ber
of
enzy
me
fu
nctio
ns
structural data
structural andsequence data
/ hydrolases
type I PLP-dependent enzymes
TIM barrel glycosyl hydrolases
![Page 14: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/14.jpg)
Convergent and Divergent Evolution
• Unrelated proteins can perform the same function (convergent evolution), sometimes using the same mechanism – sometimes using different mechanisms
• Related proteins can perform different functions – divergent evolution
![Page 15: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/15.jpg)
Active site convergence
Trypsin Subtilisin
![Page 16: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/16.jpg)
Alpha/beta hydrolaseTrypsin Subtilisin
Brain platelet activating factor acetylhydrolase
CheB methylesterase
Clp protease
![Page 17: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/17.jpg)
Predicting Binding SiteBinding-site analysis: cutA
Most likely binding site
Surface clefts
Residue conservation
Conserved surface patches
![Page 18: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/18.jpg)
Identifying Binding Site Function Using Motifs
- 3D enzyme active site structural motifs (Craig Porter)
- Catalytic Site Atlas - Identification of catalytic residues (Gail Bartlett, Alex Gutteridge)
- Metal binding sites (Malcolm MacArthur)
- Binding site features (Gareth Stockwell)
- Automatically generated templates of ligand-binding and
- DNA binding motifs (Sue Jones, Hugh Shanahan)
- “Reverse” templates (Roman Laskowski)
JESS – fast template search algorithm (Jonathan Barker)
PINTS - Searches for similar clusters (Aloy, Russell … – EMBL Heidelberg))
![Page 19: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/19.jpg)
Catalytic Site Atlas
Enzyme reports from primary literature information -lactamase Class A– EC: 3.5.2.6– PDB: 1btl– Reaction: -lactam + H2O -amino acid– Active site residues: S70, K73, S130, E166– Plausible mechanism:N
O
OH
N H 2
OH
S e r
L y s
S e r
N H 3 +
O
H
O
N
O
S e r
L y s
S e r
N H 3 +
O
O
NH
O
O
O
OH
H
S e r
L y s
S e r
G l u
OO H
O
OHO
NH
O
H
N H
S e r
L y s
S e r
G l u
![Page 20: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/20.jpg)
3-D templates
•Use 3D templates to describe the active site of the enzyme
–analogous to 1-D sequence motifs such as PROSITE, but in 3-D
•Sequence position independent
•Captures essence of functional site in protein
![Page 21: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/21.jpg)
TEmplate Search and Superposition TESS
• defines a functional site as a sequence-independent set of atoms in 3-D space
• search a new structure for a functional site
• search a database of structures for similar clusters
Wallace et al., 1997
e.g. serine proteinase,catalytic triad
![Page 22: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/22.jpg)
Pepsin
![Page 23: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/23.jpg)
Eukaryotic & Fungal Aspartic Proteinases: all-atom DTG-DTG Template
Aspartic Proteinase - Active Site residues - [DTG]x2
![Page 24: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/24.jpg)
A template of 8 atoms is sufficient to identifyall Aspartic Proteinases
Asp CO2 Gly C
Gly CAsp O
Thr/Ser O
Thr O
Aspartic Proteases: Active Site Template
![Page 25: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/25.jpg)
green= truered=false
Aspartic Protease Template Search
against all PDB
![Page 26: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/26.jpg)
3D Templates to Characterise Functional Sites
Template searches
(189 enzyme active site templates)
(~600 Metal binding site templates)
![Page 27: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/27.jpg)
GARTfaseCholesterol oxidaseIIAglc histidine kinase
Carbamoylsarcosineamidohhydrase
Dihydrofolate reductase Ser-His-Aspcatalytic triad
…
Database of enzyme active site templates189 templates
![Page 28: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/28.jpg)
MCSG structure
BioH – unknown function involved in biotin synthesis in E.coli
An example
Structure: Rossmann fold, hence many structural homologues
Expected to be an enzyme
Sequence contains two Gly-X-Ser-X-Gly motifs typical ofacyltransferases and thioesterases
![Page 29: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/29.jpg)
Ser-His-Asp catalytic triad of the lipases with rmsd=0.28Å
(template cut-off is 1.2Å)
CSA template searchOne very strong hit
Experimentally confirmed by hydrolase assays
Novel carboxylesterase acting on short acyl chain substrates
![Page 30: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/30.jpg)
Templates of Active Sites• Catalytic cluster conserved – Simple template
–e.g. Aspartic Proteinase (DTG)x2
• Order and geometry of catalytic residues varies–Multiple templates e.g. Polymerases
• Same catalytic cluster used in many different enzyme functions – one template identifies multiple active sites in unrelated structures
– eg Asp/His/Ser catalytic triad is well conserved in structure
![Page 31: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/31.jpg)
Instances of convergence Ser-His-Asp triads Cys-His-Asp triads Ribonuclease T1s Malic enzyme and isocitrate dehydrogenase Haloperoxidases Creatinase and carboxypeptidase G2 Glycosidases Class II extradiol-type dioxygenase and class III
extradiol-type dioxygenase Receptor tyrosine phosphatase and low-molecular
weight tyrosine phosphatase Pyridoxal 5' phosphate enzymes
James Torrance
![Page 32: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/32.jpg)
Template databases
• HAND CURATED– Enzyme active sites (PROCAT) – 189 templates
• Currently being extended
– Metal-binding sites – 600 templates
• AUTOMATED– Ligand-binding sites – 10,000 templates
– DNA-binding sites – 800 templates
![Page 33: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/33.jpg)
Another example of convergent evolution: The DNA HTH Binding Motif
1jhg
1hcr 1b9m 1eto
1lmb
1ais
1orc Sue Jones
![Page 34: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/34.jpg)
ProFunc – function from 3D structure
Homologous sequences of known function
Binding site identification and analysis
Homologous structures of known function
Functional sequence motifsQ-x(3)-[GE]-x-C-[YW]-x(2)-[STAGC]
Enzyme active site 3D-templates
HTH-motifs Electrostatics Surface comparison
… etc
DNA-, ligand- binding and “reverse” templates
Residue conservation analysis
![Page 35: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/35.jpg)
Three MCSG Examples(James Watson)
Three examples show the varying levels of information that can be retrieved from structures:
1. Almost full functional information. GOOD
•APC 1040
2. General information. NOT SO GOOD
•APC 012
3. Little or no information obtained. UGLY
•APC 078
![Page 36: From Structure to Function Janet Thornton European Bioinformatics Institute](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649e715503460f94b6fda8/html5/thumbnails/36.jpg)
Acknowledgements
• Roman Laskowski, James Watson, Richard Morris, Rafael Najmanovich, Fabian Glaser - EBI
• Christine Orengo, Annabel Todd, James Bray, Russell Marsden – University College, London
• MCSG members – Andzrej Jaochimiak, Al Edwards etc
• Funding: NIH - PSI; EU - SPINE; DoE – DNA Motifs; UK BBSRC LINK