d atabanks + new tools = new insights
DESCRIPTION
THE AXIOM. D atabanks + New tools = New insights. S imple A tom D epth I ndex C alculator. protein fold barcoding CATH – ADAPT…. -1. SADIC: a new tool to analyze atom depth. Digging inside objects to discover their origins. Birth of the Earth. protein folding. 2D. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/1.jpg)
Databanks +New tools =New insights
THE AXIOM
Simple Atom Depth
Index Calculator
protein fold barcodingCATH – ADAPT… -1
![Page 2: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/2.jpg)
protein foldingBirth of the Earth
Digging inside objects to discover their origins
SADIC: a new tool to analyze atom depth
![Page 3: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/3.jpg)
* Chakravarty S, Varadarajan R. Residue depth: a novel parameter for the analysis of protein structure and stability. Structure Fold Des. 1999 7:723-732* Pintar A, Carugo O, Pongor S. Atom depth as a descriptor of the protein interior. Biophys J. 2003 84:2553-2561.
atom depth calculated as the distance with:the closest external water*the closest dot of the water accessible surface*the closest surface exposed atom*
atom depth
HEWL 4lzt
2D
![Page 4: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/4.jpg)
atom depth2D
Daniele Varrazzo, Andrea Bernini1, Ottavia Spiga, Arianna Ciutti, Stefano Chiellini,Vincenzo Venditti, Luisa Bracci and Neri Niccolai. Three-dimensional Computation of Atomic Depth in Complex Molecular Structures Bioinformatics 2005 21:2856-2860
Calculation of exposed volumes
3D
HEWL 4lzt
2D
![Page 5: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/5.jpg)
atom depthCalculation of exposed
volumes
HEWL 4lzt
3D
Daniele Varrazzo, Andrea Bernini1, Ottavia Spiga, Arianna Ciutti, Stefano Chiellini,Vincenzo Venditti, Luisa Bracci and Neri Niccolai. Three-dimensional Computation of Atomic Depth in Complex Molecular Structures Bioinformatics 2005 21:2856-2860
![Page 6: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/6.jpg)
Calculation of exposed volumes
Depth index: Di,r = 2Vi,r / V 0,r
where Vi,r is the exposed volume of a sphere of radius r centered on atom i of the molecule and V0,r is the exposed volume of the same sphere when centered on an isolated atom
HEWL 4lzt
atom depth3D
Daniele Varrazzo, Andrea Bernini1, Ottavia Spiga, Arianna Ciutti, Stefano Chiellini,Vincenzo Venditti, Luisa Bracci and Neri Niccolai. Three-dimensional Computation of Atomic Depth in Complex Molecular Structures Bioinformatics 2005 21:2856-2860
the sphere radius r should have the biggest value which makes Vi = 0 for the most buried atom
![Page 7: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/7.jpg)
0,0
0,5
1,0
1,5
2,0
4,0
8,0
12,0
16,0
20,0
24,0
Di,r
r [Å]
![Page 8: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/8.jpg)
Thr 47 α carbon Di,9 = 1.59Ile 58 α carbon Di,9 = 0.13Trp 28 α carbon Di.9 = 0.03
58
47
28
atom depth3D vs 2D
HEWL 4lzt
![Page 9: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/9.jpg)
3D atom depth analysis
from PDB ID1UBQ
http://www.sbl.unisi.it/prococoa/
Di
![Page 10: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/10.jpg)
SBL Bioinformatics ProjectsProjects SADIC correlated:
1. fold dependent aa compositions of protein cores;
2. towards i-SADIC.----------------------------------------------------
Projects SADIC uncorrelated:1. systematic analysis of PPI
![Page 11: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/11.jpg)
Di analysis of protein atomsdefining strutural
layers in protein 3D structureseach strutural layer
includes atoms with similar Di’sfast and accurate analysis of
aa content of structural layers
![Page 12: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/12.jpg)
Ln Dicolor
L6 > 1.2 red
L5 1.0 – 1.2 orange
L4 0.8 – 1.0 yellow
L3 0.6 – 0.8 green
L2 0.4 -0.6 blue
L1 0.2 - 0.4 indigo
L0 < 0.2 violet3 VTR (chitinolytic enzyme 572 aa)
Di analysis of protein atoms
![Page 13: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/13.jpg)
N 0.19CA 0.30C 0.25O 0.23CB 0.50CG 0.68CD 0.91CE 1.11NZ 1.29
K63
N 0.38CA 0.52C 0.50O 0.52CB 0.76CG 0.95CD 1.17OE1 1.24OE2 1.24
E24
3D atom depth analysisN 0.10CA 0.05C 0.11O 0.18CB 0.02CG 0.02CD1 0.02CD2 0.00
L43
Dimax
Dimax
Dimax
from PDB ID1UBQ
http://ww
w.sbl.unisi.it/prococoa/
![Page 14: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/14.jpg)
Dimax analysis of protein residues
defining aa occupancy in protein strutural layers
each strutural layer includes residues with
similar Dimax’sfast and accurate analysis of aa distribution in protein
structures
![Page 15: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/15.jpg)
Dimax analysis of protein singlesquite a few proteins like to stay single
(at least in the crystalline state)
Bioinformatiha 2, Firenze 18 ottobre
-9
![Page 16: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/16.jpg)
a database of protein singlesExperimental Method: X-RAY (79,770)
Chain Type: Protein (74,456)Only 1 chain in asym. unit: (28,803)Oligomeric state: 1 (21,193)Number of Entities: 1 (3,517)Homologue Removal @ 95% identity (2,410)
2,410 proteins in the dataset
4,657,574 atoms589,383 residues
2162
322482
642802
9621122
12821442
16021762
192202468
1012141618
DOOPS:
![Page 17: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/17.jpg)
a database of protein singles
2,410 proteins in the dataset
4,657,574 atoms589,383 residues
DOOPS:
Swiss-Prot: 540,958 proteins in the dataset (192 Maa)
2162
322482
642802
9621122
12821442
16021762
192202468
1012141618
0 20001000
![Page 18: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/18.jpg)
calculation of % amino acid content in L0the first quantitative analysis of a large array of protein cores!
aa % in L0
Alanine 11.51Cysteine 2.63Aspartate 1.77Glutamate 1.2
Phenylalanine 6.36Glycine 10.81
Histidine 1.32Isoleucine 11.74
Lysine 0.58Leucina 16.27
Methionine 2.49Asparagine 1.7
Proline 2.45Glutamine 1.21Arginine 0.83Serine 4.85
Threonine 4.65Valine 13.7
Tryptophan 1.43Tyrosine 2.5
Dimax analysis of protein cores2,410 proteins; 4,657,574 atoms; 589,383 residues DOOPS:
~20 % of total molecular volume ΣDOOPS aa(L0) =
106,088(from 2410 proteins)
core aa if Dimax < 0.2
aa % in L0
Alanine 11.51Cysteine 2.63Aspartate 1.77Glutamate 1.2
Phenylalanine* 6.36Glycine 10.81
Histidine 1.32Isoleucine 11.74
Lysine 0.58Leucina 16.27
Methionine 2.49Asparagine 1.7
Proline 2.45Glutamine 1.21Arginine 0.83Serine 4.85
Threonine 4.65Valine 13.7
Tryptophan 1.43Tyrosine 2.5
![Page 19: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/19.jpg)
Class Architectures
Topology
Homologous
superfamily
Domains
1 (mainly α) 5 386 875 37,0382 (mainly β) 20 229 520 43,8813 (α & β) 14 594 1113 90,0294 (few sec. str.) 1 104 118 2,588
Total 40 1313 2626 173,536
Di analysis of protein coresfolding clues from aa core
composition?
:
![Page 20: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/20.jpg)
1.10 1.20 1.25 1.50 2.10 2.30 2.40 2.60 2.80 3.10 3.20 3.30 3.40 3.60 3.90 total
Proteinsmono
213 (84)
84(40)
19(17)
10(3)
17(13)
57(37)
94(73)
134(110)
12(12)
84(73)
52(44)
139(106)
218203
10(8)
49(49)
1,190(872)( )
Di analysis of protein coresfolding clues from aa core
composition?
#
domain
DOOPS + CATHselected Architectures
with ≥ 10 PDB files
:
![Page 21: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/21.jpg)
Cys
PDB ID 1UZK(A01)
aa % average value (av)av + σ
av + 2σav - σav - 2σ
Towards protein folding barcodes
ribbon
LeuPhe
PDB ID 1RG8(A00)
trefoil
Val
PDB ID 2IMH(A01)
four layersandwich
Class Architectures
Topology
Homologous
superfamily
1 5 386 8752 20 229 5203 14 594 11134 1 104 118
Total 40 1313 2626
% L0 1.10 1.20 1.25 1.50 2.10 2.30 2.40 2.60 2.80 3.10 3.20 3.30 3.40 3.60 3.90 overall
ALA 13,28 10,32 21,46 12,74 9,26 10,05 8,43 9,32 5,5 10,69 10,08 12,58 11,88 14,95 12,01 11.51
ARG 0,6 1,28 0,24 1,39 0 0,64 1,72 0,75 0 0,55 1,11 1,75 0,3 0,47 0,95 0.83ASN 0,67 2,62 0,73 2,77 1,85 2,04 1,77 1,36 0 2,1 2,9 0,96 1,52 2,8 2,1 1.70ASP 1,61 2,62 0,24 2,91 1,23 1,27 2,03 1,79 0 2,1 2,9 3,02 1,77 2,34 0,95 1.77CYS 3,35 2,99 5,37 0,83 22,84 2,04 1,46 4,42 0,92 2,83 2,1 1,49 1,86 1,4 3,05 2.63GLN 0,6 1,5 0,24 1,11 1,23 1,15 1,81 1,69 0 0,46 1,56 2,15 0,99 1,4 1,33 1.21GLU 1,48 1,44 0,73 1,52 0 1,15 1,19 1,04 0 0,91 2,59 2,41 1,08 0,93 0,67 1.20GLY 8,05 8,72 9,76 13,85 16,05 9,92 16,2 10,82 9,17 8,78 11,81 11,35 12,64 13,08 9,91 10.8
1HIS 1,01 1,6 2,44 1,11 0,62 0,76 0,79 0,56 0 2,65 1,96 3,02 1,91 0,47 2,48 1.32
ILE 12,68 9,95 10,73 8,59 6,79 13,61 10,68 10,78 13,76 12,8 11,77 12,53 11,53 7,01 11,34 11.74
LEU 23,88 18,34 22,44 11,77 8,02 17,18 12,97 13,98 33,94 16,54 11,9 14,33 14,22 15,42 13,63 16.27
LYS 0,67 0,91 0 1,11 0 0,38 0,49 0,56 0 0,09 0,62 1,36 0,55 0 0,67 0.58MET 2,62 4,17 1,71 4,99 0 2,8 2,65 3,15 1,83 2,93 2,76 2,41 2,39 3,27 1,91 2.49PHE 6,44 6,79 2,93 4,57 4,32 7,12 7,06 6,73 15,6 7,22 4,95 6,18 6,07 4,21 6,01 6.36PRO 1,34 2,46 3,41 2,63 3,09 3,31 3 2,78 0 3,29 2,9 1,84 2,25 1,4 1,81 2.45SER 3,49 4,55 3,66 5,96 3,09 5,34 5,56 5,13 2,75 2,83 5,35 4,43 4,23 6,07 5,34 4.85THR 2,28 4,81 4,15 7,2 5,56 3,31 5,12 4,47 0,92 3,2 5,22 4,25 4,94 5,14 5,91 4.65TRP 1,01 1,55 0 2,77 3,7 0,38 1,63 2,78 2,75 2,19 1,52 0,66 1,26 0,47 2,1 1.43TYR 2,62 3,69 0,24 4,57 2,47 1,27 2,69 4,38 0,92 3,29 3,12 1,58 2,32 0 2,29 2.50VAL 12,34 9,68 9,51 7,62 9,88 16,28 12,75 13,51 11,93 14,53 12,88 11,7 16,29 19,16 15,54 13.7
# PDB
213 (84)
84(40)
19(17)
10(3)
17(13)
57(37)
94(73)
134(110)
12(12)
84(73)
52(44)
139(106)
218203
10(8)
49(49) 2,410
Di of 173,536 CATH domains28 h, 5’ (average comp. time 1.72
s/domain)Calculations performed on
6 cores 990X CPU based computer
Ala
PDB ID 3CKC(A02)
alphahorseshoe
CATH-ADAPTCATH - atom depth assisted protein tomography
![Page 22: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/22.jpg)
Towards protein folding barcodesPutting the protein universe in
order
![Page 23: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/23.jpg)
Towards protein folding barcodesPutting the protein universe in
order
![Page 24: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/24.jpg)
towards i-SADIC(implemented SADIC)
![Page 25: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/25.jpg)
towards i-SADIC(implemented SADIC)
H/D exchange rate profiles
![Page 26: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/26.jpg)
towards i-SADIC(implemented SADIC)
H/D exchange rate profilesD
DD
DD
D
D
D
D
D
D
D
D
D
![Page 27: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/27.jpg)
towards i-SADIC(implemented SADIC)
H/D exchange rate profiles
![Page 28: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/28.jpg)
towards i-SADIC(implemented SADIC)
H/D exchange rate profiles
![Page 29: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/29.jpg)
towards i-SADIC(implemented SADIC)
H/D exchange rate profiles
![Page 30: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/30.jpg)
2D atom depth or 3D atom depth
H/D exchange rate profiles
data from Pedersen TG, Thomsen NK, Andersen KV, Madsen JC, Poulsen FM. Determination of the rate constants k1 and k2 of the Linderstrom-Lang model for protein amide hydrogen exchange. A study of the individual amides in hen egg-white lysozyme. J Mol Biol. 1993 230(2):651-660.
dnwi = or atom distance with the nearest water
molecule
Di,9 = or atom depth index with a probe od radius 9 Å
![Page 31: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/31.jpg)
iSADIC atom depth 3D atom depth
H/D exchange rate profiles
data from Pedersen TG, Thomsen NK, Andersen KV, Madsen JC, Poulsen FM. Determination of the rate constants k1 and k2 of the Linderstrom-Lang model for protein amide hydrogen exchange. A study of the individual amides in hen egg-white lysozyme. J Mol Biol. 1993 230(2):651-660.
Di,9 = or atom depth index with a probe od radius 9 Å
iDi,9 = aDi,9 + bASAi cDi,9 + dDnwi
![Page 32: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/32.jpg)
iSADIC atom depth 3D atom depth
H/D exchange rate profiles
iDi,9 = aDi,9 + bASAi cDi,9 + dDnwi
![Page 33: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/33.jpg)
protein-protein interface analysis
biological vs crystallographic interfaces
![Page 34: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/34.jpg)
crystallographic dimers
biological dimers
![Page 35: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/35.jpg)
![Page 36: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/36.jpg)
![Page 37: D atabanks + New tools = New insights](https://reader033.vdocuments.net/reader033/viewer/2022061405/5681634a550346895dd3dab8/html5/thumbnails/37.jpg)
vs
N ARG CA ARG C ARG O ARG CB ARG CG ARG CD ARG NE ARG CZ ARG NH1 ARG NH2 ARG H ARG HA ARG HB2 ARG HB3 ARG HG2 ARG HG3 ARG HD2 ARG HD3 ARG HE ARGHH11 ARGHH12 ARGHH21 ARGHH22 ARG
N LYSCA LYSC LYSO LYSCB LYSCG LYSCD LYSCE LYSNZ LYSH LYSHA LYSHB2 LYSHB3 LYSHG2 LYSHG3 LYSHD2 LYSHD3 LYSHE2 LYSHE3 LYSHZ1 LYSHZ2 LYSHZ3 LYS