development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides
DESCRIPTION
Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides. Fábio M. Marques Madeira Supervisor: Professor Geoff Barton. 7 th May 2013. 14-3-3s dock onto pairs of tandem phosphoSer / Thr. 2R-ohnologue families. P. P. Kinase 1. 14-3-3. Kinase 2. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/1.jpg)
Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides
Fábio M. Marques Madeira
Supervisor: Professor Geoff Barton
7th May 2013
![Page 2: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/2.jpg)
14-3-3s dock onto pairs of tandem phosphoSer/Thr
P P
Kinase 1 Kinase 2
Hundreds of structurally and functionally diverse targets
14-3-3
1
2R-ohnologuefamilies
![Page 3: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/3.jpg)
The binding specificity of 14-3-3s is determined by overall steric fit and the sequence flanking the phosphoSer/Thr site
2
Mode I: RSX(pS/T)XP
Mode II: RX(F/Y)X(pS)XP
Mode III: C-terminal X(pS/T)
P P
Johnson et al., (2011) Molecular & cellular proteomics 10, M110.005751.
![Page 4: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/4.jpg)
ANIA: ANnotation and Integrated Analysis of the 14-3-3 interactome
3
![Page 5: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/5.jpg)
Development and evaluation of three new classifiers
6
Position-specific scoring matrix (PSSM)
Artificial Neural Network (ANN)
Support Vector Machines (SVM)
![Page 6: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/6.jpg)
Defining positive and negative examples for training and testing
5
Previous76 Pos76 Neg
Current273 Pos93 Neg
Training datasets:
1,192 Likely Neg
72 Proteins
pS/T pS/T
C- -N
![Page 7: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/7.jpg)
Defining positive and negative examples for training and testing
5
Previous76 Pos76 Neg
Current273 Pos93 Neg
Training datasets:
1,192 Likely Neg
Previous17 Pos17 Neg
Current38 Pos38 Neg
Blind datasets:
-11:11
-3:3
-7:7
Sequence redundancy thresholds:60%, 50% and 40%
Different motif regions/lengths:
-9:9
-5:5
![Page 8: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/8.jpg)
Development and evaluation of three new classifiers
7The area under the curve (AUC) was tested by Jackknife
![Page 9: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/9.jpg)
Development and evaluation of three new classifiers
8
Q - Accuracy
MCC - Matthews Correlation Coefficient
![Page 10: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/10.jpg)
Amino acid alphabet reduction reduces accuracy
9
Li et al., 2003 Livingston and Barton, 1993
Grouping 20 amino acids in 10 physicochemical classes:
Overall, alphabet reduction led to lower classification performances, suggesting that some sequence features that influence 14-3-3 binding, were lost by the reduction.
![Page 11: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/11.jpg)
Protein secondary structure, disorder and conservation do not improve the performance of the ANN
10
Sequence conservationProtein secondary structure by Jpred
Protein disorder by IUPred, DisEMBL and GlobPlot
P – Positives; N – Negatives (true + likely neg); L – Likely neg only; R – Random neg
![Page 12: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/12.jpg)
11
Blind testing shows that the PSSM is the best overall predictor
80% Overall Accuracy
![Page 13: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/13.jpg)
12
Prediction of new 14-3-3-binding sites using the PSSMHuman Proteome
![Page 14: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/14.jpg)
13
Scansite includes a set of predictions based on type I 14-3-3-
binding motif: RSX(pS/T)XP
The PSSM predictor outperforms Scansite in terms of accuracy
PSSM Scansite
![Page 15: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/15.jpg)
Conclusions
New strategy to map negative datasets
Performance improvement (AUC from ~0.80 to 0.88) and 80% accuracy,
for the PSSM model (60% and [-5:5])
Large-scale prediction of the human 14-3-3-binding proteome
The PSSM classifier outperforms Scansite in terms of accuracy
15
![Page 16: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/16.jpg)
Future work
1. Test training of the classifiers using non-symmetrical motif regions:
e.g. [-6:3]
2. Investigate new machine learning algorithms such as Bayesian
classifiers
3. Use the PSSM classifier to predict the 14-3-3-binding proteome of
model organisms such as Arabidopsis thaliana
4. Integrate predictions in ANIA and investigate if the candidate sites
are lynchpin sites conserved across 2R-ohnologue family members
16
![Page 17: Development of classification methods to predict new 14-3-3-binding proteins and phosphopeptides](https://reader035.vdocuments.net/reader035/viewer/2022070422/568165ba550346895dd8b744/html5/thumbnails/17.jpg)
Acknowledgements
Geoff Barton
Chris Cole
All members in the Computational Biology group
Carol MacKintosh and Michele Tinti