a network-based representation of protein fold space spencer bliven qualifying examination6/6/2011
TRANSCRIPT
![Page 1: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/1.jpg)
A network-based representation of protein fold space
Spencer Bliven
Qualifying Examination 6/6/2011
![Page 2: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/2.jpg)
Overview1. Background & Motivation
2. Preliminary Research
3. Proposed Future Research
![Page 3: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/3.jpg)
Fold SpaceWhat protein folds are possible?
Discrete or Continuous? Both? Neither?
What portion of fold space is utilized by nature?
Long debated questions. Why?Understanding of structure-function relationshipProtein design/engineeringProtein evolutionClassification
![Page 4: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/4.jpg)
Previous Work Orengo, Flores, Taylor,
Thornton. Protein Eng (1993) vol. 6 (5) pp. 485-500
Holm and Sander. J Mol Biol (1993) vol. 233 (1) pp. 123-38
Holm and Sander. Science (1996) vol. 273 (5275) pp. 595-603
Shindyalov and Bourne. Proteins (2000) vol. 38 (3) pp. 247-60
Hou, Sims, Zhang, Kim. PNAS (2003) vol. 100 (5) pp. 2386-90
Taylor. Curr Opin Struct Biol (2007) vol. 17 (3) pp. 354-61
Sadreyev et al. Curr Opin Struct Biol (2009) vol. 19 (3) pp. 321-8
α
α+β
β
α/β
![Page 5: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/5.jpg)
Why can we do better?More structures
Sampling of globular folds “saturated”Few novel folds being discoveredGeometric arguments for saturation of
small protein folds
Recent all-vs-all computationCluster sequence to 40% identity17,852 representative (updated weekly)189 million FATCAT rigid-body alignments
73503
http://www.rcsb.org/pdb/statistics/contentGrowthChart.do?content=total&seqid=100Accessed 5/31/2011
![Page 6: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/6.jpg)
Structural Similarity Graph Nodes: PDB chains,
non-redundant to 40%
Edges: FATCAT-rigid alignments
“Significant” edges: p<0.001 Length > 25 Coverage > 50
Hierarchically cluster to reduce complexity in visualization
aba/ba+bMultiMembraneSmall
![Page 7: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/7.jpg)
Agreement with SCOP
Class p<10-6
Fold p<10-7
Superfamily p<10-10
![Page 8: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/8.jpg)
Continuity
Grishin. J Struct Biol (2001) vol. 134 (2-3) pp. 167-85
Skolnick claims ≤ 7 intermediates between any proteinsWe observe network diameter=15
Can find interesting paths
![Page 9: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/9.jpg)
C4
C5
C6
C7
Symmetry
Beta Propellers
![Page 10: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/10.jpg)
SymmetryFunctionally important
Protein evolution (e.g. beta-trefoil)DNA bindingAllosteric regulationCooperativity
Widespread (~20% of proteins)
Focus of algorithmic work
FGF-1 Lee & Blaber. PNAS 2011
TATA Binding Protein1TGH
Hemoglobin4HHB
![Page 11: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/11.jpg)
Cross-class example 3GP6.A
PagP, modifies lipid A f.4.1 (transmembrane
beta-barrel)
1KT6.A Retinol-binding protein b.60.1 (Lipocalins)
![Page 12: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/12.jpg)
Summary of Preliminary Research
Calculated all-vs-all alignment Prlić A, Bliven S, Rose PW, Bluhm WF, Bizon C, Godzik A, Bourne PE. Pre-
calculated protein structure alignments at the RCSB PDB website. Bioinformatics (2010) vol. 26 (23) pp. 2983-2985
Built network of significant alignmentsApproximately matches SCOP classifications
Improved structural alignment algorithms Identify symmetry, circular permutations, topology
independent alignments Discussed more in report
![Page 13: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/13.jpg)
Future ResearchImprove the network
1. Improve all-vs-all comparison algorithm
2. Tune parameters during graph generation
Annotate the network & draw biological inferences3. Annotate nodes with functional information
4. Compare with other networks
Create new networks5. Enhance structural comparison algorithms
![Page 14: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/14.jpg)
1. Improve all-vs-all comparison algorithm
Need domain decomposition
Use Combinatorial Extension (CE)
![Page 15: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/15.jpg)
2. Tune parameters during graph generation
Don’t use p-valuesShouldn’t compare p-values, statistically*Not normalized by secondary structureNot accurate due to multiple testing problem
Use TM-scoreRMSD, normalized to the alignment length
Determine optimal thresholds for determining “significance”For instance, train an SVG
* Technically ok here, since one-to-one with the FATCAT score
![Page 16: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/16.jpg)
FATCAT p-value by Class
Perform poorly on all-alpha in “twilight zone”
Terrible on membrane proteins Probably reflects non-
structural considerations in SCOP assignment
![Page 17: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/17.jpg)
3. Annotate nodes with functional information
SCOP/CATH classifications
GO terms
Metal binding
Ligand binding
Symmetry
aba/ba+bMultiMembraneSmall
![Page 18: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/18.jpg)
4. Compare with other networks
Define other types of network over the set of protein representativesProtein-protein interactionsCo-expression
Correlate to the structural similarities
Structural similarity
Protein-protein interaction
![Page 19: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/19.jpg)
5. Enhance structural comparison algorithms
Improve automated pseudo-symmetry detection
Find topology-independent relationships
C3
![Page 20: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/20.jpg)
SummaryFold space as network
Improve network creation
Annotate network with functional information
Improve structural similarity detection
![Page 21: A network-based representation of protein fold space Spencer Bliven Qualifying Examination6/6/2011](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e0c5503460f94af4fec/html5/thumbnails/21.jpg)
AcknowledgmentsBourne Lab
Philip Bourne
Andreas Prlić
Lab & PDB members
Qualifying Exam Committee
Ruben Abagyan
Patricia Jennings
Andy McCammon
Collaborators
Philippe Youkharibache
Jean-Pierre Changeux
Rotation Advisors
Pavel Pevzner
Philip Bourne
José Onuchic & Pat Jennings
Mike MacCoss
Virgil Woods