protein functional site prediction the identification of protein regions responsible for stability...

20
Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic problem With the explosion of genomic data from recent sequencing efforts, protein functional site prediction from only sequence is an increasingly important bioinformatic endeavor.

Post on 19-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Protein Functional Site Prediction

• The identification of protein regions responsible for stability and function is an especially important post-genomic problem

• With the explosion of genomic data from recent sequencing efforts, protein functional site prediction from only sequence is an increasingly important bioinformatic endeavor.

Page 2: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

What is a “Functional Site”?

• Defining what constitutes a “functional site” is not trivial

• Residues that include and cluster around known functionality are clear candidates for functional sites

• We define a functional site as catalytic residues, binding sites, and regions that clustering around them.

Page 3: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Protein

Page 4: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Protein + Ligand

Page 5: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Functional Sites (FS)

Page 6: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Regions that Cluster Around FS

Page 7: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Phylogenetic motifs

• PMs are short sequence fragments that conserve the overall familial phylogeny

• Are they functional?

• How do we detect them?

Page 8: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Phylogenetic motifs

• PMs are short sequence fragments that conserve the overall familial phylogeny

• Are they functional?• How do we detect them? • First we design a simple heuristic to find

them• Then we see if the detected sites are

functional

Page 9: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Phylogenetic Motif Identification

• Compare all windowed trees with whole tree and keep track of the partition metric scores

• Normalize all partition metric scores by calculating z-scores

• Call these normalized scores Phylogenetic Similarity Z-scores (PSZ)

• Set a PSZ threshold for identifying windows that represent phylogenetic motifs

Page 10: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Set PSZ Threshold

Page 11: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Regions of PMs

Page 12: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

TIM

Phylogenetic Similarity False Positive Expectation

Page 13: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

TIM

Phylogenetic Similarity False Positive Expectation

Page 14: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Cytochrome P450

Phylogenetic Similarity False Positive Expectation

Page 15: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Cytochrome P450

Phylogenetic Similarity False Positive Expectation

Page 16: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Enolase

Phylogenetic Similarity False Positive Expectation

Page 17: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Glycerol Kinase

Phylogenetic Similarity False Positive Expectation

Page 18: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Glycerol Kinase

Phylogenetic Similarity False Positive Expectation

Page 19: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Myoglobin

Phylogenetic Similarity False Positive Expectation

Page 20: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic

Myoglobin

Phylogenetic Similarity False Positive Expectation