subtypes of associated protein-dna (transcription factor-transcription factor binding site) patterns

13
Subtypes of Associated Protein- DNA (TF-TFBS) Patterns Prepared by: Cyrus Tak-Ming Chan ([email protected]) Tak-Ming Chan, Kwong-Sak Leung, Kin-Hong Lee, Man-Hon Wong, Chi-Kong Lau, Stephen Kwok-Wing Tsui, Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns , Nucleic Acids Research, 2012, doi: 10.1093/nar/gks749. 17/Sep/2012 Version 1.2 (Typos corrected on P12) 1

Upload: cyrus-chan

Post on 02-Jul-2015

378 views

Category:

Technology


2 download

DESCRIPTION

http://www.cse.cuhk.edu.hk/~tmchan/subtypes/ Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

TRANSCRIPT

Page 1: Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

Subtypes of Associated Protein-DNA (TF-TFBS) Patterns

Prepared by: Cyrus Tak-Ming Chan ([email protected])

Tak-Ming Chan, Kwong-Sak Leung, Kin-Hong Lee, Man-Hon Wong, Chi-Kong Lau, Stephen Kwok-Wing Tsui, Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns, Nucleic Acids Research, 2012, doi: 10.1093/nar/gks749.

17/Sep/2012 Version 1.2 (Typos corrected on P12)

1

Page 2: Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

Introduction

Proteins bind to DNA fragments to regulate genes i.e. Transcription Factors (TFs) bind to Transcription Factor

Binding Sites (TFBSs)

Finding the binding cores (several residues only) is fundamental and important

2

Page 3: Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

Motivations

Finding patterns/motifs one-sided is challenging and difficult e.g. TFBS Motif Discovery: Noises, variations through mutations,

unknown locations—weak signals to be recovered

? —Prediction —True TFBS

3

Tak-Ming Chan et al, IEEE Transactions on Evolutionary Computation, 2012 / BMC Bioinformatics, 2009, 10: 321 / Bioinformatics, 2007, 24(3)

Page 4: Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

Introduction

Finding associated patterns on both sides is shown to be promising—when you have many diverse binding sequences (e.g. TRANSFAC) Associated TF-TFBS patterns found from sequences…

x 7664 in TRANSFAC; 408 AAs on average

x 26786 bound TFBSs,1225 matrices in TRANSFAC; 25bp on average

Associated pattern discovery

…NRIAA… …TGACA…

…NRAAA… …TGACA…

…NREAA… …TGTGA……

Tak-Ming Chan et al, Discovering approximate-associated sequence patterns for protein-DNA interactions. Bioinformatics, 2011, 27(4)

4

Page 5: Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

Introduction

Finding associated patterns on both sides is shown to be promising—when you have many diverse binding sequences (e.g. TRANSFAC) Associated TF-TFBS patterns found from sequences are verified

on 3D structures to be binding cores!

…NRIAA… …TGACA…

…NRAAA… …TGACA…

…NREAA… …TGTGA……

Verified on 3D structures (binding cores <3.5Å)

x 40222 binding pairs from 1290 PDB protein-DNA complexes

5Tak-Ming Chan et al, Discovering approximate-associated sequence patterns for protein-DNA interactions. Bioinformatics, 2011, 27(4)

Page 6: Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

Introduction—Motivations

We can go further with these promising associated TF-TFBS patterns Discovering and analyzing the binding variances (subtypes)

…NRIAA… …TGACA…

…NRAAA… …TGACA…

…NREAA… …TGTGA……

Subtypes may•Lead to changed binding preferences•Distinguish conserved from flexible binding residues •Reveal novel binding mechanisms

6

Page 7: Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

Methods & Materials

7

Page 8: Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

Methods & Materials

Both L-2 distance and p-value of Chi-squared test are used to shortlist subtypes (3rd: G-C; 4th:G/C-G )

8

Page 9: Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

Results

Sample results from http://www.cse.cuhk.edu.hk/~tmchan/subtypes/

9

Page 10: Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

Results

Subtypes with evidence of changed binding preferences >70% of subtypes (& pairs) reflect

changed binding preferences according to PDB structure evidence.

10

Page 11: Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

Results

Subtype clusters show more conserved (invariant) residues are important for protein-DNA interactions; variant residues show specific properties

11

Page 12: Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

Results

Case study shows subtypes that are potentially critical for regulation through dimerization and thus TF-TFBS binding

PKVEIL-CAGCTG PKVVIL-CACGTG

myogenic regulatory factor (MRF) family: PDB 1MDY

Myc family (Oncogene): PDB 1NKP

PKVEIL appears in TFs of MRF4, Myf-5, Myf-6, MyoD… in TRANSFAC

PKVVIL appears in TFs of c/L/v-Myc in TRANSFAC

• The subtypes are discovered without family information while reflecting strong familial specificity

• Literatures on wet-labs support that if V is mutated to AA (MycV394D) similar to E, the dimerization of Myc-Max will be abolished (Miz1 binding deficient)

12

Page 13: Subtypes of Associated Protein-DNA (Transcription Factor-Transcription Factor Binding Site) Patterns

Discussion

Further applications Applications on TFBS (motif) matching by adding TF associated

subtype information

Extension of the method on high-throughput sequencing data (e.g. ChIP-Seq, Protein Binding Microarrays)

Integration of other information to enhance the TF-TFBS prediction

Incorporation of 3D homology modeling to better model protein-DNA interactions

Analysis of regulatory mechanisms with other data, e.g. allele-specific mRNA data, to reveal more detailed regulatory mechanisms

13