mrna protein dna activation repression translation localization stability pol ii 3’utr...

82
mRNA protei n DNA Activation Repression Translatio n Localizati on Stability Pol II 3’UTR Transcriptional and post- transcriptional regulation of gene expression

Post on 19-Dec-2015

234 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

mRNA

protein

DNAActivationRepression

TranslationLocalizationStability

Pol II

3’UTR

Transcriptional and post-transcriptional regulation of gene

expression

Page 2: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Using gene expression to identify regulatory

elements5’ upstream

Page 3: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Feb 2007: ~110,000 arrays in NCBI GEO

Gene expression

Allen Institute Mouse Brain Gene expression Atlas (in

situ hybridization, ~23,000 genes)

Feb 2008: ~202,000 arrays

Page 4: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Roth et al., 1998, Tavazoie et al., 1999:co-expressed genes often share the same

regulatory elements

Expression5’ upstream

How do you identify regulatory elements from gene

expression ?

Motif finding programs:

- AlignACE (Hughes et al, 2000)

- MEME

- REDUCE (Bussemaker et al, 2001)

and many others …

Page 5: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Problems with current motif finding approaches

Different approaches for different types of expression data

Single microarray (e.g. log-ratios) REDUCE

C0

C1

C2

C3

C4

Co-expression clusters ALIGNACE

Page 6: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Problems with current motif finding approaches

Many unrealistic assumptions, e.g., zeroth order

background sequence model in AlignACE 1kb upstream region of Plasmodium falciparum PF11_0108

TTTAAAAAAAAAAAAAAAAAGAGAAAAACCATATTTATATGGATATAATATTTTTAAAGTATAGAAAAAATAATATATATTTATATACATTTATATTAATGAAAAAGCAAACAGCTAAATTACAAAAAAAAAAAAAAATTAGATTATCTCAATTAAAAGAACAATATATAAATAATTAATCCATGCTATTTTTTGATATATATAAGAATTTAATGCCTTATATTATAAATAGAGAAATAAATAAATAAATAAATATATAAACATATATATTATATATATATATATATATATAGTTATACATTATGATTTTGAAAAAATAGATATATACTATTAATTGTATATGTTTATACATAAAGCATATTTTTATTAATTGTAATATATAGATTTTTTATTATAATAATATTATATATATATATATATATATATATATTTTTTTTTTTTTGTTAAATAGCGAAATAAAAATACCTGACCTTTGTAATCTTTATTTGATTACTTCCTTCTTCATTCCTTCTTTGTTTGTTTGTTTGTTTCCCTTTTTTTTTTTTTTTTTTTTTTTTTTAGTTAATTCTTTTATATGTATAATAATATTATAAGACAATTGGACAATGATTACAAAAAGGTAAAAGTAATAATTTTCTAAAGTATAATATAATATTATAATAATATAATATAAATTTTTAATAAAATTTAATAAAAAAGTTTATAAATACTTATCGACCATAAGTCGTTTAAGAAAAAAAAAAAAAAGAAAAAAAAAAAAAAATATTACAAAAATATTATAGTGTATTATATTTTATCATATCATCTTTTTTTTATTTTTTTTTATATTTTTGTTACGGCACATCAAGCAACTATAAATATTTAAGATCAACCACCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACATTATTTATGGTATTTTAAA

Page 7: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Problems with current motif finding approaches

Elevated false positive rate

k-means clustered gene expression

randomly clustered gene expression

many motifs

AlignACE

many motifs

Page 8: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Re-thinking motif discovery from expression data

• One approach for all types of expression data

• Make as few a priori assumptions as possible

• Very low false positive rate

• Scale to complex metazoan and plant genomes

Noam Slonim

Page 9: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

0

1

2

3

4

Cluster Index

Microarray Conditions

All Genes on array

Clusters of co-expressed genes

Page 10: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

5’ upstream regions

Cluster index

1

1

1

1

1

2

2

2

2

0

0

0

0

0

2

correlation is quantified using

the mutual information

0.27 0.07 0.33

0.07 0.27 0.00Motif

Expression (Cluster Indices)

Absent

Present

0 1 2

Our approach: look for motifs whose profile of

presence/absence is informative about the

expression profile

2

1

3

1 )()(

),(log),()expression; motif(

i j jPiP

jiPjiPI

These genes belong to cluster

0

These genes belong to cluster

1

These genes belong to cluster

2

...

Page 11: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

0.45

0.12

0.01

-0.08

-0.87

-1.56

-2.32

-2.89

-5.65

1.54

1.98

3.50

4.39

6.45

-8.90

5’ upstream region

Log-ratio

Continuous expression variables (e.g. microarray log-ratios)

Page 12: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

3’UTRs

1

1

1

1

1

2

2

2

2

0

0

0

0

0

2

5’ upstream regions

Expression cluster index

1

1

1

1

1

2

2

2

2

0

0

0

0

0

2

Expression cluster index

DNA RNA

Page 13: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

finding all informative DNA and RNA motifs

How do we do it ?

Page 14: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Possible motif representations

Search space

Accuracy

very good

good

acceptable

very large

large

smallWords (k-mers)

GCGATGAG

Weight matrices

Degenerate code[AC]CGATGAG[TC]

Page 15: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Motif Search Algorithmk-mer MI CTCATCG 0.0618TCATCGC 0.0485AAAATTT 0.0438GATGAGC 0.0434AAAAATT 0.0383ATGAGCT 0.0334TTGCCAC 0.0322TGCCACC 0.0298ATCTCAT 0.0265......ACGCGCG 0.0018CGACGCG 0.0012TACGCTA 0.0011ACCCCCT 0.0010CCACGGC 0.0009TTCAAAA 0.0005AGACGCG 0.0004CGAGAGC 0.0003CTTATTA 0.0002

Not informative

Highly informative

...

MI=0.081

MI=0.045

MI=0.040

Page 16: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Optimizing k-mers into more informative degenerate motifs

ATCCGTACA

ATCC[C/G]TACAwhich character

increases the mutual information by the largest amount ?

5’ upstream regions

Cluster Indices

1

1

1

1

1

2

2

2

2

0

0

0

0

0

2

A/G

T/GC/G A/C/G

A/T/G

C/G/T

Page 17: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Optimizing k-mers into more informative degenerate motifs

ATCC[C/G]TACA

5’ upstream regions

Cluster Indices

1

1

1

1

1

2

2

2

2

0

0

0

0

0

2

A/C

T/CC/G A/C/G

A/T/C

C/G/T

.

.

.

Page 18: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

change

Motif Conservation with S. bayanus

Similarity to ChIP-chip RAP1 motif

Mutual information

RAP1 binding site (ChIP-chip)

Page 19: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

k-mer MI CTCATCG 0.0618TCATCGC 0.0485AAAATTT 0.0438GCTCATC 0.0434AAAAATT 0.0383ATGAGCT 0.0334TTGCCAC 0.0322TGCCACC 0.0298ATCTCAT 0.0265...

Highly informative

k-mers

Only optimize k-mer if

I(k-mer;expression | motif)

is large enough

(for all motifs optimized so far)

MI=0.081

MI=0.045

Motifs optimized so far

optimize ?

Conditional mutual information I(X;Y|Z)

Page 20: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Each motif is subjected to a stringent statistical significance

test

Real mutual information value

Maximum of 10,000 expression-shuffled mutual information values

Page 21: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

The regulation of gene expression is highly

combinatorial

DNA

Pol II

Expression pattern 1

Expression pattern 2

Expression pattern 3

Page 22: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Can we group our predicted motifs into modules of combinatorially acting regulatory elements ?

The regulation of gene expression is highly

combinatorial

Page 23: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Predicting combinatorial regulation using mutual

information5’ upstream region

0.43 0.07

0.07 0.43Motif 1

Motif 2

Absent

Present

Absent Present

2

1

2

1 )()(

),(log),()2 motif1; motif(

i j jPiP

jiPjiPI

Is the presence of motif 1 informative about the presence of motif 2 ?

Page 24: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Discovering modules of combinatorially acting

motifs

modules

Page 25: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

YeastP. falciparum Huma

n

Results

(malaria parasite)

Page 26: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Yeast stress gene expression program (Gasch et al, 2000)

• 173 microarray conditions

• ~ 5,500 genes

• 80 co-expression clusters

• Runtime ~ 1h (standard PC)

Page 27: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Predicted Motifs

Expression Clusters

17 motifs in 5’ upstream regions 6 motifs in 3’UTRs

0 “motifs” when shuffling the gene labels of the clustering partition

1129 motifs when applying AlignACE (with default parameters) to each cluster independently880 “motifs” when applying AlignACE to the same shuffled clusters as above

Page 28: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Predicted Motifs

13 modules of co-occurring motifs

All 23 motifs are highly conserved with S. bayanus

PAC

RRPE

PUF4

PUF3

MSN2/4

RAP1

RPN4

REB1

MBP1

HAP4

XBP1

BAS1

CBF1

SWI4

14 previously known motifs

Expression Clusters

Page 29: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

PAC is under-represented in cluster

13 (p<1e-5)

PAC is highly over-represented in cluster 66

(p<1e-20)

over-

rep

resen

tion

un

der-

rep

resen

tion

PAC

RRPE

Puf4

Expression ClustersMotifs

5’

5’

3’UTR

Predicted cooperation between DNA and RNA

motifs

Page 30: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Predicted Motifs

Expression Clusters

PAC

RRPE

PUF4

PUF3

MSN2/4

RAP1

RPN4

REB1

MBP1

HAP4

XBP1

BAS1

CBF1

SWI4

Page 31: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Mitochondrial ribosome, p<1e-33

Mitochondrial ribosome, p<1e-29

Puf3

Cytosolic ribosome, p<1e-18

Page 32: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Functional enrichments

Proteasome complex (p<1e-44)

DNA replication (p<1e-7)

Oxydative phosphorylation (p<1e-17)

Rpn4

Mbp1

Novel motif

Page 33: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Beer and Tavazoie, 2004; Elemento and Tavazoie, 2005

We also use mutual information to discover …

Non-random spatial

distribution

Orientation preferences

Co-localization

Page 34: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

0

0

0

0

0

1

1

1

1

5’ upstream region

Cluster Indices

2

2

2

2

0

0

0

0

0

1

1

1

1

2

2

2

2

Cluster IndicesCl

ose

Far

Very

far

Distance to TSS is informative about

expression

Distance between two motifs is informative

about expression

Page 35: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Predicted Motifs

Clusters

YY

YYY

YY

Y

Y

YY

Y

> 50% of our predicted motifs have a non-random spatial distribution

Page 36: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Clusters where the motif is over-represented

Clusters where the motif is NOT over-represented

ATG-600bp

PAC has a non-random spatial distribution

Page 37: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

RAP1 motif has a different kind of non-random spatial distribution

Unique cluster where the motif is over-represented

Clusters where the motif is NOT over-represented

RAP1 motif also has a strong orientation preference

Page 38: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Clusters where the TWO motifs are both over-represented

Clusters where the motifs are NOT over-represented

-600bp ATG

PAC and RRPE tend be co-localize on the DNA

Page 39: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

-600bp ATGPAC and the Msn2/4 binding site tend to avoid being in the same promoters

Page 40: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Single array analysis

Down-regulated Up-regulatedCy3/Cy5 expression log-ratios

PAC

Rpn4

Yap1

Puf3

H2O2 treatment in ΔMsn2/ΔMsn4 background

Page 41: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Bozdech, Llinás, et al., PLoS Biol, 2003

P. falciparum intra-erythrocytic developmental cycle

~ 2

,70

0 p

eri

od

ically

exp

ress

ed

g

en

es

0h Time 48h

Associate a “phase” to each gene, which reflects the timing of maximal expression

Page 42: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

-0.25

-0.12

0.01

0.08

0.34

0.67

2.32

2.89

3.01

-0.38

-1.68

-2.34

-2.56

-3.14

3.14

5’ upstream region

Phase

Discovering motifs that are informative about the expression phase

Page 43: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

21 motifs in 5’ upstream regions 0 motifs in 3’UTRs

0 “motifs” when shuffling the gene labels of the phase profile

-π Phase +π

71% highly conserved with P. yoelli

DNA replication, p<1e-4plastid, p<0.01

ribosome, p<0.001

Page 44: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Independent biochemical validation

- Purified 3/26 predicted TF in P. falciparum

- Identified DNA-binding specificities using protein binding microarrays

Bulyk lab, Harvard

Llinás lab, Princeton University, submitted

Page 45: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Motifs Match PredictionsP

rote

in B

ind

ing

M

icro

arra

y F

IRE

P

red

icti

on

GST AP2 GST AP2 AP2 GST AP2

Page 46: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

-π Phase +π

Independent biochemical validation for 3/21 motifs

More TFs being purified ...

bound by MAL6P1.44

bound by PF11_0404

bound by PF14_0633

Page 47: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Human gene expression atlas(Su et al, 2004, PNAS)

• 79 human tissues

• >17,000 genes

• 120 co-expression clusters

• Runtime ~ 24h (standard PC)

Page 48: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

73 DNA motifs 42 RNA motifs

ELK4Sp1AhRbZIP911NF-YE2F1TCF11-MafGPax2E2Fv-MybTEADDof2CHOP-C/EBPalphaHAND1-TCF3GBPSkn-1HFH-3Sox17

miR-499/miR-505/miR-200a/miR-141miR-525/miR-518f*/miR-526c/miR-526a/miR-520a*miR-380-3p/miR-215/miR-485-3p/hsa-let-7g/miR-610/hsa-let-7i/hsa-let-7b/hsa-let-7amiR-30d/miR-30c/miR-30a-5p/miR-30e-5p/miR-30BmiR-200b/miR-429/miR-200cmiR-663

71 modules

Page 49: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

NF-Y

novel

M phase (p<1e-43)

Page 50: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

TCF11-MafG

novel

Olfactory receptor activity (p<1e-43)

Page 51: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

miR-525miR-518f*miR-526c

Sp1

Page 52: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

(NF-Y binding site)

Page 53: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

let-7b over-expression in human fibroblasts

let-7 microRNAs are up-regulated when fibroblasts enter quiescence

Let-7 target genes ?

A. Lagesse-Miller, O. Elemento, …, and H. Coller, submitted

Page 54: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

~14,0

00

gen

es

C1

C2

C3

C4

C1

C2

C4

C5

A. Lagesse-Miller, O. Elemento, …, and H. Coller, submitted

0h 12h 24h 36h 48h

C5

let-7b over-expression in human fibroblasts

                 UACCUC |||||| uugguguguuggaugAUGGAGu-5’  let-7b

seed

Page 55: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

-1000bp

TSSArabidopsis thaliana

Experimental testing:

Ken Birnbaum (NYU)

Phil Benfey (Duke)

~22,300 genes on Affy chip

Page 56: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Biological insights• Importance of RNA motifs in

shaping transcriptomes ~30% of yeast, worm, human, arabidopsis motifs are RNA motifs

• In worm/human/mouse, many RNA motifs match miRNA targets

• “Cooperation” between DNA and RNA motifs

                UGUGAU |||||| cgaguaguuucgaccgACACUAu

Yeast Puf4 motif

Novel worm 3’UTR motif

Page 57: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Biological insights

• Avoidance of joint-presence for certain motifs

• Under-representation of certain motifs

Page 58: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

works with any type of gene expression data

FIRE(Finding Informative Regulatory Elements)

Single microarray (e.g. log-ratios)

Elemento, Slonim and Tavazoie, 2007, Molecular Cell

C0

C1

C2

C3

C4

Clustered microarrays

Gene expression

phase

Page 59: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

It looks for both DNA and RNA motifs

FIRE(Finding Informative Regulatory Elements)

5’5’

3’UTR3’UTR

5’5’5’5’

3’UTR3’UTR

5’5’5’5’

Elemento, Slonim and Tavazoie, 2007, Molecular Cell

Page 60: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

It is fast and scales well to large metazoan and plant genomes

FIRE(Finding Informative Regulatory Elements)

Elemento, Slonim and Tavazoie, 2007, Molecular Cell

Page 61: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

It yields few or no false positives

FIRE(Finding Informative Regulatory Elements)

Real clustered gene expression

randomly clustered gene expression

115 motifs

0 motifs

(Human tissue microarray dataset)

Page 62: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

It automatically evaluates:

FIRE(Finding Informative Regulatory Elements)

Functional coherence

Defense response (p<1e-32)

Inter-species conservation

Spatial and orientation biases

Compare to known motifs (JASPAR)

Cooperativity and co-localization

FIRE

Page 63: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

fire --expfile=human_clusters.txt --exptype=discrete --species=human

Expression file Expression type Species

FIRE(Finding Informative Regulatory Elements)

Usage:

http://tavazoielab.princeton.edu/FIRE/

NM_000030 0NM_000040 0NM_000042 0NM_000045 0NM_000046 1NM_000053 1NM_000065 1NM_000066 1 ...

- discrete- continuous

- human- mouse- arabidopsis- drosophila- worm- plasmodium- budding yeast- fission yeast- sea squirt

~250 downloads since Nov 2007

Page 64: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Bambi Tsui

http://tavazoielab.princeton.edu/FIRE/

~1500 queries since Nov 2007

Page 65: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Acknowledgements

Saeed Tavazoie Noam Slonim

Sasan Amini

Chang Chan

Gordon Freckleton

Hany Girgis

Yir-Chung Liu

Ilias Tagkopoulos

Tiffany Vora

Scott Breunig

Anand Dharan

Hani Goodarzi

Danny Lieber

Yael Marshall

Kellen Olszewski

Bambi Tsui

Eric Wieschaus

Manuel Llinás

Hilary Coller

Aster Lagesse-Miller

Xuemin Lu

Erandi De Silva

Collaborators at

Princeton

Page 66: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression
Page 67: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Additional slides

Page 68: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

…Chan*, Elemento*, Tavazoie, PLoS Computational Biology, 2005

Page 69: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

k-mer MI CTCATCG 0.0618TCATCGC 0.0485AAAATTT 0.0438GATGAGC 0.0434AAAAATT 0.0383ATGAGCT 0.0334TTGCCAC 0.0322TGCCACC 0.0298CATCGCA 0.0293AGATGAG 0.0288TTTTTCA 0.0280ATCTCAT 0.0265...ACGCGCG 0.0168CGACGCG 0.0167TACGCTA 0.0167ACCCCCT 0.0167CCACGGC 0.0164TTCAAAA 0.0163AGACGCG 0.0163CGAGAGC 0.0163GATAGAG 0.0155GTAGCTC 0.0143CTTATTA 0.0142...

TestPASSPASSPASSPASSPASSPASSPASSPASSPASSPASSPASSPASS

PASSDON’T PASSDON’T PASSDON’T PASSDON’T PASSDON’T PASSDON’T PASSDON’T PASSDON’T PASSDON’T PASSDON’T PASS

Most informative

Less informative

10 consecutive

“don’t pass”

Optimize “seeds” into

more degenerate

motifs

Page 70: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Optimizing seeds into more informative degenerate

motifsA

TCG

S=C/GM=A/CW=A/TR=A/GK=T/GY=T/C

V=A/C/GH=A/C/TD=A/G/TB=C/G/T

N=A/C/G/T

TCC[C/G]TAC matches TCCCTAC

and

TCCGTAC

Page 71: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Predicted Motifs

Clusters

PAC

RRPE

PUF4

PUF3

MSN2/4

RAP1

RPN4

REB1

MBP1

HAP4

XBP1

BAS1

CBF1

SWI4

5’

3’UTR

3’UTR

Another example of predicted cooperation between DNA and RNA

motifs

RAP1

Novel

Novel

Page 72: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

A gene expression map of Arabidopsis thaliana

development (Schmid et al, 2005)

• 79 different tissue samples

• 22,300 genes

• 140 clusters

Schmid et al, 2005, Nature Genetics

Page 73: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

-Log(p) over-rep

Log(p) under-rep

114 motifs in 5’ upstream regions 66 motifs in 3’UTRs

0 “motifs” when shuffling the gene labels of the clustering partition

Page 74: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

-Log(p) over-rep

Log(p) under-rep

telo-box

ABRE-like

W-box

I-box

DRE-core

Few of these motifs are known

Page 75: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

-Log(p) over-rep

Log(p) under-rep

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Many have a non-random spatial distribution

Page 76: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Clusters where the motif is over-represented

Clusters where the motif is NOT over-represented

-1000bp

TSSMotif has a non-random spatial distribution

Page 77: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Functional enrichments

Defense response (p<1e-32)

Ribosome (p<1e-84)

Localized to chloroplast (p<1e-35)

3’UTR

5’

3’UTR

Page 78: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

-Log(p) over-rep

Log(p) under-rep

These two motifs are predicted to co-localize extensively

Page 79: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Clusters where the motifs are over-represented

Clusters where the motifs are NOT over-represented

-1000bp

TSS

These two motifs co-localize on the DNA

Page 80: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Examples of tissue-specific motifs

... Collaborations with Phil Benfey (Duke), Ken Birnbaum

(NYU)

Page 81: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

Other data-types

strong binding

no bindingp-values

20 Bicoid-bound vs 100 non-bound enhancers

20 Dorsal-bound vs 100 non-bound enhancers

• ChiP-chip, e.g. HNF6 (in human islet cells)

• Enhancers (Drosophila)

Page 82: MRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Transcriptional and post-transcriptional regulation of gene expression

No

No

No

No

No

No

No

No

No

5’ upstream region

Tissue-specific expression

Yes

Yes

Yes

Yes

Yes

Yes

All other genes

Tissue-specific genes

Binary expression variables (e.g. tissue specific expression)