lecture 11

42
http://cs273a.stanford.edu [Bejerano Fall10/11] 1

Upload: tirzah

Post on 02-Feb-2016

27 views

Category:

Documents


0 download

DESCRIPTION

Lecture 11. HW1 Feedback (ours) (Upcoming Project – discuss Wed) Non-Coding RNAs Halfway Feedback (yours). “non coding” RNAs. Central Dogma of Biology:. RNA is an Active Player:. reverse transcription. long ncRNA. What is ncRNA?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 1

Page 2: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 2

Lecture 11

HW1 Feedback (ours)

(Upcoming Project – discuss Wed)

Non-Coding RNAs

Halfway Feedback (yours)

Page 3: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 3

“non coding” RNAs

Page 4: Lecture 11

4

Central Dogma of Biology:

Page 5: Lecture 11

5

RNA is an Active Player:

reverse transcriptionlong ncRNA

Page 6: Lecture 11

6

What is ncRNA?

• Non-coding RNA (ncRNA) is an RNA that functions without being translated to a protein.

• Known roles for ncRNAs:– RNA catalyzes excision/ligation in introns.– RNA catalyzes the maturation of tRNA.– RNA catalyzes peptide bond formation.– RNA is a required subunit in telomerase.– RNA plays roles in immunity and development (RNAi).– RNA plays a role in dosage compensation.– RNA plays a role in carbon storage.– RNA is a major subunit in the SRP, which is important in protein trafficking.– RNA guides RNA modification.

– In the beginning it is thought there was an RNA World, where RNA was both the information carrier and active molecule.

Page 7: Lecture 11

7

AAUUGCGGGAAAGGGGUCAACAGCCGUUCAGUACCAAGUCUCAGGGGAAACUUUGAGAUGGCCUUGCAAAGGGUAUGGUAAUAAGCUGACGGACAUGGUCCUAACCACGCAGCCAAGUCCUAAGUCAACAGAUCUUCUGUUGAUAUGGAUGCAGUUCA

RNA Folds into (Secondary and) 3D Structures

P 6b

P 6a

P 6

P 4

P 5P 5a

P 5b

P 5c

120

140

160

180

200

220

240

260

AAU

UGCGGG

A

A

A

GGGGUCA

ACAGCCG UUCAG

U

ACCA

AGUCUCAGGGG

AAACUUUGAGAU

GGCCUUGCA A A G G

G U A UGGUA

AU

A AG

CUGACGGACA

UGGUCC

U

A

A

CCA CGCA

GC

CAA

GUCC

UAA

GUCAACAG

AU C U

UCUGUUGAU

A

UGGAU

GC

AGU

UC A

Cate, et al. (Cech & Doudna).(1996) Science 273:1678.

Waring & Davies. (1984) Gene 28: 277.

We would like to predict them from sequence.

Page 8: Lecture 11

RNA structure rules• Canonical basepairs:

– Watson-Crick basepairs:• G - C• A - U

– Wobble basepair:• G – U

• Stacks: continuous nested basepairs. (energetically favorable)

• Non-basepaired loops:

– Hairpin loop.

– Bulge.

– Internal loop.

– Multiloop.

• Pseudo-knots

Page 9: Lecture 11

Bafna 1

RNA structure: Basics

• Key: RNA is single-stranded. Think of a string over 4 letters, AC,G, and U.

• The complementary bases form pairs.• Base-pairing defines a secondary structure.

The base-pairing is usually non-crossing.

Page 10: Lecture 11

Ab initio structure prediction: lots of Dynamic Programming

• Maximizing the number of base pairs (Nussinov et al, 1978)

simple model:(i, j) = 1

Page 11: Lecture 11

Pseudoknots drastically increase computational complexity

Page 12: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 12

Nearest Neighbor Model for RNA Secondary Structure Free Energy at 37 OC:

C G U U U G G GUU

CACAAACG

-2 .0

-2 .1

-0 .9

-0 .9

-1 .8

-1 .6

+ 5 .0

Ghelix = GCGGC + GGUCA + 2GUUAA + GUGAC =

-2.0 kcal/mol - 2.1 kcal/mol + 2x(-0.9) kcal/mol - 1.8 kcal/mol = -7.7 kcal/mol

Ghairpin loop = Ginitiation (6 nucleotides) + GmismatchGGCA =

5.0 kcal/mol - 1.6 kcal/mol = 3.4 kcal/mol

Gtotal = G

hairpin + Ghelix = 3.4 kcal/mol - 7.7 kcal/mol = -4.3 kcal/mol

Mathews, Disney, Childs, Schroeder, Zuker, & Turner. 2004. PNAS 101: 7287.

Page 13: Lecture 11

Zuker’s algorithm MFOLD: computing loop dependent energies

Page 14: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 14

Energy Landscape of Real & Inferred Structures

Page 15: Lecture 11

1

Unfortunately…

– Random DNA (with high GC content) often folds into low-energy structures.

– What other signals determine non-coding genes?

Page 16: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 16

Evolution to the Rescue

Page 17: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 17

Page 18: Lecture 11

a a cg u u c c c cu c ua g a cc

S

S

S

S

S aSu L aL

S uSa L cL

S gSc L a

S cSg L c

S L

• Each derivation tree corresponds to a structure.

Stochastic context-free grammar (SCFG)

L

L

L

L

Page 19: Lecture 11

S aSu

S cSg

S gSc

S uSa

S a

S c

S g

S u

S SS

1. A CFG

S aSu

acSgu

accSggu

accuSaggu

accuSSaggu

accugScSaggu

accuggSccSaggu

accuggaccSaggu

accuggacccSgaggu

accuggacccuSagaggu

accuggacccuuagaggu

2. A derivation of “accuggacccuuagaggu” 3. Corresponding structure

Stochastic context-free grammar (cont’)

Page 20: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 20

Page 21: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 21

MicroRNA

Page 22: Lecture 11

Genomic context

known miRNAs in human

intergenic intronic

polycistronic

monocistronic

Page 23: Lecture 11

tRNA

Page 24: Lecture 11

tRNA Activity

Page 25: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 25

Page 26: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 26

Page 27: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 27

Human specific accelerated evolution

Chimp

Humanrapid change

conserved

Page 28: Lecture 11

28

Human Accelerated RegionsHuman-specific substitutions in conserved sequences

28[Pollard, K. et al., Nature, 2006] [Beniaminov, A. et al., RNA, 2008]

HumanDerived

Chimp

Humanrapid change

HAR1:• Novel ncRNA•Co-expressed in Cajal-Retzius cells with reelin.•Similar expression inhuman, chimp, rhesus.•18 unique human substitutionsleading to novel conformation.•All weak (AT) to strong (GC).

conserved

ChimpAncestral

Page 29: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 29

Other Non Coding Transcripts

Page 30: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 30

Page 31: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 31

mRNA

Page 32: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 32

EST

Page 33: Lecture 11

lincRNAs (long intergenic non coding RNAs)

http://cs273a.stanford.edu [Bejerano Fall10/11] 33

Page 34: Lecture 11

X chromosome inactivation in mammals

X X X Y

X

Dosage compensation

Page 35: Lecture 11

Xist – X inactive-specific transcript

Avner and Heard, Nat. Rev. Genetics 2001 2(1):59-67

Page 36: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 36

Microarrays, Next Gen(eration) Sequencing etc.

Page 37: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 37

End Results

Page 38: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 38

Page 39: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 39

Page 40: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 40

Transcripts, transcripts everywhere

Human Genome

Transcribed (Tx)

Tx from both strands

Leaky tx?

Functional?

Page 41: Lecture 11

Or are they?

http://cs273a.stanford.edu [Bejerano Fall10/11] 41

Page 42: Lecture 11

http://cs273a.stanford.edu [Bejerano Fall10/11] 42

Halfway Feedback