11/02/05 d dobbs isu - bcb 444/544x: rna structure prediction1 11/2/05 rna structure prediction
TRANSCRIPT
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 1
11/2/05
RNA Structure Prediction
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 2
Announcements
Seminar
12:10 PM Fri BCB Faculty Seminar in E164 Lago How to do sequence alignments on parallel computers Srinivas Aluru, ECprE & Chair, BCB Program http://www.bcb.iastate.edu/courses/BCB691F2005.html
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 3
Announcements
BCB 544 Projects - Important Dates:
Nov 2 Wed noon - Project proposals due to David/Drena
Nov 4 Fri 10A - Approvals/responses to students
Dec 2 Fri noon - Written project reports due
Dec 5,7,8,9 class/lab - Oral Presentations (20')
(Dec 15 Thurs = Final Exam)
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 4
RNA Structure & Function Prediction
Mon Review - promoter prediction
RNA structure & function
Wed RNA structure prediction2' & 3' structure predictionmiRNA & target prediction - perhaps..
RNA function prediction? Won't have time to cover this…
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 5
Reading Assignment (for Mon/Wed)
Mount Bioinformatics• Chp 8 Prediction of RNA Secondary Structure • pp. 327-355• Ck Errata:
http://www.bioinformaticsonline.org/help/errata2.html
Cates (Online) RNA Secondary Structure Prediction Module• http://cnx.rice.edu/content/m11065/latest/
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 6
Review last lecture:
RNA Structure & Function
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 7
RNA Structure & Function
• RNA structure• Levels of organization• Energetics (more about this on Wed)
• RNA types & functions• Genomic information storage/transfer• Structural • Catalytic• Regulatory
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 8
Fig 6.2Baxevanis & Ouellette 2005
Covalent & non-covalent bonds in RNA
Primary: Covalent bonds
Secondary/Tertiary Non-covalent bonds
• H-bonds (base-pairing)• Base stacking
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 9
1)G-C, A-U, G-U ("wobble") & variants U can form base-pairs with both A & G
2)Nucleotides in RNA are frequently modified this is not very common in DNA
These features & flexible "single-stranded" RNAbackbone allow for many potential base-pairs
Base-pairing in RNA
See: IMB Image Library of Biological Molecules
Modified bases are especially important) in tRNA: e.g., pseudo-Uridine, rD, 5-CH3-C6-isopentenyl-A
7-CH3-G, many others…
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 10
Fig 6.2Baxevanis & Ouellette 2005
Common structural motifs in RNA
Helices
Loops• Hairpin • Internal • Bulge • Multibranch
Pseudoknots
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 11
RNA functions
Storage/transfer of genetic information
• Genomes • many viruses have RNA genomes
single-stranded (ssRNA)e.g., retroviruses (HIV)
double-stranded (dsRNA)
• Transfer of genetic information • mRNA = "coding RNA" - encodes proteins
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 12
RNA functions
Structural • e.g., rRNA, which is major structural component of ribosomes
BUT - its role is not just structural, also:
Catalytic RNA in ribosome has peptidyltransferase activity
• Enzymatic activity responsible for peptide bond formation between amino acids in growing peptide chain• Also, many small RNAs are enzymes
"ribozymes"
(Gloria Culver, ISU)
(W Allen Miller, ISU)
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 13
RNA functionsRegulatory
Recently discovered important new roles for RNAs In normal cells:
• in "defense" - esp. in plants• in normal development
e.g., siRNAs, miRNAAs tools:
• for gene therapy or to modify gene expression
• RNAi (used by many at ISU: Diane Bassham, Thomas Baum, Jeff Essner, Kristen
Johansen, Jo Anne Powell-Coffman, Roger Wise, etc.)• RNA aptamers (Marit Nilsen-Hamilton, ISU)
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 14
RNA types & functions Types of RNAs Primary Function(s)
mRNA - messenger translation (protein synthesis) regulatory
rRNA - ribosomal translation (protein synthesis) <catalytic>
t-RNA - transfer translation (protein synthesis)
hnRNA - heterogeneous nuclear
precursors & intermediates of mature mRNAs & other RNAs
scRNA - small cytoplasmic signal recognition particle (SRP)tRNA processing <catalytic>
snRNA - small nuclear snoRNA - small nucleolar
mRNA processing, poly A addition <catalytic>rRNA processing/maturation/methylation
regulatory RNAs (siRNA, miRNA, etc.)
regulation of transcription and translation, other??
L Samaraweera 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 15
Thanks to Chris Burge, MITfor following slides
Slightly modified from: Gene Regulation and MicroRNAs
Session introduction presented at ISMB 2005, Detroit, MI
Chris Burge [email protected]
C Burge 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 16
Expression of a Typical Eukaryotic Gene
DNA
…
Transcription
Protein
TranslationmRNA
Splicing
exon intron
AAAAAAAAA
Polyadenylation
Protein Coding Gene
Folding, Modification,Transport, Complex Assembly
Protein Complex
Degradation
Degradation
primary transcript / pre-mRNA
Export
For each of these processes, there is a ‘code’
(set of default recognition rules)
C Burge 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 17
Gene Expression Challenges for Computational Biology
• Understand the ‘code’ for each step in gene expression(set of default recognition rules), e.g., the ‘splicing code’
• Understand the rules for sequence-specific recognition of nucleic acids by protein and ribonucleoprotein (RNP) factors
• Understand the regulatory events that occur at each step and the biological consequences of regulation
Lots of data
Genomes, structures, transcripts, microarrays, ChIP-Chip, etc.
C Burge 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 18
Sequence-specific Transcription Factors
• have modular organization
» Understand DNA-binding specificity
Yan (ISU) A computational method to identify amino acid residues involved in protein-DNA interactions
ATF-2/c-Jun/IRF-3 DNA complex
Panne et al. EMBO J. 2004
C Burge 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 19
Early Steps in Pre-mRNA Splicing
Matlin, Clark & Smith Nature Mol Cell Biol 2005
• Formation of exon-spanning complex
• Subsequent rearrangement to form intron-spanning spliceosomes which catalyze intron excision and exon ligation
hnRNP proteins
C Burge 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 20
Alternative Splicing
Matlin, Clark & Smith Nature Mol Cell Biol 2005
Wang (ISU) Genome-wide Comparative Analysis of Alternative Splicing in Plants
> 50% of human genes
undergo alternative splicing
C Burge 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 21
Splicing Regulation
Matlin, Clark & Smith Nature Mol Cell Biol 2005
ESE/ESS = Exonic Splicing Enhancers/Silencers
ISE/ISS = Intronic Splicing Enhancers/Silencers
C Burge 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 22
C. elegans lin-4 Small Regulatory RNA
We now know that there are hundreds of microRNA genes
(Ambros, Bartel, Carrington, Ruvkun, Tuschl, others)
lin-4 precursor
lin-4 RNA
“Translational repression”
V. Ambros lablin-4 RNA
target mRNA
C Burge 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 23
MicroRNA Biogenesis
N. Kim Nature Rev Mol Cell Biol 2005
C Burge 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 24
miRNA and RNAi pathways
RISC
Dicerprecursor
miRNA siRNAs
Dicer
“translational repression”and/or mRNA degradation
mRNA cleavage, degradation
RNAi pathway
microRNA pathway
MicroRNA primary transcript Exogenous dsRNA, transposon, etc.
target mRNA
Drosha
RISCRISC
C Burge 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 25
miRNA Challenges for Computational Biology
• Find the genes encoding microRNAs
• Predict their regulatory targets
• Integrate miRNAs into gene regulatory pathways & networks
Computational Prediction of MicroRNA Genes & Targets
C Burge 2005
Need to modify traditional paradigm of "transcriptional control" primarily by protein-DNA interactions to include miRNA regulatory mechanisms!
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 26
New Today:
RNA Structure Prediction
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 27
RNA structure prediction strategies
1)Energy minimization (thermodynamics)
2) Comparative sequence analysis (co-variation)
3) Combined experimental & computational
Secondary structure prediction
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 28
Secondary structure prediction strategies
1)Energy minimization (thermodynamics)
• Algorithm: Dynamic programming to find high probability pairs(also, some Genetic algorithms)
• Software: Mfold - ZukerVienna RNA Package -
Hofacker RNAstructure - MathewsSfold - Ding & Lawrence
R Knight 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 29
Secondary structure prediction strategies
2) Comparative sequence analysis (co-variation)• Algorithm:
Mutual informationContext-free grammars
• Software: ConStructAlifoldPfoldFOLDALIGNDynalign
R Knight 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 30
Secondary structure prediction strategies
3) Combined experimental & computational
• Experiment:Map single-stranded vs double-stranded regions in folded RNA
• How?Enzymes: S1 nuclease, T1 RNaseChemicals: kethoxal, DMS
R Knight 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 31
Experimental RNA structure determination?
• X-ray crystallography
• NMR spectroscopy
• Enzymatic/chemical mapping
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 32
1) Energy minimization method
What are the assumptions?
Native tertiary structure or "fold" of an RNA molecule is (one of) its "lowest" free energy configuration(s)
Gibbs free energy = G in kcal/mol at 37C
= equilibrium stability of structure lower values (negative) are more favorableIs this assumption valid?
in vivo? - this may not hold, but we don't really know
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 33
Free energy minimization
What are the rules?
A UA U
A=UA=U
Basepair
G = -1.2 kcal/mole
A UU A
A=UU=A
G = -1.6 kcal/mole
Basepair
What gives here?
C Staben 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 34
Energy minimization calculations:Base-stacking is critical
AAUU
-1.2 CGGC
-3.0
AU or UAUA AU
-1.6 GCCG
-4.3
AG, AC, CA, GAUC, UG, GU, CU
-2.1 GUUG
-0.3
CCGG
-4.8 XG, GXYU, UY
0
- Tinocco et al.
C Staben 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 35
Nearest-neighbor parameters
Most methods for free energy minimizationuse nearest-neighbor parameters (derived from experiment) for predicting stability of an RNA secondary structure (in terms of G at 37C)
& most available software packages use the same set of parameters:
Mathews, Sabina, Zuker & Turner, 1999
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 36
Energy minimization - calculations:
Total free energy of a specific conformation for a specific RNA molecule = sum of incremental energy terms for:
• helical stacking (sequence dependent)• loop initiation• unpaired stacking
(favorable "increments" are < 0)
Fig 6.3Baxevanis & Ouellette 2005
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 37
But how many possible conformations for a single RNA molecule?
Huge number:Zuker estimates (1.8)N possible secondary
structures for a sequence of N nucleotides
for 100 nts (small RNA…) = 3 X 1025 structures!
Solution? Not exhaustive enumeration… Dynamic programming
O(N3) in time O(N2) in
space/storage iff pseudoknots
excluded, otherwise:O(N6 ), timeO(N4 ), space
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 38
2) Comparative sequence analysis (co-variation)
Two basic approaches:
• Algorithms constrained by initial alignment
Much faster, but not as robust as unconstrained
Base-pairing probabilities determined by a partition function
• Algorithms not constrained by initial alignment
Genetic algorithms often used for finding an alignment & set of structures
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 39
RNA Secondary structure prediction: Performance?
How evaluate? • Not many experimentally determined structures
currently, ~ 50% are rRNA structures so "Gold Standard" (in absence of tertiary structure):
compare with predicted RNA secondary structure with that determined by
comparative sequence analysis (!!??) using Benchmark Datasets
NOTE: Base-pairs predicted by comparative sequence analysis for large & small subunit rRNAs are 97% accurate when compared with high resolution crystal structures!
- Gutell, Pace
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 40
RNA Secondary structure prediction: Performance?
1)Energy minimization (via dynamic programming) 73% avg. prediction accuracy - single
sequence2) Comparative sequence analysis
97% avg. prediction accuracy - multiple sequences (e.g., highly conserved rRNAs)much lower if sequence conservation is lower &/or fewer sequences are available for alignment
3) Combined - recent developments: combine thermodynamics & co-variation
& experimental constraints? IMPROVED RESULTS
11/02/05 D Dobbs ISU - BCB 444/544X: RNA Structure Prediction 41
RNA structure prediction strategies
Requires "craft" & significant user input & insight
1) Extensive comparative sequence analysis to predict tertiary contacts (co-variation)
e.g., MANIP - Westhof2) Use experimental data to constrain model
building e.g., MC-CYM - Major
3) Homology modeling using sequence alignment & reference tertiary structure (not many of these!)
4) Low resolution molecular mechanics e.g., yammp - Harvey
Tertiary structure prediction