alternative splicing genomic dna sequence gmgm aaaaa exon intron exon gmgm aaaaa transcription mrna...
TRANSCRIPT
Alternative SplicingGenomic DNA Sequence
Gm
AAAAA
Exon ExonIntron Intron Intron
Exon ExonExon
Gm
AAAAA
Transcription
mRNA mRNA
RNA Processing
pre-mRNA
Alternative Splicing Data Sources are Large and
Growing
Microarray detectionDirect or indirect alternative splicing detection
Curated databasesSWISS-PROT and RefSeq both support annotationof experimentally supported alternative splicing
Hu et al. (2001)Genome Res 11:1237-45Yeakley et al. (2002) Nat Biotech 20:353-9
cDNA Sequencing ProjectsRIKEN sequenced >21000 full length mouse
cDNAsMany other projects underway (human, fly,
plants,…)Shinagawa et al. (2001) Nature 409:685-90
Public EST data sources (dbEST)>4.5 million human EST sequences>12 million total EST sequencesAbout 1000 new sequences per day
Boguski et al. (1993) Nat Gen 4:332-3
Gm
Nonsense-Mediated mRNA Decay
Termination codon is on the
last exon(not premature)
Mitchell and Tollervey (2001) Curr Opin Cell Biol 13:320-5
Nagy and Maquat (1998) TIBS 23:198-9Le Hir et al. (2000) Genes & Dev 14:1098-1108
Lykke-Andersen et al. (2001) Science 293:1836-9Kim et al. (2001) EMBO 20:2062-68
Ishigaki et al. (2001) Cell 106:607-17
Leeds et al. (1991) Genes Dev 5:2303-14
Genomic DNA
pre-mRNA
mRNAExon junction complex
AAAAAAAAA
Exon Intron ExonIntron Exon
Nonsense-Mediated mRNA Decay
Termination codon > 50nt before last exon junction
(Premature Termination Codon)
Decapping and degradation
Mitchell and Tollervey (2001) Curr Opin Cell Biol 13:320-5
Nagy and Maquat (1998) TIBS 23:198-9Le Hir et al. (2000) Genes & Dev 14:1098-1108
Lykke-Andersen et al. (2001) Science 293:1836-9Kim et al. (2001) EMBO 20:2062-68
Ishigaki et al. (2001) Cell 106:607-17
Leeds et al. (1991) Genes Dev 5:2303-14
mRNAGm
AAAAAAAAA
Interaction between EJC andrelease factors triggers NMD
Nonsense-Mediated mRNA Decay
Gm
Gm
Translated normally
Degraded by NMD
>50 nt
mRNA
mRNA
AAAAAAAAA
AAAAAAAAA
ORF
ORF
NMD is Pervasive1498 of 1500 genes surveyed
from fungi, plants, insects and vertebrates obey the PTC
ruleNagy and Maquat (1998) TIBS 23:198-9
“NMD is a critical process in normal cellular developement”
Wagner and Lykke-Andersen (2002) J Cell Sci 115:3033-8
Wang et al. (2002) J Biol Chem 277:18489-93
Renders recessive many otherwise dominant
mutationsCali and Anderson (1998) Mol Gen Genet 260:176-84
V(D)J recombination
4.3% of reviewed RefSeqs have PTCs 34% have start codon after first exon
TranscriptionalRegulation
RUST
Gene locus
pre-mRNA
productivemRNA
Protein
transcription
productivesplicing
translation
RUSTTranscriptional
RegulationGene locus
productivesplicing
Gene locus
pre-mRNA
transcription
pre-mRNA
ProductivemRNA
ProductivemRNA
Alternative Splicing Can Yield Isoforms Differentially Subjected to NMDNucleus
Prematuretermination
codon
NMD
pre-mRNA
DNA
mRNAmRNA
DNA
pre-mRNA
Nucleus
SC35 Auto-regulationSC35 Locus
SC35 pre-mRNA
Productive SC35 mRNA
SC35 protein
splicing
transcription
translation
alternativesplicing
Sureau et al. (2001) EMBO J 20:1785-96
Gm
Gm
SC35 Auto-regulationAlternative splicing coupled with
nonsense-mediated decay
Sureau et al. (2001) EMBO J 20:1785-96
SC35SC35
SC35SC35
SC35 pre-mRNA
SC35 mRNA
SC35protein
SC35 pre-mRNA
SC35 mRNA(with premature
termination codon)
SC35 Locus
SC35 pre-mRNA
Productive SC35 mRNA
SC35 protein
AAAAA
AAAAA
ORF
SC35
EST-inferred human isoforms
0 2000 4000 6000 8000
NMDCandidates
Alternative isoforms
All isoforms, including canonical
1989 (35 % of 5693)
5693
8820
10000
Canonical Splice FormsGenomic Contigs
Coding Refseqs
Association via LocusLink
Refseq mRNAs
Extract coding regions
align w/ Spidey
Refseq-Contig Pairs
≥98% id, no gaps
Construct genes from aligned Refseq exons
& intervening genomic introns (overlapchoose mRNA w/ largest number of
exons)
Refseq-coding genes
Lander et al. (2001) Nature 409: 860-921
Wheelan et al. (2001) Gen Res 11:1952-7
Pruitt, K.D. et al (2001) NAR 29: 137-40
mRNA
Exon 1 Exon 3Exon 2 Exon 4
Refseq-coding gene
Genomic DNA Sequence
Cluster ESTs w/ WU-BLAST2
≥92% id, allow gaps
Refseq-coding genes
Boguski et al., (1993) Nat Genet 4, 332-3.
Gish,(2002)(Wash.Univ.)
Align ESTs w/ sim4
Alternative Isoforms of Refseq-coding genes
Kan, et al. (2001) Gen Res 11, 889-900.
Florea, et al.,(1998)Gen Res 8, 967-74.
Identification of Alternative Isoforms
ESTs from dbEST
Use TAP to infer alternative
mRNAs
>92% identity, gaps allowed
Aligned EST 5’ end does not indicate reading frame
Class Experimental Evidence Among Our Results
Splicing Factors
AUF1, SC35SRP20, SRP30b (in C. elegans)
Sureau et al. (2001) EMBO J 20:1785-96Wilson et al. (1999) Mol Cell Bio 19:4056-64
Morrison et al. (1997) PNAS 94:9782-9785
AUF1, *10 new
Ribosomal Proteins
L3, L7a, L10a, L12 (in C. elegans)
L30, S14B (in S. cerevisiae) Mitrovich & Anderson (2000) GenesDev 14:2173-84
L3, L7a, L10a, L12, *11 new
Previous and new RUST targets
Alternative Splicing
Recruitment of Sequence.
Deletion of Sequence.
*Frameshift and Truncation.
not integer # codons
Premature Stop Codons
EST LimitationsSingle pass
sequencing errors
Incompletelyprocessed transcripts
3’ end bias
Librarycontamination
Thanaraj (1999) NAR 27:2627-37
Alternative Splicing EST Analysis
From data in Brett et al. (2000) FEBS Lett 474:83-6
0
500
1000
1500
2000
Did not affect
reading frame
Inserted stop
codon
Changed
reading frame
Num
ber
of EST
splice
form
s
Alternative Isoform Inference from Splice Pairs
Alternative Splice Pairs, by Mode
Alternative Splice Pairs, by Mode
Splice Pairs Generating Premature Stops
For 76% of isoforms with premature stops:
ESTs cover a PTC & splice junction downstream
EST coverage and premature stops
In 80% of these isoforms, there is a PTC in every reading frame:
Alternative polyadenlyation signals are biased against recovery
RefSeq mRNAAlternatively spliced EST, reading frame 0
Alternatively spliced EST, reading frame 2
Alternatively spliced EST, reading frame 1