![Page 1: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/1.jpg)
Regulation of Alternative Splicing
Jihye Kim
Oral Preliminary Exam (May 7, 2007)
![Page 2: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/2.jpg)
Outline
• Alternative Splicing Overview• Goal : Investigate “regulation” of AS• Method : Association Rule Mining• Part I : Finding association rules of cis-regulatory
elements involved in alternative splicing
• Part II : Cis-regulatory Motif Combinations Associated with Tissue-specific Alternative Splicing
• Summary• Future Work
![Page 3: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/3.jpg)
Splicing
• Introns are removed and flanking exons are concatenated
• Spliceosome
- snRNPs and other proteins
[image from http://fig.cox.miami.edu/~cmallery/150/gene/c7.17.11.spliceosome.jpg]
![Page 4: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/4.jpg)
Splice Sites
• Recognized by spliceosome• Splice sites are too weak to predict intron
location accurately
[image from http://web-books.com/MoBio/Free/Ch5A4.htm]
5’ 3’
![Page 5: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/5.jpg)
Splicing Factors and Binding Sites
• Assist spliceosome to identify splice sites• Splicing factors
– SR (serine/arginine-rich) proteins
• Exonic and intronic enhancers and silencers (cis-acting)– ESE (A/G rich motifs), ESS (hnRNP), ISE (G triples, UGCAUG), ISS
[Source from Katherina Kechris in Rocky’05 Conference]
Exon Exon 2
![Page 6: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/6.jpg)
Alternative Splicing
• Over 70% in human genome• Major mechanism to generate protein diversity• Highly relevant to disease
– 15% disease-causing mutations affect splicing [Krawczak 1992]
[Krawczak 1992] Krawczak, M., Reiss, J., and Cooper, D.N. 1992 Hum. Genet. 90: 41-54
protein
Pre-mRNA
mRNA
![Page 7: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/7.jpg)
Types of Alternative Splicing
[Source from Cartegni et al. 2002]
Cassette Exon
![Page 8: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/8.jpg)
Investigating Alternative Splicing
• Traditionally, align ESTs and mRNAs to genomic sequences
• Recently, microarray technology
(Splice arrays)– Exon skipping is measured– Hard to measure other types of AS
![Page 9: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/9.jpg)
Previous Work on AS Regulation
• Most methods– use only sequence data– focus on the effect of individual motifs
• Brain-specific exon skipping [Brudno 2001]– 25 brain-specific cassette exons from literature– Over-representation of UGCAUG in downstream intron
• RESCUE-ESE [Fairbrother 2002]– Frequent hexamers in exon by weak splice sites– 10 ESE motifs show enhancer activity in experiment
[Brudno 2001] Brudno M., Gelfand M.S., et al., 2001 NAR 20 (11) 2338-21348[Fairbrother 2002] Fairbrother WG., et al., 2002 Science 9;297(5583):1007-13
![Page 10: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/10.jpg)
What We Have Done So Far
• Investigate cis-regulatory motifs that influence amount of AS or tissue-specific AS[Jihye Kim, Sihui Zhao, Steffen Heber, “Finding association rules of cis-regulatory elements involved in alternative splicing”, Proceedings of the 45th annual southeast regional conference (ACM-SE) pp. 232 – 237]
[Jihye Kim, Sihui Zhao, Steffen Heber, “Cis-regulatory Motif Combinations Associated with Tissue-specific Alternative Splicing”,7th workshop on Algorithms in Bioinformatics (WABI 2007) (submitted)
– Use mouse splice array data– Apply Association Rule Mining– Investigate motif combination involved in tissue-
specific AS
![Page 11: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/11.jpg)
AS Datasets in Mouse
• Dataset– Splice Array [Pan 2004]
with 6 probes– 3126 exon skipping
genes in mouse
– %ASex : percentage of exon skipping in 10 tissues
[Pan 2004] Pan, Q., et al., 2004 Mol Cell 16(6):929-942
Aim I-I : representing data context
![Page 12: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/12.jpg)
Association Rule Mining• By Agrawal et al. in 1993• Initially used for Market Basket Analysis
• An association rule is a pattern that states when X occurs, Y occurs with certain probability
• X : antecedent (left-hand-side, lhs), Y : consequent (right-hand-side, rhs)
• Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf)
X Y
![Page 13: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/13.jpg)
Rule Strength Measures
• Given a rule,
– Support = Pr(X∧Y)
– Confidence = Pr(Y | X)
– Lift = Pr(X∧Y)/ Pr(X)Pr(Y)• Dependency of lhs and rhs• Generally, lhs and rhs have positive dependency
if lift >1.0
X Y
![Page 14: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/14.jpg)
ARM Example
Cart 1 : Milk, Bread, Diaper, Beer, Jam, Banana
Cart 2 : Beer, Nuts, Tissue, Diaper
Cart 3 : Apple, Beer
Cart 4 : Jam, Beer, Diaper
Cart 5 : Bread, Butter, Tissue, Jam
![Page 15: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/15.jpg)
ARM Example
Min supp = 0.5 Min conf = 0.7
Frequent Itemset = itemset whose support > 0.5
Cart 1 : Milk, Bread, Diaper, Beer, Jam, Banana
Cart 2 : Beer, Nuts, Tissue, Diaper
Cart 3 : Apple, Beer
Cart 4 : Jam, Beer, Diaper
Cart 5 : Bread, Butter, Tissue, Jam
![Page 16: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/16.jpg)
ARM Example
Min supp = 0.5 Min conf = 0.7
Frequent Itemsets (support)
Cart 1 : Milk, Bread, Diaper, Beer, Jam, Banana
Cart 2 : Beer, Nuts, Tissue, Diaper
Cart 3 : Apple, Beer
Cart 4 : Jam, Beer, Diaper
Cart 5 : Bread, Butter, Tissue, Jam
![Page 17: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/17.jpg)
ARM Example
Min supp = 0.5 Min conf = 0.7
Frequent Itemsets (support)
Cart 1 : Milk, Bread, Diaper, Beer, Jam, Banana
Cart 2 : Beer, Nuts, Tissue, Diaper
Cart 3 : Apple, Beer
Cart 4 : Jam, Beer, Diaper
Cart 5 : Bread, Butter, Tissue, Jam
Bread(2/5 < 0.5)
![Page 18: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/18.jpg)
ARM Example
Min supp = 0.5 Min conf = 0.7
Frequent Itemsets (support)
Cart 1 : Milk, Bread, Diaper, Beer, Jam, Banana
Cart 2 : Beer, Nuts, Tissue, Diaper
Cart 3 : Apple, Beer
Cart 4 : Jam, Beer, Diaper
Cart 5 : Bread, Butter, Tissue, Jam
Beer (0.8)Beer (0.8), Jam (0.6),
Diaper (0.6)
{Beer, Diaper} (0.6)
![Page 19: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/19.jpg)
ARM Example
Min supp = 0.5 Min conf = 0.7
Frequent Itemsets
Cart 1 : Milk, Bread, Diaper, Beer, Jam, Banana
Cart 2 : Beer, Nuts, Tissue, Diaper
Cart 3 : Apple, Beer
Cart 4 : Jam, Beer, Diaper
Cart 5 : Bread, Butter, Tissue, Jam
Beer (0.8), Jam (0.6),
Diaper (0.6)
{Beer, Diaper} (0.6)
Association Rules (confidence)
![Page 20: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/20.jpg)
ARM Example
Min supp = 0.5 Min conf = 0.7
Frequent Itemsets
Cart 1 : Milk, Bread, Diaper, Beer, Jam, Banana
Cart 2 : Beer, Nuts, Tissue, Diaper
Cart 3 : Apple, Beer
Cart 4 : Jam, Beer, Diaper
Cart 5 : Bread, Butter, Tissue, Jam
Beer (0.8), Jam (0.6),
Diaper (0.6)
{Beer, Diaper} (0.6)
Association Rules (confidence)
Beer => Jam (2/4 < 0.7)
![Page 21: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/21.jpg)
ARM Example
Min supp = 0.5 Min conf = 0.7
Frequent Itemsets
Cart 1 : Milk, Bread, Diaper, Beer, Jam, Banana
Cart 2 : Beer, Nuts, Tissue, Diaper
Cart 3 : Apple, Beer
Cart 4 : Jam, Beer, Diaper
Cart 5 : Bread, Butter, Tissue, Jam
Beer (0.8), Jam (0.6),
Diaper (0.6)
{Beer, Diaper} (0.6)
Association Rules (confidence)
Beer => Diaper (0.75)
![Page 22: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/22.jpg)
Apriori Algorithm
• Most popular algorithm
• Two steps:– Find all itemsets that satisify min_supp.
(frequent itemsets)• any subset of a frequent itemset is also frequent• Find all 1-item frequent itemsets; then all 2-item
frequent itemsets, and so on.
– Generate Rules• A B is an association rule if
Confidence(A B) ≥ min_conf
![Page 23: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/23.jpg)
Part I : Finding association rules of cis-regulatory elements involved in alternative splicing[Proceedings of the 45th annual southeast regional conference (ACM-SE) Winston-Salem, North Carolina pp. 232 – 237]
![Page 24: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/24.jpg)
K-mers Around Cassette Exon (items)
• Pre-mRNA sequences– Transcripts from NCBI– BLAT to align transcripts
to mouse genome– 200 bps from 7 regions
around cassette exon– 2565 genes in total
• Items (6mers) :AAAAAA to TTTTTT in region 1 … 7
Aim I-I : representing data context
![Page 25: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/25.jpg)
ARM in Finding AS Motif Rule
• Items : all possible hexamers (motifs)• Transactions : 2565 AS genes• Goal : finding motif association rules in AS
genes. (e.g., AGGATA TTAGCT)• By Apriori algorithm [Agrawal 1993]
Find All Frequent Hexamers
Generate Hexamer Rules
[Agrawal 1993] Agrawal R., Imielinski T., Swami AN., 1993 SIGMOD 22(2):207-216
![Page 26: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/26.jpg)
ARM Example
[Example]
Seq 1 : ACGATTAGG
Seq 2 : GAATAGG
Seq 3 : TGCAGG
Seq 4 : GGATTAGG
Seq 5 : CAGAT
Min support = 0.5
Min confidence = 0.7
![Page 27: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/27.jpg)
ARM Example
[Example]
Seq 1 : ACGATTAGG
Seq 2 : GAATAGG
Seq 3 : TGCAGG
Seq 4 : GGATTAGG
Seq 5 : CAGAT
Min support = 0.5
Min confidence = 0.7
- Frequent 3-mer sets (support)AGG (0.8),
![Page 28: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/28.jpg)
ARM Example
[Example]
Seq 1 : ACGATTAGG
Seq 2 : GAATAGG
Seq 3 : TGCAGG
Seq 4 : GGATTAGG
Seq 5 : CAGAT
Min support = 0.5
Min confidence = 0.7
- Frequent 3mers sets (support)AGG (0.8), GAT (0.6), TAG (0.6),{AGG,TAG} (0.6)
![Page 29: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/29.jpg)
ARM Example
[Example]
Seq 1 : ACGATTAGG
Seq 2 : GAATAGG
Seq 3 : TGCAGG
Seq 4 : GGATTAGG
Seq 5 : CAGAT
Min support = 0.5
Min confidence = 0.7
- Frequent 3mers sets (support)AGG (0.8), GAT (0.6), TAG (0.6),{AGG,TAG} (0.6)
- Rules (confidence)AGG GATconf = 2 / 4 = 0.5 < minconf
![Page 30: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/30.jpg)
ARM Example
[Example]
Seq 1 : ACGATTAGG
Seq 2 : GAATAGG
Seq 3 : TGCAGG
Seq 4 : GGATTAGG
Seq 5 : CAGAT
Min support = 0.5
Min confidence = 0.7
- Frequent 3mers sets (support)AGG (0.8), GAT (0.6), TAG (0.6),{AGG,TAG} (0.6)
- Rules (confidence)AGG TAG (0.75)TAG AGG (1.0)
![Page 31: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/31.jpg)
Motif Association Rules from AS Genes
1 2 3 4 5 6 7
- 7_TGAAGA, 7_GAAGAA (ASF/SF2, SRp55)
- 6_TTTTCT, 6_AATAAA, …
- Among 6,000 6-mers, 1/3 are in AEDB
- Candidates of regulatory motifs
Association Rules
Minconf = 0.4
Frequent 6-mers
Minsup = 0.05 (129 genes)
- 7_AAAAAT 7_TGAAGA, 7_AAAGGA 7_AGAAGA,
- 7_GAAAAA 7_AAGAAG, 7_CTGCCT 7_CTGGAG,
- 7_AGGAAA 7_AAGAAG, 7_AATAAA 7_AAGAAG
- Candidates of regulatory combinations for AS
Aim I-II : finding motif association rules for all AS genes
![Page 32: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/32.jpg)
Clustering by AS Pattern in 10 Tissues
• Hypothesize : Motif combinations “cause” AS profile• Cluster genes based on AS profile. We use
– Euclidean distance / Correlation – Average linkage clustering
• Frequent 6-mers in cluster are motif candidates
Aim I-III : finding motif association rules for cluster
![Page 33: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/33.jpg)
Association Rules from Clusters
1 2 3 4 5 6 7
• Lift (XY) > 2.0• Comparison with outside the
cluster (p-value < 2.13e-10)• Association rules are
candidates of motif combinations for the corresponding AS pattern
Correlation based clusters
Aim I-III : finding motif association rules for cluster
![Page 34: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/34.jpg)
Part II : Cis-regulatory Motif Combinations Associated with Tissue-specific Alternative Splicing[7th workshop on Algorithms in Bioinformatics (WABI 2007) (submitted)]
![Page 35: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/35.jpg)
Finding Motifs Involved in Tissue-Specific AS
• Items : – hexamers in gene regions and– exon skipping rate in tissues
• Transactions :– 2565 genes from Pan’s data set
• Goal : find associations AGGATA in cassette exon High exon skipping in Brain
• We focus on complex rules, e.g.{AGGATA in cassette exon, CCTGCG in downstream intron} High exon skipping in Brain
Aim II-I : finding motif association rules for tissue-specific AS
![Page 36: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/36.jpg)
AS profile items
• Use quartile to convert numeric %ASexes to character AS profile items– BrainLow :The first %ASex
quartile in Brain– BrainHigh : The last %ASex
quartile in BrainBrainLow BrainHigh
![Page 37: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/37.jpg)
Motif Combination ARM Example
[Sequence]
Seq 1 : ACGATTAGG
Seq 2 : GAATAGG
Seq 3 : TGCAGG
Seq 4 : GGATTAGG
Seq 5 : CAGAT
Min support = 0.5
Min confidence = 0.7
[AS profile]
BH, HH
BH, HL
BH, HH
BL, HH
BH, HL
BH : BrianHighBL : BrainLowHH : HeartHighHL : HeartLow
+
![Page 38: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/38.jpg)
Motif Combination ARM Example
![Page 39: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/39.jpg)
Tissue-Specific AS Motif Combinations
• With strict thresholds– Min_supp = 0.01, Min_conf = 0.5, Min_lift = 1.2– MinLen of lhs = 2 (for complex rule)
• Rule appearance– lhs : hexamers, rhs : AS profile items
• 197 association rules are found in total• 27 complex rules are found
– lhs : combinations of 34 frequent hexamersrhs : AS profile items in tissues
– All rules have >1.9 lift – 23 rules show motif combinations in different regions
Aim II-I : finding motif association rules for tissue-specific AS
![Page 40: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/40.jpg)
Antecedent Consequent Support Confidence Lift
{X4_GCTGGA, X4_TGCTGG} {IntestineLow} 0.016 0.519 2.006
{X4_GCTGGA, X4_TGCTGG} {LungLow} 0.016 0.506 1.961
{X4_TGCTGG, X4_CTGGAG} {IntestineLow} 0.011 0.539 2.083
{X4_TGCTGG, X4_CTGGAG} {LungLow} 0.010 0.5 1.937
{X5_TTTTTA, X7_AGAGGA} {HeartHigh} 0.010 0.510 2.043
{X1_AGCAGC, X5_TTTTTA} {MuscleHigh} 0.010 0.54 2.220
{X1_GAGCAG, X3_TTTTAA} {MuscleHigh} 0.010 0.510 2.096
{X1_GAGCAG, X3_TTCTTT} {LiverHigh} 0.013 0.508 2.048
{X4_AGAAGA, X5_TTATTT} {SalivaryLow} 0.011 0.528 2.066
{X4_AGAAGA, X5_TTATTT} {HeartLow} 0.011 0.528 2.075
{X4_AGAAGA, X5_TTATTT} {KidneyLow} 0.011 0.528 2.023
{X4_AGAAGA, X5_TTATTT} {LiverLow} 0.011 0.528 2.041
{X3_ATTTTT, X6_TTCCTG} {SalivaryHigh} 0.011 0.509 2.031
{X3_TTGTTT, X6_TGTCTC} {LiverHigh} 0.011 0.5 2.017
{X2_GCCTGG, X3_CCTCTG} {LiverLow} 0.011 0.542 2.092
{X2_GTGGGG, X5_TTGTTT} {MuscleHigh} 0.013 0.516 2.120
{X5_ATTTTA, X6_TGCTGT} {SalivaryHigh} 0.010 0.510 2.034
{X5_TCTTTT, X6_TTGTCT} {SalivaryHigh} 0.010 0.634 2.530
{X3_TCTGTT, X6_TTGTCT} {HeartHigh} 0.012 0.527 2.110
{X5_TTTTTA, X6_TTGTCT} {HeartHigh} 0.014 0.507 2.032
{X3_CTCTTT, X5_TTAAAA} {KidneyHigh} 0.010 0.5 2.042
{X2_GGGTGG, X5_TTATTT} {SalivaryHigh} 0.011 0.510 2.032
{X5_TCTTTT, X6_TTTTCA} {IntestineHigh} 0.011 0.5 2.007
{X3_TTTATT, X6_TTTCCT} {IntestineHigh} 0.014 0.522 2.094
{X5_TCTTTT, X5_TTATTT, X5_TTTTTA} {HeartHigh} 0.010 0.5 2.004
{X5_TTCTTT, X5_TATTTT, X5_TTTTCT} {SalivaryHigh} 0.011 0.527 2.104
{X3_TATTTT, X3_ATTTTT, X5_TTGTTT} {BrainHigh} 0.011 0.510 2.084
1 2 34 5 6 7
Aim II-I : finding motif association rules for tissue-specific AS
{5_TTTTTA, 7_AGAGGA} => {HeartHigh}
![Page 41: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/41.jpg)
AS Profile of Motif Combinations
Aim II- II : analyzing motif combination
1 2 3 4 5 6 7
![Page 42: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/42.jpg)
Summary of Graphs
• In some cases, genes with one motif do not show any different AS profile from all AS genes
• However, often, genes containing all multiple motifs show significantly changed exon skipping levels
• Combination of cis-regulatory motifs can influence AS profile in tissues
![Page 43: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/43.jpg)
• AEDB in EBI– Transcript regulatory sequences from literature– 292 enhancers and silencers
• >60% extracted frequent hexamers are part of AEDB motifs
• >97% of hexamers involved in complex rules are part of AEDB motifs
Comparison with AEDB
![Page 44: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/44.jpg)
Summary
• Association rule mining (ARM) applied
• Finding motif association rules for AS
• Finding motif association rules for AS clusters
• Finding motif combinations for tissue-specific AS
![Page 45: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/45.jpg)
Future Work
Improve method• Improve motif representation, e.g.
– variable motif length, gapped k-mers– results from motif finding tools
• Improve AS profile representation• Add more features, e.g.
– position and distance between motifs– splice site– exon / intron length– conservation, gene information
• Statistical analysis– Thresholds– Multiple testing
![Page 46: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/46.jpg)
Future Work
• Systematic analysis of simple & complex motifs • Other data sources
– Human splice array [Johnson 2003]– ESTs
• Investigate discovered motifs– Apply motif discovery tools– Analyze genome occurrence– Analyze gene and protein structure
• Build predictive model and apply it (If I have enough time )
• Experimental verification[Johnson 2003] Science. 2003 Dec 19;302(5653):2141-4
![Page 47: Regulation of Alternative Splicing Jihye Kim Oral Preliminary Exam (May 7, 2007)](https://reader036.vdocuments.net/reader036/viewer/2022070411/56649f435503460f94c62b81/html5/thumbnails/47.jpg)
Acknowledgements
• Dr. Steffen Heber
• Dr. Eric A. Stone
• Dr. Zhao-Bang Zeng
• Dr. Barbara Sherry
• Sihui Zhao
• Li Zhang
• Hyunmin Kim
THANK YOU