20091110 technical seminar chip-seq data analysis
TRANSCRIPT
Tools and challenges for ChIP-seq data analysis
Alba Jené SanzBiomedical Genomics Lab (UPF)
Overview
1. ChIP-seq – The basics
2. Typical pipeline
3. Challenges in ChIP-seq data analysis
4. To take into account
5. Available tools
6. Analysis example
7. Future Challenges
8. Where to look for help
1. ChIP-seq – The Basics
ChIP-on-chip
ChIP-seq
1. ChIP-seq – The Basics
ChIP-on-chip
ChIP-seq
Bioinformatics
1. ChIP-seq – The Basics
35 bp
500 bp 35 bp
1. ChIP-seq – The Basics
1. ChIP-seq – The Basics
2. Typical pipeline
2. Typical pipeline
2. Typical pipeline
Bowtie
2. Typical pipeline
MACSBowtie
2. Typical pipeline
MACSBowtie
CEAS
2. Typical pipeline
Mapping…
Peak calling…
Unique / multiple locations
Allowing mismatches – seed sequence
Balance accuracy / performance
2. Typical pipeline
3. Challenges in ChIP-seq data analysis
Millions of segments that need a fast mapping to the genome (allowing
mismatches or gaps, performance issues)
Peak detection – find the exact binding site
Data normalization – compare results, background noise
Visualization – thousands of enriched regions. UCSC, JBrowse…
4. To take into account
Transcription Factors vs Nucleosomes / Histone modifications
Control available?
Sequencing depth bias in Control vs IP
Different alignment methods produce different peak calling results, but the difference is
not as much as the one due to different peak caller or replicate
Many differences on peak callers can be explained by the different thresholds used
Some peak callers may be specific to some data types
Consistency may be used to set threshold if replicates are available
4. To take into account
There are many tools for the analysis of ChIP-
seq data, but no standards yet
5. Available tools
5. Available tools
5. Available tools
5. Available tools
Uses regional averaging to mitigate sample fluctuations in the control library
Uses the control to model the distribution across the genome using the Poissondistribution (BG). After identifying candidate peaks significantly enriched over theBG, a local labda is estimated using windows around each peak to eliminate local biases
Open-source, open to contributions (Artistic License) and being activelyimproved
Easy to use and fast-responding developers
Compares very well to other methods
5. Available tools
lane5_SNAIL_F9_qseq.txt
SOLEXA 90320 5 1 0 476 0 1 .ACGGGGGAGGG.C...CAAC..A...C............ BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 1222 0 1 .AATTGAAAAAT.A..TTTAA..G...A............ DO[[XVX[BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 133 0 1 .CCAGTCTATTAATT.TTGCC..GA..C............ DPXXXYYYYYYBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 145 0 1 .ATTGTTTCTGACTA.TTGAT..GC..T............ BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 153 0 1 .ACCGCTATCAGTAC.TAGCT..GT..A............ DMUYUVWBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 215 0 1 .TGTTGCCATTGCTA.AGGCA..GT..T............ DOVWBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 1 827 0 1 AGGAGATCGGCCGGTTGATGAGCCGAGTG........... \Z__U_]PXYXTGRZ]QXBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 1 56 0 1 AAAAATCGACGCTCAAGTCAGAGGTGGCG........... ababba`\_aabaab`U\ba\BBBBBBBBBBBBBBBBBBB 1
SOLEXA 90320 5 1 1 925 0 1 TGCAGCACTGGGGCCAGATGGTAAGCCCT........... _Z_\T]`\]M]OLP^^\[`WBBBBBBBBBBBBBBBBBBBB 1
SOLEXA 90320 5 1 2 1637 0 1 GGGCTTCTGCCCCGGTGGGTACATGAGTA........... aaa`a`a`aa`aX_`^^\^[``BBBBBBBBBBBBBBBBBB 1
QSEQ files (Solexa's FASTQ with ASCII Phred64, the 3rd FASTQ type)
@3_1_3_89
CACAGTGTCCTCCAGGTTCATCCC................
+3_1_3_89
abbab^aaaaaaa\VUVbBBBBBBBBBBBBBBBBBBBBBB
@3_1_3_762
GCAAACAAATGGCGGAAAGCGGCG................
+3_1_3_762
aabba`b`a`]]TXOY`BBBBBBBBBBBBBBBBBBBBBBB
@3_1_90_512
GCTGAGGCAGGAGAATTGCTTGAACTGGGAAGGCAGAGGT
+3_1_90_512
ab`a_a`X``WTGW]T]S\Z]T[aXa_T^]XP\]\H_VXY
@3_1_90_1028
GTTACGGCTTATCCTGCACATTACGACCGTTTGCGTAACG
+3_1_90_1028
`bba`X_^ab_aWS`_\b[`aa_]TZ^VY\a`VW^`^b`a
@3_1_90_1651
TAATTTTAGATTTTATCCTTGACATTGTAAATATTACATT
+3_1_90_1651
aUVF[aa\`VU_`aaU[__aaa\YV^aP`aQU\a`_^\_a
@3_1_90_1670
@FC30C11AAXX:8:1:1649:1790
GAAAAGTATTTGCAATTTGTTGCCTCTCATCCAAGAATGAAATTCCTATTG
+
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<:::::6776666666566663
@FC30C11AAXX:8:1:1655:1811
GAAGCAGAAGCCTATATACCCTGTAGAACTGGGAGCCAATTAAACCTCTTT
+
<<<<<<<<<<<<<<<<<<5<<<<><<:>:<<;<:;3665537.6+6.33.+
@FC30C11AAXX:8:1:1609:1848
GATGTGGTTTCACATAAATTGACATATATAGTTCCAGGCTGTAAATGTTGT
+
<<;<<;;+<<7:<<<<<<7::7:<<<<<<:7,777402-4.-+*20+0-%-
@FC30C11AAXX:8:1:1667:1880
GTTTTATACAAATCAAAACCATAGTGAGATACCATCTCACACTAGTCAGAA
+
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<:::::6767366767357566
@FC30C11AAXX:8:1:1577:1853
GTGATGGGAGGAAAGCTAGGGGGCTATAATGTCTATTACAAGGCTCAGTAG
+
<6<<<<<<<<<<<<<<<<<<<<<<<<<<<<:99:9646066,6+0604044
6. Analysis example
lane5_SNAIL_F9_qseq.txt
SOLEXA 90320 5 1 0 476 0 1 .ACGGGGGAGGG.C...CAAC..A...C............ BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 1222 0 1 .AATTGAAAAAT.A..TTTAA..G...A............ DO[[XVX[BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 133 0 1 .CCAGTCTATTAATT.TTGCC..GA..C............ DPXXXYYYYYYBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 145 0 1 .ATTGTTTCTGACTA.TTGAT..GC..T............ BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 153 0 1 .ACCGCTATCAGTAC.TAGCT..GT..A............ DMUYUVWBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 215 0 1 .TGTTGCCATTGCTA.AGGCA..GT..T............ DOVWBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 1 827 0 1 AGGAGATCGGCCGGTTGATGAGCCGAGTG........... \Z__U_]PXYXTGRZ]QXBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 1 56 0 1 AAAAATCGACGCTCAAGTCAGAGGTGGCG........... ababba`\_aabaab`U\ba\BBBBBBBBBBBBBBBBBBB 1
SOLEXA 90320 5 1 1 925 0 1 TGCAGCACTGGGGCCAGATGGTAAGCCCT........... _Z_\T]`\]M]OLP^^\[`WBBBBBBBBBBBBBBBBBBBB 1
SOLEXA 90320 5 1 2 1637 0 1 GGGCTTCTGCCCCGGTGGGTACATGAGTA........... aaa`a`a`aa`aX_`^^\^[``BBBBBBBBBBBBBBBBBB 1
QSEQ files (Solexa's FASTQ with ASCII Phred64, the 3rd FASTQ type)
@3_1_3_89
CACAGTGTCCTCCAGGTTCATCCC................
+3_1_3_89
abbab^aaaaaaa\VUVbBBBBBBBBBBBBBBBBBBBBBB
@3_1_3_762
GCAAACAAATGGCGGAAAGCGGCG................
+3_1_3_762
aabba`b`a`]]TXOY`BBBBBBBBBBBBBBBBBBBBBBB
@3_1_90_512
GCTGAGGCAGGAGAATTGCTTGAACTGGGAAGGCAGAGGT
+3_1_90_512
ab`a_a`X``WTGW]T]S\Z]T[aXa_T^]XP\]\H_VXY
@3_1_90_1028
GTTACGGCTTATCCTGCACATTACGACCGTTTGCGTAACG
+3_1_90_1028
`bba`X_^ab_aWS`_\b[`aa_]TZ^VY\a`VW^`^b`a
@3_1_90_1651
TAATTTTAGATTTTATCCTTGACATTGTAAATATTACATT
+3_1_90_1651
aUVF[aa\`VU_`aaU[__aaa\YV^aP`aQU\a`_^\_a
@3_1_90_1670
@FC30C11AAXX:8:1:1649:1790
GAAAAGTATTTGCAATTTGTTGCCTCTCATCCAAGAATGAAATTCCTATTG
+
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<:::::6776666666566663
@FC30C11AAXX:8:1:1655:1811
GAAGCAGAAGCCTATATACCCTGTAGAACTGGGAGCCAATTAAACCTCTTT
+
<<<<<<<<<<<<<<<<<<5<<<<><<:>:<<;<:;3665537.6+6.33.+
@FC30C11AAXX:8:1:1609:1848
GATGTGGTTTCACATAAATTGACATATATAGTTCCAGGCTGTAAATGTTGT
+
<<;<<;;+<<7:<<<<<<7::7:<<<<<<:7,777402-4.-+*20+0-%-
@FC30C11AAXX:8:1:1667:1880
GTTTTATACAAATCAAAACCATAGTGAGATACCATCTCACACTAGTCAGAA
+
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<:::::6767366767357566
@FC30C11AAXX:8:1:1577:1853
GTGATGGGAGGAAAGCTAGGGGGCTATAATGTCTATTACAAGGCTCAGTAG
+
<6<<<<<<<<<<<<<<<<<<<<<<<<<<<<:99:9646066,6+0604044
6. Analysis example
Filter qualities and parse
lane5_SNAIL_F9_qseq.txt
SOLEXA 90320 5 1 0 476 0 1 .ACGGGGGAGGG.C...CAAC..A...C............ BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 1222 0 1 .AATTGAAAAAT.A..TTTAA..G...A............ DO[[XVX[BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 133 0 1 .CCAGTCTATTAATT.TTGCC..GA..C............ DPXXXYYYYYYBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 145 0 1 .ATTGTTTCTGACTA.TTGAT..GC..T............ BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 153 0 1 .ACCGCTATCAGTAC.TAGCT..GT..A............ DMUYUVWBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 0 215 0 1 .TGTTGCCATTGCTA.AGGCA..GT..T............ DOVWBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 1 827 0 1 AGGAGATCGGCCGGTTGATGAGCCGAGTG........... \Z__U_]PXYXTGRZ]QXBBBBBBBBBBBBBBBBBBBBBB 0
SOLEXA 90320 5 1 1 56 0 1 AAAAATCGACGCTCAAGTCAGAGGTGGCG........... ababba`\_aabaab`U\ba\BBBBBBBBBBBBBBBBBBB 1
SOLEXA 90320 5 1 1 925 0 1 TGCAGCACTGGGGCCAGATGGTAAGCCCT........... _Z_\T]`\]M]OLP^^\[`WBBBBBBBBBBBBBBBBBBBB 1
SOLEXA 90320 5 1 2 1637 0 1 GGGCTTCTGCCCCGGTGGGTACATGAGTA........... aaa`a`a`aa`aX_`^^\^[``BBBBBBBBBBBBBBBBBB 1
QSEQ files (Solexa's FASTQ with ASCII Phred64, the 3rd FASTQ type)
@3_1_3_89
CACAGTGTCCTCCAGGTTCATCCC................
+3_1_3_89
abbab^aaaaaaa\VUVbBBBBBBBBBBBBBBBBBBBBBB
@3_1_3_762
GCAAACAAATGGCGGAAAGCGGCG................
+3_1_3_762
aabba`b`a`]]TXOY`BBBBBBBBBBBBBBBBBBBBBBB
@3_1_90_512
GCTGAGGCAGGAGAATTGCTTGAACTGGGAAGGCAGAGGT
+3_1_90_512
ab`a_a`X``WTGW]T]S\Z]T[aXa_T^]XP\]\H_VXY
@3_1_90_1028
GTTACGGCTTATCCTGCACATTACGACCGTTTGCGTAACG
+3_1_90_1028
`bba`X_^ab_aWS`_\b[`aa_]TZ^VY\a`VW^`^b`a
@3_1_90_1651
TAATTTTAGATTTTATCCTTGACATTGTAAATATTACATT
+3_1_90_1651
aUVF[aa\`VU_`aaU[__aaa\YV^aP`aQU\a`_^\_a
@3_1_90_1670
@FC30C11AAXX:8:1:1649:1790
GAAAAGTATTTGCAATTTGTTGCCTCTCATCCAAGAATGAAATTCCTATTG
+
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<:::::6776666666566663
@FC30C11AAXX:8:1:1655:1811
GAAGCAGAAGCCTATATACCCTGTAGAACTGGGAGCCAATTAAACCTCTTT
+
<<<<<<<<<<<<<<<<<<5<<<<><<:>:<<;<:;3665537.6+6.33.+
@FC30C11AAXX:8:1:1609:1848
GATGTGGTTTCACATAAATTGACATATATAGTTCCAGGCTGTAAATGTTGT
+
<<;<<;;+<<7:<<<<<<7::7:<<<<<<:7,777402-4.-+*20+0-%-
@FC30C11AAXX:8:1:1667:1880
GTTTTATACAAATCAAAACCATAGTGAGATACCATCTCACACTAGTCAGAA
+
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<:::::6767366767357566
@FC30C11AAXX:8:1:1577:1853
GTGATGGGAGGAAAGCTAGGGGGCTATAATGTCTATTACAAGGCTCAGTAG
+
<6<<<<<<<<<<<<<<<<<<<<<<<<<<<<:99:9646066,6+0604044
6. Analysis example
BOWTIE
Filter qualities and parse
SNAIL_F9.bwt
5_1_0_1409 + gi|51511750|ref|NC_000021.7|NC_000021 34604194 AGTTGCACCTTTAACAATTTCCCAT %/6::9::;;;;7279######### 0 17:G>T,24:G>T
5_1_0_811 + gi|89161218|ref|NC_000023.9|NC_000023 77246408 TTCTGCAAGCCTCCGGAGCGCACGTG BBB@5<?=9<9@>96/:0######## 0 25:C>G
5_1_1_1665 + gi|89161199|ref|NC_000002.10|NC_000002 201785208 GCCCAGCTGTCACTGTGGTTTTGATTTGC BBCCCBBBCBBB@BABBBACCA####### 0
5_1_2_1637 + gi|51511731|ref|NC_000015.8|NC_000015 92942360 GGGCTTCTGCCCCGGTGGGTACATGAGTA BBBABABABBAB9@A??=?<AA####### 0
5_1_2_1359 + gi|89161205|ref|NC_000003.10|NC_000003 101351498 CAATTCCCTCCTTGAAAGGCTCCTCCACC BCCBBBBAAAABA9@B?59@ABA###### 0
5_1_2_730 - gi|51511721|ref|NC_000005.8|NC_000005 1314600 GGACTTCCATGCAAACAAGCTGCTTTCCA ########BB>9@B@;@B<;??ABCBABB 0
5_1_2_1118 - gi|89161213|ref|NC_000007.12|NC_000007 157199758 CATCTTTGATGAGTTACTACCTGTGGGGT ########@B@?=B@;8@659@@BAABAB 0
5_1_3_920 + gi|51511727|ref|NC_000011.8|NC_000011 133317176 GGTAGACTCACAAAACTACCAAAGTCCTCTAC ABABAABCBBBBCBCBCBBBCCA>@@###### 0
5_1_3_971 + gi|89161190|ref|NC_000012.10|NC_000012 7497006 TTTTCATGCAGCCCGAGACATCAAGCTAGCAG B@86646330/250################## 0 31:T>G
6. Analysis example
SNAIL_F9.bwt
5_1_0_1409 + gi|51511750|ref|NC_000021.7|NC_000021 34604194 AGTTGCACCTTTAACAATTTCCCAT %/6::9::;;;;7279######### 0 17:G>T,24:G>T
5_1_0_811 + gi|89161218|ref|NC_000023.9|NC_000023 77246408 TTCTGCAAGCCTCCGGAGCGCACGTG BBB@5<?=9<9@>96/:0######## 0 25:C>G
5_1_1_1665 + gi|89161199|ref|NC_000002.10|NC_000002 201785208 GCCCAGCTGTCACTGTGGTTTTGATTTGC BBCCCBBBCBBB@BABBBACCA####### 0
5_1_2_1637 + gi|51511731|ref|NC_000015.8|NC_000015 92942360 GGGCTTCTGCCCCGGTGGGTACATGAGTA BBBABABABBAB9@A??=?<AA####### 0
5_1_2_1359 + gi|89161205|ref|NC_000003.10|NC_000003 101351498 CAATTCCCTCCTTGAAAGGCTCCTCCACC BCCBBBBAAAABA9@B?59@ABA###### 0
5_1_2_730 - gi|51511721|ref|NC_000005.8|NC_000005 1314600 GGACTTCCATGCAAACAAGCTGCTTTCCA ########BB>9@B@;@B<;??ABCBABB 0
5_1_2_1118 - gi|89161213|ref|NC_000007.12|NC_000007 157199758 CATCTTTGATGAGTTACTACCTGTGGGGT ########@B@?=B@;8@659@@BAABAB 0
5_1_3_920 + gi|51511727|ref|NC_000011.8|NC_000011 133317176 GGTAGACTCACAAAACTACCAAAGTCCTCTAC ABABAABCBBBBCBCBCBBBCCA>@@###### 0
5_1_3_971 + gi|89161190|ref|NC_000012.10|NC_000012 7497006 TTTTCATGCAGCCCGAGACATCAAGCTAGCAG B@86646330/250################## 0 31:T>G
SNAIL_F9.bwt.bed
chr21 34604194 34604219 5_1_0_1409 . +
chr23 77246408 77246434 5_1_0_811 . +
chr02 201785208 201785237 5_1_1_1665 . +
chr15 92942360 92942389 5_1_2_1637 . +
chr03 101351498 101351527 5_1_2_1359 . +
chr05 1314600 1314629 5_1_2_730 . -
chr07 157199758 157199787 5_1_2_1118 . -
chr11 133317176 133317208 5_1_3_920 . +
chr12 7497006 7497038 5_1_3_971 . +
chr01 201404048 201404081 5_1_3_1986 . +
6. Analysis example
Parsing
SNAIL_F9.bwt
5_1_0_1409 + gi|51511750|ref|NC_000021.7|NC_000021 34604194 AGTTGCACCTTTAACAATTTCCCAT %/6::9::;;;;7279######### 0 17:G>T,24:G>T
5_1_0_811 + gi|89161218|ref|NC_000023.9|NC_000023 77246408 TTCTGCAAGCCTCCGGAGCGCACGTG BBB@5<?=9<9@>96/:0######## 0 25:C>G
5_1_1_1665 + gi|89161199|ref|NC_000002.10|NC_000002 201785208 GCCCAGCTGTCACTGTGGTTTTGATTTGC BBCCCBBBCBBB@BABBBACCA####### 0
5_1_2_1637 + gi|51511731|ref|NC_000015.8|NC_000015 92942360 GGGCTTCTGCCCCGGTGGGTACATGAGTA BBBABABABBAB9@A??=?<AA####### 0
5_1_2_1359 + gi|89161205|ref|NC_000003.10|NC_000003 101351498 CAATTCCCTCCTTGAAAGGCTCCTCCACC BCCBBBBAAAABA9@B?59@ABA###### 0
5_1_2_730 - gi|51511721|ref|NC_000005.8|NC_000005 1314600 GGACTTCCATGCAAACAAGCTGCTTTCCA ########BB>9@B@;@B<;??ABCBABB 0
5_1_2_1118 - gi|89161213|ref|NC_000007.12|NC_000007 157199758 CATCTTTGATGAGTTACTACCTGTGGGGT ########@B@?=B@;8@659@@BAABAB 0
5_1_3_920 + gi|51511727|ref|NC_000011.8|NC_000011 133317176 GGTAGACTCACAAAACTACCAAAGTCCTCTAC ABABAABCBBBBCBCBCBBBCCA>@@###### 0
5_1_3_971 + gi|89161190|ref|NC_000012.10|NC_000012 7497006 TTTTCATGCAGCCCGAGACATCAAGCTAGCAG B@86646330/250################## 0 31:T>G
SNAIL_F9.bwt.bed
chr21 34604194 34604219 5_1_0_1409 . +
chr23 77246408 77246434 5_1_0_811 . +
chr02 201785208 201785237 5_1_1_1665 . +
chr15 92942360 92942389 5_1_2_1637 . +
chr03 101351498 101351527 5_1_2_1359 . +
chr05 1314600 1314629 5_1_2_730 . -
chr07 157199758 157199787 5_1_2_1118 . -
chr11 133317176 133317208 5_1_3_920 . +
chr12 7497006 7497038 5_1_3_971 . +
chr01 201404048 201404081 5_1_3_1986 . +
6. Analysis example
Parsing
MACS
MACS pipeline
Output:
- Peak locations in BED and XLS format (genome browser)
- Tag count in wiggle format (genome browser)
- Bimodal model in R scripts
6. Analysis example
PolII
H3K27me3
6. Analysis example
snail_mfold_15_MACS.wig
track type=wiggle_0 name="MACS_counts_after_shifting" description="Shifted Merged MACS
tag counts for every 10 bp"
variableStep chrom=chr10 span=10
85171 1
85181 1
85191 1
85201 1
85211 1
85221 1
85231 2
85371 2
snail_mfold_15_tsize41_newbwt_peaks.bed
track name="MACS peaks for snail_mfold_15_tsize41_newbwt"
chr1 559644 559924 MACS_peak_1 79.29
chr1 2435221 2435542 MACS_peak_2 51.58
chr1 14624217 14624571 MACS_peak_3 66.12
chr1 15610639 15611000 MACS_peak_4 56.69
chr1 16822564 16822753 MACS_peak_5 52.84
chr1 18411948 18412187 MACS_peak_6 82.46
chr1 22857612 22857985 MACS_peak_7 88.74
chr1 27541904 27542134 MACS_peak_8 69.47
6. Analysis example
snail_mfold_15_MACS.wig
track type=wiggle_0 name="MACS_counts_after_shifting" description="Shifted Merged MACS
tag counts for every 10 bp"
variableStep chrom=chr10 span=10
85171 1
85181 1
85191 1
85201 1
85211 1
85221 1
85231 2
85371 2
snail_mfold_15_tsize41_newbwt_peaks.bed
track name="MACS peaks for snail_mfold_15_tsize41_newbwt"
chr1 559644 559924 MACS_peak_1 79.29
chr1 2435221 2435542 MACS_peak_2 51.58
chr1 14624217 14624571 MACS_peak_3 66.12
chr1 15610639 15611000 MACS_peak_4 56.69
chr1 16822564 16822753 MACS_peak_5 52.84
chr1 18411948 18412187 MACS_peak_6 82.46
chr1 22857612 22857985 MACS_peak_7 88.74
chr1 27541904 27542134 MACS_peak_8 69.47
6. Analysis example
CEAS
Input:
-BED format peak locations
- Optional signal profile in wiggle format
- BED format extra regions of interest
6. Analysis example
CEAS output
CEAS output
CEAS output
CEAS output
CEAS output
CEAS output
7. Future challenges
Re-analyze data with new algorithms – sequences remain the same
ChIP-seq combined with Chromatin Conformation Capture (3C) –long-range physical interactions
Technical improvements: RNA-seq will benefit from longer reads
Integrated computational analyses – integration of TF, histonemarks, methylation, polymerase loading to predict regulatory output
8. Where to look for help...
Seqanswers.com
8. Where to look for help...
Seqanswers.com
Google groups, mailing lists of each project
MACS
CEAS FindPeaks
8. Where to look for help...
Seqanswers.com
Google groups, mailing lists of each project
Lab mates!
MACS
CEAS FindPeaks