Download - ChIP-Seq: TB Example
ChIP-Seq: TB Example
James Galagan
Acknowledgements
Broad, BUSeq and Analysis
Brian WeinerMatt PetersenDesmond Lun
SBRICHIP
Kyle MinchTige Rustad
David Sherman
ChIP-Seq
Immuno-precipitation
(IP)
Control (no IP)
Sequence FragmentsAlign to Genome
Look for Enrichment
Target Site
IP
Control
(ChIP-chip) (ChIP-Seq)
Dormancy RegulonGenes induced >2 fold after a 2-h shift from ambient to 0.2% O2.
~100 genes modulated47 upregulated = dormancy regulon
Sherman et al., 2001
DosR Confirmation
Hypoxic Response Depends on DosR
ORF Gene Rv ΔdosR Gene product*Rv0079 22.2 ± 6.9 0.9 ± 0.1 HPRv0080 7.8 ± 1.7 0.9 ± 0.1 HPRv0081 4.9 ± 1.2 0.5 ± 0.1 Transc. regulatorRv0082 3.1 ± 0.4 0.6 ± 0.1 Prob. oxidored. sub.Rv0083 2.1 ± 0.4 1.1 ± 0.2 Prob. oxidored. sub.Rv0569 9.0 ± 4.3 0.9 ± 0.1 CHPRv0570 nrdZ 5.5 ± 3.0 1.1 ± 0.1 Ribonuc. red. cl. II*Rv0571c 1.7 ± 0.5 1 ± 0.2 CHPRv0572c 3.0 ± 0.8 1 ± 0.1 HP*Rv0574c 2.0 ± 0.5 1.1 ± 0.1 CHPMT0639 2.0 ± 0.4 1 ± 0.2 HP***Rv1733c 4.1 ± 1.3 ND Poss. mem. prot.*Rv1734c 1.5 ± 0.1 1 ± 0.2 HPRv1736c narX 3.7 ± 0.7 1 ± 0.2 Fused nitrate red.***Rv1737c narK2 8.5 ± 2.0 1.1 ± 0.2 Nitrite extrus. prot.***Rv1738 22.8 ± 9.7 1.3 ± 0.1 CHPRv1812c 3.6 ± 0.6 1 ± 0.1 HP*Rv1813c 11.4 ± 3.0 0.8 ± 0.2 CHP*Rv1996 7.9 ± 4.6 0.7 CHPRv1997 ctpF 4.3 ± 2.2 0.8 ± 0.1 Cation trans.
ATPaseRv2003c 2.3 ± 0.6 0.9 ± 0.1 CHPRv2004c 4.2 ± 1.4 1.2 ± 0.2 HP*Rv2005c 7.3 ± 3.7 1.3 ± 0.5 CHP
The presence of one, two or three motif sequences (matrix score> 9.5) upstream of a gene is indicated by *, **or ***respectively.
ORF Gene Rv ΔdosR Gene product*Rv2006 otsB 2.2 ± 0.9 0.8 ± 0.1 Trehalose phos.*Rv2007c fdxA 25.9 ± 3.3 0.8 ± 0 FerredoxinRv2028c 6.0 ± 1.7 0.9 ± 0.2 CHPRv2029c pfkB 13.3 ± 5.7 0.9 ± 0.1 Phosphofruct. IIRv2030c 27.3 ± 6.3 ND CHP**Rv2031c acr 27.9 ± 7.6 ND α-Crystallin**Rv2032 15.1 ± 5.0 ND CHPRv2623 18.8 ± 4.1 ND CHPRv2624c 3.9 ± 1.3 0.6 CHPRv2625c 3.0 ± 1.1 1.3 ± 0.1 CHP**Rv2626c 24.5 ± 4.6 1.2 ± 0.1 CHP**Rv2627c 12.4 ± 4.9 0.8 CHP**Rv2628 13.6 ± 10.8 0.8 ± 0 HPRv2629 7.6 ± 7.4 1.4 ± 0.1 HPRv2630 6.5 ± 4.6 2 ± 0.5 HPRv2631 3.4 ± 2.1 1.4 ± 0.3 CHPRv2830c 2.6 ± 0.7 1.2 ± 0.1 HPRv3126c 1.7 ± 0.7 0.8 ± 0 HP**Rv3127 17.4 ± 2.4 0.8 ± 0.1 CHPRv3128c 1.5 ± 0.5 0.8 ± 0.1 CHPRv3129 2.7 ± 1.3 0.6 ± 0.1 CHP*Rv3130c 25.5 ± 9.4 ND CHP*Rv3131 34.1 ± 6.4 ND CHPRv3132c 5.7 ± 1.1 0.8 ± 0.1 Sensor hist. kinaseRv3133c dosR 9.1 ± 3.3 1.1 ± 0.2 Two-comp. resp.
reg.**Rv3134c 22.2 ± 17.9 1.2 ± 0.2 CHPRv3841 bfrB 5.2 ± 1.9 2.0 ± 1.3 Bacterioferritin
26 of 27 most induced genes depend on DosR
Park et al., 2001
DosR Binding Motif
• Computational identification– YMF to search promotors of hypoxic
response genes
5’-TTSGGGACTWWAGTCCCSAA-3’
• Experimental validation– Binds both copies of motif in acr promotor– Mutation abolishes binding and induction
Park et al., 2001
DosR Chip-Seq
• Native antibody to DosR– No tag
• DosR Control – no target for antibody– Control ChIP-Seq
• Runs at 2,4 and 8 Hours
DosR ChIP-Seq Replicates
Chip-Seq for TBDosR transcription factor binding at 4 hours post hypoxia
IP Enrichment
Known DosR Regulated Genes
hspxRv1733c
Desmond Lun, Kyle Minch
DosR Binding (2 hours-IP channel)
= forward read= reverse readRv1733c
Rv1737cRv1738
Window Enrichment Analysis
1. Divide genome into non-overlapping bins
2. Take reads from ChIP and control libraries
3. Calculate log-likelihood ratio for independence (based on chi-square)
1 2 3 4 5 6 7 8 9 10 11 12 …
DosR ROC Curve
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
Fasle Positive Rate
Tru
e P
osi
tive
Rat
e
Combined DosR Network
Matt Petersen, Brian Weiner
Previously known (from Park et al.)Green Dashed Park, CLR, and Chip-SeqGreen CLR and Park et al.Black Dashed Chip-Seq onlyBlack Not CLR or Chip-Seq
New Predictions (not in Park)Red Dashed CLR and Chip-SeqRed CLR
IP Coverage vs Induction in Park et al.
0
200
400
600
800
1000
1200
1400
0.0 5.0 10.0 15.0 20.0 25.0 30.0 35.0 40.0
Park et al. Differential Expression
Pea
k H
igh
(M
ax C
ove
rag
e)
IP Coverage vs Motif MatchPeak Amplitude and Motif Match
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
0 10 20 30 40 50 60
ChIP-Seq Peak Amplitude (Enrichment over Control)
Mo
tif
Fit
to
Lo
go
Incr
easi
ng
Mat
ch t
o L
og
o (
Neg
ativ
e L
L M
eme)
Location, Location, Location…
Valouev et al (2008) Nature Methods
DosR Binding (2 hours-IP channel)
= forward read= reverse readRv1733c
Rv1737cRv1738
Chip-Seq Blind Deconvolution
Impulse
Function
Deconvolve
3 binding sites
Desmond Lun, Brian Weiner
single binding site
Binding Site Resolution
Desmond Lun
Fit enrichment curve topeak from putative single site
Fit other peaks to single Site enrichment curve
Refit peaks for 2,3, etc sites
Re-estimate enrichment curvefrom all predicted site
Repeat
Reconstruct Promoter Architecture
CSDeconv on DosR data
Peak ID Position Amplitude
Position of motif match Difference
Absolute difference Location
1 88094.9 2.5 88124.5 -29.6 29.6 Upstream of Rv0079
2 665845.8 7.4 665858.5 -12.7 12.7 In Rv0573c
3 668495.8 2.7 668499.5 -3.7 3.7 Upstream of Rv0574c
4 1639601.5 7.4 1639626.5 -25.0 25.0 In Rv1453
5 1960517.8 13.1 1960519.5 -1.7 1.7 Upstream of Rv1733c
6 1960611.5 25.5 1960623.5 -12.0 12.0 Upstream of Rv1733c
7 1960692.3 10.4 1960692.5 -0.2 0.2 Upstream of Rv1733c
8 1965458.4 10.5 1965470.5 -12.1 12.1 Upstream of Rv1737c
9 1965543.4 14.4 1965532.5 10.9 10.9 Upstream of Rv1737c
10 2056358.2 3.1 2056374.5 -16.3 16.3 Upstream of Rv1813c, Rv1814
11 2238942.3 9.5 2238937.5 4.8 4.8 Upstream of Rv1996
12 2256458.2 12.9 2256495.5 -37.3 37.3 Upstream of Rv2007c
13 2278996.3 39.4 2279004.5 -8.2 8.2 Upstream of Rv2031c, Rv2032
14 2279046.9 22.8 2279061.5 -14.6 14.6 Upstream of Rv2031c, Rv2032
15 2949477.7 6.6 2949471.5 6.2 6.2 Upstream of Rv2623
16 2953045.1 8.3 2953073.5 -28.4 28.4 Upstream of Rv2626c
Continued…
CSDeconv on DosR data
Peak ID Position Amplitude
Position of motif match Difference
Absolute difference Location
17 2954749.8 5.2 2954791.5 -41.7 41.7 Upstream of Rv2627c, Rv2628
18 2955065.3 9.5 2955030.5 34.8 34.8 Upstream of Rv2627c, Rv2628
19 2955479.4 9.4 2955475.5 3.9 3.9 Upstream of Rv2629
20 3492068.8 12.5 3492091.5 -22.7 22.7 In Rv3126c
21 3496439.2 50.6 3496450.5 -11.3 11.3 Upstream of Rv3130c, Rv3131
22 3500822.9 3.2 3500831.5 -8.6 8.6 Upstream of Rv3134c
• Identifies a total of 22 binding sites
• All have sequences that match a motif resembling that previously identified by Park et al.
• Motif recovered:
Park et al. (2003) Mol Microbiol.Rv3133c/dosR is a transcription factor that mediates the hypoxic response of Mycobacterium tuberculosis.
Rv2627c
Novel dosR Binding Site?
Rv2629Rv2628
dosR Binding Site
Getting Induction –Inducible Promoter System
Tagged Tet-Promotor Construct
• Named pEXNF-xxxx• EX = expression
vector• NF = N-terminal
FLAG tag• xxxx = Rv number
for gene of interest• pEXNF-3133c used
in subsequent slides
Gateway recombination sites
Tet Operator
FLAG tag
Gene of Interest
Episomal vs KO Background
Transcription Factor Gene
Tet Promotor
Epitope Tag
Genome
X
Transcription Factor Protein
Target Promotor Region
Tet
Induction Profiles – dosR background
Episomal vs WT Background
Transcription Factor Gene
Tet Promotor
Epitope Tag
Genome
Transcription Factor Protein
Target Promotor Region
??
Questions
• To drive or not to drive– Tet can drive– TF X – may not know the condition
• Can we know• Drugs• Lipids
• Induce– Qpcr – rna TO transcriptomics SBRI– Crosslink/save lysate for western debug if necessary– ChIP– Quantify that we have DNA and QC – SBRI/BU– Library prep (at BU – Chris Mahwinney)– Multiplex (10x) Solexa
Induction Profiles – WT background
Assaying for Tag/Untagged Ratio
Transcription Factor Gene
Tet Promotor
Epitope Tag
Genome
Transcription Factor Protein
Target Promotor Region
taggedmRNA
untaggedmRNA
qPCR mRNA levels
Acknowledgements
Broad, BUSeq and Analysis
Brian WeinerMatt PetersenDesmond Lun
SBRICHIP
Kyle MinchTige Rustad
David Sherman