signal processing. example data – chip-seq t.n. siegel, d.r. hekstra, l.e. kemp, l.m. figueiredo,...
TRANSCRIPT
![Page 1: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/1.jpg)
Signal processing
![Page 2: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/2.jpg)
Example data – ChIP-Seq
T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross, "Four histone variants mark the boundaries of polycistronic transcription units in Trypanosoma brucei" , Genes Dev. 23 (2009) 1063-1076.
![Page 3: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/3.jpg)
α-factor release
Example Data: Time-Resolved ChIP-chip
Chromosome 16
M.D. Sekedat, D. Fenyö, R.S. Rogers, A.J. Tackett, J.D. Aitchison, B.T. Chait, "GINS motion reveals replication fork progression is remarkably uniform throughout the yeast genome", Mol Syst Biol. 6 (2010) 353.
![Page 4: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/4.jpg)
Example data – MALDI-TOF
m/z1000 4500
Inte
nsity
1800
0
D:\Users\Fenyo\Desktop\ATP.txt (15:42 02/03/11)Description: none available m/z2280 2400
Inte
nsi
ty
700
0
D:\Users\Fenyo\Desktop\ATP.txt (15:46 02/03/11)Description: none available
m/z1300 1460In
ten
sity
45
0
D:\Users\Fenyo\Desktop\ATP.txt (15:50 02/03/11)Description: none available
m/z1444.0 1458.0
Inte
nsi
ty
35
0
D:\Users\Fenyo\Desktop\ATP.txt (15:54 02/03/11)Description: none available
m/z2378.0 2394.0
Inte
nsi
ty
700
0
D:\Users\Fenyo\Desktop\ATP.txt (16:07 02/03/11)Description: none available
Peptide intensity vs m/z
![Page 5: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/5.jpg)
Fragment intensity vs m/z
Example data – ESI-LC-MS/MS
Time
m/z
m/z
% R
ela
tive
Ab
un
da
nce
100
0250 500 750 1000
[M+2H]2+
762
260 389 504
633
875
292405 534
9071020663 778 1080
1022
MS/MS
Peptide intensity vs m/z vs time
![Page 6: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/6.jpg)
Example Data: Super-Resolution Microscopy
Dylan Reid and Eli Rothenberg
![Page 7: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/7.jpg)
Sinus
amplitude
Wave length
b
ac
a
ca /)sin(
![Page 8: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/8.jpg)
Sinus and Cosinus
b
ac
a
ca /)sin( cb /)cos(
![Page 9: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/9.jpg)
Two Frequencies
![Page 10: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/10.jpg)
Fourier Transform
dxxff eix 2^
)()(
)2sin()2cos(2
iiiei
![Page 11: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/11.jpg)
Fourier Transform
from numpy import *x=2.0*pi*arange(1000.0)/100000.0sin1 = sin(1000.0*x)sin2 = 0.2*sin(10000.0*x)sin12=sin1+sin2
fft12=fft.rfft(sin12)
Frequency
![Page 12: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/12.jpg)
Inverse Fourier Transform
dfxf exi2^
)()(
Frequency
![Page 13: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/13.jpg)
Inverse Fourier Transform
from numpy import *x=2.0*pi*arange(1000.0)/100000.0sin1 = sin(1000.0*x)sin2 = 0.2*sin(10000.0*x)sin12=sin1+sin2fft12=fft.rfft(sin12)
sin12_=fft.irfft(fft12,len(sin12))
Frequency
![Page 14: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/14.jpg)
Inverse Fourier Transform
Frequency
![Page 15: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/15.jpg)
A Peak
centroid
full width at half
maximum (FWHM)
area
height
maximum
meanvarianceskewnesskurtosis
Inte
nsit
y
![Page 16: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/16.jpg)
Mean and variance
)(xxf
)()(22xfx
Mean
Variance
)(xfA peak is defined by and 1)( xf
![Page 17: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/17.jpg)
Skewness and kurtosis
3/)(44
)( xfx
Skewness
Kurtosis
33/)()( xfx
![Page 18: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/18.jpg)
A Gaussian Peak
def gaussian(x,x0,s):return exp(-(x-x0)**2/(2*s**2))
x = linspace(-1,1,1000)y=gaussian(x,0,0.1)ffty=fft.rfft(y)
Frequency
![Page 19: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/19.jpg)
A Gaussian Peak
Skewness = 0
Kurtosis = 0
2log22FWHM
2heightarea
Frequency
![Page 20: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/20.jpg)
Peak with a longer tail
2FWHM
heightarea
)( 01
1)(
2
xxxf
Frequency
![Page 21: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/21.jpg)
A skewed peak
def pdf(x): return 1/sqrt(2*pi) * exp(-x**2/2)
def cdf(x): return (1 + erf(x/sqrt(2))) / 2
def skew(x,e=0,w=1,a=0): t = (x-e) / w return 2 / w * pdf(t) * cdf(a*t)
Frequency
![Page 22: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/22.jpg)
Normal noise
x = linspace(-1,1,1000)y=0.2*random.normal(size=len(x))
If the noise is not normally distributed, try to find a transform that makes it normal
Frequency
![Page 23: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/23.jpg)
Lognormal noise
x = linspace(-1,1,1000)y=0.2*random.lognormal(size=len(x))
Frequency
![Page 24: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/24.jpg)
Skewed noise
x=random.uniform(-1.0,1.0,size=10*len(x))y=random.uniform(0.0,1.0,size=10*len(x))yskew=skew(x,-0.1,0.2,10)/max(yskew)yn_skew=x_test[y<yskew][:len(x)]
Frequency
![Page 25: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/25.jpg)
Gaussian peak with normal noise
Frequency
Frequency
Frequency
![Page 26: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/26.jpg)
Removing High Frequences
Frequency
![Page 27: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/27.jpg)
Convolution
http://en.wikipedia.org/wiki/Convolution
)()())(*( tgftgf
Describes the response of a linear and time-invariant system to an input signal
The inverse Fourier transform of the pointwise product in frequency space
![Page 28: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/28.jpg)
Smoothing by convolution
![Page 29: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/29.jpg)
Smoothing
w=ones(2*width+1,'d')convolve(w/w.sum(),y,'valid‘)
Frequency Frequency Frequency
Inte
nsit
y
![Page 30: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/30.jpg)
Smoothing
![Page 31: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/31.jpg)
Smoothing
![Page 32: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/32.jpg)
Adaptive Background Correction (unsharp masking)
wlk
wlk
kIw
dwdlI )(
12),,('
Unsharp masking
Original
wi = linspace(1,window_len,window_len)w = 1 / ( 2*r_[wi[::-1],0,wi] + 1 )x_ = x - d*convolve(w/w.sum(),x,'valid')
![Page 33: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/33.jpg)
Adaptive Background Correction
![Page 34: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/34.jpg)
Smoothing and Adaptive Background Correction
![Page 35: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/35.jpg)
Savitsky-Golay smoothingPolynomial order = 3
Bin size = 25
Bin size = 75
Bin size = 150
Polynomial order = 5 Polynomial order = 7
![Page 36: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/36.jpg)
Background
Frequency
Frequency
![Page 37: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/37.jpg)
Background Subtraction Using Smoothing
Bin size = 100 Bin size = 200 Bin size = 300
Smooting Smooting Smooting
Background subtractionBackground subtractionBackground subtraction
![Page 38: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/38.jpg)
Root Mean Square Deviation (RMSD)
22
2
//||
))((w
wlkIkI
The Root Mean Square Deviation (RMSD) is often constant for the noise and larger for the peak if the window size is approximately the size of the peak.
![Page 39: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/39.jpg)
Background Subtraction using RMSDBin size = 100 Bin size = 200 Bin size = 300
RM
SD
RM
SD
RM
SD
Inte
nsit
y
Inte
nsit
y
Inte
nsit
y
![Page 40: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/40.jpg)
Convolution, Cross-correlation, and Autocorrelation
http://en.wikipedia.org/wiki/Convolution
Convolution describes the response of a linear andtime-invariant system to an input signal.
The inverse Fourier transform of the pointwise product in frequency space.
Cross-correlation is a measure of similarity of two signals.
It can be used for finding a shift between two signals.
Auto-correlation is the cross-correlation of a signal with itself.
It can be used for finding periodic signals obscured by noise.
![Page 41: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/41.jpg)
Cross-correlation and autocorrelation
)()())(( tgftgf
http://en.wikipedia.org/wiki/Convolution
)()())(*( tfftff
![Page 42: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/42.jpg)
Autocorrelation
Autocorrelation
Signal
Same signal
![Page 43: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/43.jpg)
Cross-correlation
Cross-correlation
Signal
Shifted signal
![Page 44: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/44.jpg)
Cross-correlation
Cross-correlation
Signal
Half of the peaks shifted
![Page 45: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/45.jpg)
How similar are two signals?
Dot product),...,,(
21 aaa nA
),...,,(21 bbb n
B
cos
BA
BA iiiba
Identical vectors: 1,0 BAPerpendicular vectors: 0,
2 BA
)()()0)(( gfgf
The dot product is the came as the cross-correation at zero:
![Page 46: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/46.jpg)
What are the characteristics of the dot product?
10 3 1 0.3 0.1 S/N 10
100
1000
Dimensions
Signal+Noise
Noise
![Page 47: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/47.jpg)
Autocorrelation
Autocorrelation
Signal
Shifted signal
Sum of signal and shifted
signal
![Page 48: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/48.jpg)
Coincidence – enhances the signal
The signal to noise can be dramatically increased by measuring several independent signals of the same phenomenon and combining these signals.
Ideal signal
Product of the four measurements
Four measurements
![Page 49: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/49.jpg)
Coincidence – supresses and transforms the noise
Noise in productOriginal noise
![Page 50: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/50.jpg)
Coincidence – supresses interference
Ideal signal
Product of the four measurements
Four measurements with interference
![Page 51: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/51.jpg)
Peak Finding
The derivative of a function is zero at its minima and maxima.
The second derivative is negative at maxima and positive at minima.
![Page 52: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/52.jpg)
Detection of steps
Motivation: To demonstrate a general strategy for separating signal from noise:
1. Characterize the signal and the noise2. Make a model of the data3. Select detection method4. Select parameters using simulations
Inte
nsit
y
![Page 53: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/53.jpg)
Detection of steps: Characterization of noise
Remove signal by subtracting a moving average
![Page 54: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/54.jpg)
Detection of steps: Model of data
points=1000x = linspace(-1,1,points)y=noise*random.normal(size=len(x))y[points/2:]+=signal
S/N=0.75 S/N=1 S/N=2
![Page 55: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/55.jpg)
Detection of steps: Detection method
Steps can be converted into peaks by calculating the difference between the moving average in two windows
S/N=0.75 S/N=1 S/N=2
![Page 56: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/56.jpg)
Detection of steps: Detection method
S/N=0.75 S/N=1 S/N=2
Bin size = 10
Bin size = 30
Bin size = 100
Avera
ge
Inte
nsit
yA
vera
ge
Inte
nsit
yA
vera
ge
Inte
nsit
y
Avera
ge
Inte
nsit
yA
vera
ge
Inte
nsit
yA
vera
ge
Inte
nsit
y
Avera
ge
Inte
nsit
yA
vera
ge
Inte
nsit
yA
vera
ge
Inte
nsit
y
![Page 57: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/57.jpg)
Detection of steps: Simulations - peak location
S/N=0.05 S/N=0.25 S/N=1
Bin size = 10
Bin size = 30
Bin size = 100
![Page 58: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/58.jpg)
Detection of steps: Simulations – correct peak
S/N=0.05 S/N=0.25 S/N=1
Bin size = 10
Bin size = 30
Bin size = 100
Fre
qu
en
cy
Fre
qu
en
cy
Fre
qu
en
cy
Fre
qu
en
cy
Fre
qu
en
cy
Fre
qu
en
cy
Fre
qu
en
cy
Fre
qu
en
cy
Fre
qu
en
cy
Score
Score
Score
Score
Score
Score
Score
Score
Score
![Page 59: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/59.jpg)
Detection of steps: Simulations - FDR and FNR
S/N=0.05 S/N=0.25 S/N=1
Bin size = 10
Bin size = 30
Bin size = 100
Fals
e R
ate
Fals
e R
ate
Fals
e R
ate
Fals
e R
ate
Fals
e R
ate
Fals
e R
ate
Fals
e R
ate
Fals
e R
ate
Fals
e R
ate
Threshold
Threshold
Threshold
Threshold
Threshold
Threshold
Threshold
Threshold
Threshold
False Discovery
Rate
False Negative
Rate
![Page 60: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/60.jpg)
Peak Finding
1. Characterize the signal and the noise2. Make a model of the data3. Select detection method4. Select parameters using simulations
Inte
nsit
y
![Page 61: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/61.jpg)
Peak Finding: Characterizing the noise
Inte
nsit
y
Let’s first try without removing the peaks
![Page 62: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/62.jpg)
Peak Finding: Characterizing the noise
Inte
nsit
y
Removing the peaks by looking for outliers in the root mean square deviation (RMSD)
RMSD
![Page 63: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/63.jpg)
Peak Finding: Characterizing the peaks
Inte
nsit
y
![Page 64: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/64.jpg)
Peak Finding: Model of data
points=1000x = linspace(-1,1,points)y=noise*random.normal(size=len(x))y+=signal*gaussian(x,0,0.01)
S/N=1 S/N=2 S/N=4
![Page 65: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/65.jpg)
Peak Finding: Detection method
S/N=1 S/N=2 S/N=4
Peaks can be detected by finding maxima in the moving average with a window size similar to the peak width
wlk
wlk
kIlS )()(
![Page 66: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/66.jpg)
Peak Finding: Detection method – moving average
S/N=1
S/N=2
S/N=4
Bin size = 5 Bin size = 20 Bin size = 80 Signal
![Page 67: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/67.jpg)
Peak Finding: Detection method – RMSD
S/N=1
S/N=2
S/N=4
Bin size = 5 Bin size = 20 Bin size = 80 Signal
![Page 68: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/68.jpg)
Peak Finding: Information about the Peak
centroid(mean)
full width at half
maximum (FWHM)
area
height
maximum
meanvarianceskewnesskurtosis
Inte
nsit
y
![Page 69: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/69.jpg)
Information about a Peak
)(
)(
xf
xxf
)(xfarea
Centroid or mean
)(xfA peak is defined by
))(max( xfheight
To calculate any of these measures we needto know where the peak starts and ends.
![Page 70: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/70.jpg)
Where does a peak start and end?
![Page 71: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/71.jpg)
Estimating peptide quantity
Peak heightCurve fittingPeak area
Peak heightCurve fitting
m/z
Inte
ns
ity
![Page 72: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/72.jpg)
Time dimension
m/z
Inte
ns
ity
Tim
e
m/z
Tim
e
![Page 73: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/73.jpg)
Sampling
Retention Time
Inte
nsi
ty
![Page 74: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/74.jpg)
0.5
0.6
0.7
0.8
0.9
1
1.1
1 2 3 4 5 6 7 8 9 10
Th
res
ho
lds
(90
%)
# of points
Sampling
![Page 75: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/75.jpg)
What is the best way to estimate quantity?
Peak height - resistant to interference- poor statistics
Peak area - better statistics - more sensitive to
interference
Curve fitting - better statistics- needs to know the peak
shape- slow
![Page 76: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/76.jpg)
Homework: Background Subtraction Using Smoothing
![Page 77: Signal processing. Example data – ChIP-Seq T.N. Siegel, D.R. Hekstra, L.E. Kemp, L.M. Figueiredo, J.E. Lowell, D. Fenyö, X. Wang, S. Dewell, G.A. Cross,](https://reader036.vdocuments.net/reader036/viewer/2022070414/5697bffa1a28abf838cc09d8/html5/thumbnails/77.jpg)
Summary
Fourier transform - transformation to frequency space and back
Signal – how do we detect and characterize signals?
Noise – how do we characterize noise?
Modeling signal and noise
Simulation to select thresholds and select parameters
Filters – fitering by low-pass (i.e. smoothing) and high-pass filters
(e.g. adaptive background correction)
Detection methods based on moving average and RMSD
Convolution - describes the response of a linear and
time-invariant system to an input signal
Cross-correlation is a measure of similarity of two signals
Autocorrelation can be used for finding periodic signals obscured by
noise
The dot product can be used to determine how similar two signals
are
Coincidence measurements enhance the signal and supresses noise
The quantity associated with a peak – height and area
Sampling – how often do we need to sample a peak to get a good
estimate of its area?