mergepdfs[1]
TRANSCRIPT
8/6/2019 MergePDFs[1]
http://slidepdf.com/reader/full/mergepdfs1 1/5
Speech Denoising by Adaptive Weighted Average
Filtering in the EMD framework
Kais KHALDI
and Monia TURKI-HADJ ALOUANEUnite Signaux et Systemes, ENIT
BP 37, Le Belvedre 1002 Tunis, Tunisia
Email: [email protected], [email protected]
Abdel-Ouahab BOUDRAAIRENav, Ecole Navale/ E3I2(EA3876), ENSIETA
Groupe ASM, Lanveoc Poulmic
BP600, 29240 Brest−Armees, France
Email:[email protected]
Abstract—This paper introduces a new speech enhancementmethod, which combines Adaptive Center Weighted Average(ACWA) filter with Empirical Mode Decomposition (EMD). BothACWA and EMD operate in the time domain. The ACWA filter
is advantageous as it operates adaptively in the time domainand does not require the stationarity and the whiteness of thesignals. Thanks to the data driven decomposition of the EMD,the application of the ACWA filter on the IMFs gives betterresults than the ACWA filtering of the noisy signal. The proposedEMD-ACWA denoising method is applied to noisy speech signalwith different noise levels and the results are compared to thoseobtained by two different denoising methods: wavelet thresholdsand ACWA filtering. A significant superiority of the EMD-ACWAmethod over the two others is shown in white noisy contexts aswell as in correlated noisy ones.
I. INTRODUCTION
Recently, a new temporal signal decomposition method,
called Empirical Mode Decomposition (EMD), has been in-troduced by Huang et al. [1] for analyzing data from non-
stationary and nonlinear processes. The major advantage of
the EMD is that the basis functions are derived from the signal
itself. Hence, the analysis is adaptive in contrast to traditional
methods such as wavelets where the basis functions are fixed.
The EMD has received more attention in terms of applications
[2]-[3], interpretation [4]-[5], and improvement [6]-[7]. The
major advantage of the EMD is that the basis functions are
derived from the signal itself. The EMD is also used in
speech denoising [8]. In fact, speech signal noise reduction
is a well known problem in signal processing. Particularly,
linear methods such as the Wiener filtering [9], are largely
used, because linear filters are easy to implement and todesign. However, these methods are not effective when the
noise estimation is not possible or when the noise is colored.
To overcome these difficulties, nonlinear methods have been
proposed and especially those based on Wavelet thresholding
[10]-[11]. A limit of the wavelet approach is that the basis
functions are fixed, and thus do not necessarily match all real
signals.
To overcome the drawbacks of the wavelet method, two
strategies for noise reduction have been proposed in [8]: EMD
associated with filtering is efficient for relatively low noise
level and when associated with thresholding is attractive in
particular for relatively high noise level. However, in [8],
only signals corrupted by additive white Gaussian noise are
considered.
In this paper, an adaptive denoising scheme associating EMD
with the ACWA filter is proposed. The ACWA filter [12]and other correlated versions are basically used in image
enhancement domain [13]. This filter operates adaptively in
the time domain what fits in the EMD framework, and it does
not require the stationarity of the signals and the whiteness
of the noise. The effectiveness of the ACWA filter can be
improved when it is associated to the EMD decomposition.
Indeed, the IMFs are less noisier than the noisy speech. The
proposed denoising method benefits from the advantages of
the EMD and the attractive properties of the ACWA filter,
which is adaptive and easy to implement, for obtaining good
performance in the presence of white as well as colored noises.
II. EMD ALGORITHM
The EMD decomposes a given signal x(t) into a series of
IMFs through an iterative process called sifting; each one with
a distinct time scale [1]. The decomposition is based on the
local time scale of x(t), and yields adaptive basis functions.
The EMD can be seen as a type of wavelet decomposition
whose subbands are built up as needful to separate the different
components of x(t). Each IMF replaces the signals detail,
at a certain scale or frequency band [4]. The EMD picks
out the highest frequency oscillation that remains in x(t). By
definition, an IMF satisfies two conditions :
1) the number of extrema and the number of zeros cross-
ings may differ by no more than one.
2) the average value of the envelope defined by the localmaxima, and the envelope defined by the local minima,
is zero.
Thus, locally, each IMF contains lower frequency oscillations
than the just extracted one. The EMD does not use a
pre-determined filter or a wavelet function, and is a fully
data-driven method [1]. To be successfully decomposed
into IMFs, the signal x(t) must have at least two extrema,
one minimum and one maximum. The sifting involves the
following steps :
Step 1: Fix the threshold and set j ← 1 ( jthIMF)
Step 2: rj−1(t) ← x(t) (residual)
2008 International Conference on Signals, Circuits and Systems
978-1-4244-2628-7/08/$25.00 ©2008 IEEE -1-
8/6/2019 MergePDFs[1]
http://slidepdf.com/reader/full/mergepdfs1 2/5
Step 3: Extract the jthIMF :
(a) : hj,i−1(t) ← rj−1(t) ,i ← 1 ( i number of sifts)
(b) : Extract local maxima/minima of hj,i−1(t)
(c) : Compute upper and lower envelopes
Uj,i−1(t) and L
j,i−1(t) by interpolating,using cubic spline,
respectively local maxima and minima of hj,i−1(t)
(d) : Compute the mean of the envelopes :
μj,i−1(t) =(Uj,i−1(t) + Lj,i−1(t))/2
(e) : Update : hj,i(t) := hj,i−1(t) − μj,i−1(t), i := i + 1(f) : Calculate the stopping criterion :
SD(i) =
T t=1
|hj,i − 1(t) − hj,i(t)|2
(hj,i − 1(t))2
(g) : Repeat Steps (b)-(f) until SD(i)< and then put
IMFj(t) ← hj,i(t) ( jthIMF)
Step 4: Update residual : rj(t) := rj−1(t) − IMFj(t).
Step 5: Repeat Step 3 with j := j + 1 until the number of
extrema in rj(t) is ≤ 2.
Where T is x(t) time duration. The sifting is repeated several
times (i), in order to get h true IMF that fulfills the conditions
(1) and (2). The result of the sifting is that x(t) will be
decomposed into a sum of C IMFs and a residual rC (t) such
as the following:
x(t) =
C j=1
IMFj(t) + rC (t) (1)
C value is determined automatically using SD (Step 3(f)). The
sifting has two effects : (a) it eliminates riding waves, and
(b) it smoothes uneven amplitudes. To guarantee that IMF
components retain enough physical sense of both amplitudeand frequency modulation, we have to determine SD value for
the sifting. This is accomplished by limiting the size of the
standard deviation SD, computed from the two consecutive
sifting results. Usually, SD (or ) is set between 0.2 and 0.3[1].
III. THE EMD-ACWA DENOISING APPROACH
The proposed denoising method is shown in figure 1.
The noisy signal y(t) described by an additive model is given
by :
y(t) = x(t) + b(t), (2)
where x(t) corresponds to the clean speech signal and b(t)denotes the noise signal.
The noisy signal is decomposed into a sum of IMFs as follows:
y(t) =
C j=1
IMFj(t) + rC (t). (3)
The extracted IMFs include the noise since each IMF, indexed
by j, can be approximated as follows:
IMFj(t) = f j(t) + bj(t), (4)
EMD
ACWA filter
ACWA filter
ACWA filter
ACWA filter
-U
3
R
IM F 1
IM F 2
-
-
-
-
IM F 3
IM F C
Residual
+ -x(t)y(t)
Fig. 1. Denoising scheme.
where IMFj is a noisy version of the data f j .
An estimation f j(t) of f j(t) based on the noisy observation
IMFj(t) is given by
f j(t) = Γ[IMFj(t)], (5)
where Γ[IMFj(t)] is a temporal processing using ACWA filter.
Finally, the estimated signal, x(t), is given by :
x(t) =
C
j=1
f j(t) + rC (t) (6)
The denoising of the IMF by the ACWA filter is given as
follows [12]
f j(t) =
F mean + K j(IM F j(t)− F mean), if F var ≥ σ2j
F mean, otherwise(7)
K j = (1−σ2
j
F var), (8)
where F mean and F var denote respectively the average and
the variance of the IMF computed over a sliding window of
length L, and σ2j designates the variance of noise contained
in the IMF indexed by j.
The noise level σj is estimated as in [14],[15], [16] as
following:
σj = 1.4826×Median{|IMFj(t) −Median{IMFj(t)} |} .
(9)
Classically the ACWA filter has been used in image en-
hancement applications. It can be also interesting and effective
in the context of audio signal enhancement. As shown by
(7) this filter operates in the time domain what corresponds
well to the EMD framework. In contrast to the classical filters
such as Wiener filter, all the parameters are computed in time
domain and hence transformation to frequency domain is not
2008 International Conference on Signals, Circuits and Systems
-2-
8/6/2019 MergePDFs[1]
http://slidepdf.com/reader/full/mergepdfs1 3/5
needed. Besides, as the signal is enhanced sample by sample,
the hypothesis of signals stationarity and noise whiteness are
relaxed. In particular, this filter can perform in general noisy
contexts: white as well as colored noise, high as well as low
noise level.IV. EXPERIMENTAL RESULTS
The proposed noise reduction method is tested on a speech
signal corrupted by different noises whose levels are fixed
through the input Signal to Noise Ratio (SNR),
SNRin = 10 log10
T t=1
(x(t))2
T t=1
(y(t)− x(t))2, (10)
where x(t) and y(t) are respectively the clean and the noisy
signals. The results obtained by the proposed method are
compared to the wavelet approach (Daubechies 8) and ACWA
filter. We use the ACWA filter as comparison method because
it gives better results than the MMSE filter [17]. As an
objective criterion to evaluate the performance of the denoising
method, we use the output Signal to Noise Ratio:
SNRout = 10 log10
T t=1
(x(t))2
T t=1
(x(t) − x(t))2, (11)
where x(t) is the reconstructed signal.
We take as example two speech signals ”a” and ”b”. Thesesignals are corrupted by a colored noise ”f16” with SNR value
fixed to -2dB. The original signals and their corresponding
noisy versions are depicted in figure 2. The size L of the
ACWA filter window is fixed to 511. This choice is justified
by the results shown in figure 3 where are displayed the
variations of the SNRout versus L for two values of SNRin: -2
dB and 0 dB. Figure 3, shows that for L = 511 the SNRout
remains almost constant.
The denoised versions of signals ”a” and ”b” obtained by the
EMD-ACWA, the wavelet thresholds (db8), and the ACWA
filter, are shown respectively in figures 4 and 5. The SNR in is
fixed to -2dB. In fact, we choose db8 with a hard threshold as
a tool of comparison, because it gives good results comparedwith others wavelets.
A careful comparative examination of the signals of figures 4
and 5, shows that the EMD-ACWA performs better than
the wavelet (db8) and ACWA-filter in terms of noise reduction.
Figures 6, 7 and 8 show the variations of the SNRout versus
the SNRin relating to the denoising signal ”a” when corrupted
respectively by a white Gaussian noise, the colored f16 noise
and the colored factory noise. These results demonstrate the
effectiveness of the proposed method. Indeed, the improve-
ment in SNR provided by the EMD-ACWA is much higher
than those given by the wavelet method and the ACWA filter.
Besides, a significant SNR improvement, varying from 4.2
dB to 17.4 dB, is achieved by the EMD-ACWA method. In
fact, even for very low SNR values, we can still observe the
effectiveness of the proposed method in removing the noise
components as the gain in SNR can go up to 14 dB.Note that when listening to the enhanced speechs, the EMD-
ACWA produces lower residual noise, noticeably less speech
distortion compared to the wavelet (db8) method and ACWA
filter.
V. CONCLUSION
In this paper, a new speech enhancement method to effec-
tively remove the noise components is presented. We have
combined two powerful adaptive methods: the EMD and the
ACWA filtering. Obtained results for speech signal contami-
nated with different noises with different SNR values ranging
from -10 dB to 10 dB, showed that the proposed method
performs better than the the wavelet approach and the ACWAfilter. In addition, the reported results demonstrated that the
EMD-ACWA denoising method is effective for noise removal
and confirmed that it is a very attractive method to use in
general noisy contexts.
REFERENCES
[1] N.E. Huang and al. The empirical mode decomposition and Hilbertspectrum for nonlinear and non-stationary time series analysis. Proc.
Royal Society, 454(1971):903–995, 1998.
[2] F. Salzenstein A.O. Boudraa, J.C. Cexus and L. Guillon. If estima-tion using empirical mode decomposition and nonlinear teager energyoperator. Proc. IEEE ISCCSP, pages 45–48, Hammamet, 2004.
[3] A.O. Boudraa S. Benramdane, J.C. Cexus and J.A. Astolfi. Transient tur-
bulent pressure signal processing using empirical mode decomposition.Proc. Physics in Signal and Image Processing, Mhoulouse, 2007.
[4] P. Flandrin, G. Rilling, and P. Goncalves. Empirical mode decompositionas a filter bank. IEEE Sig. Proc. Lett., 11(2):112–114, 2004.
[5] Z. Wu and N.E. Huang. A study of the characteristics of white noiseusing the empirical mode decomposition method. Proc. Roy. Soc.
London A, 460:1597–1611, 2004.[6] B. Weng and K.E. Barner. Optimal and bidirectional optimal empirical
mode decomposition. Proc. IEEE ICASSP, 3:1501–1504, Toulouse,2007.
[7] R. Deering and J.F. Kaiser. The use of a masking signal to im-prove empirical mode decomposition. Proc. IEEE ICASSP, 4:485–488,Philadelphia, 2005.
[8] K. Khaldi, A.O. Boudraa, A. Bouchiki, M. Turki-Hadj Alouane, andE. Samba Diop. Speech signal noise reduction by EMD. In Proc. IEEE
ISCCSP, Malta, March 2008.
[9] J.G. Proakis and D.G. Manolakis. Digital Signal Processing: Principles,
Algorithms, and Applications, volume 1. Prentice-Hall, 3rd edition,1996.
[10] D.L. Donoho. De-noising by soft-thresholding. IEEE Trans. Inform.Theory, 41(3):613–627, 1995.
[11] D.L. Donoho and I.M. Johnstone. Ideal spatial adaptation via waveletshrinkage. Biometrica, 81:425–455, 1994.
[12] J.S. Lee. Digital image enhancement and noise filtering by using localstatistics. IEEE Trans. Pattern Anal. Mach. Intell., 2(4):165–168, March1980.
[13] Masayuki Meguro, Akira Taguchi, and Nozomu Hamada. Data-dependent weighted average filterig for image sequence restoration.
Electroics and Communications in Japan, Part III: Fundamental Elec-tronics Science, 84(4):1–10, 2000.
[14] A.O. Boudraa and J.C. Cexus. Denoising via empirical mode decom-position. In Proc. IEEE ISCCSP, Marrakech, Morocco, 2006.
[15] A.O. Boudraa, J.C. Cexus, and Z. Saidi. EMD-based signal noisereduction. Int. J. Sig. Process., 1(1):33–37, 2004. ISSN: 1304-4494.
2008 International Conference on Signals, Circuits and Systems
-3-
8/6/2019 MergePDFs[1]
http://slidepdf.com/reader/full/mergepdfs1 4/5
[16] William H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery. Numerical Recipes in C: The Art of Scientific Computing, volume 1.Cambridge University Press, 2nd edition, 1992.
[17] M. Turki-Hadj Alouane K. Khaldi and A.O. Boudraa. Voiced speechenhancement based on adaptive filtering of selected intrinsic mode func-tions. Advances in Adaptive Data Analysis (AADA), 2008 (submitted).
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 104
−1
0
1
A m p l i t u d e
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 104
−1
0
1
A m p l i t u d e
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 104
−1
0
1
A m p l i t u d e
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 104
−1
0
1
Time
A m p l i t u d e
a
b
noisy a
noisy b
Fig. 2. The original signals (”a” and ”b”) and their noisy versions (f16 noisewith SNR =-2dB).
0 200 400 600 800 1000 12006
6.5
7
7.5
8
8.5
9
9.5
10
10.5
11
(L) Size of the window ACWA filter
S N R g a i n [ d B ]
For initial SNR = −2 dB
For initial SNR = 0 dB
Fig. 3. The variation of the SNRout relating to the noisy signal ”a” versus Lthe size of the ACWA filter window (f16 noise with SNR=-2 db ad SNR=0db).
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 104
−1
0
1
W a v e l e t ( d b 8 )
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 104
−1
0
1
Time
A C W A f i l t e r
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5x 10
4
−1
0
1
E M D − A C W A
Fig. 4. Denoised version of the signal ”a” obtained by the EMD-ACWA,the Wavelet (db8) and the ACWA filter (f16 noise with SNR =-2dB).
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 104
−1
0
1
W a v e l e t ( d b 8 )
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 104
−1
0
1
Time
A C W A f i l t e r
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 104
−1
0
1
E M D − A C W A
Fig. 5. Denoised version of the signal ”b” obtained by the EMD-ACWA,the Wavelet (db8) and the ACWA filter (f16 noise with SNR =-2dB).
−10 −5 0 5 100
2
4
6
8
10
12
14
16
18
Initial SNR [dB]
S N R o u t p u t [
d B ]
EMD−ACWA
Wavelet (Daubechies 8)
ACWA filter
Fig. 6. Variation of the SNRout versus the SNRin relating to the denoisingof the signal ”a” corrupted by a white Gaussian noise.
2008 International Conference on Signals, Circuits and Systems
-4-
8/6/2019 MergePDFs[1]
http://slidepdf.com/reader/full/mergepdfs1 5/5
−10 −5 0 5 100
2
4
6
8
10
12
14
16
18
Initial SNR [dB]
S N R o u t p u t [ d B ]
EMD−ACWA
Wavelet (Daubechies 8)
ACWA filter
Fig. 7. Variation of the SNRout versus the SNRin relating to the denoisingof the signal ”a” corrupted by the f16 noise.
−10 −5 0 5 100
2
4
6
8
10
12
14
16
18
Initial SNR [dB]
S N R o u t p u t [ d B ]
EMD−ACWA
Wavelet(Daubechies 8)
ACWA filter
Fig. 8. Variation of the SNRout versus the SNRin relating to the denoisingof the signal ”a” corrupted by the factory noise.
2008 International Conference on Signals, Circuits and Systems
-5-