mergepdfs[1]

5
Speech Denoising by Adaptive Weighted Average Filtering in the EMD framework Kais KHALDI and Monia TURKI-HADJ ALOUANE Unit´ e Signaux et Syst` emes, ENIT BP 37, Le Belvedre 1002 Tunis, Tunisia Email: [email protected], [email protected] Abdel-Ouahab BOUDRAA IRENav, Ecole Navale/ E 3 I 2 (EA3876), ENSIETA Groupe ASM, Lanv´ eoc Poulmic BP600, 29240 BrestArm´ ees, France Email:[email protected]  Abstract —This paper introduces a new speech enhan cement meth od, whic h comb ines Adapt ive Cent er Weight ed A vera ge (ACWA) lter with Empirical Mode Decomposition (EMD). Both ACWA and EMD operate in the time domain. The ACWA lter is adv ant age ous as it ope rat es adapti vel y in the time domain and does not requir e the stationa rity and the whitene ss of the signa ls. Thanks to the data driv en decompos ition of the EMD, the app lic ati on of the ACW A lt er on the IMF s gi ves bet ter results than the ACWA ltering of the noisy signal. The proposed EMD-ACWA denoising method is applied to noisy speech signal with different noise levels and the results are compared to those obtained by two different denoising methods: wavelet thresholds and ACWA ltering. A signicant superiority of the EMD-ACWA method over the two others is shown in white noisy contexts as well as in correlated noisy ones. I. I NTRODUCTION Recen tly , a new temp oral sign al decomposi tion meth od, called Empirical Mode Decomposition (EMD), has been in- tro duc ed by Huang et al. [1] for anal yzi ng dat a fr om non- stat iona ry and nonli near processe s. The majo r adv anta ge of the EMD is that the basis functions are derived from the signal itself. Hence, the analysis is adaptive in contrast to traditional methods such as wavelets where the basis functions are xed. The EMD has received more attention in terms of applications [2]-[3], inte rpre tati on [4]-[5], and impr ovement [6]-[7]. The major advantage of the EMD is that the basis functions are de ri ve d fr om the si gnal it se lf . The EMD is also us ed in spee ch deno isin g [8]. In fact , spee ch sign al nois e redu ctio n is a well known probl em in sign al proc essin g. Part icul arly , line ar meth ods such as the Wiene r lter ing [9], are largel y use d, bec aus e li nea r lt ers are eas y to imple me nt and to desi gn. Howe ver , thes e meth ods are not eff ecti ve when the noise estimation is not possible or when the noise is colored. To overcome these difculties, nonlinear methods have been proposed and especially those based on Wavelet thresholding [10]-[11]. A limit of the wav elet approa ch is that the basis functions are xed, and thus do not necessarily match all real signals. To overcome the dra wba cks of the wave let met hod , two strategies for noise reduction have been proposed in [8]: EMD associated with lterin g is efc ient for rela tiv ely low noise lev el and when asso ciat ed with threshol ding is attr acti ve in par tic ula r for relati ve ly hig h noi se le ve l. Howe ve r, in [8], only signals corrupted by additive white Gaussian noise are considered. In this paper, an adaptive denoising scheme associating EMD with the ACWA lter is pro pos ed. The ACWA lter [12] and oth er cor rel ate d ver sio ns are bas ica lly use d in image enhancement domain [13]. This lte r oper ates adap tiv ely in the time domain what ts in the EMD framework, and it does not require the stationarity of the signals and the whiteness of the noi se. The ef fec ti veness of the ACW A lt er can be impr ove d when it is assoc iate d to the EMD deco mpos itio n. Indeed, the IMFs are less noisier than the noisy speech. The propo sed deno isin g meth od benets from the adv anta ges of the EMD and the attr acti ve prope rtie s of the ACW A lte r, which is adaptive and easy to implement, for obtaining good performance in the presence of white as well as colored noises. II. EMD ALGORITHM The EMD decomposes a given signal x(t) into a series of IMFs through an iterative process called sifting; each one with a distinct time scale [1]. The decomposition is based on the local time scale of x(t), and yields adaptive basis functions. The EMD can be see n as a typ e of wavelet decomposition whose subbands are built up as needful to separate the different comp onent s of x(t). Eac h IMF rep lac es the sig nals det ail, at a cer tai n sca le or fr equ enc y ban d [4] . The EMD pic ks out the highest frequency oscillation that remains in x(t). By denition, an IMF satises two conditions : 1) the numbe r of ext rema and the number of zeros cross - ings may differ by no more than one. 2) the aver age value of the envelope dened by the local maxima, and the envelope dened by the local minima, is zero. Thus, locally, each IMF contains lower frequency oscillations th an th e ju st ext ra ct ed o ne . Th e EMD do es no t us e a pre -de te rmi ned lt er or a wa ve let funct ion , and is a ful ly dat a-d ri ven met hod [1]. T o be succes sfu lly dec ompose d into IMFs, the sign al x(t) must have at least two extrema, one mi nimum and one max imu m. The sif tin g in volves the following steps : Step 1: Fix the threshold and set j 1 (  j th IMF) Step 2: r j1 (t) x(t) (residual) 2008 International Conference on Signals, Circuits and Systems 978-1-4244-2628-7/08/$25.00 ©2008 IEEE -1-

Upload: margarita-castro

Post on 08-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

8/6/2019 MergePDFs[1]

http://slidepdf.com/reader/full/mergepdfs1 1/5

Speech Denoising by Adaptive Weighted Average

Filtering in the EMD framework 

Kais KHALDI

and Monia TURKI-HADJ ALOUANEUnite Signaux et Systemes, ENIT

BP 37, Le Belvedre 1002 Tunis, Tunisia

Email: [email protected], [email protected]

Abdel-Ouahab BOUDRAAIRENav, Ecole Navale/ E3I2(EA3876), ENSIETA

Groupe ASM, Lanveoc Poulmic

BP600, 29240 Brest−Armees, France

Email:[email protected]

 Abstract—This paper introduces a new speech enhancementmethod, which combines Adaptive Center Weighted Average(ACWA) filter with Empirical Mode Decomposition (EMD). BothACWA and EMD operate in the time domain. The ACWA filter

is advantageous as it operates adaptively in the time domainand does not require the stationarity and the whiteness of thesignals. Thanks to the data driven decomposition of the EMD,the application of the ACWA filter on the IMFs gives betterresults than the ACWA filtering of the noisy signal. The proposedEMD-ACWA denoising method is applied to noisy speech signalwith different noise levels and the results are compared to thoseobtained by two different denoising methods: wavelet thresholdsand ACWA filtering. A significant superiority of the EMD-ACWAmethod over the two others is shown in white noisy contexts aswell as in correlated noisy ones.

I. INTRODUCTION

Recently, a new temporal signal decomposition method,

called Empirical Mode Decomposition (EMD), has been in-troduced by Huang et al. [1] for analyzing data from non-

stationary and nonlinear processes. The major advantage of 

the EMD is that the basis functions are derived from the signal

itself. Hence, the analysis is adaptive in contrast to traditional

methods such as wavelets where the basis functions are fixed.

The EMD has received more attention in terms of applications

[2]-[3], interpretation [4]-[5], and improvement [6]-[7]. The

major advantage of the EMD is that the basis functions are

derived from the signal itself. The EMD is also used in

speech denoising [8]. In fact, speech signal noise reduction

is a well known problem in signal processing. Particularly,

linear methods such as the Wiener filtering [9], are largely

used, because linear filters are easy to implement and todesign. However, these methods are not effective when the

noise estimation is not possible or when the noise is colored.

To overcome these difficulties, nonlinear methods have been

proposed and especially those based on Wavelet thresholding

[10]-[11]. A limit of the wavelet approach is that the basis

functions are fixed, and thus do not necessarily match all real

signals.

To overcome the drawbacks of the wavelet method, two

strategies for noise reduction have been proposed in [8]: EMD

associated with filtering is efficient for relatively low noise

level and when associated with thresholding is attractive in

particular for relatively high noise level. However, in [8],

only signals corrupted by additive white Gaussian noise are

considered.

In this paper, an adaptive denoising scheme associating EMD

with the ACWA filter is proposed. The ACWA filter [12]and other correlated versions are basically used in image

enhancement domain [13]. This filter operates adaptively in

the time domain what fits in the EMD framework, and it does

not require the stationarity of the signals and the whiteness

of the noise. The effectiveness of the ACWA filter can be

improved when it is associated to the EMD decomposition.

Indeed, the IMFs are less noisier than the noisy speech. The

proposed denoising method benefits from the advantages of 

the EMD and the attractive properties of the ACWA filter,

which is adaptive and easy to implement, for obtaining good

performance in the presence of white as well as colored noises.

II. EMD ALGORITHM

The EMD decomposes a given signal x(t) into a series of 

IMFs through an iterative process called sifting; each one with

a distinct time scale [1]. The decomposition is based on the

local time scale of  x(t), and yields adaptive basis functions.

The EMD can be seen as a type of wavelet decomposition

whose subbands are built up as needful to separate the different

components of  x(t). Each IMF replaces the signals detail,

at a certain scale or frequency band [4]. The EMD picks

out the highest frequency oscillation that remains in x(t). By

definition, an IMF satisfies two conditions :

1) the number of extrema and the number of zeros cross-

ings may differ by no more than one.

2) the average value of the envelope defined by the localmaxima, and the envelope defined by the local minima,

is zero.

Thus, locally, each IMF contains lower frequency oscillations

than the just extracted one. The EMD does not use a

pre-determined filter or a wavelet function, and is a fully

data-driven method [1]. To be successfully decomposed

into IMFs, the signal x(t) must have at least two extrema,

one minimum and one maximum. The sifting involves the

following steps :

Step 1: Fix the threshold and set j ← 1 ( jthIMF)

Step 2: rj−1(t) ← x(t) (residual)

2008 International Conference on Signals, Circuits and Systems

978-1-4244-2628-7/08/$25.00 ©2008 IEEE -1-

8/6/2019 MergePDFs[1]

http://slidepdf.com/reader/full/mergepdfs1 2/5

Step 3: Extract the jthIMF :

(a) : hj,i−1(t) ← rj−1(t) ,i ← 1 ( i number of sifts)

(b) : Extract local maxima/minima of hj,i−1(t)

(c) : Compute upper and lower envelopes

Uj,i−1(t) and L

j,i−1(t) by interpolating,using cubic spline,

respectively local maxima and minima of hj,i−1(t)

(d) : Compute the mean of the envelopes :

μj,i−1(t) =(Uj,i−1(t) + Lj,i−1(t))/2

(e) : Update : hj,i(t) := hj,i−1(t) − μj,i−1(t), i := i + 1(f) : Calculate the stopping criterion :

SD(i) =

T t=1

|hj,i − 1(t) − hj,i(t)|2

(hj,i − 1(t))2

(g) : Repeat Steps (b)-(f) until SD(i)< and then put

IMFj(t) ← hj,i(t) ( jthIMF)

Step 4: Update residual : rj(t) := rj−1(t) − IMFj(t).

Step 5: Repeat Step 3 with j := j + 1 until the number of 

extrema in rj(t) is ≤ 2.

Where T is x(t) time duration. The sifting is repeated several

times (i), in order to get h true IMF that fulfills the conditions

(1) and (2). The result of the sifting is that x(t) will be

decomposed into a sum of  C  IMFs and a residual rC (t) such

as the following:

x(t) =

C j=1

IMFj(t) + rC (t) (1)

C  value is determined automatically using SD (Step 3(f)). The

sifting has two effects : (a) it eliminates riding waves, and

(b) it smoothes uneven amplitudes. To guarantee that IMF

components retain enough physical sense of both amplitudeand frequency modulation, we have to determine SD value for

the sifting. This is accomplished by limiting the size of the

standard deviation SD, computed from the two consecutive

sifting results. Usually, SD (or ) is set between 0.2 and 0.3[1].

III. THE EMD-ACWA DENOISING APPROACH

The proposed denoising method is shown in figure 1.

The noisy signal y(t) described by an additive model is given

by :

y(t) = x(t) + b(t), (2)

where x(t) corresponds to the clean speech signal and b(t)denotes the noise signal.

The noisy signal is decomposed into a sum of IMFs as follows:

y(t) =

C j=1

IMFj(t) + rC (t). (3)

The extracted IMFs include the noise since each IMF, indexed

by j, can be approximated as follows:

IMFj(t) = f j(t) + bj(t), (4)

EMD

ACWA filter

ACWA filter

ACWA filter

ACWA filter

-U

3

R

IM F 1

IM F 2

-

-

-

-

IM F 3

IM F C 

Residual

+ -x(t)y(t)

Fig. 1. Denoising scheme.

where IMFj is a noisy version of the data f j .

An estimation f j(t) of  f j(t) based on the noisy observation

IMFj(t) is given by

f j(t) = Γ[IMFj(t)], (5)

where Γ[IMFj(t)] is a temporal processing using ACWA filter.

Finally, the estimated signal, x(t), is given by :

x(t) =

j=1

f j(t) + rC (t) (6)

The denoising of the IMF by the ACWA filter is given as

follows [12]

f j(t) =

F mean + K j(IM F j(t)− F mean), if  F var ≥ σ2j

F mean, otherwise(7)

K j = (1−σ2

j

F var), (8)

where F mean and F var denote respectively the average and

the variance of the IMF computed over a sliding window of 

length L, and σ2j designates the variance of noise contained

in the IMF indexed by j.

The noise level σj is estimated as in [14],[15], [16] as

following:

σj = 1.4826×Median{|IMFj(t) −Median{IMFj(t)} |} .

(9)

Classically the ACWA filter has been used in image en-

hancement applications. It can be also interesting and effective

in the context of audio signal enhancement. As shown by

(7) this filter operates in the time domain what corresponds

well to the EMD framework. In contrast to the classical filters

such as Wiener filter, all the parameters are computed in time

domain and hence transformation to frequency domain is not

2008 International Conference on Signals, Circuits and Systems

-2-

8/6/2019 MergePDFs[1]

http://slidepdf.com/reader/full/mergepdfs1 3/5

needed. Besides, as the signal is enhanced sample by sample,

the hypothesis of signals stationarity and noise whiteness are

relaxed. In particular, this filter can perform in general noisy

contexts: white as well as colored noise, high as well as low

noise level.IV. EXPERIMENTAL RESULTS

The proposed noise reduction method is tested on a speech

signal corrupted by different noises whose levels are fixed

through the input Signal to Noise Ratio (SNR),

SNRin = 10 log10

T t=1

(x(t))2

T t=1

(y(t)− x(t))2, (10)

where x(t) and y(t) are respectively the clean and the noisy

signals. The results obtained by the proposed method are

compared to the wavelet approach (Daubechies 8) and ACWA

filter. We use the ACWA filter as comparison method because

it gives better results than the MMSE filter [17]. As an

objective criterion to evaluate the performance of the denoising

method, we use the output Signal to Noise Ratio:

SNRout = 10 log10

T t=1

(x(t))2

T t=1

(x(t) − x(t))2, (11)

where x(t) is the reconstructed signal.

We take as example two speech signals ”a” and ”b”. Thesesignals are corrupted by a colored noise ”f16” with SNR value

fixed to -2dB. The original signals and their corresponding

noisy versions are depicted in figure 2. The size L of the

ACWA filter window is fixed to 511. This choice is justified

by the results shown in figure 3 where are displayed the

variations of the SNRout versus L for two values of SNRin: -2

dB and 0 dB. Figure 3, shows that for L = 511 the SNRout

remains almost constant.

The denoised versions of signals ”a” and ”b” obtained by the

EMD-ACWA, the wavelet thresholds (db8), and the ACWA

filter, are shown respectively in figures 4 and 5. The SNR in is

fixed to -2dB. In fact, we choose db8 with a hard threshold as

a tool of comparison, because it gives good results comparedwith others wavelets.

A careful comparative examination of the signals of figures 4

and 5, shows that the EMD-ACWA performs better than

the wavelet (db8) and ACWA-filter in terms of noise reduction.

Figures 6, 7 and 8 show the variations of the SNRout versus

the SNRin relating to the denoising signal ”a” when corrupted

respectively by a white Gaussian noise, the colored f16 noise

and the colored factory noise. These results demonstrate the

effectiveness of the proposed method. Indeed, the improve-

ment in SNR provided by the EMD-ACWA is much higher

than those given by the wavelet method and the ACWA filter.

Besides, a significant SNR improvement, varying from 4.2

dB to 17.4 dB, is achieved by the EMD-ACWA method. In

fact, even for very low SNR values, we can still observe the

effectiveness of the proposed method in removing the noise

components as the gain in SNR can go up to 14 dB.Note that when listening to the enhanced speechs, the EMD-

ACWA produces lower residual noise, noticeably less speech

distortion compared to the wavelet (db8) method and ACWA

filter.

V. CONCLUSION

In this paper, a new speech enhancement method to effec-

tively remove the noise components is presented. We have

combined two powerful adaptive methods: the EMD and the

ACWA filtering. Obtained results for speech signal contami-

nated with different noises with different SNR values ranging

from -10 dB to 10 dB, showed that the proposed method

performs better than the the wavelet approach and the ACWAfilter. In addition, the reported results demonstrated that the

EMD-ACWA denoising method is effective for noise removal

and confirmed that it is a very attractive method to use in

general noisy contexts.

REFERENCES

[1] N.E. Huang and al. The empirical mode decomposition and Hilbertspectrum for nonlinear and non-stationary time series analysis. Proc.

  Royal Society, 454(1971):903–995, 1998.

[2] F. Salzenstein A.O. Boudraa, J.C. Cexus and L. Guillon. If estima-tion using empirical mode decomposition and nonlinear teager energyoperator. Proc. IEEE ISCCSP, pages 45–48, Hammamet, 2004.

[3] A.O. Boudraa S. Benramdane, J.C. Cexus and J.A. Astolfi. Transient tur-

bulent pressure signal processing using empirical mode decomposition.Proc. Physics in Signal and Image Processing, Mhoulouse, 2007.

[4] P. Flandrin, G. Rilling, and P. Goncalves. Empirical mode decompositionas a filter bank. IEEE Sig. Proc. Lett., 11(2):112–114, 2004.

[5] Z. Wu and N.E. Huang. A study of the characteristics of white noiseusing the empirical mode decomposition method. Proc. Roy. Soc.

  London A, 460:1597–1611, 2004.[6] B. Weng and K.E. Barner. Optimal and bidirectional optimal empirical

mode decomposition. Proc. IEEE ICASSP, 3:1501–1504, Toulouse,2007.

[7] R. Deering and J.F. Kaiser. The use of a masking signal to im-prove empirical mode decomposition. Proc. IEEE ICASSP, 4:485–488,Philadelphia, 2005.

[8] K. Khaldi, A.O. Boudraa, A. Bouchiki, M. Turki-Hadj Alouane, andE. Samba Diop. Speech signal noise reduction by EMD. In Proc. IEEE 

 ISCCSP, Malta, March 2008.

[9] J.G. Proakis and D.G. Manolakis. Digital Signal Processing: Principles,

  Algorithms, and Applications, volume 1. Prentice-Hall, 3rd edition,1996.

[10] D.L. Donoho. De-noising by soft-thresholding. IEEE Trans. Inform.Theory, 41(3):613–627, 1995.

[11] D.L. Donoho and I.M. Johnstone. Ideal spatial adaptation via waveletshrinkage. Biometrica, 81:425–455, 1994.

[12] J.S. Lee. Digital image enhancement and noise filtering by using localstatistics. IEEE Trans. Pattern Anal. Mach. Intell., 2(4):165–168, March1980.

[13] Masayuki Meguro, Akira Taguchi, and Nozomu Hamada. Data-dependent weighted average filterig for image sequence restoration.

  Electroics and Communications in Japan, Part III: Fundamental Elec-tronics Science, 84(4):1–10, 2000.

[14] A.O. Boudraa and J.C. Cexus. Denoising via empirical mode decom-position. In Proc. IEEE ISCCSP, Marrakech, Morocco, 2006.

[15] A.O. Boudraa, J.C. Cexus, and Z. Saidi. EMD-based signal noisereduction. Int. J. Sig. Process., 1(1):33–37, 2004. ISSN: 1304-4494.

2008 International Conference on Signals, Circuits and Systems

-3-

8/6/2019 MergePDFs[1]

http://slidepdf.com/reader/full/mergepdfs1 4/5

[16] William H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery.  Numerical Recipes in C: The Art of Scientific Computing, volume 1.Cambridge University Press, 2nd edition, 1992.

[17] M. Turki-Hadj Alouane K. Khaldi and A.O. Boudraa. Voiced speechenhancement based on adaptive filtering of selected intrinsic mode func-tions. Advances in Adaptive Data Analysis (AADA), 2008 (submitted).

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 104

−1

0

1

   A  m  p   l   i   t  u   d  e

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 104

−1

0

1

   A  m  p   l   i   t  u   d  e

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 104

−1

0

1

   A  m  p   l   i   t  u   d  e

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 104

−1

0

1

Time

   A  m  p   l   i   t  u   d  e

a

b

noisy a

noisy b

Fig. 2. The original signals (”a” and ”b”) and their noisy versions (f16 noisewith SNR =-2dB).

0 200 400 600 800 1000 12006

6.5

7

7.5

8

8.5

9

9.5

10

10.5

11

(L) Size of the window ACWA filter 

   S   N   R  g  a   i  n   [   d   B   ]

For initial SNR = −2 dB

For initial SNR = 0 dB

Fig. 3. The variation of the SNRout relating to the noisy signal ”a” versus Lthe size of the ACWA filter window (f16 noise with SNR=-2 db ad SNR=0db).

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 104

−1

0

1

   W  a  v  e   l  e   t   (   d   b   8   )

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 104

−1

0

1

Time

   A   C   W   A   f   i   l   t  e  r

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5x 10

4

−1

0

1

   E   M   D  −   A   C   W   A

Fig. 4. Denoised version of the signal ”a” obtained by the EMD-ACWA,the Wavelet (db8) and the ACWA filter (f16 noise with SNR =-2dB).

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 104

−1

0

1

   W  a  v  e   l  e   t   (   d   b   8   )

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 104

−1

0

1

Time

   A   C   W   A   f   i   l   t  e  r

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 104

−1

0

1

   E   M   D  −   A   C   W   A

Fig. 5. Denoised version of the signal ”b” obtained by the EMD-ACWA,the Wavelet (db8) and the ACWA filter (f16 noise with SNR =-2dB).

−10 −5 0 5 100

2

4

6

8

10

12

14

16

18

Initial SNR [dB]

   S   N   R  o  u   t  p  u   t   [

   d   B   ]

EMD−ACWA

Wavelet (Daubechies 8)

 ACWA filter 

Fig. 6. Variation of the SNRout versus the SNRin relating to the denoisingof the signal ”a” corrupted by a white Gaussian noise.

2008 International Conference on Signals, Circuits and Systems

-4-

8/6/2019 MergePDFs[1]

http://slidepdf.com/reader/full/mergepdfs1 5/5

−10 −5 0 5 100

2

4

6

8

10

12

14

16

18

Initial SNR [dB]

   S   N   R  o  u   t  p  u   t   [   d   B   ]

EMD−ACWA

Wavelet (Daubechies 8)

 ACWA filter 

Fig. 7. Variation of the SNRout versus the SNRin relating to the denoisingof the signal ”a” corrupted by the f16 noise.

−10 −5 0 5 100

2

4

6

8

10

12

14

16

18

Initial SNR [dB]

   S   N   R  o  u   t  p  u   t   [   d   B   ]

EMD−ACWA

Wavelet(Daubechies 8)

 ACWA filter 

Fig. 8. Variation of the SNRout versus the SNRin relating to the denoisingof the signal ”a” corrupted by the factory noise.

2008 International Conference on Signals, Circuits and Systems

-5-