Download - Signal Processing Algorithms for Wireless Acoustic Sensor Networks Alexander Bertrand Electrical Engineering Department (ESAT) Katholieke Universiteit

Signal Processing Algorithms for Wireless Acoustic Sensor Networks

Alexander Bertrand

Electrical Engineering Department (ESAT)Katholieke Universiteit Leuven

06-07-2010, University of Oldenburg, MEDI-AKU-SIGNAL Kolloquium

Outline

1. Introduction

2. Multi-channel Wiener filter (MWF)

3. Example: distributed MWF in binaural hearing aids

4. DANSE in fully connected WASN

5. Tree-DANSE

6. Multi-speaker VAD

Tracking of speech powerNoise reduction

Outline

1. Introduction




5. Tree-DANSE


4

Traditional sensor array DSP

centralized processing

known / fixed sensor positions

Sensor array DSP

Long distance (SNR drops 6dB for each doubling of distance)

Sharp angle

#microphones is limited

5

Distributed sensor arrays

Wireless acoustic sensor network (WASN)

• More spatial information• More sensors• Subset: high SNR

recordings

6

• Challenges

3) Distributed processing

1) Unknown/changing positions, link failure ADAPTIVE

2) Bandwidth efficiency

4) Subset selection

Distributed sensor arrays

Outline

1. Introduction




5. Tree-DANSE


Multi-channel Wiener Filtering (MWF)

2

1min ( ) ( ) ( )HE d w

w y

1( )n

1( )d

- Goal: estimate speech component in 1 of the N microphones

- Output = sum of filtered microphone signals:

W1

W2

W3

W4

+ Clean speech

1( )y

( ) ( ) ( ) y d n


1( ) ( ) ( )yy yd w R r

1( )n

1( )d



W1

W2

W3

W4

+ Clean speech

1( )y

( ) ( ) ( )Hyy E R y y

* *1 1( ) ( ) ( ) ( ) ( ) ( ) 1 0 ... 0

T

yd ddE d E d r y d R




- Needs: - N x N noise+speech correlation matrix Ryy - N x 1 clean speech correlation (column of Rdd)

- Rdd can be estimated using Rdd= Ryy- Rnn using voice activity detection (VAD) mechanism

W1

W2

W3

W4

+ Clean speech


RECAP

- Given: N microphone signals

- Choose one (arbitrary) reference microphone

- MWF computes optimal filters such that sum of outputs is as close as possible to speech component in target microphone

Noise frame: destructive interference

Noise = electro music

F1

F2

F3

F4

+

Noise = electro music

F1

F2

F3

F4

+

Speech frame: constructive interference

Outline

1. Introduction




5. Tree-DANSE


7. Subset selection

8. Conclusions

15

Example: binaural hearing aids

MWF left MWF right

Binaural link

large bandwidth needed

full matrix inversion

= 2-node WASN

16

Example: binaural hearing aids

w11

Binaural link

g12

+

g21 w22

+

Converges to optimum if single desired source

(Doclo et al., 2007)

17

Motivation for DANSE

• > 2 nodes ?e.g. supporting external sensor nodes or multiple hearing aid users.

18



19



20



21


• > 2 nodes

• Multiple desired sources e.g. conversation monitoring.

22


• > 2 nodes

• Multiple desired sources e.g. conversation monitoring.

Outline

1. Introduction




5. Tree-DANSE


24

DANSE

• Previous requires more general framework:Distributed adaptive node-specific signal estimation (DANSE)

• Allows for multiple nodes (fully connected topology)

• Allows for multiple target sources: Estimating K sources requires communication of K-channel signals(DANSEK)

DANSE

Considered here:

• Fully connected WSN

• Multi-channel sensor signal observations

• Goal: each node estimates node-specific signal, but common latent signal subspace (dimension= # targets)

26

3 nodes, fully connected

27

Binaural hearing aids (revisited)

w11

Binaural link

g12

+

g21 w22

+

28

w11(2)

Binaural link

g12(2)

+ +

w11(1) g12(1)

w22(2)g21(2)

w22(1)g21(1)

Converges to optimum if #desired sources ≤ 2

J=2, DANSE2 (K=2)

auxiliary channels(capture signal

space)


29

Binaural link

+ +

J=2, DANSEK

1z

2z

1d 2d

11W 12G 21G 22W

Converges to optimum if K= # desired sources

KK


Sequential updating

Sequential round-robin update

31

DANSE with simultaneous updating

- Simultaneous updating: parallel computing

- Sometimes convergence to optimal solution, but not always

- Solution: relaxation yields convergence and optimality:

newii WWW )1(1

32

Without relaxation (S-DANSE)

4 nodes, 3-6 sensors/node


33

With relaxation (rS-DANSE)

4 nodes, 3-6 sensors/node


34

DANSE audio demo (tracking omitted)

Unfiltered

rS-DANSE

Centralized MWF

35

Robust DANSE

- Theory: DANSE == centralized MWF, but…

36

Robust DANSE

- Numerical errors due to:

- Estimation errors in Rdd (especially at low SNR nodes) ripple effect

- Reference microphones are close to each other ill-conditioned basis for signal subspace

- Solution: estimate speech component in communicated signals, preferably from high SNR nodes (= Robust DANSE or R-DANSE)

- Convergence is proven under certain dependency conditions

Outline

1. Introduction




5. Tree-DANSE


What if not fully connected?


Nodes must pass on information from other nodes

1) Nodes act as relays (virtually fully connected): - huge increase in bandwidth if limited connections- routing problem

2) Nodes broadcast the sum of all filtered inputs:- no increase in bandwidth- no routing problem (?)

40


FEEDBACK !!



- Intuition

- Theoretical analysis

- Conclusion: feedback causes major problems

- Direct feedback (one edge) vs. indirect feedback (loops)

Direct feedback cancellation

• Transmitter feedback cancellation

• Receiver feedback cancellation

Direct feedback cancellation


- Intuition

- Theoretical analysis

- Conclusion: feedback causes major problems

- Direct feedback (one edge) vs. indirect feedback (loops)

- Prune to tree topology T-DANSE (= still optimal output!!)

Outline

1. Introduction




5. Tree-DANSE


47

Multi-speaker VAD

- Goal: Track individual speech power of multiple simultaneous speakers or other non-stationary sources (VAD)

- Exploit spatial diversity from WASN

speaker

microphone

48

Multi-speaker VAD

• Ad-hoc microphone array• Assumptions:

1. Speakers in near-field2. Speakers are independent3. Limited noise/reverberance4. Sources to track are well-grounded (= they attain zero-values)

• Advantages:

• Array geometry unknown

• Speaker positions unknown

• Energy-based low data rate synchronization not crucial

WASN’s !

Data model

Non-negative blind source separation

- Theorem (Plumbley, 2002):

“An orthogonal mixture of non-negative, well-grounded source signals, that preserves non-negativity, is a permutation of the original signals.”

Exploiting non-negativity and well-groundedness (J=N=2 example)

s1

s2

s1

s2

y=As


s1

s2

Orthogonal transformation preserves uncorrelatedness simple decorrelation (whitening) of measurements gives original up to a rotation

whiten

s1

s2

?


- Well-grounded source signals

y=As

s1

s2

s1

s2



s1

s2

whiten

s1

s2

!



s1

s2

s1

s2

Non-negative blind source separation

- Theorem (Plumbley, 2002):

“An orthogonal mixture of non-negative, well-grounded source signals, that preserves non-negativity, is a permutation of the original signals.”

- Two different techniques:

1. - Whitening, ignoring non-negativity constraints (=easy)

- Search for rotation matrix that restores non-negativity (=hard)

2. Whitening with non-negativity constraints (=hard)

- 1st approach (Oja & Plumbley) = NPCA (Non-negative principal component analysis)

- 2nd approach (Bertrand & Moonen) = MNICA (Multiplicative non-negative independent component analysis)

MNICA: results

Download - Signal Processing Algorithms for Wireless Acoustic Sensor Networks Alexander Bertrand Electrical Engineering Department (ESAT) Katholieke Universiteit

Top Related