signal processing algorithms for wireless acoustic sensor networks alexander bertrand electrical...

Post on 04-Jan-2016

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Signal Processing Algorithms for Wireless Acoustic Sensor Networks

Alexander Bertrand

Electrical Engineering Department (ESAT)Katholieke Universiteit Leuven

06-07-2010, University of Oldenburg, MEDI-AKU-SIGNAL Kolloquium

Outline

1. Introduction

2. Multi-channel Wiener filter (MWF)

3. Example: distributed MWF in binaural hearing aids

4. DANSE in fully connected WASN

5. Tree-DANSE

6. Multi-speaker VAD

Tracking of speech powerNoise reduction

Outline

1. Introduction

2. Multi-channel Wiener filter (MWF)

3. Example: distributed MWF in binaural hearing aids

4. DANSE in fully connected WASN

5. Tree-DANSE

6. Multi-speaker VAD

4

Traditional sensor array DSP

centralized processing

known / fixed sensor positions

Sensor array DSP

Long distance (SNR drops 6dB for each doubling of distance)

Sharp angle

#microphones is limited

5

Distributed sensor arrays

Wireless acoustic sensor network (WASN)

• More spatial information• More sensors• Subset: high SNR

recordings

6

• Challenges

3) Distributed processing

1) Unknown/changing positions, link failure ADAPTIVE

2) Bandwidth efficiency

4) Subset selection

Distributed sensor arrays

Outline

1. Introduction

2. Multi-channel Wiener filter (MWF)

3. Example: distributed MWF in binaural hearing aids

4. DANSE in fully connected WASN

5. Tree-DANSE

6. Multi-speaker VAD

Multi-channel Wiener Filtering (MWF)

2

1min ( ) ( ) ( )HE d w

w y

1( )n

1( )d

- Goal: estimate speech component in 1 of the N microphones

- Output = sum of filtered microphone signals:

W1

W2

W3

W4

+ Clean speech

1( )y

( ) ( ) ( ) y d n

Multi-channel Wiener Filtering (MWF)

1( ) ( ) ( )yy yd w R r

1( )n

1( )d

- Goal: estimate speech component in 1 of the N microphones

- Output = sum of filtered microphone signals:

W1

W2

W3

W4

+ Clean speech

1( )y

( ) ( ) ( )Hyy E R y y

* *1 1( ) ( ) ( ) ( ) ( ) ( ) 1 0 ... 0

T

yd ddE d E d r y d R

Multi-channel Wiener Filtering (MWF)

- Goal: estimate speech component in 1 of the N microphones

- Output = sum of filtered microphone signals:

- Needs: - N x N noise+speech correlation matrix Ryy - N x 1 clean speech correlation (column of Rdd)

- Rdd can be estimated using Rdd= Ryy- Rnn using voice activity detection (VAD) mechanism

W1

W2

W3

W4

+ Clean speech

Multi-channel Wiener Filtering (MWF)

RECAP

- Given: N microphone signals

- Choose one (arbitrary) reference microphone

- MWF computes optimal filters such that sum of outputs is as close as possible to speech component in target microphone

Noise frame: destructive interference

Noise = electro music

F1

F2

F3

F4

+

Noise = electro music

F1

F2

F3

F4

+

Speech frame: constructive interference

Outline

1. Introduction

2. Multi-channel Wiener filter (MWF)

3. Example: distributed MWF in binaural hearing aids

4. DANSE in fully connected WASN

5. Tree-DANSE

6. Multi-speaker VAD

7. Subset selection

8. Conclusions

15

Example: binaural hearing aids

MWF left MWF right

Binaural link

large bandwidth needed

full matrix inversion

= 2-node WASN

16

Example: binaural hearing aids

w11

Binaural link

g12

+

g21 w22

+

Converges to optimum if single desired source

(Doclo et al., 2007)

17

Motivation for DANSE

• > 2 nodes ?e.g. supporting external sensor nodes or multiple hearing aid users.

18

Motivation for DANSE

• > 2 nodes ?e.g. supporting external sensor nodes or multiple hearing aid users.

19

Motivation for DANSE

• > 2 nodes ?e.g. supporting external sensor nodes or multiple hearing aid users.

20

Motivation for DANSE

• > 2 nodes ?e.g. supporting external sensor nodes or multiple hearing aid users.

21

Motivation for DANSE

• > 2 nodes

• Multiple desired sources e.g. conversation monitoring.

22

Motivation for DANSE

• > 2 nodes

• Multiple desired sources e.g. conversation monitoring.

Outline

1. Introduction

2. Multi-channel Wiener filter (MWF)

3. Example: distributed MWF in binaural hearing aids

4. DANSE in fully connected WASN

5. Tree-DANSE

6. Multi-speaker VAD

24

DANSE

• Previous requires more general framework:Distributed adaptive node-specific signal estimation (DANSE)

• Allows for multiple nodes (fully connected topology)

• Allows for multiple target sources: Estimating K sources requires communication of K-channel signals(DANSEK)

DANSE

Considered here:

• Fully connected WSN

• Multi-channel sensor signal observations

• Goal: each node estimates node-specific signal, but common latent signal subspace (dimension= # targets)

26

3 nodes, fully connected

27

Binaural hearing aids (revisited)

w11

Binaural link

g12

+

g21 w22

+

28

w11(2)

Binaural link

g12(2)

+ +

w11(1) g12(1)

w22(2)g21(2)

w22(1)g21(1)

Converges to optimum if #desired sources ≤ 2

J=2, DANSE2 (K=2)

auxiliary channels(capture signal

space)

Binaural hearing aids (revisited)

29

Binaural link

+ +

J=2, DANSEK

1z

2z

1d 2d

11W 12G 21G 22W

Converges to optimum if K= # desired sources

KK

Binaural hearing aids (revisited)

Sequential updating

Sequential round-robin update

31

DANSE with simultaneous updating

- Simultaneous updating: parallel computing

- Sometimes convergence to optimal solution, but not always

- Solution: relaxation yields convergence and optimality:

newii WWW )1(1

32

Without relaxation (S-DANSE)

4 nodes, 3-6 sensors/node

DANSE with simultaneous updating

33

With relaxation (rS-DANSE)

4 nodes, 3-6 sensors/node

DANSE with simultaneous updating

34

DANSE audio demo (tracking omitted)

Unfiltered

rS-DANSE

Centralized MWF

35

Robust DANSE

- Theory: DANSE == centralized MWF, but…

36

Robust DANSE

- Numerical errors due to:

- Estimation errors in Rdd (especially at low SNR nodes) ripple effect

- Reference microphones are close to each other ill-conditioned basis for signal subspace

- Solution: estimate speech component in communicated signals, preferably from high SNR nodes (= Robust DANSE or R-DANSE)

- Convergence is proven under certain dependency conditions

Outline

1. Introduction

2. Multi-channel Wiener filter (MWF)

3. Example: distributed MWF in binaural hearing aids

4. DANSE in fully connected WASN

5. Tree-DANSE

6. Multi-speaker VAD

What if not fully connected?

What if not fully connected?

Nodes must pass on information from other nodes

1) Nodes act as relays (virtually fully connected): - huge increase in bandwidth if limited connections- routing problem

2) Nodes broadcast the sum of all filtered inputs:- no increase in bandwidth- no routing problem (?)

40

What if not fully connected?

FEEDBACK !!

What if not fully connected?

What if not fully connected?

- Intuition

- Theoretical analysis

- Conclusion: feedback causes major problems

- Direct feedback (one edge) vs. indirect feedback (loops)

Direct feedback cancellation

• Transmitter feedback cancellation

• Receiver feedback cancellation

Direct feedback cancellation

What if not fully connected?

- Intuition

- Theoretical analysis

- Conclusion: feedback causes major problems

- Direct feedback (one edge) vs. indirect feedback (loops)

- Prune to tree topology T-DANSE (= still optimal output!!)

Outline

1. Introduction

2. Multi-channel Wiener filter (MWF)

3. Example: distributed MWF in binaural hearing aids

4. DANSE in fully connected WASN

5. Tree-DANSE

6. Multi-speaker VAD

47

Multi-speaker VAD

- Goal: Track individual speech power of multiple simultaneous speakers or other non-stationary sources (VAD)

- Exploit spatial diversity from WASN

speaker

microphone

48

Multi-speaker VAD

• Ad-hoc microphone array• Assumptions:

1. Speakers in near-field2. Speakers are independent3. Limited noise/reverberance4. Sources to track are well-grounded (= they attain zero-values)

• Advantages:

• Array geometry unknown

• Speaker positions unknown

• Energy-based low data rate synchronization not crucial

WASN’s !

Data model

Data model

Non-negative blind source separation

- Theorem (Plumbley, 2002):

“An orthogonal mixture of non-negative, well-grounded source signals, that preserves non-negativity, is a permutation of the original signals.”

Exploiting non-negativity and well-groundedness (J=N=2 example)

s1

s2

s1

s2

y=As

Exploiting non-negativity and well-groundedness (J=N=2 example)

s1

s2

Orthogonal transformation preserves uncorrelatedness simple decorrelation (whitening) of measurements gives original up to a rotation

whiten

s1

s2

?

Exploiting non-negativity and well-groundedness (J=N=2 example)

- Well-grounded source signals

y=As

s1

s2

s1

s2

Exploiting non-negativity and well-groundedness (J=N=2 example)

- Well-grounded source signals

s1

s2

whiten

s1

s2

!

Exploiting non-negativity and well-groundedness (J=N=2 example)

- Well-grounded source signals

s1

s2

s1

s2

Non-negative blind source separation

- Theorem (Plumbley, 2002):

“An orthogonal mixture of non-negative, well-grounded source signals, that preserves non-negativity, is a permutation of the original signals.”

- Two different techniques:

1. - Whitening, ignoring non-negativity constraints (=easy)

- Search for rotation matrix that restores non-negativity (=hard)

2. Whitening with non-negativity constraints (=hard)

- 1st approach (Oja & Plumbley) = NPCA (Non-negative principal component analysis)

- 2nd approach (Bertrand & Moonen) = MNICA (Multiplicative non-negative independent component analysis)

MNICA: results

MNICA: results

MNICA: results

top related