signal processing algorithms for wireless acoustic sensor networks alexander bertrand electrical...

Signal Processing Algorithms for Wireless Acoustic Sensor Networks

Alexander Bertrand

Electrical Engineering Department (ESAT)Katholieke Universiteit Leuven

06-07-2010, University of Oldenburg, MEDI-AKU-SIGNAL Kolloquium

Outline

1. Introduction

2. Multi-channel Wiener filter (MWF)

3. Example: distributed MWF in binaural hearing aids

4. DANSE in fully connected WASN

5. Tree-DANSE

6. Multi-speaker VAD

Tracking of speech powerNoise reduction

Outline

1. Introduction

5. Tree-DANSE

Traditional sensor array DSP

centralized processing

known / fixed sensor positions

Sensor array DSP

Long distance (SNR drops 6dB for each doubling of distance)

Sharp angle

#microphones is limited

Distributed sensor arrays

Wireless acoustic sensor network (WASN)

• More spatial information• More sensors• Subset: high SNR

recordings

• Challenges

3) Distributed processing

1) Unknown/changing positions, link failure ADAPTIVE

2) Bandwidth efficiency

4) Subset selection

Distributed sensor arrays

Outline

1. Introduction

5. Tree-DANSE

Multi-channel Wiener Filtering (MWF)

1min ( ) ( ) ( )HE d w

- Goal: estimate speech component in 1 of the N microphones

- Output = sum of filtered microphone signals:

+ Clean speech

( ) ( ) ( ) y d n

1( ) ( ) ( )yy yd w R r

+ Clean speech

( ) ( ) ( )Hyy E R y y

* *1 1( ) ( ) ( ) ( ) ( ) ( ) 1 0 ... 0

yd ddE d E d r y d R

- Needs: - N x N noise+speech correlation matrix Ryy - N x 1 clean speech correlation (column of Rdd)

- Rdd can be estimated using Rdd= Ryy- Rnn using voice activity detection (VAD) mechanism

+ Clean speech

- Given: N microphone signals

- Choose one (arbitrary) reference microphone

- MWF computes optimal filters such that sum of outputs is as close as possible to speech component in target microphone

Noise frame: destructive interference

Noise = electro music

Speech frame: constructive interference

Outline

1. Introduction

5. Tree-DANSE

7. Subset selection

8. Conclusions

Example: binaural hearing aids

MWF left MWF right

Binaural link

large bandwidth needed

full matrix inversion

= 2-node WASN

Example: binaural hearing aids

Binaural link

g21 w22

Converges to optimum if single desired source

(Doclo et al., 2007)

Motivation for DANSE

• > 2 nodes ?e.g. supporting external sensor nodes or multiple hearing aid users.

• > 2 nodes

• Multiple desired sources e.g. conversation monitoring.

• > 2 nodes

• Multiple desired sources e.g. conversation monitoring.

Outline

1. Introduction

5. Tree-DANSE

• Previous requires more general framework:Distributed adaptive node-specific signal estimation (DANSE)

• Allows for multiple nodes (fully connected topology)

• Allows for multiple target sources: Estimating K sources requires communication of K-channel signals(DANSEK)

Considered here:

• Fully connected WSN

• Multi-channel sensor signal observations

• Goal: each node estimates node-specific signal, but common latent signal subspace (dimension= # targets)

3 nodes, fully connected

Binaural hearing aids (revisited)

Binaural link

g21 w22

w11(2)

Binaural link

g12(2)

w11(1) g12(1)

w22(2)g21(2)

w22(1)g21(1)

Converges to optimum if #desired sources ≤ 2

J=2, DANSE2 (K=2)

auxiliary channels(capture signal

space)

Binaural link

J=2, DANSEK

11W 12G 21G 22W

Converges to optimum if K= # desired sources

Sequential updating

Sequential round-robin update

DANSE with simultaneous updating

- Simultaneous updating: parallel computing

- Sometimes convergence to optimal solution, but not always

- Solution: relaxation yields convergence and optimality:

newii WWW )1(1

Without relaxation (S-DANSE)

4 nodes, 3-6 sensors/node

With relaxation (rS-DANSE)

4 nodes, 3-6 sensors/node

DANSE audio demo (tracking omitted)

Unfiltered

rS-DANSE

Centralized MWF

Robust DANSE

- Theory: DANSE == centralized MWF, but…

Robust DANSE

- Numerical errors due to:

- Estimation errors in Rdd (especially at low SNR nodes) ripple effect

- Reference microphones are close to each other ill-conditioned basis for signal subspace

- Solution: estimate speech component in communicated signals, preferably from high SNR nodes (= Robust DANSE or R-DANSE)

- Convergence is proven under certain dependency conditions

Outline

1. Introduction

5. Tree-DANSE

What if not fully connected?

Nodes must pass on information from other nodes

1) Nodes act as relays (virtually fully connected): - huge increase in bandwidth if limited connections- routing problem

2) Nodes broadcast the sum of all filtered inputs:- no increase in bandwidth- no routing problem (?)

FEEDBACK !!

- Intuition

- Theoretical analysis

- Conclusion: feedback causes major problems

- Direct feedback (one edge) vs. indirect feedback (loops)

Direct feedback cancellation

• Transmitter feedback cancellation

• Receiver feedback cancellation

Direct feedback cancellation

- Intuition

- Theoretical analysis

- Conclusion: feedback causes major problems

- Direct feedback (one edge) vs. indirect feedback (loops)

- Prune to tree topology T-DANSE (= still optimal output!!)

Outline

1. Introduction

5. Tree-DANSE

Multi-speaker VAD

- Goal: Track individual speech power of multiple simultaneous speakers or other non-stationary sources (VAD)

- Exploit spatial diversity from WASN

speaker

microphone

Multi-speaker VAD

• Ad-hoc microphone array• Assumptions:

1. Speakers in near-field2. Speakers are independent3. Limited noise/reverberance4. Sources to track are well-grounded (= they attain zero-values)

• Advantages:

• Array geometry unknown

• Speaker positions unknown

• Energy-based low data rate synchronization not crucial

WASN’s !

Data model

Non-negative blind source separation

- Theorem (Plumbley, 2002):

“An orthogonal mixture of non-negative, well-grounded source signals, that preserves non-negativity, is a permutation of the original signals.”

Exploiting non-negativity and well-groundedness (J=N=2 example)

Orthogonal transformation preserves uncorrelatedness simple decorrelation (whitening) of measurements gives original up to a rotation

whiten

- Well-grounded source signals

whiten

Non-negative blind source separation

- Theorem (Plumbley, 2002):

“An orthogonal mixture of non-negative, well-grounded source signals, that preserves non-negativity, is a permutation of the original signals.”

- Two different techniques:

1. - Whitening, ignoring non-negativity constraints (=easy)

- Search for rotation matrix that restores non-negativity (=hard)

2. Whitening with non-negativity constraints (=hard)

- 1st approach (Oja & Plumbley) = NPCA (Non-negative principal component analysis)

- 2nd approach (Bertrand & Moonen) = MNICA (Multiplicative non-negative independent component analysis)

MNICA: results

signal processing algorithms for wireless acoustic sensor networks alexander bertrand electrical...

multiple nodes

external sensor nodes

distributed mwf

binaural hearing aidsdanse

multiple hearing aid

nodes multiple desired

speech component

n microphone signals

Documents

esat and eloyalty

esat faculty professional development

sample esat survey

katholieke hogeschool kempen

workshop kul-scores/ iap v-06 / iap v-22 / iccos bart de...

esat - apresentação institucional

claudia diaz (k.u.leuven)1 privacy and anonymity claudia...

katholieke universiteit leuven

cosic · katholieke universiteit leuven faculteit...

katholieke avondgebeden

15. esat akta^

resynchronization attacks on wg and lex hongjun wu and bart...

statistical data fusion to prioritize lists of genes bert...

iecep esat

081118 - référentiel esat

esat - corporate presentation

katholieke universiteit leuven - maerivoet · katholieke...

bert pluymers johan suykens, bart de moor department of...

teleperformance calendar - esat/csat

gibbs biclustering of microarray data yves moreau & qizheng...