msee defense

52
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257 Enhancements to the Generalized Sidelobe Canceller for Audio Beamforming in an Immersive Environment Phil Townsend MSEE Candidate University of Kentucky

Upload: jptown0

Post on 09-Jun-2015

251 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Enhancements to the Generalized Sidelobe Canceller for Audio Beamforming in an Immersive Environment

Phil Townsend

MSEE Candidate

University of Kentucky

Page 2: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Overview1) Introduction

- Adaptive Beamforming and the GSC

2) Amplitude Scaling Improvements

- 1/r Model, Acoustic Physics, Statistical

3) Automatic Target Alignment

- Thresholded Cross Correlation using PHAT-β

4) Array Geometry Analysis

- Volumetric Beamfield Plots

- Monte Carlo Test of Geometric Parameters

5) Final Conclusions and Questions

Page 3: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Part 1: Introduction

• What's beamforming?• A spatial filter that enhances sound

based on its spatial position through the coherent processing of signals from distributed microphones.– Reduce room noise/effects– Suppress interfering speakers

Page 4: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Adaptive Beamforming• Optimization of Generalized Filter

Coefficients

– Often requires minimizing output energy while keeping target component unchanged

• Estimate statistics on the fly– Input Correlation Matrix unknown/changing

• Gradient Descent Toward Optimal Taps– Constrained Lowest Energy Output Forms

Unique Minimum to Bowl-Shaped Surface

y [ n]=W optT [ n] X [n ]

Page 5: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Visualization of Gradient Descent

From http://en.wikipedia.org/wiki/Gradient_descent; Image in Public Domain

Page 6: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Generalized Sidelobe Canceller (GSC)

• Simplifies Frost's constrained adaptation into two stages– A fixed, Delay-Sum Beamformer– A Blocking Matrix that's adaptively filtered

and subtracted.– Adaptation can be any algorithm; we use

NLMS here– Simplification comes mostly from enforcing

distortionless response

Page 7: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Page 8: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

GSC (con't)• Upper branch DSB result

• Lower branch BM tracks are

where traditional Blocking Matrix is

Page 9: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

GSC (con't)• Final output is

• Adaption algorithm for each BM track is (NLMS, much faster than constrained)

Page 10: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Limitations of Current Models and Methods

• Blocking Matrix Leakage

– Farfield assumption not valid for immsersive microphone arrays

– Target steering might be incorrect

• Most research limited to equispaced linear arrays

– Hard to construct

– Limited useful frequency range

– Want to explore other geometries and find the best

Page 11: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Part 2: Amplitude Correction

• Nearfield acoustics means target component has different amplitude in each microphone

• Propose and test a few models to correct cancellation– 1/r Model– Sound propagation filtering– Statistical filtering

Page 12: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Simple 1/r Model

• The acoustic wave equation is solved by a function inversely proportional in r

• so make a BM using that fact (keep tracks in distance order)

Page 13: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

ISO Acoustic Physics Model• Fluid dynamics can be taken into

account to design a filter based on distance, temperature, humidity, and pressure (ISO standard 9613)

• Might allow us to add easily-obtainable information to enhance beamforming

Page 14: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Statistical Amplitude Scaling• Lump all corruptive effects together and

minimize energy of difference of tracks

• Carry out as a function of frequency to get

Page 15: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

ISO and Statistical BM's• ISO Model (Frequency Domain)

• Statistical Scaling (Frequency Domain)

Page 16: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

A Perfect Blocking Matrix

• Audio Cage data was collected with targets and speakers separate, so a perfect BM can be simulated

• Shows upper bound on possible improvement

Page 17: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Page 18: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Experimental Evaluation of Methods

• Set initial intelligibility to around .3• Beamform for many target and noise

scenarios• Find mean correlation coefficient of BM

tracks (want as low as possible) and overall output (want as large as possible)

Page 19: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Results• Most real methods make little difference

– Statistical scaling a little worse b/c of bad SNR

– ISO filtering a little better b/c of more info– 1/r model made no difference

• Perfect BM made slight improvement, but array geometry was most important!

• Listen to some examples...

Page 20: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Output Correlation Chart

Page 21: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

BM Correlation Chart

Page 22: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Part 3: Automatic Steering• If steering delays aren't right then target

signal leakage occurs and DSB is weaker.

• Cross correlation is a highly robust technique for finding similarities between signals, so use to fine tune delays

• Apply window and correlation strength thresholds to try to improve performance in poor SNR environment

Page 23: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

GCC and PHAT-β• Find the cross correlation between tracks

over only a small window of possible movements

and whiten to make the spike stand out

Page 24: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Correlation Coefficient Threshold • Since environment is noisy and speaker

might go silent, update only if max correlation is sufficiently strong

Page 25: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Experimental Evaluation• Same setup as before

– Initial intel ~.3– Find output correlation with closest mic

• Vary correlation threshold .1 to .9

Page 26: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Results• Tighter threshold better but updates never help

vs original GSC– Low threshold: erratic focal point movement– High threshold: can't recover from bad

updates– Low SNR makes good estimates very

difficult

• Retrace of lags (multilateration) shows search window D should be tighter

• Array geometry still more important

• Listen to some more examples...

Page 27: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Output Correlation ChartNormal GSC Performance for Comparison

Page 28: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Page 29: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Page 30: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Part 4: Array Geometry• Since array geometry is the most

important factor, we need to find what the best layouts are and why

• Start by generating beamfields to visualize array performance and look for patterns qualitatively

• Then propose parameters and run computer simulations quantitatively

Page 31: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Volumetric Beamfield Plots• GSC beamfield changes over time, but

DSB is root of the system and performance is constant.

• Need to see performance in three dimensions

• Use layered approach with colors to indicate intensity and transparency to see features inside the space

Page 32: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Linear Array• Generally good performance

– Office too small for sidelobes to appear

• Mainlobe elongated toward array

Page 33: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Page 34: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Perimeter Array• Also generally good

– Very tight mainlobe

• No height resolution– Not a problem in an office though– Motivation for ceiling arrays

Page 35: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Page 36: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Random Arrays• Performance highly variable

– One best of the lot, one very bad

• Need to find ways to describe and select best random arrays (coming soon)

Page 37: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Page 38: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Page 39: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

A Monte Carlo Experiment for Analysis of Geometry

• Propose the following parameters for describing array geometry in 2D and evaluate array performance for many randomly-chosen geometries:– Centroid

• Array center of gravity (mean position)

– Dispersion• Mic spread (standard deviation of positions)

Page 40: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Parameter Examples

Page 41: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Monte Carlo (con't)• For a given centroid and dispersion,

evaluate the array based on:– PSR – Peak to Side lobe Ratio

• Worst-case interference

– MLW – Main Lobe Width• Tightness of enhancement area• Redefined in 2D to use x and y 3dB widths

w3dB= x3dB2 y3dB2

Page 42: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Monte Carlo Simulation

• Test variation of one parameter while holding the other constant.

• Generate random positions from an 8x8m square and target a sound source 1m below center

• Choose 120 random geometries for each run (a “class” of arrays)

• Compare to rectangular array

Page 43: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Layout

Page 44: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Centroid Displacement

Page 45: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Dispersion

Page 46: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Results• Centroid centered over target always best

– Irregular arrays more robust when centroid shifts

• Dispersion a classic tradeoff– Tightly-packed array: tight mainlobe but strong

sidelobes– Widely-spread array: wide mainlobe but weak

sidelobes

Page 47: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Part 5. Final Conclusions & Future Work• Statistical methods for improving GSC ineffective

– Low SNR introduces large error

• Introducing separate, concrete info helped

– ISO model gave a tiny improvement

– More accurate target position (laser, SSL) always best for steering

• Array geometry is most important to improving performance

– Linear array good, but random arrays have potential to do better

– Found that a ceiling array should be centered over its intended target, but...

– Open question: how does one describe the best array for beamforming on human speech?

Page 48: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Special Thanks

• Advisor– Dr. Kevin Donohue

• Thesis Committee Members– Dr. Jens Hannemann– Dr. Samson Cheung

• Everyone at the UK Vis Center

Page 49: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Questions?

Page 50: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Extra Slides

Page 51: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

Frost Algorithm• Solution to the constrained optimization

subject to the constraint (C a selection matrix)

The constraint vector dictates the sum of column weights, often F = [1 0 0 0...]

• Solution (P and F constant matrices):

Page 52: MSEE Defense

www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257