Beam-Repositioning System using Microstrip Patch Antenna Array for Wireless A...
MSEE Defense
1. Enhancements to the Generalized Sidelobe Canceller
for Audio Beamforming in an Immersive Environment
Phil Townsend
MSEE Candidate
University of Kentucky
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
2. Overview
1) Introduction
- Adaptive Beamforming and the GSC
2) Amplitude Scaling Improvements
- 1/r Model, Acoustic Physics, Statistical
3) Automatic Target Alignment
- Thresholded Cross Correlation using PHAT-β
4) Array Geometry Analysis
- Volumetric Beamfield Plots
- Monte Carlo Test of Geometric Parameters
5) Final Conclusions and Questions
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
3. Part 1: Introduction
• What's beamforming?
• A spatial filter that enhances sound
based on its spatial position through the
coherent processing of signals from
distributed microphones.
– Reduce room noise/effects
– Suppress interfering speakers
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
4. Adaptive Beamforming
• Optimization of Generalized Filter
Coefficients
T
y[ n]=W [ n] X [n ] opt
– Often requires minimizing output energy
while keeping target component unchanged
• Estimate statistics on the fly
– Input Correlation Matrix unknown/changing
• Gradient Descent Toward Optimal Taps
– Constrained Lowest Energy Output Forms
Unique Minimum to Bowl-Shaped Surface
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
5. Visualization of Gradient Descent
From http://en.wikipedia.org/wiki/Gradient_descent; Image in Public Domain
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
6. Generalized Sidelobe Canceller
(GSC)
• Simplifies Frost's constrained adaptation
into two stages
– A fixed, Delay-Sum Beamformer
– A Blocking Matrix that's adaptively filtered
and subtracted.
– Adaptation can be any algorithm; we use
NLMS here
– Simplification comes mostly from enforcing
distortionless response
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
7. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
8. GSC (con't)
• Upper branch DSB result
• Lower branch BM tracks are
where traditional Blocking Matrix is
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
9. GSC (con't)
• Final output is
• Adaption algorithm for each BM track is
(NLMS, much faster than constrained)
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
10. Limitations of Current Models and
Methods
• Blocking Matrix Leakage
– Farfield assumption not valid for immsersive
microphone arrays
– Target steering might be incorrect
• Most research limited to equispaced linear arrays
– Hard to construct
– Limited useful frequency range
– Want to explore other geometries and find the best
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
11. Part 2: Amplitude Correction
• Nearfield acoustics means target
component has different amplitude in
each microphone
• Propose and test a few models to correct
cancellation
– 1/r Model
– Sound propagation filtering
– Statistical filtering
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
12. Simple 1/r Model
• The acoustic wave equation is solved by
a function inversely proportional in r
• so make a BM using that fact (keep
tracks in distance order)
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
13. ISO Acoustic Physics Model
• Fluid dynamics can be taken into
account to design a filter based on
distance, temperature, humidity, and
pressure (ISO standard 9613)
• Might allow us to add easily-obtainable
information to enhance beamforming
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
14. Statistical Amplitude Scaling
• Lump all corruptive effects together and
minimize energy of difference of tracks
• Carry out as a function of frequency to
get
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
15. ISO and Statistical BM's
• ISO Model (Frequency Domain)
• Statistical Scaling (Frequency Domain)
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
16. A Perfect Blocking Matrix
• Audio Cage data was collected with
targets and speakers separate, so a
perfect BM can be simulated
• Shows upper bound on possible
improvement
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
17. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
18. Experimental Evaluation of
Methods
• Set initial intelligibility to around .3
• Beamform for many target and noise
scenarios
• Find mean correlation coefficient of BM
tracks (want as low as possible) and
overall output (want as large as possible)
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
19. Results
• Most real methods make little difference
– Statistical scaling a little worse b/c of bad
SNR
– ISO filtering a little better b/c of more info
– 1/r model made no difference
• Perfect BM made slight improvement,
but array geometry was most important!
• Listen to some examples...
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
22. Part 3: Automatic Steering
• If steering delays aren't right then target
signal leakage occurs and DSB is
weaker.
• Cross correlation is a highly robust
technique for finding similarities between
signals, so use to fine tune delays
• Apply window and correlation strength
thresholds to try to improve performance
in poor SNR environment
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
23. GCC and PHAT-β
• Find the cross correlation between tracks
over only a small window of possible movements
and whiten to make the spike stand out
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
24. Correlation Coefficient Threshold
• Since environment is noisy and speaker
might go silent, update only if max
correlation is sufficiently strong
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
25. Experimental Evaluation
• Same setup as before
– Initial intel ~.3
– Find output correlation with closest mic
• Vary correlation threshold .1 to .9
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
26. Results
• Tighter threshold better but updates never help
vs original GSC
– Low threshold: erratic focal point movement
– High threshold: can't recover from bad
updates
– Low SNR makes good estimates very
difficult
• Retrace of lags (multilateration) shows search
window D should be tighter
• Array geometry still more important
• Listen to some more examples...
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
27. Output Correlation Chart
Normal GSC Performance for Comparison
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
28. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
29. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
30. Part 4: Array Geometry
• Since array geometry is the most
important factor, we need to find what
the best layouts are and why
• Start by generating beamfields to
visualize array performance and look for
patterns qualitatively
• Then propose parameters and run
computer simulations quantitatively
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
31. Volumetric Beamfield Plots
• GSC beamfield changes over time, but
DSB is root of the system and
performance is constant.
• Need to see performance in three
dimensions
• Use layered approach with colors to
indicate intensity and transparency to
see features inside the space
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
32. Linear Array
• Generally good performance
– Office too small for sidelobes to appear
• Mainlobe elongated toward array
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
33. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
34. Perimeter Array
• Also generally good
– Very tight mainlobe
• No height resolution
– Not a problem in an office though
– Motivation for ceiling arrays
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
35. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
36. Random Arrays
• Performance highly variable
– One best of the lot, one very bad
• Need to find ways to describe and select
best random arrays (coming soon)
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
37. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
38. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
39. A Monte Carlo Experiment for
Analysis of Geometry
• Propose the following parameters for
describing array geometry in 2D and
evaluate array performance for many
randomly-chosen geometries:
– Centroid
• Array center of gravity (mean position)
– Dispersion
• Mic spread (standard deviation of positions)
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
41. Monte Carlo (con't)
• For a given centroid and dispersion,
evaluate the array based on:
– PSR – Peak to Side lobe Ratio
• Worst-case interference
– MLW – Main Lobe Width
• Tightness of enhancement area
• Redefined in 2D to use x and y 3dB widths
2 2
w3dB= x y 3dB 3dB
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
42. Monte Carlo Simulation
• Test variation of one parameter while
holding the other constant.
• Generate random positions from an
8x8m square and target a sound source
1m below center
• Choose 120 random geometries for each
run (a “class” of arrays)
• Compare to rectangular array
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
43. Layout
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
46. Results
• Centroid centered over target always best
– Irregular arrays more robust when centroid
shifts
• Dispersion a classic tradeoff
– Tightly-packed array: tight mainlobe but strong
sidelobes
– Widely-spread array: wide mainlobe but weak
sidelobes
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
47. Part 5. Final Conclusions & Future Work
• Statistical methods for improving GSC ineffective
– Low SNR introduces large error
• Introducing separate, concrete info helped
– ISO model gave a tiny improvement
– More accurate target position (laser, SSL) always best
for steering
• Array geometry is most important to improving performance
– Linear array good, but random arrays have potential to
do better
– Found that a ceiling array should be centered over its
intended target, but...
– Open question: how does one describe the best array
for beamforming on human speech?
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
48. Special Thanks
• Advisor
– Dr. Kevin Donohue
• Thesis Committee Members
– Dr. Jens Hannemann
– Dr. Samson Cheung
• Everyone at the UK Vis Center
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
51. Frost Algorithm
• Solution to the constrained optimization
subject to the constraint (C a selection
matrix)
The constraint vector dictates the sum of
column weights, often F = [1 0 0 0...]
• Solution (P and F constant matrices):
www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
52. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257