University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Multimodal blind source separation with a circular microphone array and robust beamforming

Naqvi, SM, Khan, MS, Chambers, JA, Liu, Qingju and Wang, Wenwu (2011) Multimodal blind source separation with a circular microphone array and robust beamforming In: 19th European Signal Processing Conference (EUSIPCO-2011), 29 Aug - Sep 02 2011, Barcelona, Spain.

NaqviKLWC_EUSIPCO_2011.pdf - ["content_typename_Submitted version (pre-print)" not defined]

Download (362kB) | Preview


A novel multimodal (audio-visual) approach to the problem of blind source separation (BSS) is evaluated in room environments. The main challenges of BSS in realistic environments are: 1) sources are moving in complex motions and 2) the room impulse responses are long. For moving sources the unmixing filters to separate the audio signals are difficult to calculate from only statistical information available from a limited number of audio samples. For physically stationary sources measured in rooms with long impulse responses, the performance of audio only BSS methods is limited. Therefore, visual modality is utilized to facilitate the separation. The movement of the sources is detected with a 3-D tracker based on a Markov Chain Monte Carlo particle filter (MCMC-PF), and the direction of arrival information of the sources to the microphone array is estimated. A robust least squares frequency invariant data independent (RLSFIDI) beamformer is implemented to perform real time speech enhancement. The uncertainties in source localization and direction of arrival information are also controlled by using a convex optimization approach in the beamformer design. A 16 element circular array configuration is used. Simulation studies based on objective and subjective measures confirm the advantage of beamforming based processing over conventional BSS methods. © 2011 EURASIP.

Item Type: Conference or Workshop Item (Conference Paper)
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing
Authors :
Naqvi, SM
Khan, MS
Chambers, JA
Date : 2011
Depositing User : Symplectic Elements
Date Deposited : 17 Dec 2013 17:07
Last Modified : 16 Jan 2019 16:49

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800