Audio-visual Convolutive Blind Source Separation
Liu, Q, Wang, W and Jackson, P (2010) Audio-visual Convolutive Blind Source Separation In: Sensor Signal Processing for Defence, 2010-09-29 - 2010-09-30, London, UK.
Available under License : See the attached licence file.
We present a novel method for speech separation from their audio mixtures using the audio-visual coherence. It consists of two stages: in the off-line training process, we use the Gaussian mixture model to characterise statistically the audio-visual coherence with features obtained from the training set; at the separation stage, likelihood maximization is performed on the independent component analysis (ICA)-separated spectral components. To address the permutation and scaling indeterminacies of the frequency-domain blind source separation (BSS), a new sorting and rescaling scheme using the bimodal coherence is proposed.We tested our algorithm on the XM2VTS database, and the results show that our algorithm can address the permutation problem with high accuracy, and mitigate the scaling problem effectively.
|Item Type:||Conference or Workshop Item (Conference Paper)|
|Divisions :||Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing|
|Date :||29 September 2010|
|Identification Number :||10.1049/ic.2010.0225|
|Additional Information :||© 2010 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.|
|Depositing User :||Symplectic Elements|
|Date Deposited :||12 Dec 2012 15:32|
|Last Modified :||09 Jun 2014 13:17|
Actions (login required)
Downloads per month over past year