University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Audio-visual Convolutive Blind Source Separation

Liu, Q, Wang, W and Jackson, PJB (2010) Audio-visual Convolutive Blind Source Separation

[img]
Preview
PDF
LiuWangJackson_SSPD10.pdf
Available under License : See the attached licence file.

Download (281kB)
[img] Plain Text (licence)
licence.txt

Download (1kB)

Abstract

We present a novel method for speech separation from their audio mixtures using the audio-visual coherence. It consists of two stages: in the off-line training process, we use the Gaussian mixture model to characterise statistically the audiovisual coherence with features obtained from the training set; at the separation stage, likelihood maximization is performed on the independent component analysis (ICA)-separated spectral components. To address the permutation and scaling indeterminacies of the frequency-domain blind source separation (BSS), a new sorting and rescaling scheme using the bimodal coherence is proposed.We tested our algorithm on the XM2VTS database, and the results show that our algorithm can address the permutation problem with high accuracy, and mitigate the scaling problem effectively.

Item Type: Conference or Workshop Item (Paper)
Divisions: Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing
Depositing User: Symplectic Elements
Date Deposited: 09 Dec 2011 15:55
Last Modified: 23 Sep 2013 18:50
URI: http://epubs.surrey.ac.uk/id/eprint/7726

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800