University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Bimodal Coherence based Scale Ambiguity Cancellation for Target Speech Extraction and Enhancement

Liu, Q, Wang, W and Jackson, PJB (2010) Bimodal Coherence based Scale Ambiguity Cancellation for Target Speech Extraction and Enhancement In: 11th Annual Conference of the International Speech Communication Association 2010, 2010-09-26 - 2010-09-30, Makuhari, Japan.

[img]
Preview
PDF
LiuWangJackson_IS10.pdf
Available under License : See the attached licence file.

Download (493kB)
[img] Plain Text (licence)
licence.txt

Download (1kB)

Abstract

We present a novel method for extracting target speech from auditory mixtures using bimodal coherence, which is statistically characterised by a Gaussian mixture modal (GMM) in the offline training process, using the robust features obtained from the audio-visual speech. We then adjust the ICA-separated spectral components using the bimodal coherence in the time-frequency domain, to mitigate the scale ambiguities in different frequency bins. We tested our algorithm on the XM2VTS database, and the results show the performance improvement with our proposed algorithm in terms of SIR measurements.

Item Type: Conference or Workshop Item (Paper)
Divisions: Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing
Depositing User: Symplectic Elements
Date Deposited: 09 Dec 2011 11:58
Last Modified: 23 Sep 2013 18:50
URI: http://epubs.surrey.ac.uk/id/eprint/7725

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800