Bimodal Coherence based Scale Ambiguity Cancellation for Target Speech Extraction and Enhancement
Liu, Q, Wang, W and Jackson, PJB (2010) Bimodal Coherence based Scale Ambiguity Cancellation for Target Speech Extraction and Enhancement In: 11th Annual Conference of the International Speech Communication Association 2010, 2010-09-26 - 2010-09-30, Makuhari, Japan.
Available under License : See the attached licence file.
Plain Text (licence)
We present a novel method for extracting target speech from auditory mixtures using bimodal coherence, which is statistically characterised by a Gaussian mixture modal (GMM) in the offline training process, using the robust features obtained from the audio-visual speech. We then adjust the ICA-separated spectral components using the bimodal coherence in the time-frequency domain, to mitigate the scale ambiguities in different frequency bins. We tested our algorithm on the XM2VTS database, and the results show the performance improvement with our proposed algorithm in terms of SIR measurements.
|Item Type:||Conference or Workshop Item (Paper)|
|Divisions :||Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing|
|Date :||26 September 2010|
|Depositing User :||Symplectic Elements|
|Date Deposited :||09 Dec 2011 11:58|
|Last Modified :||23 Sep 2013 18:50|
Actions (login required)
Downloads per month over past year