Bimodal Coherence based Scale Ambiguity Cancellation for Target Speech Extraction and Enhancement
Liu, Q, Wang, W and Jackson, PJB (2010) Bimodal Coherence based Scale Ambiguity Cancellation for Target Speech Extraction and Enhancement In: 11th Annual Conference of the International Speech Communication Association 2010, 2010-09-26 - 2010-09-30, Makuhari, Japan.
| PDF Available under License : See the attached licence file. 481Kb | |
| Plain Text (licence) 1516b |
Abstract
We present a novel method for extracting target speech from auditory mixtures using bimodal coherence, which is statistically characterised by a Gaussian mixture modal (GMM) in the offline training process, using the robust features obtained from the audio-visual speech. We then adjust the ICA-separated spectral components using the bimodal coherence in the time-frequency domain, to mitigate the scale ambiguities in different frequency bins. We tested our algorithm on the XM2VTS database, and the results show the performance improvement with our proposed algorithm in terms of SIR measurements.
| Item Type: | Conference or Workshop Item (Paper) |
|---|---|
| Divisions: | Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing |
| ID Code: | 7725 |
| Deposited By: | Symplectic Elements |
| Deposited On: | 09 Dec 2011 11:58 |
| Last Modified: | 08 Jun 2013 16:19 |
Document Downloads
Repository Staff Only: item control page
Tools
Tools