Integrating binaural cues and blind source separation method for separating reverberant speech mixtures
Alinaghi, A, Wang, W and Jackson, PJB (2011) Integrating binaural cues and blind source separation method for separating reverberant speech mixtures IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 209 - 212.
AlinaghiWangJackson_ICASSP11.pdf - Accepted Version
Available under License : See the attached licence file.
Plain Text (licence)
This paper presents a new method for reverberant speech separation, based on the combination of binaural cues and blind source separation (BSS) for the automatic classification of the time-frequency (T-F) units of the speech mixture spectrogram. The main idea is to model interaural phase difference, interaural level difference and frequency bin-wise mixing vectors by Gaussian mixture models for each source and then evaluate that model at each T-F point and assign the units with high probability to that source. The model parameters and the assigned regions are refined iteratively using the Expectation-Maximization (EM) algorithm. The proposed method also addresses the permutation problem of the frequency domain BSS by initializing the mixing vectors for each frequency channel. The EM algorithm starts with binaural cues and after a few iterations the estimated probabilistic mask is used to initialize and re-estimate the mix- ing vector model parameters. We performed experiments on speech mixtures, and showed an average of about 0.8 dB improvement in signal-to-distortion (SDR) over the binaural-only baseline
Copyright 2011 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
|Divisions:||Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing|
|Depositing User:||Symplectic Elements|
|Date Deposited:||25 Nov 2011 15:06|
|Last Modified:||23 Sep 2013 18:50|
Actions (login required)
Downloads per month over past year