University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Joint Mixing Vector and Binaural Model Based Stereo Source Separation

Alinaghi, A, Jackson, PJB, Liu, Q and Wang, W (2014) Joint Mixing Vector and Binaural Model Based Stereo Source Separation IEEE Transactions on Audio, Speech, & Language Processing, 22, 9. pp. 1434-1448.

[img] Text
AlinaghiJW_TALSP_2014.pdf - ["content_typename_Accepted version (post-print)" not defined]
Restricted to Repository staff only
Available under License : See the attached licence file.

Download (713kB)
[img] Text (licence)
SRI_deposit_agreement.pdf
Restricted to Repository staff only
Available under License : See the attached licence file.

Download (33kB)

Abstract

In this paper the mixing vector (MV) in the statistical mixing model is compared to the binaural cues represented by interaural level and phase differences (ILD and IPD). It is shown that the MV distributions are quite distinct while binaural models overlap when the sources are close to each other. On the other hand, the binaural cues are more robust to high reverberation than MV models. According to this complementary behavior we introduce a new robust algorithm for stereo speech separation which considers both additive and convolutive noise signals to model the MV and binaural cues in parallel and estimate probabilistic time-frequency masks. The contribution of each cue to the final decision is also adjusted by weighting the log-likelihoods of the cues empirically. Furthermore, the permutation problem of the frequency domain blind source separation (BSS) is addressed by initializing the MVs based on binaural cues. Experiments are performed systematically on determined and underdetermined speech mixtures in five rooms with various acoustic properties including anechoic, highly reverberant, and spatially-diffuse noise conditions. The results in terms of signal-to-distortion-ratio (SDR) confirm the benefits of integrating the MV and binaural cues, as compared with two state-of-the-art baseline algorithms which only use MV or the binaural cues.

Item Type: Article
Authors :
NameEmailORCID
Alinaghi, AUNSPECIFIEDUNSPECIFIED
Jackson, PJBUNSPECIFIEDUNSPECIFIED
Liu, QUNSPECIFIEDUNSPECIFIED
Wang, WUNSPECIFIEDUNSPECIFIED
Date : 1 September 2014
Identification Number : 10.1109/TASLP.2014.2320637
Depositing User : Symplectic Elements
Date Deposited : 28 Mar 2017 13:11
Last Modified : 31 Oct 2017 16:57
URI: http://epubs.surrey.ac.uk/id/eprint/806073

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800