University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks

Zermini, Alfredo, Kong, Qiuqiang, Xu, Yong, Plumbley, Mark D. and Wang, Wenwu (2018) Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks In: Latent Variable Analysis and Signal Separation. LVA/ICA 2018. Lecture Notes in Computer Science. Springer, pp. 361-371. ISBN 978-3-319-93763-2 Online ISBN: 978-3-319-93764-9

[img]
Preview
Text
Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks.pdf - Accepted version Manuscript

Download (2MB) | Preview

Abstract

Given binaural features as input, such as interaural level difference and interaural phase difference, Deep Neural Networks (DNNs) have been recently used to localize sound sources in a mixture of speech signals and/or noise, and to create time-frequency masks for the estimation of the sound sources in reverberant rooms. Here, we explore a more advanced system, where feed-forward DNNs are replaced by Convolutional Neural Networks (CNNs). In addition, the adjacent frames of each time frame (occurring before and after this frame) are used to exploit contextual information, thus improving the localization and separation for each source. The quality of the separation results is evaluated in terms of Signal to Distortion Ratio (SDR).

Item Type: Book Section
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
NameEmailORCID
Zermini, Alfredoalfredo.zermini@surrey.ac.uk
Kong, Qiuqiangq.kong@surrey.ac.uk
Xu, Yongyong.xu@surrey.ac.uk
Plumbley, Mark D.m.plumbley@surrey.ac.uk
Wang, WenwuW.Wang@surrey.ac.uk
Editors :
NameEmailORCID
Deville, Y
Gannot, S
Mason, R
Plumbley, Markm.plumbley@surrey.ac.uk
Ward, D
Date : 6 June 2018
Identification Number : 10.1007/978-3-319-93764-9_34
Copyright Disclaimer : © 2018 Springer.
Uncontrolled Keywords : Convolutional neural networks; Binaural cues; Reverberant rooms; Speech separation; Contextual information
Related URLs :
Depositing User : Clive Harris
Date Deposited : 17 May 2018 14:03
Last Modified : 06 Nov 2018 14:47
URI: http://epubs.surrey.ac.uk/id/eprint/846449

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800