University of Surrey

Test tubes in the lab Research in the ATI Dance Research

A Performance Evaluation of Several Deep Neural Networks for Reverberant Speech Separation

Liu, Qingju, Wang, Wenwu, Jackson, Philip and Safavi, Saeid (2018) A Performance Evaluation of Several Deep Neural Networks for Reverberant Speech Separation In: 52nd Asilomar Conference on Signals, Systems and Computers, 28 - 31 October 2018, Pacific Grove, CA.

Asilomar2018.pdf - Accepted version Manuscript

Download (458kB) | Preview
[img] Text - Accepted version Manuscript
Restricted to Repository staff only

Download (450kB)


In this paper, we compare different deep neural networks (DNN) in extracting speech signals from competing speakers in room environments, including the conventional fullyconnected multilayer perception (MLP) network, convolutional neural network (CNN), recurrent neural network (RNN), and the recently proposed capsule network (CapsNet). Each DNN takes input of both spectral features and converted spatial features that are robust to position mismatch, and outputs the separation mask for target source estimation. In addition, a psychacoustically-motivated objective function is integrated in each DNN, which explores perceptual importance of each TF unit in the training process. Objective evaluations are performed on the separated sounds using the converged models, in terms of PESQ, SDR as well as STOI. Overall, all the implemented DNNs have greatly improved the quality and speech intelligibility of the embedded target source as compared to the original recordings. In particular, bidirectional RNN, either along the temporal direction or along the frequency bins, outperforms the other DNN structures with consistent improvement.

Item Type: Conference or Workshop Item (Conference Paper)
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
Safavi, Saeid
Date : 2018
Funders : EPSRC
DOI : 10.1109/ACSSC.2018.8645219
Copyright Disclaimer : © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.”
Additional Information : Electronic ISBN: 978-1-5386-9218-9; CD-ROM ISBN: 978-1-5386-9216-5; USB ISBN: 978-1-5386-9217-2; Print on Demand(PoD) ISBN: 978-1-5386-9219-6;
Depositing User : Melanie Hughes
Date Deposited : 25 Sep 2018 13:37
Last Modified : 08 May 2019 11:28

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800