University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Two-Stage Monaural Source Separation in Reverberant Room Environments using Deep Neural Networks

Sun, Yang, Wang, Wenwu, Chambers, Jonathon and Naqvi, Syed Mohsen (2018) Two-Stage Monaural Source Separation in Reverberant Room Environments using Deep Neural Networks IEEE/ACM Transactions on Audio Speech and Language Processing, 27 (1). pp. 125-139.

[img]
Preview
Text
Two-Stage Monaural Source Separation in Reverberant Room Environments using Deep Neural Networks.pdf - Accepted version Manuscript

Download (2MB) | Preview

Abstract

Deep neural networks (DNNs) have been used for dereverberation and separation in the monaural source separation problem. However, the performance of current state-ofthe-art methods is limited, particularly when applied in highly reverberant room environments. In this paper, we propose a twostage approach with two DNN-based methods to address this problem. In the first stage, the dereverberation of the speech mixture is achieved with the proposed dereverberation mask (DM). In the second stage, the dereverberant speech mixture is separated with the ideal ratio mask (IRM). To realize this two-stage approach, in the first DNN-based method, the DM is integrated with the IRM to generate the enhanced time-frequency (T-F) mask, namely the ideal enhanced mask (IEM), as the training target for the single DNN. In the second DNN-based method, the DM and the IRM are predicted with two individual DNNs. The IEEE and the TIMIT corpora with real room impulse responses (RIRs) and noise from the NOISEX dataset are used to generate speech mixtures for evaluations. The proposed methods outperform the state-of-the-art specifically in highly reverberant room environments.

Item Type: Article
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
NameEmailORCID
Sun, Yang
Wang, WenwuW.Wang@surrey.ac.uk
Chambers, Jonathon
Naqvi, Syed Mohsen
Date : 17 October 2018
DOI : 10.1109/TASLP.2018.2874708
Copyright Disclaimer : © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Uncontrolled Keywords : Deep neural networks; Monaural source separation; Dereverberation mask; Highly reverberant room environments
Related URLs :
Depositing User : Clive Harris
Date Deposited : 10 Oct 2018 08:39
Last Modified : 11 Dec 2018 17:09
URI: http://epubs.surrey.ac.uk/id/eprint/849629

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800