University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Monaural Source Separation in Complex Domain with Long Short-Term Memory Neural Network

Sun, Yang, Xian, Yang, Wang, Wenwu and Naqvi, Syed Mohsen (2019) Monaural Source Separation in Complex Domain with Long Short-Term Memory Neural Network IEEE Journal of Selected Topics in Signal Processing.

[img]
Preview
Text
Monaural Source Separation in Complex Domain.pdf - Accepted version Manuscript

Download (1MB) | Preview

Abstract

In recent research, deep neural network (DNN) has been used to solve the monaural source separation problem. According to the training objectives, DNN-based monaural speech separation is categorized into three aspects, namely masking, mapping and signal approximation (SA) based techniques. However, the performance of the traditional methods is not robust due to variations in real-world environments. Besides, in the vanilla DNN-based methods, the temporal information cannot be fully utilized. Therefore, in this paper, the long short-term memory (LSTM) neural network is applied to exploit the long-term speech contexts. Then, we propose the complex signal approximation (cSA) which is operated in the complex domain to utilize the phase information of the desired speech signal to improve the separation performance. The IEEE and the TIMIT corpora are used to generate mixtures with noise and speech interferences to evaluate the efficacy of the proposed method. The experimental results demonstrate the advantages of the proposed cSA-based LSTM RNN method in terms of different objective performance measures.

Item Type: Article
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
NameEmailORCID
Sun, Yang
Xian, Yang
Wang, WenwuW.Wang@surrey.ac.uk
Naqvi, Syed Mohsen
Date : 2019
Copyright Disclaimer : © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Uncontrolled Keywords : Deep neural networks; Monaural speech separation; Long short-term memory; Complex signal approximation
Related URLs :
Depositing User : Clive Harris
Date Deposited : 09 May 2019 13:42
Last Modified : 09 May 2019 13:42
URI: http://epubs.surrey.ac.uk/id/eprint/851777

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800