University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Two Stage Single Channel Audio Source Separation using Deep Neural Networks

Grais, Emad M., Roma, Gerard, Simpson, Andrew J. R. and Plumbley, Mark (2017) Two Stage Single Channel Audio Source Separation using Deep Neural Networks IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25 (9). pp. 1469-1479.

[img]
Preview
Text
Two Stage Single Channel Audio Source Separation using Deep Neural Networks.pdf - Accepted version Manuscript

Download (1MB) | Preview

Abstract

Most single channel audio source separation (SCASS) approaches produce separated sources accompanied by interference from other sources and other distortions. To tackle this problem, we propose to separate the sources in two stages. In the first stage, the sources are separated from the mixed signal. In the second stage, the interference between the separated sources and the distortions are reduced using deep neural networks (DNNs). We propose two methods that use DNNs to improve the quality of the separated sources in the second stage. In the first method, each separated source is improved individually using its own trained DNN, while in the second method all the separated sources are improved together using a single DNN. To further improve the quality of the separated sources, the DNNs in the second stage are trained discriminatively to further decrease the interference and the distortions of the separated sources. Our experimental results show that using two stages of separation improves the quality of the separated signals by decreasing the interference between the separated sources and distortions compared to separating the sources using a single stage of separation.

Item Type: Article
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
NameEmailORCID
Grais, Emad M.UNSPECIFIEDUNSPECIFIED
Roma, GerardUNSPECIFIEDUNSPECIFIED
Simpson, Andrew J. R.UNSPECIFIEDUNSPECIFIED
Plumbley, Markm.plumbley@surrey.ac.ukUNSPECIFIED
Date : September 2017
Funders : Engineering and Physical Sciences Research Council (EPSRC)
Identification Number : 10.1109/TASLP.2017.2716443
Copyright Disclaimer : © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Final, published version of this paper is available at: E. M. Grais; G. Roma; A. J. R. Simpson; M. D. Plumbley, "Two Stage Single Channel Audio Source Separation using Deep Neural Networks," in IEEE/ACM Transactions on Audio, Speech, and Language Processing , vol.PP, no.99, pp.1-1 doi: 10.1109/TASLP.2017.2716443
Uncontrolled Keywords : Matrix decomposition; Speech; Source separation; Interference; Distortion; Training; Training data; Single channel audio source separation; Deep neural networks; Audio enhancement; Discriminative learning
Depositing User : Clive Harris
Date Deposited : 19 Jun 2017 14:59
Last Modified : 17 Jul 2017 12:20
URI: http://epubs.surrey.ac.uk/id/eprint/841432

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800