University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Source separation with weakly labelled data: An approach to computational auditory scene analysis

Kong, Qiuqiang, Wang, Yuxuan, Song, Xuchen, Cao, Yin, Wang, Wenwu and Plumbley, Mark D. (2020) Source separation with weakly labelled data: An approach to computational auditory scene analysis In: 45th International Conference on Acoustics, Speech, and Signal Processing, May 4 to 8, 2020, Barcelona, Spain.

Source separation with weakly labelled data An approach to computational auditory scene analysis_v1.1.pdf - Accepted version Manuscript

Download (461kB) | Preview


Source separation is the task of separating an audio recording into individual sound sources. Source separation is fundamental for computational auditory scene analysis. Previous work on source separation has focused on separating particular sound classes such as speech and music. Much previous work requires mixtures and clean source pairs for training. In this work, we propose a source separation framework trained with weakly labelled data. Weakly labelled data only contains the tags of an audio clip, without the occurrence time of sound events. We first train a sound event detection system with AudioSet. The trained sound event detection system is used to detect segments that are most likely to contain a target sound event. Then a regression is learnt from a mixture of two randomly selected segments to a target segment conditioned on the audio tagging prediction of the target segment. Our proposed system can separate 527 kinds of sound classes from AudioSet within a single system. A U-Net is adopted for the separation system and achieves an average SDR of 5.67 dB over 527 sound classes in AudioSet.

Item Type: Conference or Workshop Item (Conference Paper)
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
Kong, Qiuqiang
Wang, Yuxuan
Song, Xuchen
Plumbley, Mark
Date : 24 January 2020
Funders : EPSRC
Grant Title : "Making Sense of Sound"
Uncontrolled Keywords : Source separation, weakly labelled data, computational auditory scene analysis, AudioSet.
Depositing User : James Marshall
Date Deposited : 19 Feb 2020 09:58
Last Modified : 19 Feb 2020 16:04

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800