University of Surrey

Test tubes in the lab Research in the ATI Dance Research

General-purpose audio tagging from noisy labels using convolutional neural networks

Iqbal, Turab, Kong, Qiuqiang, Plumbley, Mark D and Wang, Wenwu (2018) General-purpose audio tagging from noisy labels using convolutional neural networks In: DCASE2018 Workshop on Detection and Classification of Acoustic Scenes and Events, 19 - 20 November 2018, Surrey, UK.

[img]
Preview
Text
DCASE2018Workshop_Iqbal_151.pdf - Accepted version Manuscript

Download (212kB) | Preview

Abstract

General-purpose audio tagging refers to classifying sounds that are of a diverse nature, and is relevant in many applications where domain-specific information cannot be exploited. The DCASE 2018 challenge introduces Task 2 for this very problem. In this task, there are a large number of classes and the audio clips vary in duration. Moreover, a subset of the labels are noisy. In this paper, we propose a system to address these challenges. The basis of our system is an ensemble of convolutional neural networks trained on log-scaled mel spectrograms. We use preprocessing and data augmentation methods to improve the performance further. To reduce the effects of label noise, two techniques are proposed: loss function weighting and pseudo-labeling. Experiments on the private test set of this task show that our system achieves state-of-the-art performance with a mean average precision score of 0.951

Item Type: Conference or Workshop Item (Conference Paper)
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
NameEmailORCID
Iqbal, Turabt.iqbal@surrey.ac.uk
Kong, Qiuqiangq.kong@surrey.ac.uk
Plumbley, Mark Dm.plumbley@surrey.ac.uk
Wang, WenwuW.Wang@surrey.ac.uk
Editors :
NameEmailORCID
Plumbley, Mark Dm.plumbley@surrey.ac.uk
Kroos, Christianc.kroos@surrey.ac.uk
Bello, JP
Richard, G
Ellis, DPW
Mesaros, A
Date : November 2018
Funders : EPSRC
Copyright Disclaimer : Copyright 2018 The Author(s). This is an Open Access publication
Uncontrolled Keywords : Audio classification, convolutional network, recurrent network, deep learning, data augmentation, label noise
Related URLs :
Depositing User : Melanie Hughes
Date Deposited : 23 Nov 2018 09:34
Last Modified : 11 Dec 2018 11:24
URI: http://epubs.surrey.ac.uk/id/eprint/849923

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800