University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Surrey-CVSSP system for DCASE2017 challenge task 4

Xu, Yong, Kong, Qiuqiang, Wang, Wenwu and Plumbley, Mark (2017) Surrey-CVSSP system for DCASE2017 challenge task 4 In: Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), 16-17 Nov 2017, Munich, Germany.

[img]
Preview
Text
Surrey-CVSSP system for DCASE2017 challenge task4.pdf - Accepted version Manuscript

Download (138kB) | Preview

Abstract

In this technique report, we present a bunch of methods for the task 4 of Detection and Classification of Acoustic Scenes and Events 2017 (DCASE2017) challenge. This task evaluates systems for the large-scale detection of sound events using weakly labeled training data. The data are YouTube video excerpts focusing on transportation and warnings due to their industry applications. There are two tasks, audio tagging and sound event detection from weakly labeled data. Convolutional neural network (CNN) and gated recurrent unit (GRU) based recurrent neural network (RNN) are adopted as our basic framework. We proposed a learnable gating activation function for selecting informative local features. Attention-based scheme is used for localizing the specific events in a weakly-supervised mode. A new batch-level balancing strategy is also proposed to tackle the data unbalancing problem. Fusion of posteriors from different systems are found effective to improve the performance. In a summary, we get 61% F-value for the audio tagging subtask and 0.73 error rate (ER) for the sound event detection subtask on the development set. While the official multilayer perceptron (MLP) based baseline just obtained 13.1% F-value for the audio tagging and 1.02 for the sound event detection.

Item Type: Conference or Workshop Item (Conference Paper)
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
NameEmailORCID
Xu, Yongyong.xu@surrey.ac.uk
Kong, Qiuqiangq.kong@surrey.ac.uk
Wang, WenwuW.Wang@surrey.ac.uk
Plumbley, Markm.plumbley@surrey.ac.uk
Date : 6 November 2017
Funders : Engineering and Physical Sciences Research Council (EPSRC)
Copyright Disclaimer : This work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
Uncontrolled Keywords : DCASE2017; Convolutional neural network; Attention; Audio tagging; Sound event detection; Weakly labelled data
Related URLs :
Depositing User : Clive Harris
Date Deposited : 30 Nov 2017 15:12
Last Modified : 19 Feb 2018 10:35
URI: http://epubs.surrey.ac.uk/id/eprint/845082

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800