University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Attention-based Convolutional Neural Networks for Acoustic Scene Classification

Ren, Zhao, Kong, Qiuqiang, Qian, Kun, Plumbley, Mark D and Schuller, Bj¨orn W (2018) Attention-based Convolutional Neural Networks for Acoustic Scene Classification In: 3rd workshop on Detection and Classification of Acoustic Scenes and Events (DCASE 2018 Workshop), 19 - 20 November 2018, Surrey, UK.

[img]
Preview
Text
__homes.surrey.ac.uk_home_.System_Desktop_ATTENTION-BASED CONVOLUTIONAL NEURAL NETWORKS FOR ACOUSTIC SCENE CLASSIFICATION (1).pdf - Accepted version Manuscript

Download (279kB) | Preview

Abstract

We propose a convolutional neural network (CNN) model based on an attention pooling method to classify ten different acoustic scenes, participating in the acoustic scene classification task of the IEEE AASPChallengeonDetectionandClassificationofAcousticScenes and Events (DCASE 2018), which includes data from one device (subtask A) and data from three different devices (subtask B). The log mel spectrogram images of the audio waves are first forwarded to convolutional layers, and then fed into an attention pooling layer to reduce the feature dimension and achieve classification. From attention perspective, we build a weighted evaluation of the features, instead of simple max pooling or average pooling. On the official development set of the challenge, the best accuracy of subtask A is 72.6%,whichisanimprovementof12.9%whencomparedwiththe official baseline (p < .001 in a one-tailed z-test). For subtask B, the best result of our attention-based CNN is a significant improvement of the baseline as well, in which the accuracies are 71.8%, 58.3%, and 58.3% for the three devices A to C (p < .001 for device A, p < .01 for device B, and p < .05 for device C).

Item Type: Conference or Workshop Item (Conference Paper)
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
NameEmailORCID
Ren, Zhao
Kong, Qiuqiangq.kong@surrey.ac.uk
Qian, Kun
Plumbley, Mark Dm.plumbley@surrey.ac.uk
Schuller, Bj¨orn W
Date : 2018
Funders : H2020; SF7; EPSRC
Uncontrolled Keywords : Acoustic Scene Classification, Convolutional Neural Network, Attention Pooling, Log Mel Spectrogram
Related URLs :
Depositing User : Melanie Hughes
Date Deposited : 19 Sep 2018 14:49
Last Modified : 11 Dec 2018 11:24
URI: http://epubs.surrey.ac.uk/id/eprint/849365

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800