University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Modeling Label Dependencies for Audio Tagging with Graph Convolutional Network

Wang, Helin, Zou, Yuexian, Chong, Dading and Wang, Wenwu (2020) Modeling Label Dependencies for Audio Tagging with Graph Convolutional Network IEEE Signal Processing Letters.

WangZCW_SPL_2020.pdf - Accepted version Manuscript

Download (784kB) | Preview


As a multi-label classification task, audio tagging aims to predict the presence or absence of certain sound events in an audio recording. Existing works in audio tagging do not explicitly consider the probabilities of the co-occurrences between sound events, which is termed as the label dependencies in this study. To address this issue, we propose to model the label dependencies via a graph-based method, where each node of the graph represents a label. An adjacency matrix is constructed by mining the statistical relations between labels to represent the graph structure information, and a graph convolutional network (GCN) is employed to learn node representations by propagating information between neighboring nodes based on the adjacency matrix, which implicitly models the label dependencies. The generated node representations are then applied to the acoustic representations for classification. Experiments on Audioset show that our method achieves a state-of-the-art mean average precision (mAP) of 0:434.

Item Type: Article
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
Wang, Helin
Zou, Yuexian
Chong, Dading
Date : 16 August 2020
Additional Information : Embargo OK Metadata Pending
Depositing User : James Marshall
Date Deposited : 20 Aug 2020 09:56
Last Modified : 20 Aug 2020 09:56

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800