Cao, Yin, Kong, Qiuqiang, Iqbal, Turab, An, Fengyan, Wang, Wenwu and Plumbley, Mark D.
(2019)
Polyphonic sound event detection and localization using a two-stage strategy
In: Detection and Classification of Acoustic Scenes and Events (DCASE 2019) Workshop, 25-26 Oct 2019, New York University's Tandon School of Engineering, Brooklyn, NY, USA.
![[img]](http://epubs.surrey.ac.uk/853042/1.hassmallThumbnailVersion/Polyphonic%20sound%20event%20detection%20and%20localization%20using%20a%20two-stage%20strategy.pdf)  Preview |
|
Text
Polyphonic sound event detection and localization using a two-stage strategy.pdf
- Accepted version Manuscript
Download (535kB)
| Preview
|
Abstract
Sound event detection (SED) and localization refer to recognizing
sound events and estimating their spatial and temporal locations.
Using neural networks has become the prevailing method for SED.
In the area of sound localization, which is usually performed by estimating
the direction of arrival (DOA), learning-based methods have
recently been developed. In this paper, it is experimentally shown
that the trained SED model is able to contribute to the direction
of arrival estimation (DOAE). However, joint training of SED and
DOAE degrades the performance of both. Based on these results, a
two-stage polyphonic sound event detection and localization method
is proposed. The method learns SED first, after which the learned
feature layers are transferred for DOAE. It then uses the SED ground
truth as a mask to train DOAE. The proposed method is evaluated on
the DCASE 2019 Task 3 dataset, which contains different overlapping
sound events in different environments. Experimental results
show that the proposed method is able to improve the performance
of both SED and DOAE, and also performs significantly better than
the baseline method.
Item Type: |
Conference or Workshop Item
(Conference Poster)
|
Divisions
: |
Faculty of Engineering and Physical Sciences > Electronic Engineering
|
Authors
: |
Cao, Yin | yin.cao@surrey.ac.uk | | Kong, Qiuqiang | q.kong@surrey.ac.uk | | Iqbal, Turab | t.iqbal@surrey.ac.uk | | An, Fengyan | | | Wang, Wenwu | W.Wang@surrey.ac.uk | | Plumbley, Mark D. | m.plumbley@surrey.ac.uk | |
|
Editors
: |
Mandel, Michael | | | Salamon, Justin | | | Ellis, Daniel P. W | | |
|
Date
: |
25 October 2019
|
Funders
: |
Engineering and Physical Sciences Research Council (EPSRC), European Union's Horizon 2020
|
DOI
: |
10.33682/4jhy-bj81
|
Grant Title
: |
Making Sense of Sounds
|
Copyright Disclaimer
: |
Copyright 2019 The Authors
This work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of
this license, visit: http://creativecommons.org/licenses/by/4.0/
|
Uncontrolled Keywords
: |
Sound event detection;
Source localization;
Direction of arrival;
Convolutional recurrent neural networks
|
Related URLs
: |
|
Additional Information
: |
Proceedings DOI: https://doi.org/10.33682/1syg-dy60
|
Depositing User
: |
Clive Harris
|
Date Deposited
: |
05 Nov 2019 13:54
|
Last Modified
: |
14 Nov 2019 13:45
|
URI: |
http://epubs.surrey.ac.uk/id/eprint/853042 |
Actions (login required)
 |
View Item |
Downloads per month over past year