University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Chime-home: A dataset for sound source recognition in a domestic environment.

Foster, P, Sigtia, S, Krstulovic, S, Barker, J and Plumbley, MD (2015) Chime-home: A dataset for sound source recognition in a domestic environment. In: 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 18-21 Oct 2015, New Paltz, NY.

[img]
Preview
Text
chime_home_waspaa_2015_camera_ready_3.pdf
Available under License : See the attached licence file.

Download (178kB) | Preview
[img]
Preview
Text (licence)
SRI_deposit_agreement.pdf
Available under License : See the attached licence file.

Download (33kB) | Preview

Abstract

For the task of sound source recognition, we introduce a novel data set based on 6.8 hours of domestic environment audio recordings. We describe our approach of obtaining annotations for the recordings. Further, we quantify agreement between obtained annotations. Finally, we report baseline results for sound source recognition using the obtained dataset. Our annotation approach associates each 4-second excerpt from the audio recordings with multiple labels, on a set of 7 labels associated with sound sources in the acoustic environment. With the aid of 3 human annotators, we obtain 3 sets of multi-label annotations, for 4378 4-second audio excerpts. We evaluate agreement between annotators by computing Jaccard indices between sets of label assignments. Observing varying levels of agreement across labels, with a view to obtaining a representation of ‘ground truth’ in annotations, we refine our dataset to obtain a set of multi-label annotations for 1946 audio excerpts. For the set of 1946 annotated audio excerpts, we predict binary label assignments using Gaussian mixture models estimated on MFCCs. Evaluated using the area under receiver operating characteristic curves, across considered labels we observe performance scores in the range 0.76 to 0.98

Item Type: Conference or Workshop Item (Conference Paper)
Subjects : Signal processing
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing
Authors :
AuthorsEmailORCID
Foster, PUNSPECIFIEDUNSPECIFIED
Sigtia, SUNSPECIFIEDUNSPECIFIED
Krstulovic, SUNSPECIFIEDUNSPECIFIED
Barker, JUNSPECIFIEDUNSPECIFIED
Plumbley, MDUNSPECIFIEDUNSPECIFIED
Date : 2015
Identification Number : 10.1109/WASPAA.2015.7336899
Contributors :
ContributionNameEmailORCID
PublisherIEEE, UNSPECIFIEDUNSPECIFIED
Related URLs :
Additional Information : © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Depositing User : Symplectic Elements
Date Deposited : 02 Feb 2016 09:23
Last Modified : 02 Feb 2016 09:25
URI: http://epubs.surrey.ac.uk/id/eprint/809719

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800