University of Surrey

Test tubes in the lab Research in the ATI Dance Research

CHiME-home: A dataset for sound source recognition in a domestic environment.

Foster, P, Sigtia, S, Krstulovic, S, Barker, J and Plumbley, MD (2015) CHiME-home: A dataset for sound source recognition in a domestic environment. In: 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 18-21 Oct 2015, New Paltz, NY.

Available under License : See the attached licence file.

Download (178kB) | Preview
Text (licence)
Available under License : See the attached licence file.

Download (33kB) | Preview


For the task of sound source recognition, we introduce a novel data set based on 6.8 hours of domestic environment audio recordings. We describe our approach of obtaining annotations for the recordings. Further, we quantify agreement between obtained annotations. Finally, we report baseline results for sound source recognition using the obtained dataset. Our annotation approach associates each 4-second excerpt from the audio recordings with multiple labels, on a set of 7 labels associated with sound sources in the acoustic environment. With the aid of 3 human annotators, we obtain 3 sets of multi-label annotations, for 4378 4-second audio excerpts. We evaluate agreement between annotators by computing Jaccard indices between sets of label assignments. Observing varying levels of agreement across labels, with a view to obtaining a representation of ‘ground truth’ in annotations, we refine our dataset to obtain a set of multi-label annotations for 1946 audio excerpts. For the set of 1946 annotated audio excerpts, we predict binary label assignments using Gaussian mixture models estimated on MFCCs. Evaluated using the area under receiver operating characteristic curves, across considered labels we observe performance scores in the range 0.76 to 0.98

Item Type: Conference or Workshop Item (Conference Paper)
Subjects : Signal processing
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing
Authors :
Date : 2015
Identification Number : 10.1109/WASPAA.2015.7336899
Contributors :
Related URLs :
Additional Information : © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Depositing User : Symplectic Elements
Date Deposited : 02 Feb 2016 09:23
Last Modified : 06 Dec 2017 12:03

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800