University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Automatic environmental sound recognition: Performance versus computational cost

Sigtia, S, Stark, AM, Krstulovic, S and Plumbley, Mark (2016) Automatic environmental sound recognition: Performance versus computational cost IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, 24 (11). pp. 2096-2107.

SigtiaStarkKrstulovicPlumbley16-aslp_accepted.pdf - Accepted version Manuscript
Available under License : See the attached licence file.

Download (1MB) | Preview
Text (licence)
Available under License : See the attached licence file.

Download (33kB) | Preview


In the context of the Internet of Things (IoT), sound sensing applications are required to run on embedded platforms where notions of product pricing and form factor impose hard constraints on the available computing power. Whereas Automatic Environmental Sound Recognition (AESR) algorithms are most often developed with limited consideration for computational cost, this article seeks which AESR algorithm can make the most of a limited amount of computing power by comparing the sound classification performance as a function of its computational cost. Results suggest that Deep Neural Networks yield the best ratio of sound classification accuracy across a range of computational costs, while Gaussian Mixture Models offer a reasonable accuracy at a consistently small cost, and Support Vector Machines stand between both in terms of compromise between accuracy and computational cost.

Item Type: Article
Subjects : Electronic Engineering
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
Sigtia, S
Stark, AM
Krstulovic, S
Date : November 2016
Identification Number : 10.1109/TASLP.2016.2592698
Copyright Disclaimer : (c) 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
Uncontrolled Keywords : Automatic environmental sound recognition, computational auditory scene analysis, machine learning, deep learning.
Depositing User : Symplectic Elements
Date Deposited : 03 Aug 2016 15:55
Last Modified : 26 Jul 2017 16:01

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800