Robust bird species recognition: making it work for dawn chorus audio archives
Stowell, D and Plumbley, MD (2014) Robust bird species recognition: making it work for dawn chorus audio archives In: Ecology and acoustics: emergent properties from community to landscape, 16-18 Jun 2014 Paris, France, 2014-06-16 - 2014-06-18, Paris, France. (Unpublished)
Available under License : See the attached licence file.
Download (33kB) | Preview
The recent (2013) bird species recognition challenges organised by the SABIOD project attracted some strong performances from automatic classifiers applied to short audio excerpts from passive acoustic monitoring stations. Can such strong results be achieved for dawn chorus field recordings in audio archives? The question is important because archives (such as the British Library Sound Archive) hold thousands such recordings, covering many decades and many countries, but they are mostly unlabelled. Automatic labelling holds the potential to unlock their value to ecological studies. Audio in such archives is quite different from passive acoustic monitoring data: importantly, the recording conditions vary randomly (and are usually unknown), making the scenario a ”cross-condition” rather than ”single-condition” train/test task. Dawn chorus recordings are generally long, and the annotations often indicate which birds are in a 20-minute recording but not within which 5-second segments they are active. Further, the amount of annotation available is very small. We report on experiments to evaluate a variety of classifier configurations for automatic multilabel species annotation in dawn chorus archive recordings. The audio data is an order of magnitude larger than the SABIOD challenges, but the ground-truth data is an order of magnitude smaller. We report some surprising findings, including clear variation in the bene- fits of some analysis choices (audio features, pooling techniques noise-robustness techniques) as we move to handle the specific multi-condition case relevant for audio archives.
|Item Type:||Conference or Workshop Item (Lecture)|
|Divisions :||Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing|
|Date :||June 2014|
|Related URLs :|
|Additional Information :||Full text not available from this repository.|
|Depositing User :||Symplectic Elements|
|Date Deposited :||08 Dec 2015 16:12|
|Last Modified :||08 Dec 2015 16:12|
Actions (login required)
Downloads per month over past year