University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Audio Assisted Robust Visual Tracking With Adaptive Particle Filtering

Kilic, V, Barnard, M, Wang, W and Kittler, J (2015) Audio Assisted Robust Visual Tracking With Adaptive Particle Filtering IEEE Transactions on Multimedia, 17 (2). pp. 186-200.

[img]
Preview
Text
KilicBWK_TMM_2014_postprint.pdf - ["content_typename_Accepted version (post-print)" not defined]
Available under License : See the attached licence file.

Download (9MB) | Preview
[img]
Preview
PDF (licence)
SRI_deposit_agreement.pdf
Available under License : See the attached licence file.

Download (33kB) | Preview

Abstract

The problem of tracking multiple moving speakers in indoor environments has received much attention. Earlier techniques were based purely on a single modality, e.g., vision. Recently, the fusion of multi-modal information has been shown to be instrumental in improving tracking performance, as well as robustness in the case of challenging situations like occlusions (by the limited field of view of cameras or by other speakers). However, data fusion algorithms often suffer from noise corrupting the sensor measurements which cause non-negligible detection errors. Here, a novel approach to combining audio and visual data is proposed. We employ the direction of arrival angles of the audio sources to reshape the typical Gaussian noise distribution of particles in the propagation step and to weight the observation model in the measurement step. This approach is further improved by solving a typical problem associated with the PF, whose efficiency and accuracy usually depend on the number of particles and noise variance used in state estimation and particle propagation. Both parameters are specified beforehand and kept fixed in the regular PF implementation which makes the tracker unstable in practice. To address these problems, we design an algorithm which adapts both the number of particles and noise variance based on tracking error and the area occupied by the particles in the image. Experiments on the AV16.3 dataset show the advantage of our proposed methods over the baseline PF method and an existing adaptive PF algorithm for tracking occluded speakers with a significantly reduced number of particles.

Item Type: Article
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing
Authors :
AuthorsEmailORCID
Kilic, VUNSPECIFIEDUNSPECIFIED
Barnard, MUNSPECIFIEDUNSPECIFIED
Wang, WUNSPECIFIEDUNSPECIFIED
Kittler, JUNSPECIFIEDUNSPECIFIED
Date : 1 February 2015
Identification Number : 10.1109/TMM.2014.2377515
Uncontrolled Keywords : Science & Technology, Technology, Computer Science, Information Systems, Computer Science, Software Engineering, Telecommunications, Computer Science, Adaptive particle filter, audio-visual speaker tracking, particle filter, SOURCE LOCALIZATION, MULTIPLE SPEAKERS, OBJECT TRACKING, FUSION, SEGMENTATION, NUMBER
Related URLs :
Additional Information : © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Depositing User : Symplectic Elements
Date Deposited : 28 Oct 2015 11:22
Last Modified : 28 Oct 2015 11:22
URI: http://epubs.surrey.ac.uk/id/eprint/809042

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800