University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Particle Flow PHD Filtering for Audio-Visual Multi-Speaker Tracking

Liu, Yang (2019) Particle Flow PHD Filtering for Audio-Visual Multi-Speaker Tracking Doctoral thesis, University of Surrey.

Yang_thesis.pdf - Version of Record
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (8MB) | Preview


Tracking an unknown and time-varying number of targets (e.g., speakers) in indoor environments using audio-visual (AV) modalities has received increasing interest in numerous fields including video conferencing, individual speaker discrimination, and human-computer interaction. The audio-visual sequential Monte Carlo probability hypothesis density (AV-SMCPHD) filter is a popular baseline for multi-target tracking, offering an elegant framework for fusing audio-visual information and dealing with a varying number of speakers. However, the performance of this filter can be adversely affected by the weight degeneracy problem, where the weights of most of the particles may become very small, while only few remain significant, during the iteration of the algorithm. To address this issue, this thesis proposes the AV-SMC-PHD filter by incorporating particle flows defined in terms of the ordinary differential equation and the Fokker-Planck equation. This thesis considers both zero and non-zero diffusion particle flows (ZPF/NPF), and developed two new algorithms, AV-ZPF-SMC-PHD and AV-NPFSMC-PHD, where the speaker states from the previous frames are also considered for particle relocation. The particle flow migrates particles from the prior distribution to the posterior distribution, using a homotopy function which defines the flow in synthetic time. The proposed methods can mitigate the particle degeneracy of the AV-SMC-PHD filter and improve tracking accuracy. Another issue is that the performance of the multi-speaker tracking algorithms is often degraded by mis-detection and clutter in the measurements. To address this issue, this thesis proposes an intensity particle flow (IPF) SMC-PHD filter based on the intensity function derived from the measurements, informed by the clutter density and the detection probability. The IPF-SMC-PHD filter improves tracking accuracy, but induces a high computational overhead, due to the requirement for computing the sum of the likelihood intensity functions and the third-order differentiation of the likelihood density. As a result, the computational complexity of IPF is proportional to the cube of the number of measurements. To address this problem, this thesis proposes a labelled particle flow (LPF) algorithm where particle labels are estimated from the measurements from multiple sensors and then used to update particles and estimate speaker states. Since the LPF only uses the first differentiation of the likelihood density and replaces the clustering step by the sum of particle states, LPF offers a higher computational efficiency as compared with other particle flow methods where a clustering method is often used to estimate the target states. All the proposed methods are extensively evaluated using different datasets, such as AV16.3, AVDIAR and CLEAR. The results show that the weight degeneracy problem has been mitigated by our proposed methods which offer higher tracking accuracy than the baseline methods in a variety of scenarios such as occlusion and rapid movements of the speakers.

Item Type: Thesis (Doctoral)
Divisions : Theses
Authors :
Liu, Yang
Date : December 2019
Funders : PSRC Programme Grant S3A: Future Spatial Audio for an Immersive Listener Experience at Home, BBC as part of the BBC Audio Research Partnership, China Scholarship Council (CSC)
DOI : 10.15126/thesis.00853216
Contributors :
Depositing User : Yang Liu
Date Deposited : 03 Jan 2020 15:13
Last Modified : 03 Jan 2020 15:13

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800