University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Local ordinal contrast pattern histograms for spatiotemporal, lip-based speaker authentication

Chan, CH, Goswami, B, Kittler, J and Christmas, W (2012) Local ordinal contrast pattern histograms for spatiotemporal, lip-based speaker authentication IEEE Transactions on Information Forensics and Security, 7 (2). 602 - 612. ISSN 1556-6013

[img]
Preview
PDF
Chan-TIFS-2012.pdf

Download (18Mb)

Abstract

Lip region deformation during speech contains biometric information and is termed visual speech. This biometric information can be interpreted as being genetic or behavioral depending on whether static or dynamic features are extracted. In this paper, we use a texture descriptor called local ordinal contrast pattern (LOCP) with a dynamic texture representation called three orthogonal planes to represent both the appearance and dynamics features observed in visual speech. This feature representation, when used in standard speaker verification engines, is shown to improve the performance of the lip-biometric trait compared to the state-of-the-art. The best baseline state-of-the-art performance was a half total error rate (HTER) of 13.35% for the XM2VTS database. We obtained HTER of less than 1%. The resilience of the LOCP texture descriptor to random image noise is also investigated. Finally, the effect of the amount of video information on speaker verification performance suggests that with the proposed approach, speaker identity can be verified with a much shorter biometric trait record than the length normally required for voice-based biometrics. In summary, the performance obtained is remarkable and suggests that there is enough discriminative information in the mouth-region to enable its use as a primary biometric trait.

Item Type: Article
Additional Information: Copyright 2012 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Divisions: Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing
Depositing User: Symplectic Elements
Date Deposited: 21 Jun 2012 09:02
Last Modified: 23 Sep 2013 19:23
URI: http://epubs.surrey.ac.uk/id/eprint/486737

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800