A visual voice activity detection method with adaboosting
Liu, Q, Wang, W and Jackson, PJB (2011) A visual voice activity detection method with adaboosting In: Sensor Signal Processing for Defence (SSPD 2011), 2011-09-27 - 2011-09-29, London, UK.
["document_typename_application/x-pdf" not defined]
Available under License : See the attached licence file.
Spontaneous speech in videos capturing the speaker's mouth provides bimodal information. Exploiting the relationship between the audio and visual streams, we propose a new visual voice activity detection (VAD) algorithm, to overcome the vulnerability of conventional audio VAD techniques in the presence of background interference. First, a novel lip extraction algorithm combining rotational templates and prior shape constraints with active contours is introduced. The visual features are then obtained from the extracted lip region. Second, with the audio voice activity vector used in training, adaboosting is applied to the visual features, to generate a strong final voice activity classifier by boosting a set of weak classifiers. We have tested our lip extraction algorithm on the XM2VTS database (with higher resolution) and some video clips from YouTube (with lower resolution). The visual VAD was shown to offer low error rates.
|Item Type:||Conference or Workshop Item (Conference Paper)|
|Divisions :||Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing|
|Identification Number :||10.1049/ic.2011.0145|
|Additional Information :||This paper is a postprint of a paper submitted to and accepted for publication in IET Seminar Digest and is subject to Institution of Engineering and Technology Copyright. The copy of record is available at IET Digital Library|
|Depositing User :||Symplectic Elements|
|Date Deposited :||28 Sep 2012 09:50|
|Last Modified :||23 Sep 2013 19:28|
Actions (login required)
Downloads per month over past year