Model-based synthesis of visual speech movements from 3D video
Edge, JD, Hilton, A and Jackson, PJB (2009) Model-based synthesis of visual speech movements from 3D video EURASIP Journal on Audio, Speech, and Music Processing, 2009 . 12 - 12. ISSN 1687-4714
|PDF - Accepted Version |
Available under License : See the attached licence file.
|Plain Text (licence)|
Official URL: http://dx.doi.org/10.1155/2009/597267
In this paper we describe a method for the synthesis of visual speech movements using a hybrid unit selection/model-based approach. Speech lip movements are captured using a 3D stereo face capture system, and split up into phonetic units. A dynamic parameterisation of this data is constructed which maintains the relationship between lip shapes and velocities; within this parameterisation a model of how lips move is built and is used in the animation of visual speech movements from speech audio input. The mapping from audio parameters to lip movements is disambiguated by selecting only the most similar stored phonetic units to the target utterance during synthesis. By combining properties of model-based synthesis (e.g. HMMs, neural nets) with unit selection we improve the quality of our speech synthesis.
|Additional Information:||This article is distributed under the terms of the Creative Commons Attribution license, which permits unrestricted use, distribution and reproduction in any medium, provided that the original work is properly cited.|
|Divisions:||Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing|
|Deposited By:||Symplectic Elements|
|Deposited On:||14 Dec 2011 12:43|
|Last Modified:||19 May 2013 02:34|
Repository Staff Only: item control page