Speech driven facial animation system
Abstract
This thesis is concerned with the problem of synthesizing animating face driven by new audio sequence, which is not present in the previously recorded database. The main focus of the thesis is on exploring the efficient mapping of the features of speech domain to video domain. The mapping algorithms consist of two parts: building a model to fit the training data set and predicting the visual motion with the novel audio stimuli. The motivation was to construct the direct mapping mechanism from acoustic signals at low levels to visual frames. Unlike the previous efforts at higher acoustic levels (phonemes or words), the current approach skips the audio recognition phase, in which it is difficult to obtain high recognition accuracy due to speaker and language variability.
Collections
- M Tech Dissertations [923]