Date of Award

5-1-2004

Document Type

Thesis (Undergraduate)

Department or Program

Department of Computer Science

First Advisor

Lorie Loeb

Second Advisor

Hany Farid

Third Advisor

Stephen Linder

Abstract

This thesis aims to create an automated lip-synchronization system for real-time applications. Specifically, the system is required to be fast, consist of a limited number of keyframes with small memory requirements, and create fluid and believable animations that synchronize with text-to-speech engines as well as raw voice data. The algorithms utilize traditional keyframe animation and a novel method of keyframe selection. Additionally, phoneme-to-keyframe mapping, synchronization, and simple blending rules are employed. The algorithms provide blending between keyframe images, borrow information from neighboring phonemes, accentuate phonemes b, p and m, differentiate between keyframes for phonemes with allophonic variations, and provide prosodromic variation by including emotion while speaking. The lip-sync animation synchronizes with multiple synthesized voices and human speech. A fast and versatile online real-time java chat interface is created to exhibit vivid facial animation. Results show that the animation algorithms are fast and show accurate lip-synchronization. Additionally, surveys showed that the animations are visually pleasing and improve speech understandability 96% of the time. Applications for this project include internet chat capabilities, interactive teaching of foreign languages, animated news broadcasting, enhanced game technology, and cell phone messaging.

Comments

Originally posted in the Dartmouth College Computer Science Technical Report Series, number TR2004-501.

COinS