Dissertations, Theses, and Capstone Projects

Date of Degree

6-2016

Document Type

Dissertation

Degree Name

Ph.D.

Program

Computer Science

Advisor

Matt Huenerfauth

Committee Members

Raquel Benbunan-Fich

Andrew Rosenberg

Vicki Hanson

Subject Categories

Artificial Intelligence and Robotics | Computational Linguistics | Graphics and Human Computer Interfaces

Keywords

Accessibility, Technology for People who are Deaf, Facial Expression Modeling, User Study, Eye Tracking, MPEG-4

Abstract

Technology to automatically synthesize linguistically accurate and natural-looking animations of American Sign Language (ASL) would make it easier to add ASL content to websites and media, thereby increasing information accessibility for many people who are deaf and have low English literacy skills. State-of-art sign language animation tools focus mostly on accuracy of manual signs rather than on the facial expressions. We are investigating the synthesis of syntactic ASL facial expressions, which are grammatically required and essential to the meaning of sentences. In this thesis, we propose to: (1) explore the methodological aspects of evaluating sign language animations with facial expressions, and (2) examine data-driven modeling of facial expressions from multiple recordings of ASL signers. In Part I of this thesis, we propose to conduct rigorous methodological research on how experiment design affects study outcomes when evaluating sign language animations with facial expressions. Our research questions involve: (i) stimuli design, (ii) effect of videos as upper baseline and for presenting comprehension questions, and (iii) eye-tracking as an alternative to recording question-responses from participants. In Part II of this thesis, we propose to use generative models to automatically uncover the underlying trace of ASL syntactic facial expressions from multiple recordings of ASL signers, and apply these facial expressions to manual signs in novel animated sentences. We hypothesize that an annotated sign language corpus, including both the manual and non-manual signs, can be used to model and generate linguistically meaningful facial expressions, if it is combined with facial feature extraction techniques, statistical machine learning, and an animation platform with detailed facial parameterization. To further improve sign language animation technology, we will assess the quality of the animation generated by our approach with ASL signers through the rigorous evaluation methodologies described in Part I.

Share

COinS