Dissertations, Theses, and Capstone Projects
Date of Degree
9-2021
Document Type
Dissertation
Degree Name
Ph.D.
Program
Speech-Language-Hearing Sciences
Advisor
Douglas H. Whalen
Committee Members
Hosung Nam
Wei-rong Chen
Mark K. Tiede
Christina Hagedorn
Subject Categories
Phonetics and Phonology
Keywords
vowel variability, speaking rate, the uncontrolled manifold, UCM, normalizing flow, invertible neural networks
Abstract
Variability is intrinsic to human speech production. One approach to understand variability in speech is to decompose it into task-irrelevant (“good”) and task-relevant (“bad”) parts with respect to speech tasks. Based on the uncontrolled manifold (UCM) approach, this dissertation investigates how vowel token-to-token variability in articulation and acoustics can be decomposed into “good” and “bad” parts and how speaking rate changes the pattern of these two from the Haskins IEEE rate comparison database. Furthermore, it is examined whether the “good” part of variability, or flexibility, can be modeled directly from speech data using the flow-based invertible neural networks framework. The application of the UCM analysis and FlowINN modeling method is discussed, particularly focusing on how the “good” part of variability in speech can be useful rather than being disregarded as noise.
Recommended Citation
Kang, Jaekoo, "The Effect of Speaking Rate on Vowel Variability Based on the Uncontrolled Manifold Approach and Flow-Based Invertible Neural Network Modeling" (2021). CUNY Academic Works.
https://academicworks.cuny.edu/gc_etds/4547