Date of Degree

2-2023

Document Type

Master's Thesis

Degree Name

Master of Arts

Program

Linguistics

Advisor

Kyle Gorman

Subject Categories

Linguistics

Keywords

Japanese, homograph, TTS

Abstract

Japanese writing is a complex system, and a large part of the complexity resides in the use of kanji. A single kanji character in modern Japanese may have multiple pronunciations, either as native vocabulary or as words borrowed from Chinese. This causes a problem for text-to-speech synthesis (TTS) because the system has to predict which pronunciation of each kanji character is appropriate in the context. The problem is called homograph disambiguation. In Japanese TTS technology, the trick in any case is to know which is the right reading, which makes reading Japanese text a challenge. To solve the problem, this research provides a new annotated Japanese single kanji character pronunciation data set and describes an experiment using logistic regression (LR) classifier. A baseline is computed to compare with the LR classifier accuracy. The LR classifier improves the modeling performance by 16%. This experiment provides the first experimental research in Japanese single kanji homograph disambiguation. The annotated Japanese data is freely released to the public to support further work.

Recommended Citation

Zhang, Wen, "Pronunciation Ambiguities in Japanese Kanji" (2023). CUNY Academic Works.
https://academicworks.cuny.edu/gc_etds/5243

Download

Included in

Linguistics Commons

COinS

Dissertations, Theses, and Capstone Projects

Pronunciation Ambiguities in Japanese Kanji

Date of Degree

Document Type

Degree Name

Program

Advisor

Subject Categories

Keywords

Abstract

Recommended Citation

Included in

Browse

Author Corner

Search

Links

Dissertations, Theses, and Capstone Projects

Pronunciation Ambiguities in Japanese Kanji

Author

Date of Degree

Document Type

Degree Name

Program

Advisor

Subject Categories

Keywords

Abstract

Recommended Citation

Included in

Share

Browse

Author Corner

Search

Links