Date of Degree

6-2022

Document Type

Thesis

Degree Name

M.A.

Program

Linguistics

Advisor

Kyle Gorman

Subject Categories

Linguistics

Keywords

Japanese, English, loanwords, transliteration, reversed

Abstract

We introduce the problem of gairaigo hanran ‘loanwords flood’in Japanese and the difficulties of understanding the loanwords by English speakers who also communicate in Japanese and the necessity of converting the loanwords written in katakana back to English, the reverse transliteration. We analyze the issues for this task and propose using computational methods to solve them. We create our own katakana-English loanwords dictionary as the data and use three computational models --- pair n-gram, LSTM and transformer models to work on this reverse transliteration task. We also modify the three models with an English lexicon filter. The six models are applied with two approaches: direct transliteration, where we directly transliterate from katakana or romaji to English; and indirect transliteration, where we use the English word’s pronunciation as the medium for transliterating Japanese characters to English. The models with a lexicon filteroutperformed the models without a lexicon filter, and we reach the lowest word error rate from the pair n-gram model with lexicon filter in the direct transliteration from romaji to English.

This work is embargoed and will be available for download on Friday, December 09, 2022

Graduate Center users:
To read this work, log in to your GC ILL account and place a thesis request.

Non-GC Users:
See the GC’s lending policies to learn more.

Included in

Linguistics Commons

Share

COinS