Date of Degree

9-2020

Document Type

Thesis

Degree Name

M.A.

Program

Linguistics

Advisor

Kyle Gorman

Subject Categories

Computational Linguistics

Keywords

natural language processing, machine learning

Abstract

This thesis presents experiments with using representation learning to explore how neural networks learn. Neural networks which take text as input create internal representations of the text during their training. Recent work has found that these representations can be used to perform other downstream linguistic tasks, such as part-of-speech (POS) tagging. This demonstrates that the neural networks are learning linguistic information and storing this information in the representations. We focus on the representations created by neural machine translation (NMT) models and whether they can be used in POS tagging. We train 5 NMT models including an auto-encoder. We extract the encoder from each model and utilize the representations that the encoder produces to train a hand-crafted Encoder-Tagger (ET) model to do POS tagging. We explore the impact of various features including NMT target language, NMT BLEU score, encoder depth, sequence length, token frequency, and percentage of out-of-vocabulary (OOV) tokens in a sequence. We find that NMT encoder representations contain sufficient linguistic information to perform POS tagging and that there are correlations between several features, which helps us to better understand the inner workings of neural networks.

Recommended Citation

Campbell, Emily, "Does the Word "Chien" Bark? Representation Learning in Neural Machine Translation Encoders" (2020). CUNY Academic Works.
https://academicworks.cuny.edu/gc_etds/4055

Download

Included in

Computational Linguistics Commons

COinS

CUNY Academic Works

Dissertations, Theses, and Capstone Projects

Does the Word "Chien" Bark? Representation Learning in Neural Machine Translation Encoders

Date of Degree

Document Type

Degree Name

Program

Advisor

Subject Categories

Keywords

Abstract

Recommended Citation

Included in

Browse

Search

Author Corner

Links

CUNY Academic Works

Dissertations, Theses, and Capstone Projects

Does the Word "Chien" Bark? Representation Learning in Neural Machine Translation Encoders

Author

Date of Degree

Document Type

Degree Name

Program

Advisor

Subject Categories

Keywords

Abstract

Recommended Citation

Included in

Share

Browse

Search

Author Corner

Links