Dissertations, Theses, and Capstone Projects

Date of Degree

9-2022

Document Type

Thesis

Degree Name

M.A.

Program

Linguistics

Advisor

Kyle Gorman

Subject Categories

Computational Linguistics

Keywords

Bert, discourse analysis, pragmatics, transformers, deep learning, classification relation extraction transfer learning

Abstract

Research efforts in transfer learning have gained massive popularity in recent years. Pretrained language models have demonstrated the most successful results in producing high quality neural networks capable of quality inference after training across domains via transfer learning. This study expands on the domain transfer introduced in \cite{ferracane-etal-2019-news} exploring neural methods for transfer learning of discourse parsing between a news source domain and a medical target domain. \cite{ferracane-etal-2019-news} specifically discuss transfer learning from news articles to PubMed medical journal articles. Experiments in transfer learning in the current work expand to include three domains: Wall Street Journal articles previously annotated with Rhetorical Structure Theory relations, PubMed abstracts, and earnings calls transcripts. BERT pretrained on scientific data, called SciBert \cite{beltagy-etal-2019-scibert}, is used. Experiments are conducted to fine tune SciBert on Wall Street Journal articles and Earnings calls transcripts. The transcripts are annotated through the rstWeb tool (Zeldes 2016) with a subset of RST labels labeling relations between clauses. Results demonstrate progress in transfer learning between distinct domains is extremely challenging. A novel BERT model pretrained on earnings calls data is introduced. There are multiple avenues for innovation and improvement to explore. In-domain training where the pretrained model domain matches the domain of the fine tuned data yielded better results.

Share

COinS