Dissertations, Theses, and Capstone Projects

Date of Degree

6-2016

Document Type

Thesis

Degree Name

M.A.

Program

Linguistics

Advisor

William Sakas

Subject Categories

Computational Linguistics | Discourse and Text Linguistics | Semantics and Pragmatics

Keywords

event extraction, semantic parsing

Abstract

While event extraction and automatic summarization have taken great strides in the realm of news stories, fictional narratives like fairy tales have not been so fortunate. A number of challenges arise from the literary elements present in fairy tales that are not found in more straightforward corpora of natural language, such as archaic expressions and sentence structures. To aid in summarization of fictional texts, I created an class - a template for a digital object, in this case a semantic and story event - that captures elements predicted to help classify events as important for inclusion. I wrote a processor to run over a new corpus of fairy tales and parse them into events, then to be manually annotated for importance and clustered to create a classifier that could be used on novel texts. These two latter steps could not be completed before publication of this paper, but my proposed approach is covered extensively, as is future work along the same lines. Along with this paper, supplementary materials encompassing the event class, processor code, corpus in full text, and corpus as un-annotated event data are included.

StoryTime_Code.zip (1135 kB)
Event class, parser code, and readme file

StoryTime_Output.zip (7106 kB)

Share

COinS