William Sakas

Computational Linguistics | Discourse and Text Linguistics | Semantics and Pragmatics


event extraction, semantic parsing


While event extraction and automatic summarization have taken great strides in the realm of news stories, fictional narratives like fairy tales have not been so fortunate. A number of challenges arise from the literary elements present in fairy tales that are not found in more straightforward corpora of natural language, such as archaic expressions and sentence structures. To aid in summarization of fictional texts, I created an class - a template for a digital object, in this case a semantic and story event - that captures elements predicted to help classify events as important for inclusion. I wrote a processor to run over a new corpus of fairy tales and parse them into events, then to be manually annotated for importance and clustered to create a classifier that could be used on novel texts. These two latter steps could not be completed before publication of this paper, but my proposed approach is covered extensively, as is future work along the same lines. Along with this paper, supplementary materials encompassing the event class, processor code, corpus in full text, and corpus as un-annotated event data are included.

