Date of Degree

6-2022

Document Type

Thesis

Degree Name

M.A.

Program

Linguistics

Advisor

Kyle Gorman

Subject Categories

Computational Linguistics

Keywords

machine learning, sarcasm detectinon, logistic regression, svm, news headlines

Abstract

Sarcasm and indirect language are commonplace for humans to produce and recognize but difficult for machines to detect. While artificial intelligence can accurately analyze sentiment and emotion in speech and text, it may struggle with insincere and sardonic content, although it is possible to train a machine to identify uttered and written sarcasm. This paper aims to detect sarcasm using logistic regression and a support vector machine (SVM) and compare their results to a baseline.

The models are trained on headlines from a Kaggle dataset containing headlines from the satirical news website The Onion and serious news website Huffpost (formerly The Huffington Post). The scope of the headlines include politics, pop culture and local news. Our findings indicate that logistic regression and the support vector classification perform far better than the dummy classifier.

Share

COinS