Dissertations, Theses, and Capstone Projects
Date of Degree
6-2022
Document Type
Thesis
Degree Name
M.A.
Program
Linguistics
Advisor
Kyle Gorman
Subject Categories
Computational Linguistics
Keywords
machine learning, sarcasm detectinon, logistic regression, svm, news headlines
Abstract
Sarcasm and indirect language are commonplace for humans to produce and recognize but difficult for machines to detect. While artificial intelligence can accurately analyze sentiment and emotion in speech and text, it may struggle with insincere and sardonic content, although it is possible to train a machine to identify uttered and written sarcasm. This paper aims to detect sarcasm using logistic regression and a support vector machine (SVM) and compare their results to a baseline.
The models are trained on headlines from a Kaggle dataset containing headlines from the satirical news website The Onion and serious news website Huffpost (formerly The Huffington Post). The scope of the headlines include politics, pop culture and local news. Our findings indicate that logistic regression and the support vector classification perform far better than the dummy classifier.
Recommended Citation
Novic, Lara I., "A Machine Learning Approach to Text-Based Sarcasm Detection" (2022). CUNY Academic Works.
https://academicworks.cuny.edu/gc_etds/4856