Dissertations, Theses, and Capstone Projects
Date of Degree
9-2025
Document Type
Doctoral Dissertation
Degree Name
Doctor of Philosophy
Program
Computer Science
Advisor
Shweta Jain
Committee Members
Sarah Ita Levitan
Liang Zhao
Adam Tashman
Subject Categories
Computer Engineering
Keywords
bias, propaganda, Large Language Models, LLM
Abstract
In the digital age, the swift consumption of news and the widespread dissemination of propaganda significantly threaten democratic principles and informed decision-making. News organizations, often aligned politically, tailor their content to appeal to specific audiences, thereby perpetuating biases. Despite advancements in detecting propaganda through various computational techniques, including Large Language Models(LLM) and BERT-based approaches, significant challenges persist concerning data availability and the effectiveness of detection tools. This thesis explores the landscape of text-based propaganda and reviews related research in textual analysis. It introduces a novel method for collecting a social-media based dataset, which includes analysis of journalists' tweets to discern biases and propagandistic content. This method led to the creation of the "Journalist Media Bias on X" (JMBX), a publicly available dataset designed to explore the relationship between propagandist tone in text and the bias of news outlets. Additionally, the thesis presents "Propasafe," a BERT-based browser extension designed to detect and alert users to propagandistic content in news articles. By flagging potential propaganda in real-time, Propasafe enables users to critically assess news sources, fostering more objective and informed interactions with media. This tool also prioritizes user privacy while combating manipulative content. With rise of use of LLMs in annotation task, the thesis investigates their role and evaluates their performance on subjective tasks like propaganda detection. These evaluations help illuminate the strengths and limitations of LLM-driven annotations, especially in complex, nuanced tasks that typically require human judgment. Through these contributions, this thesis offers a holistic approach to text-based propaganda detection, supporting researchers, end-users, and annotators in the development of efficient, privacy-preserving, and transparent systems.
Recommended Citation
Sharma, Vivek, "Text-Based Propaganda Detection" (2025). CUNY Academic Works.
https://academicworks.cuny.edu/gc_etds/6462
