Dissertations, Theses, and Capstone Projects

Date of Degree


Document Type

Capstone Project

Degree Name



Data Analysis & Visualization


Timothy Shortell

Subject Categories

Categorical Data Analysis | Data Science | Geographic Information Sciences | Graphic Communications | Social and Cultural Anthropology | Social Influence and Political Communication | Social Media


topic modeling, sentiment analysis, geographic analysis, data visualization, twitter, new york city, los angeles


As a resource for social data, Twitter’s platform has been used to measure the quality of life through sentiment analysis. This capstone project explores another methodological technique—querying Twitter data around specific keyword terms to determine dominant topics, word patterns, and sentiment leanings in a geographical area. Focusing on New York City and Los Angeles for comparative analysis, the keyword term “why” will be used to build a Python analysis around topic modeling and sentiment analysis. Using this approach, the analysis reveals social and cultural differences, the overall sentiment of tweets, and subjects of interest to tweeters.

GitHub Repository for all the files: https://github.com/shewilliams/whynyc.
Website: https://shewilliams.github.io/whynyc/.


Online component: https://shewilliams.github.io/whynyc/

whynyc.zip (209024 kB)
File contains Python and website files

sheryl-williams-2022-capstone-project-20220531185505.warc (7824 kB)
Archived version of project website
