Date of Degree


Document Type

Capstone Project

Degree Name



Liberal Studies


Matthew Gold

Subject Categories

Data Science


Jeopardy, Topic Modeling, Data Visualization, Data Science, Trivia


The gameshow Jeopardy! has been around in its current iteration—hosted by Alex Trebek—since 1984. During this time, it has accumulated data on clues, contestants, and possible strategies on how to win. Using a crowd-sourced archive called J! Archive, this project seeks to find trends in the topics that the game covers and take a deeper look into the performance of its contestants. It employs topic modeling, a text-analysis method, to organize the hundreds of thousands of archived clues and statistical analysis to rate the performance of contestants by gender. Using web-based visualization tools, the data is shown in an interactive and understandable way. The main goal of this project is to take a dive into the data, analyze a series of points, create a robust database for the show, and connect Jeopardy! to a larger cultural, social, and political context. This will allow for further analysis and visualizations to be done in the future.


Online component:

jeopardy-20200906022924.warc (593 kB)
Website WARC file

jeopardy.sql (71990 kB)
Database backup (18 kB)
Python scripts for scraping J! Archive (202 kB)
HTML/CSS/JavaScript files for project website

Included in

Data Science Commons