Date of Degree


Document Type

Capstone Project

Degree Name



Data Analysis & Visualization


Matthew Gold

Subject Categories

Computational Linguistics | Digital Humanities | Library and Information Science | Science and Technology Policy | Science and Technology Studies


Small Business Innovation Research, grants data, dictionary-based keyword extraction, economic geography, geospatial analysis, scientometrics


The Public Innovations Explorer ( is a web-based tool created using Node.js, D3.js and Leaflet.js that can be used for investigating awards made by Federal agencies and departments participating in the Small Business Innovation Research (SBIR) and Small Business Technology Transfer (STTR) grant-making programs between 2008 and 2018. By geocoding the publicly available grants data from, the Public Innovations Explorer allows users to identify companies performing publicly-funded innovative research in each congressional district and obtain dynamic district-level summaries of funding activity by agency and year. Applying spatial clustering techniques on districts' employment levels across major economic sectors provides users with a way of examining patterns in the underlying economic activities of districts alongside Federally-funded innovation research activities taking place in a district. Finally, mathematical and dictionary-based text-mining techniques are used to derive district-level keyword details and provide users with access to some basic keyword stats for each district. Among other sources, the Explorer utilizes vocabulary sources from the European Commission, the United Nations and Leibniz Information Centre for Economics and builds on the National Institute of Health Office of Portfolio Analysis’s NLPre Pipeline available on Github to index keywords extracted from the text of grant records. The project seeks to contribute to work in research fields like scientometrics, economic geography, and in the nonprofit and philanthropy sector by developing and documenting data processing techniques and a user-interface fit for exploring geographic and thematic trends across grant datasets.

public-innovations-explorer-20210514140239.warc (44107 kB)
Archived website as a WARC file, created using – web archive player available at (391617 kB)
Contents of the Github repository containing all scripts, project data and artifacts as of May 14, 2021