Dissertations, Theses, and Capstone Projects
Date of Degree
2-2017
Document Type
Thesis
Degree Name
M.A.
Program
Linguistics
Advisor
William Sakas
Subject Categories
Computational Linguistics
Keywords
semantic search, information retrieval
Abstract
Many modern information retrieval systems work by using keyword search to locate documents in an inverted index by matching those documents based on terms in a user’s query. While highly effective for many use-cases, one notable drawback to simple keyword-based searching is that the contextual knowledge surrounding the user’s underlying information need may be lost, particularly if the user’s query terms are ambiguous or have multiple meanings. Research in the field of semantic search aims to make progress towards resolving this. One methodology in particular, explicit semantic analysis, works by modeling a document not only as a set of the unique terms it contains but also as a set of concepts which describe it; these concepts are derived from some authoritative or curated source and assigned to each document in a collection. This paper presents a prototype information retrieval system called “ES-ESA” which borrows from the principles of explicit semantic analysis and implements them using the Elasticsearch framework. The ES-ESA system is qualitatively evaluated using a corpus of academic research abstracts.
Recommended Citation
Sloan, Brian D., "ES-ESA: An Information Retrieval Prototype Using Explicit Semantic Analysis and Elasticsearch" (2017). CUNY Academic Works.
https://academicworks.cuny.edu/gc_etds/1869
Code repository for ES-ESA