Date of Award
Spring 6-2026
Document Type
Thesis
Degree Name
Master of Science (MS)
Department/Program
Digital Forensics and Cybersecurity
Language
English
First Advisor or Mentor
Hunter Johnson
Second Reader
Fatma Najar
Third Advisor
Shweta Jain
Abstract
The rapid adoption of Large Language Models (LLMs) in software development has transformed coding practices by enabling automated code generation, completion, and optimization. Despite these advantages, concerns persist regarding the security and reliability of LLM-generated code. This study presents a comprehensive evaluation of both the functional correctness and security of code produced by three prominent LLMs as of early 2026. A total of 4,800 code snippets were generated using 100 security-focused programming prompts derived from the OWASP Top 10:2025, translated across eight natural languages and two phrasing styles (literal and natural developer-oriented prompts). To assess performance, a multi-stage experimental framework was developed, incorporating automated code generation, syntax auditing, and cross-model evaluation. Each code sample was scored for correctness and security using LLM-based evaluators, with additional classification of vulnerabilities based on OWASP risk categories and Common Weakness Enumeration (CWE) root causes. The findings reveal a consistent and significant “security-correctness gap,” where LLMs frequently produce functionally correct code that contains exploitable vulnerabilities. Statistical analysis indicates only a moderate correlation between correctness and security, reinforcing that functional success does not imply safety.
Recommended Citation
Gonzalez Ayala, Christopher Brian, "The Security of LLM-Generated Code" (2026). CUNY Academic Works.
https://academicworks.cuny.edu/jj_etds/398
Included in
Applied Linguistics Commons, Artificial Intelligence and Robotics Commons, Computational Linguistics Commons, Cybersecurity Commons, Data Science Commons, Information Security Commons, Programming Languages and Compilers Commons, Science and Technology Policy Commons, Science and Technology Studies Commons, Software Engineering Commons
