Dissertations, Theses, and Capstone Projects

Date of Degree


Document Type

Capstone Project

Degree Name



Data Analysis & Visualization


Howard T. Everson

Subject Categories

Data Science | Educational Assessment, Evaluation, and Research | Educational Methods


machine learning, math proficiency, multinominal logistic regression, SPSS Modeler


A principal goal of this project was to compare several machine learning (ML) algorithms to explore and validate math proficiency classifications based on standardized test scores. The data used in these analyses came from the 6th-grade students’ mathematics assessment records of the New York State Education Department’s Testing Program (NYSTP). Our approach was to test a number of competing machine learning (ML) algorithms for classifying students’ as proficient based on their test scores and other demographic information. Our samples were drawn from the 2016 test-taking cohort of 6th-grade students (N=156,800). Five classifiers including multinominal logistic regression (MLR), XGBoost, Tree-As, Lagrangian support vector machine (LSVM), and C5.0 Decision Tree algorithm were used to establish the best predictive model. Experimental results demonstrated that multinominal logistic regression had a better performance than other ML algorithms.

Capstone_Project.zip (27824 kB)
Zip file of the GitHub repository for the capstone project