Publications and Research

Document Type


Publication Date

Spring 4-1-2024


This dataset corresponds to a study investigating the performance outcomes of students enrolled in two sections of an introductory statistics course at a community college in New York. The study, titled "Examining Differences in Performance Outcomes between Statistics Classes using High-coding vs. Low-coding Statistical Software Packages," explores the impact of utilizing different statistical software packages (R and SPSS) on student performance and motivation. The dataset comprises assessments administered to participants, including the Mathematics Motivation Questionnaire, Reading Comprehension Assessment, Algebra Assessment, Statistics Assessment, and Coding Assessment. Participants were divided into two sections: one utilizing R and the other utilizing SPSS for statistical analysis. The dataset includes responses from 14 students in the SPSS section and 4 students in the R section, collected during May 2022. Key findings from the study include no evidence of a mean difference in Statistical Comprehension between the sections, higher motivation levels among students in the R section, and slightly better performance on the Coding Assessment by students in the SPSS section. Additionally, a strong association between Algebra Knowledge and Statistical Comprehension was observed. This dataset provides valuable insights into the effects of different instructional methods and software packages on student outcomes in statistics education, offering opportunities for further analysis and exploration in the field of data science education.