Date of Degree


Document Type


Degree Name



Educational Psychology


Jay Verkuilen

Committee Members

Irvin Schonfeld

Renzo Bianchi

Howard Everson

Claire Wladis

Subject Categories

Applied Statistics | Occupational Health and Industrial Hygiene | Quantitative Psychology | Social Statistics | Statistical Methodology | Statistical Models


ordinality, monotonicity, clinical assessment, item response theory, PHQ-9, depression


Improper scale usage in psychological and clinical assessment is an important problem. If respondents do not use the scales in a consistent manner, the reliability of a composite is likely to be attenuated. This is particularly problematic when particular items are singled out for special treatment or when subscales are of interest, not just a total score. This study used both non-parametric and parametric item response theory (IRT) methods to gain further insight into the validity of the PHQ-9, a dual purpose instrument that assesses the severity of depressive symptoms using nine Likert-scale items and allows the investigator to establish provisional diagnoses of depressive disorders. The data was collected by Bianchi et al. (2015) across three separate cross-cultural samples of teachers. The analysis indicated that scale monotonicity was preserved, violations to ordinality occurred among a subset of items resulting in inconsistent scale usage within the different samples, and that language differences in the test administration primarily accounted for the differences in scale usage.