Student Theses
Date of Award
Spring 5-28-2026
Document Type
Thesis
Language
English
First Advisor
Arthur J. O'Connor
Abstract
Forecasting ridership for new transit infrastructure is difficult in the absence of observed outcomes, particularly under domain shift between an existing system and a proposed corridor. This study develops a station-level direct demand modeling (DDM) framework to forecast average weekday ridership for the proposed Interborough Express (IBX) in New York City — a 14-mile circumferential rapid transit corridor connecting Brooklyn and Queens. The approach pairs unsupervised learning with supervised estimation in a common feature space defined by transit service, accessibility, and built-environment characteristics. K-means clustering identifies latent station typologies (node–place regimes), and IBX stations are projected into this topology to define a corridor-relevant training subset, a strategy termed cluster-based domain focusing. Forecasting models — ordinary least squares, regularized linear estimators (Ridge, Lasso, Elastic Net), and tree-based ensembles (Random Forest, Gradient Boosting, XGBoost) — are evaluated through holdout testing and cross-validation. Regularized linear models outperform nonlinear alternatives in both predictive accuracy and stability, with Ridge regression selected as the preferred specification (out-of-sample R² ≈ 0.72 in the IBX-restricted domain). The resulting baseline forecast is approximately 61,000 average weekday entries, substantially below the MTA’s official projection of 115,000. Cross-model agreement is high among regularized estimators, indicating that the dominant predictive signal is approximately linear in log-transformed space and that divergence from official projections arises primarily from scenario assumptions — induced demand, long-run growth, and full network reconfiguration — rather than model misspecification. The results demonstrate that feature-space–based domain restriction provides a principled, empirically grounded framework for forecasting transit ridership under missing outcomes.
Recommended Citation
Kassoh, Fomba, "Forecasting Interborough Express Ridership Using Network-Based Station Typologies and Direct Demand Models" (2026). CUNY Academic Works.
https://academicworks.cuny.edu/sps_etds/4

Comments
Master's Research Project (Capstone), M.S. Data Science Program, City University of New York.
Course: DATA 698 - Master's Research Project.
Faculty Advisor: Professor Arthur J. O'Connor
April 2026