Date Approved
2025
Degree Type
Open Access Senior Honors Thesis
Department or School
Mathematics and Statistics
First Advisor
Andrew Ross, Ph.D.
Second Advisor
Mary-Elizabeth Murphy, Ph.D
Third Advisor
Ann R. Eisenberg, Ph.D.
Abstract
Early initiation of substance use during adolescence poses significant risks to long-term health, educational attainment, and social outcomes, making early identification a critical public health priority. Machine learning models have increasingly been used to predict substance use risk; however, many such models do not explicitly examine whether predictive performance differs across demographic groups. This senior project examines the fairness of logistic regression models used to predict first-time alcohol use among adolescents. Using nationally representative survey data from the Youth Risk Behavior Surveillance System (YRBSS), pooled across the 2017, 2019, 2021, and 2023 survey cycles, this study develops logistic regression–based predictive models to estimate the likelihood of first-time alcohol use among middle school students. Model performance is evaluated using standard classification metrics, and fairness is assessed through group-level comparisons of predicted outcomes and error rates across demographic groups. The results indicate that while the baseline models achieve acceptable overall predictive performance, differences in predictions and error rates across demographic groups are observed. These findings underscore the importance of evaluating fairness alongside accuracy when applying machine learning models to sensitive public health contexts such as adolescent substance use prediction.
Recommended Citation
Nworgu, Stephanie, "Fairness-aware and culturally adaptive machine learning for predicting adolescent substance use" (2025). Senior Honors Theses and Projects. 880.
https://commons.emich.edu/honors/880