Abstract:
Regional comprehensive universities offer accessible and diverse undergraduate educational programs, while grappling with funding cuts and affordability. The study’s first research question underscores the enduring importance of factors such as student characteristics, pre-college characteristics, and financial situations. The findings highlight high school GPA's (HS GPA) pivotal role in academic performance. Higher HS GPAs correlate with successful academic performance resulting in higher retaining likelihoods; conversely, lower HS GPAs are associated with academic struggles and increased departure likelihoods. HS curriculum variables also impact academic performance, notably in extreme gradient boosting (XGBoost) models. The second research question centers on the algorithms’ predictive power. XGBoost and random forest models consistently outperform the other models in predicting GPAs. Prioritizing area under the curve values for retention, both XGBoost and random forest models are statistically comparable for developing predictive algorithms, despite facing challenges with low specificity rates. Only slight enhancements in predictions were detected in the upsample ensemble learning models.
Implications for practice underscore the importance of targeted interventions through leveraging data science techniques and machine learning algorithms to identify and allocate support resources for at-risk students. This research significantly contributes to the discussion on student success in higher education by providing practical insights and guiding evidence-based practices. As education evolves, integrating data science into strategic planning becomes pivotal for shaping the trajectory of student success initiatives.