Algorithmic Fairness in Student On-Time Graduation Prediction

Main Article Content

Ayman Alfahid

Abstract

This study builds a fair and accurate algorithm to predict student on-time graduation. We examined the predictive power and fairness of three data sources: Admission, Academic, and a combination of the two. The results showed that the Academic data was the most effective predictor, while the admission data recorded very poor performance with notable gender bias. The combined dataset produced results similar to the Academic data, indicating the redundancy of the admission data. Also, out of the three models investigated (LR, RF, and XGBoost), Logistic Regression was selected as it recorded similar performance to other models while offering the advantages of simplicity, efficiency, and interpretability. To improve fairness, we implemented two separate strategies: "fairness through unawareness" and "fairness through awareness". The seemingly intuitive "fairness through unawareness" approach, which involved the removal of the sensitive feature, gender, not only failed to improve fairness but inadvertently exacerbated biases. However, the "fairness through awareness" approach, through threshold adjustments, significantly improved fairness without sacrificing model accuracy, challenging some long-held beliefs regarding the trade-off between fairness and accuracy.

Article Details

Section
Articles