Kikunda, Philippe Boribo and Kasongo, Issa Tasho and Nsabimana, Thierry and Ndikumagenge, Jérémie and Ndayisaba, Longin and Mushengezi, Elie Zihindula and Kala, Jules Raymond (2025) Predicting First-Year Student Performance with SMOTE-Enhanced Stacking Ensemble and Association Rule Mining for University Success Profiling. Journal of Computing Theories and Applications, 3 (2). pp. 132-144. ISSN 3024-9104
14043-Article Text-51292-1-10-20250930.pdf - Published Version
Available under License Creative Commons Attribution.
Download (999kB) | Preview
Abstract
This study examines the application of Educational Data Mining (EDM) to predict the academic per-formance of first-year students at the Catholic University of Bukavu and the Higher Institute of Edu-cation (ISP) in the Democratic Republic of Congo. The primary objective is to develop a model that can identify at-risk students early, providing the university with a tool to enhance student support and academic guidance. To address the challenges posed by data imbalance (where successful cases outnumber failures), the study adopts a hybrid methodological approach. First, the SMOTE algorithm was applied to balance the dataset. Then, a stacking classification model was developed to combine the predictive power of multiple algorithms. The variables used for prediction include the National Exam score (PEx), the secondary school track (Humanities), and the type of prior institution (public, private, or religious-affiliated schools), as well as age and sex. The results demonstrate that this approach is highly effective. The model is not only capable of predicting success or failure but also of forecasting students' performance levels (e.g., honors or distinctions). Moreover, the use of the Apriori association rule mining algorithm allowed the identification of faculty-specific success profiles, transforming prediction into an interpretable decision-support tool. This research makes several significant contributions. Practically, it provides the University of Bukavu with a tool for student orientation and early risk detection. Methodologically, it illustrates the effectiveness of a combined approach to EDM in an African context. However, the study acknowledges certain limitations, including the non-public nature of the data and the geographical specificity of the sample. It therefore proposes avenues for future research, such as the integration of Explainable AI (XAI) techniques for more refined and transparent analysis of the results.
Item Type: | Article |
---|---|
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Depositing User: | dl fts |
Date Deposited: | 30 Sep 2025 15:07 |
Last Modified: | 30 Sep 2025 15:07 |
URI: | https://dl.futuretechsci.org/id/eprint/130 |