Search for collections on FTS Digilib

Prescriptive Learning Analytics for Student Dropout: Integrating Temporal Velocity and Counterfactual Explanations in Longitudinal Data

Hidayat, Nurul and Afuan, Lasmedi and Jannah, Helmi Roichatul (2026) Prescriptive Learning Analytics for Student Dropout: Integrating Temporal Velocity and Counterfactual Explanations in Longitudinal Data. Journal of Computing Theories and Applications, 3 (4). pp. 627-646. ISSN 3024-9104

[thumbnail of 15920-Article Text-57136-1-10-20260521.pdf] Text
15920-Article Text-57136-1-10-20260521.pdf - Published Version
Available under License Creative Commons Attribution.

Download (798kB)

Abstract

Student dropout in higher education remains a persistent socioeconomic challenge, yet many predictive models reported in the literature are methodologically compromised by randomized cross-validation schemes that introduce temporal data leakage and artificially inflate predictive performance. This study proposes a longitudinal prescriptive learning analytics framework integrating three complementary methodological components: a Leave-One-Cohort-Out (LOCO) temporal validation protocol, a hybrid SMOTE-ENN class balancing strategy, and temporal velocity feature engineering derived from Learning Management System (LMS) behavioral trajectories. The framework was evaluated on a longitudinal dataset comprising 464,739 enrollment records and 77 features. Five predictive algorithms—XGBoost, LightGBM, CatBoost, Random Forest, and Logistic Regression—were comparatively assessed on a strictly isolated blind holdout cohort (2022), with CatBoost emerging as the champion estimator, achieving a PR-AUC of 0.8859, a Macro F1-Score of 0.9143, and the lowest Brier Score (0.0221), thereby demonstrating superior calibration and discriminative capability under severe class imbalance (93:7 ratio). Comprehensive ablation analysis revealed that temporal velocity features function not merely as additive predictors, but as a structural prerequisite enabling Synthetic Minority Oversampling Technique with Edited Nearest Neighbors (SMOTE-ENN) to generate high-quality synthetic boundary instances; removing these features reduced minority-class precision from 0.8302 to 0.6721. To operationalize predictive outputs into actionable intervention pathways, Diverse Counterfactual Explanations (DiCE) were implemented under a three-tier causal constraint architecture on 96 borderline high-risk students, generating 384 feasible intervention scenarios exclusively targeting forward-looking behavioral velocity metrics without constraint violations. Collectively, these findings advance the paradigm of prescriptive learning analytics by providing educational institutions with interpretable risk diagnostics and operationally feasible intervention guidance grounded in empirically validated behavioral and temporal dynamics.

Item Type: Article
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Depositing User: dl fts
Date Deposited: 22 May 2026 00:18
Last Modified: 22 May 2026 00:18
URI: https://dl.futuretechsci.org/id/eprint/185

Actions (login required)

View Item
View Item