Search for collections on FTS Digilib

Log-Transformed Regime-Based Prediction of Cloud Job Length Using Machine Learning

Pujiyanta, Ardi and Robiin, Bambang and Rahani, Faisal Fajri (2026) Log-Transformed Regime-Based Prediction of Cloud Job Length Using Machine Learning. Journal of Computing Theories and Applications, 3 (4). pp. 487-501. ISSN 3024-9104

[thumbnail of 15866-Article Text-55970-1-10-20260421.pdf] Text
15866-Article Text-55970-1-10-20260421.pdf - Published Version
Available under License Creative Commons Attribution.

Download (594kB)

Abstract

Cloud job-length prediction remains challenging when the target distribution is highly skewed and contains rare extreme values. This study proposes a log-transformed, regime-based machine learning framework for robust prediction of cloud job length, represented in million instructions (MI). The approach integrates sequential feature engineering, logarithmic target transformation, weighted learning, and regime-aware modeling to distinguish between normal and extreme job-length behavior. Using an ordered GoCJ-derived cloud job-length sequence of 1000 jobs, the dataset exhibits a heavy-tailed distribution, with a mean of 129,662 MI, a median of 93,000 MI, a 95th percentile of 525,000 MI, a 99th percentile of 900,000 MI, and a skewness of 3.695. The proposed model is evaluated against sequential baselines and stronger machine learning baselines, including Naive_Last, RollingMean_5, Global_Log_ExtraTrees, RandomForest, GradientBoosting, and MLP_Log. On the main test split, the proposed Regime_Log_ExtraTrees achieved the best RMSE of 206,255.66 and the least negative R² of −0.01062, while Global_Log_ExtraTrees remained competitive in terms of MAE, MedAE, and RMSLE. Additional walk-forward validation confirms that the regime-aware model consistently achieves the best mean RMSE and mean R² across temporal folds. Ablation results further show that regime-aware learning is the primary contributor to robustness, although accurate prediction of extreme jobs remains challenging. These findings indicate that log-transformed, regime-based learning provides a practical and more robust strategy for cloud job-length prediction under heavy-tailed workload conditions.

Item Type: Article
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Depositing User: dl fts
Date Deposited: 21 Apr 2026 06:10
Last Modified: 21 Apr 2026 06:10
URI: https://dl.futuretechsci.org/id/eprint/176

Actions (login required)

View Item
View Item