Abstract
In modern cloud environments, efficient management of computational resources is a critical challenge due to the growing demand for scalable and high-performance applications. Horizontal scaling in Kubernetes (K8s) clusters is essential for dynamically adjusting resources to match workload demands. However, their reactive nature often limits traditional autoscaling methods like K8s Horizontal Pod Autoscaler (HPA), leading to inefficiencies under variable loads. To overcome these limitations, more advanced and adaptive scaling approaches are needed. Thus, this study introduces an adaptive approach to horizontal scaling in K8s clusters using Artificial Neural Networks (ANNs) for load forecasting, referred to as ANN-HS. The proposed method aims to enhance the efficiency of resource consumption and optimize replica allocation compared to the standard HPA. By leveraging pre-trained regression models, ANN-HS dynamically adjusts resources to meet varying demands, ensuring adherence to latency requirements and improving overall system performance. Experimental results demonstrate that ANN-HS outperforms traditional HPA methods, offering a scalable and flexible solution for managing microservices in cloud environments. This approach provides a robust framework for optimizing horizontal scaling in Kubernetes, contributing to the advancement of intelligent resource management in cluster computing. Experimental results show that ANN-HS significantly improves resource utilization compared to Kubernetes’ HPA. Specifically, ANN-HS reduces CPU consumption by approximately 50% while maintaining Service Level Agreement (SLA) compliance with an average violation rate of less than 10%. Additionally, ANN-HS reduces the number of replicas needed by 66.67%, optimizing resource allocation under varying load conditions.













Similar content being viewed by others
Data availability
The datasets generated and/or analysed during the current study are available in the Mendeley Data repository, https://data.mendeley.com/datasets/ks9vbv5pb2/1.
Materials availibility
Not applicable
Code availibility
Not applicable
References
Tran, M.-N., Vu, D.-D., Kim, Y.: A survey of autoscaling in kubernetes. In: 2022 Thirteenth International Conference on Ubiquitous and Future Networks (ICUFN), pp. 263–265 (2022). IEEE
Huo, Q., Li, S., Xie, Y., Li, Z.: Horizontal pod autoscaling based on kubernetes with fast response and slow shrinkage. In: 2022 International Conference on Artificial Intelligence, Information Processing and Cloud Computing (AIIPCC), pp. 203–206 (2022). IEEE
Kuranage, M.P.J., Hanser, E., Nuaymi, L., Bouabdallah, A., Bertin, P., Al-Dulaimi, A.: Ai-assisted proactive scaling solution for cnfs deployed in kubernetes. In: 2023 IEEE 12th International Conference on Cloud Networking (CloudNet), pp. 265–273 (2023). IEEE
Augustyn, D.R., Wyciślik, Ł, Sojka, M.: Tuning a kubernetes horizontal pod autoscaler for meeting performance and load demands in cloud deployments. Appl. Sci. 14(2), 646 (2024)
Senjab, K., Abbas, S., Ahmed, N., Khan, A.U.R.: A survey of kubernetes scheduling algorithms. J. Cloud Comp. 12(1), 87 (2023)
Zafeiropoulos, A., Fotopoulou, E., Filinis, N., Papavassiliou, S.: Reinforcement learning-assisted autoscaling mechanisms for serverless computing platforms. Sim. Modell. Prac. Theory 116, 102461 (2022)
Tamiru, M.A., Tordsson, J., Elmroth, E., Pierre, G.: An experimental evaluation of the kubernetes cluster autoscaler in the cloud. In: 2020 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), pp. 17–24 (2020). https://doi.org/10.1109/CloudCom49646.2020.00002
Balla, D., Simon, C., Maliosz, M.: Adaptive scaling of kubernetes pods. In: NOMS 2020 - 2020 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5 (2020). https://doi.org/10.1109/NOMS47738.2020.9110428
Nguyen, H.T., Van Do, T., Rotter, C.: Scaling upf instances in 5g/6g core with deep reinforcement learning. IEEE Access 9, 165892–165906 (2021)
Horizontal Pod Autoscaling. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/. Accessed: 2024-06-13 (2024). https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
Nguyen, T.-T., Yeom, Y.-J., Kim, T., Park, D.-H., Kim, S.: Horizontal pod autoscaling in kubernetes for elastic container orchestration. Sensors (2020). https://doi.org/10.3390/s20164621
Shim, S., Dhokariya, A., Doshi, D., Upadhye, S., Patwari, V., Park, J.-Y.: Predictive auto-scaler for kubernetes cloud. In: 2023 IEEE International Systems Conference (SysCon), pp. 1–8 (2023). https://doi.org/10.1109/SysCon53073.2023.10131106
Silva, S.N., Goldbarg, M.A.S.d.S., Silva, L.M.D.d., Fernandes, M.A.C.: Application of fuzzy logic for horizontal scaling in kubernetes environments within the context of edge computing. Future Internet 16(9) (2024) https://doi.org/10.3390/fi16090316
Khaleq, A.A., Ra, I.: Intelligent autoscaling of microservices in the cloud for real-time applications. IEEE Access 9, 35464–35476 (2021)
Toka, L., Dobreff, G., Fodor, B., Sonkoly, B.: Adaptive ai-based auto-scaling for kubernetes. In: 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), pp. 599–608 (2020). https://doi.org/10.1109/CCGrid49817.2020.00-33
Yuan, H., Liao, S.: A time series-based approach to elastic kubernetes scaling. Electronics (2024). https://doi.org/10.3390/electronics13020285
Dang-Quang, N.-M., Yoo, M.: Deep learning-based autoscaling using bidirectional long short-term memory for kubernetes. Appl. Sci. (2021). https://doi.org/10.3390/app11093835
Yan, M., Liang, X., Lu, Z., Wu, J., Zhang, W.: Hansel: Adaptive horizontal scaling of microservices using bi-lstm. Appl. Soft Comp. 105, 107216 (2021). https://doi.org/10.1016/j.asoc.2021.107216
Violos, J., Tsanakas, S., Theodoropoulos, T., Leivadeas, A., Tserpes, K., Varvarigou, T.: Intelligent horizontal autoscaling in edge computing using a double tower neural network. Comp. Netw. 217, 109339 (2022). https://doi.org/10.1016/j.comnet.2022.109339
Zerwas, J., Krämer, P., Ursu, R.-M., Asadi, N., Rodgers, P., Wong, L., Kellerer, W.: KapetÁnios: Automated kubernetes adaptation through a digital twin. In: 2022 13th International Conference on Network of the Future (NoF), pp. 1–3 (2022). https://doi.org/10.1109/NoF55974.2022.9942649
Toka, L., Dobreff, G., Fodor, B., Sonkoly, B.: Machine learning-based scaling management for kubernetes edge clusters. IEEE Trans. Netw. Ser. Manag. 18(1), 958–972 (2021). https://doi.org/10.1109/TNSM.2021.3052837
MicroK8s: Lightweight Kubernetes. https://microk8s.io/. Acesso em: 18-07-2023
The Apache Software Foundation: Apache JMeter. https://jmeter.apache.org/. Acesso em: 18-07-2023 (2023)
The Prometheus Authors: Prometheus. https://prometheus.io/. Acesso em: 18-07-2023 (2023)
Fernandes, M.: Horizontal Scaling in Kubernetes Dataset Using Artificial Neural Networks for Load Forecasting (2024). https://doi.org/10.17632/ks9vbv5pb2.1
Red Hat: Fabric8 Kubernetes-Client. https://github.com/fabric8io/kubernetes-client. Acesso em: 18-07-2023 (2023)
Xiao, Z., Hu, S.: Dscaler: A horizontal autoscaler of microservice based on deep reinforcement learning. In: 2022 23rd Asia-Pacific Network Operations and Management Symposium (APNOMS), pp. 1–6 (2022). https://doi.org/10.23919/APNOMS56106.2022.9919994
Acknowledgements
The authors would like to express their gratitude to the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) for providing financial support. This research was also made possible with the support of the InovAI Laboratory at UFRN and the Santa Cruz Campus of IFRN.
Funding
Not applicable
Author information
Authors and Affiliations
Contributions
All authors have contributed in various degrees to ensure the quality of this work (e.g., L.M.D.d.S., P.V.A.A., S.N.S, and M.A.C.F. conceived the idea and experiments; L.M.D.d.S., P.V.A.A., S.N.S, and M.A.C.F. designed and performed the experiments; L.M.D.d.S., P.V.A.A., S.N.S., and M.A.C.F. analyzed the data; L.M.D.d.S., S.N.S, and M.A.C.F wrote the paper. L.M.D.d.S and M.A.C.F. coordinated the project). All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethical approval
Not applicable
Consent to participate
Not applicable
Consent for publication
All authors agreed with the content and gave explicit consent to submit.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
da Silva, L.M.D., Alves, P.V.A., Silva, S.N. et al. Adaptive horizontal scaling in kubernetes clusters with ANN-based load forecasting. Cluster Comput 28, 176 (2025). https://doi.org/10.1007/s10586-024-04887-5
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10586-024-04887-5