ML Algorithms that Dynamically Allocate CPU, Memory, and I/O Resources

Authors

  • Nagireddy Karri Senior IT Administrator Database, Sherwin-Williams, USA. Author

DOI:

https://doi.org/10.63282/3050-9416.IJAIBDCMS-V5I1P115

Keywords:

Reinforcement Learning, Predict-Then-Optimize, Kubernetes/cgroups, Tail Latency, Canary Rollout

Abstract

The cloud and edge platforms of the present time need to distribute CPU, memory, and I/O under unsteady, multi-tenanted workloads without violating service-level aims (SLOs), finances, and energy restrictions. This paper introduces a safety-first architecture that integrates the telemetry-based prediction and policy-based decision making. We have instrumented hosts and services with fine-grained signals CPU runnable depth and throttling, page-fault rate and cache residency, disk/NVMe queue depth and tail latency, NIC retransmits and engineer cross-resource features to measure the coupling effects. Developing these contributions, we consider three categories of the methods: (i) the supervised forecasters (gradient boosting, temporal CNN/Transformer) to predict demand in the short-horizon and SLO-risk, (ii) the constrained reinforcement learning to predict joint and continuous changes in the shares/quota of CPU, memory constraints/placement, and I/O priorities, (iii) the hybrid predict-then-optimize controllers with model-predictive guardrails. Automatic rollback and limited step sizes of canary-first rollout guarantee stability of production. Our solution to trace-driven experiments on OLTP microservices, streaming analytics, storage-heavy phases, and ML inference pipelines results in 20-45% p99 and SLO misses over well-tuned rule/PID baselines, and higher throughput and slightly reduced cost. We talk about design options such as design uncertainty-conscious headroom, equity rights, and oscillation parameters that transform raw profits into high quality results. Lastly, we present boundaries (domain shift, data heterogeneity) and future prospects in causal, carbon-sensitive and geo-distributed allocation. The findings suggest that ML-driven, policy-bound allocators have the potential to be useful control planes of dynamically managed, multi-resource management

References

1. Nunes, P., Santos, J., & Rocha, E. (2023). Challenges in predictive maintenance–A review. CIRP Journal of Manufacturing Science and Technology, 40, 53-67.

2. Esteban, A., Zafra, A., & Ventura, S. (2022). Data mining in predictive maintenance systems: A taxonomy and systematic review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 12(5), e1471.

3. Tekale, K. M., & Rahul, N. (2022). AI and Predictive Analytics in Underwriting, 2022 Advancements in Machine Learning for Loss Prediction and Customer Segmentation. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 3(1), 95-113. https://doi.org/10.63282/3050-9262.IJAIDSML-V3I1P111

4. Thallam, N. S. T. (2020). The Evolution of Big Data Workflows: From On-Premise Hadoop to Cloud-Based Architectures.

5. Achouch, M., Dimitrova, M., Ziane, K., Sattarpanah Karganroudi, S., Dhouib, R., Ibrahim, H., & Adda, M. (2022). On predictive maintenance in industry 4.0: Overview, models, and challenges. Applied sciences, 12(16), 8081.

6. Tekale, K. M., & Rahul, N. (2023). Blockchain and Smart Contracts in Claims Settlement. International Journal of Emerging Trends in Computer Science and Information Technology, 4(2), 121-130. https://doi.org/10.63282/3050-9246.IJETCSIT-V4I2P112

7. Thallam, N. S. T. (2023). Comparative Analysis of Public Cloud Providers for Big Data Analytics: AWS, Azure, and Google Cloud. International Journal of AI, BigData, Computational and Management Studies, 4(3), 18-29.

8. Naeem, M., Anpalagan, A., Jaseemuddin, M., & Lee, D. C. (2013). Resource allocation techniques in cooperative cognitive radio networks. IEEE Communications surveys & tutorials, 16(2), 729-744.

9. Parikh, S. M. (2013, November). A survey on cloud computing resource allocation techniques. In 2013 Nirma University International Conference on Engineering (NUiCONE) (pp. 1-5). IEEE.

10. Tekale, K. M. T., & Enjam, G. reddy . (2022). The Evolving Landscape of Cyber Risk Coverage in P&C Policies. International Journal of Emerging Trends in Computer Science and Information Technology, 3(3), 117-126. https://doi.org/10.63282/3050-9246.IJETCSIT-V3I1P113

11. Azar, D., Harmanani, H., & Korkmaz, R. (2009). A hybrid heuristic approach to optimize rule-based software quality estimation models. Information and Software Technology, 51(9), 1365-1376.

12. Garg, S., Sinha, S., Kar, A. K., & Mani, M. (2022). A review of machine learning applications in human resource management. International Journal of Productivity and Performance Management, 71(5), 1590-1610.

13. Venkata SK Settibathini. Optimizing Cash Flow Management with SAP Intelligent Robotic Process Automation (IRPA). Transactions on Latest Trends in Artificial Intelligence, 2023/11, 4(4), PP 1-21, https://www.ijsdcs.com/index.php/TLAI/article/view/469/189

14. CMMS vs Traditional Maintenance, WorrkTrek, 2024. online. https://worktrek.com/blog/cmms-vs-traditional-maintenance/

15. Sehrawat, S. K. (2023). The role of artificial intelligence in ERP automation: state-of-the-art and future directions. Trans Latest Trends Artif Intell, 4(4).

16. Tekale, K. M., Enjam, G. R., & Rahul, N. (2023). AI Risk Coverage: Designing New Products to Cover Liability from AI Model Failures or Biased Algorithmic Decisions. International Journal of AI, BigData, Computational and Management Studies, 4(1), 137-146. https://doi.org/10.63282/3050-9416.IJAIBDCMS-V4I1P114

17. Abd Wahab, N. H., Hasikin, K., Lai, K. W., Xia, K., Bei, L., Huang, K., & Wu, X. (2024). Systematic review of predictive maintenance and digital twin technologies challenges, opportunities, and best practices. PeerJ Computer Science, 10, e1943.

18. Ghobadi, F., & Kang, D. (2023). Application of machine learning in water resources management: a systematic literature review. Water, 15(4), 620.

19. Wu, N., & Xie, Y. (2022). A survey of machine learning for computer architecture and systems. ACM Computing Surveys (CSUR), 55(3), 1-39.

20. Teja Thallam , N. S. (2023). Centralized Management in Multi-Account AWS Environments: A Security and Compliance Perspective. International Journal of Emerging Trends in Computer Science and Information Technology, 4(3), 23-31. https://doi.org/10.63282/3050-9246.IJETCSIT-V4I3P103

21. Tekale, K. M. (2023). Cyber Insurance Evolution: Addressing Ransomware and Supply Chain Risks. International Journal of Emerging Trends in Computer Science and Information Technology, 4(3), 124-133. https://doi.org/10.63282/3050-9246.IJETCSIT-V4I3P113

22. Hassebo, A., Obidat, M., & Ali, M. (2017, December). Four LTE uplink scheduling algorithms performance metrics: Delay, throughput, and fairness tradeoff. In 2017 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) (pp. 300-305). IEEE.

23. Sandeep Rangineni Latha Thamma reddi Sudheer Kumar Kothuru , Venkata Surendra Kumar, Anil Kumar Vadlamudi. Analysis on Data Engineering: Solving Data preparation tasks with ChatGPT to finish Data Preparation. Journal of Emerging Technologies and Innovative Research. 2023/12. (10)12, PP 11, https://www.jetir.org/view?paper=JETIR2312580

24. Newaz, M. N., & Mollah, M. A. (2023, February). Memory usage prediction of HPC workloads using feature engineering and machine learning. In Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region (pp. 64-74).

25. Naga Surya Teja Thallam. (2022). Cost Optimization in Large-Scale Multi-Cloud Deployments: Lessons from Real-World Applications. International Journal of Scientific research in Engineering and Management, 6(9).

26. Katya, E. (2023). Exploring feature engineering strategies for improving predictive models in data science. Research Journal of Computer Systems and Engineering, 4(2), 201-215.

27. Matlock, K., De Niz, C., Rahman, R., Ghosh, S., & Pal, R. (2018). Investigation of model stacking for drug sensitivity prediction. BMC bioinformatics, 19(Suppl 3), 71.

28. Tekale , K. M. (2023). AI-Powered Claims Processing: Reducing Cycle Times and Improving Accuracy. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 4(2), 113-123. https://doi.org/10.63282/3050-9262.IJAIDSML-V4I2P113

29. Sehrawat, S. K. (2023). Transforming Clinical Trials: Harnessing the Power of Generative AI for Innovation and Efficiency. Transactions on Recent Developments in Health Sectors, 6(6), 1-20.

30. Dimitrijevic, B., Khales, S. D., Asadi, R., & Lee, J. (2022). Short-term segment-level crash risk prediction using advanced data modeling with proactive and reactive crash data. Applied Sciences, 12(2), 856.

31. Nita, S., & Kartikawati, S. (2020, April). Analysis of the Impact Narrative Algorithm Method, Pseudocode and Flowchart Towards Students Understanding of the Programming Algorithm Courses. In IOP Conference Series: Materials Science and Engineering (Vol. 835, No. 1, p. 012044). IOP Publishing.

32. Tekale, K. M., & Enjam, G. reddy. (2023). Advanced Telematics & Connected-Car Data. International Journal of Emerging Trends in Computer Science and Information Technology, 4(1), 124-132. https://doi.org/10.63282/3050-9246.IJETCSIT-V4I1P114

33. Thallam, N. S. T. (2021). Privacy-Preserving Data Analytics in the Cloud: Leveraging Homomorphic Encryption for Big Data Security. Journal of Scientific and Engineering Research, 8(12), 331-337.

34. Bennett, N. (2015). Introduction to Algorithms and Pseudocode. Working paper in Project “Exploring Modelling and Computation.

35. Khan, M. A., Saqib, S., Alyas, T., Rehman, A. U., Saeed, Y., Zeb, A., & Mohamed, E. M. (2020). Effective demand forecasting model using business intelligence empowered with machine learning. IEEE access, 8, 116013-116023.

36. Mitra, A., Jain, A., Kishore, A., & Kumar, P. (2022, September). A comparative study of demand forecasting models for a multi-channel retail company: a novel hybrid machine learning approach. In Operations research forum (Vol. 3, No. 4, p. 58). Cham: Springer International Publishing.

37. Feizabadi, J. (2022). Machine learning demand forecasting and supply chain performance. International Journal of Logistics Research and Applications, 25(2), 119-142.

38. Settibathini, V. S., Kothuru, S. K., Vadlamudi, A. K., Thammreddi, L., & Rangineni, S. (2023). Strategic analysis review of data analytics with the help of artificial intelligence. International Journal of Advances in Engineering Research, 26, 1-10.

39. Garg, A. (2022). Unified Framework of Blockchain and AI for Business Intelligence in Modern Banking . International Journal of Emerging Research in Engineering and Technology, 3(4), 32-42. https://doi.org/10.63282/3050-922X.IJERET-V3I4P105

40. Tekale, K. M. (2022). Claims Optimization in a High-Inflation Environment Provide Frameworks for Leveraging Automation and Predictive Analytics to Reduce Claims Leakage and Accelerate Settlements. International Journal of Emerging Research in Engineering and Technology, 3(2), 110-122. https://doi.org/10.63282/3050-922X.IJERET-V3I2P112

Downloads

Published

2024-03-30

Issue

Section

Articles

How to Cite

1.
Karri N. ML Algorithms that Dynamically Allocate CPU, Memory, and I/O Resources. IJAIBDCMS [Internet]. 2024 Mar. 30 [cited 2025 Oct. 29];5(1):145-58. Available from: https://ijaibdcms.org/index.php/ijaibdcms/article/view/276