Predictive Database Performance Optimization Using Machine Learning-Driven Query Workload Modeling
DOI:
https://doi.org/10.63282/3050-9416.IJAIBDCMS-V2I4P111Keywords:
Predictive Analytics, Database Performance Optimization, Machine Learning, Query Workload Modeling, Resource Allocation, OLTP, OLAP SystemsAbstract
The database systems of modern time have to be running on very dynamic and heterogeneous workloads where the old rule-based and cost-based optimization methods may tend not to sustain steady performance. In this paper, a predictive database performance optimization framework can be proposed that is implemented on the basis of machine learning-based query workload modeling. The method makes use of historical records of execution, structure characteristics of queries, and model characteristics of runtime resources to create forecasting models that can predict query latency and system resource consumption. The framework incorporates feature extraction, workload classification, temporal workload pattern learning, and the supervised machine learning models to predict the occurrence of performance bottlenecks. The system obtains complex nonlinear relations among query properties and run-time performance to support proactive optimization techniques such as adaptive indexing, memory set up, and execution plan optimization. The hybrid modeling methods that require a combination of plan-level and operator-level modeling can further improve the accuracy of prediction and the generalization of various workloads. Benchmark workloads can be used to experimentally evaluate the improvements in prediction accuracy and query execution performance relative to traditional cost based optimizers. Findings have shown a decrease in query latency, better throughput and a better efficiency in resources during dynamic workload situations. The framework proposed will lead to the creation of self-optimizing smart databases systems, which will be able to continuously adapt to cloud and distributed systems.
References
1. Paul, D., Cao, J., Li, F., & Srikumar, V. (2021). Database workload characterization with query plan encoders. arXiv preprint arXiv:2105.12287.
2. Tsialiamanis, P., Sidirourgos, L., Fundulaki, I., Christophides, V., & Boncz, P. (2012, March). Heuristics-based query optimisation for SPARQL. In Proceedings of the 15th International Conference on Extending Database Technology (pp. 324-335).
3. Leis, V., Gubichev, A., Mirchev, A., Boncz, P., Kemper, A., & Neumann, T. (2015). How good are query optimizers, really?. Proceedings of the VLDB Endowment, 9(3), 204-215.
4. Ma, L., Van Aken, D., Hefny, A., Mezerhane, G., Pavlo, A., & Gordon, G. J. (2018, May). Query-based workload forecasting for self-driving database managem ent systems. In Proceedings of the 2018 International Conference on Management of Data (pp. 631-645).
5. Li, G., Zhou, X., & Cao, L. (2021, October). Machine learning for databases. In Proceedings of the First International Conference on AI-ML Systems (pp. 1-2).
6. Akdere, M., Çetintemel, U., Riondato, M., Upfal, E., & Zdonik, S. B. (2012, April). Learning-based query performance modeling and prediction. In 2012 IEEE 28th International Conference on Data Engineering (pp. 390-401). IEEE.
7. Khoshkbarforoushha, A., Ranjan, R., Gaire, R., Abbasnejad, E., Wang, L., & Zomaya, A. Y. (2016). Distribution based workload modelling of continuous queries in clouds. IEEE transactions on Emerging Topics in Computing, 5(1), 120-133.
8. Lee, B. S., Chen, L., Buzas, J., & Kannoth, V. (2004). Regression-based self-tuning modeling of smooth user-defined function costs for an object-relational database management system query optimizer. The Computer Journal, 47(6), 673-693.
9. Calzarossa, M., & Serazzi, G. (2002). Workload characterization: A survey. Proceedings of the IEEE, 81(8), 1136-1150.
10. Qiu, F., Zhang, B., & Guo, J. (2016, May). A deep learning approach for VM workload prediction in the cloud. In 2016 17th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD) (pp. 319-324). IEEE.
11. García, Á. L., De Lucas, J. M., Antonacci, M., Zu Castell, W., David, M., Hardt, M., ... & Wolniewicz, P. (2020). A cloud-based framework for machine learning workloads and applications. IEEE access, 8, 18681-18692.
12. Zhang, M., Martin, P., Powley, W., & Chen, J. (2017). Workload management in database management systems: A taxonomy. IEEE transactions on knowledge and data engineering, 30(7), 1386-1402.
13. Kamble, S. S., & Gunasekaran, A. (2020). Big data-driven supply chain performance measurement system: a review and framework for implementation. International journal of production research, 58(1), 65-86.
14. Duggan, J., Cetintemel, U., Papaemmanouil, O., & Upfal, E. (2011, June). Performance prediction for concurrent database workloads. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data (pp. 337-348).
15. Ameri, P., Schlitter, N., Meyer, J., & Streit, A. (2016, August). NoWog: A workload generator for database performance benchmarking. In 2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech) (pp. 666-673). IEEE.
16. Belavagi, M. C., & Muniyal, B. (2016). Performance evaluation of supervised machine learning algorithms for intrusion detection. Procedia Computer Science, 89, 117-123.
17. Hundi, P., & Shahsavari, R. (2020). Comparative studies among machine learning models for performance estimation and health monitoring of thermal power plants. Applied Energy, 265, 114775.
18. Didona, D., Quaglia, F., Romano, P., & Torre, E. (2015, January). Enhancing performance prediction robustness by combining analytical modeling and machine learning. In Proceedings of the 6th ACM/SPEC international conference on performance engineering (pp. 145-156).
19. Liu, Z., Wu, D., Liu, Y., Han, Z., Lun, L., Gao, J., ... & Cao, G. (2019). Accuracy analyses and model comparison of machine learning adopted in building energy consumption prediction. Energy Exploration & Exploitation, 37(4), 1426-1451.
20. Manduva, V. C. (2021). AI-Driven Predictive Analytics for Optimizing Resource Utilization in Edge-Cloud Data Centers. International Journal of Emerging Trends in Science and Technology, 1-17.
21. Rajan, K., Kakadia, D., Curino, C., & Krishnan, S. (2016, October). Perforator: eloquent performance models for resource optimization. In Proceedings of the Seventh ACM Symposium on Cloud Computing (pp. 415-427).
22. Obannagari, C. K. R. N., & Nangi, P. R. (2020). Deep Learning-Driven Compliance Automation for Continuous Monitoring of Security Controls in Regulated Cloud Systems. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 1(3), 21-32. https://doi.org/10.63282/3050-9262.IJAIDSML-V1I3P104.
23. Obannagari, C. K. R. N., & Nangi, P. R. (2020). Advanced Data Science Frameworks for Predictive Cyber-Risk Assessment and Adaptive Security Policy Optimization in Zero Trust Networks. International Journal of Emerging Trends in Computer Science and Information Technology, 1(4), 67-78. https://doi.org/10.63282/3050-9246.IJETCSIT-V1I4P108.