Self-Healing Infrastructure - Predictive Automation for High-Availability Banking Systems
DOI:
https://doi.org/10.63282/3050-9416.IJAIBDCMS-V7I1P120Keywords:
Self-Healing Infrastructure, Predictive Automation, Artificial Intelligence, Banking and Financial Systems, Cloud Governance, Operational Resilience, Security, ComplianceAbstract
Modern banking platforms rely even more on intricate, cloud-native architectures to provide real-time, always-on financial services. Reliability of infrastructure has increased significantly, but operational downtime from unsafe changes, configuration drift and human error remain a leading cause of service disruption. This paper presents a prevention-first, self-healing infrastructure paradigm based on predictive automation, built-in governance and analytics-driven use of AI. Instead of concentrating on the reactive remediation, this approach is interested in proactive aversion of high-risk system states before they actualize and affect customers.
References
1. Smallstep. SSH Certificate Login Tutorial. [Online].
Available: https://smallstep.com/docs/tutorials/ssh-certificate-login/
2. Basel Committee on Banking Supervision, Principles for Operational Resilience, Bank for International Settlements, 2021. https://www.bis.org/bcbs/publ/d516.pdf
3. Amazon Web Services, Resilience on AWS, AWS Whitepaper (Online), 2024. https://pages.awscloud.com/rs/112-TZM-766/images/01%20Resilience%20on%20AWS%20-%20Final.pdf
4. Amazon Web Services, Amazon Web Services’ Approach to Operational Resilience in the Financial Sector & Beyond, AWS Whitepaper (Online), 2023. https://docs.aws.amazon.com/pdfs/whitepapers/latest/aws-operational-resilience/aws-operational-resilience.pdf
5. Google, Site Reliability Engineering (SRE) Book – Table of Contents, sre.google. https://sre.google/sre-book/table-of-contents/
6. Microsoft, Azure Well-Architected Framework – Reliability, Microsoft Learn (Online), 2023. https://learn.microsoft.com/en-us/azure/well-architected/reliability/
7. Microsoft, Reliability Design Principles, Microsoft Learn (Online), 2023. https://learn.microsoft.com/en-us/azure/well-architected/reliability/principles
8. Q. Cheng et al., AI for IT Operations (AIOps) on Cloud Platforms (Online), 2023. https://arxiv.org/pdf/2304.04661
9. Z. Yazdanparast et al., A Survey on Self-healing Software System (Online), 2024. https://arxiv.org/pdf/2403.00455
10. Z. Zhong et al., A Survey of Time Series Anomaly Detection Methods in the Context of AIOps (Online), 2023. https://arxiv.org/abs/2308.00393.