Explainable Machine Learning for Loan Default Prediction: Enhancing Transparency in Banking

Authors

  • Amit Taneja Lead Data Engineer at Mitchell Martin, USA. Author

DOI:

https://doi.org/10.63282/3050-9416.IJAIBDCMS-V2I1P106

Keywords:

Explainable AI, Loan Default Prediction, Credit Risk, SHAP, LIME, XGBoost, Machine Learning, Financial Transparency, Banking Regulation

Abstract

Traditional credit risk assessment frameworks have changed over the last few years with the integration of Machine Learning (ML) frameworks in the field of financial services. The predictive models have greatly improved in the accuracy of loan default. Nonetheless, the obscurity of a significant number of ML frameworks has triggered some issues towards transparency and interpretability, as well as regulatory adherence. The present paper reviews the context of applying Explainable Machine Learning (XML) approaches to the prediction of loan default to achieve trade-offs between predictive power and explainability. The traditional or so-called black-box models (XGBoost, Random Forest, deep learning) are contrasted to the interpretable models or post-hoc explanation methods (SHAP (SHapley Additive exPlanations), LIME (Local Interpretable Model-agnostic Explanations), and Decision Trees). We are using a real-life financial dataset (pre-2021) of LendingClub to perform the analysis of model performance. The paper highlights the importance of citizen trust and transparency in the banking system, examines the challenges facing financial institutions, and discusses how explainable AI can foster trust among customers, enhance the ethical training of financial institutions' AI, and ensure regulatory compliance. There is visualization, a heatmap, and the graphs depicting the importance of drawing clearer conclusions. We find that explainable models, albeit marginally less effective, are crucial in terms of the understanding of behavior of the borrowers and in the financial risks they represent. The characteristics of the findings urge the use of XML in credit scoring pipelines to make responsible and ethical AI implementation

References

1. Thomas, L., Crook, J., & Edelman, D. (2017). Credit scoring and its applications. Society for Industrial and Applied Mathematics.

2. Hand, D. J., & Henley, W. E. (1997). Statistical classification methods in consumer credit scoring: a review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 160(3), 523-541.

3. Abdou, H. A., & Pointon, J. (2011). Credit scoring, statistical techniques and evaluation criteria: a review of the literature. Intelligent systems in accounting, finance and management, 18(2-3), 59-88.

4. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016, August). " Why should I trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135-1144).

5. Malekipirbazari, M., & Aksakalli, V. (2015). Risk assessment in social lending via random forests. Expert systems with applications, 42(10), 4621-4631.

6. Khandani, A. E., Kim, A. J., & Lo, A. W. (2010). Consumer credit-risk models via machine-learning algorithms. Journal of Banking & Finance, 34(11), 2767-2787.

7. Heaton, J. B., Polson, N. G., & Witte, J. H. (2016). Deep learning in finance. arXiv preprint arXiv:1602.06561.

8. Sirignano, J., & Cont, R. (2021). Universal features of price formation in financial markets: perspectives from deep learning. In Machine learning and AI in finance (pp. 5-15). Routledge.

9. Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in neural information processing systems, 30.

10. Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.

11. Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., ... & Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information fusion, 58, 82-115.

12. Das, A., & Rad, P. (2020). Opportunities and challenges in explainable artificial intelligence (XAI): A survey. arXiv preprint arXiv:2006.11371.

13. Gültekin, B., & Erdoğdu Şakar, B. (2018, July). Variable importance analysis in default prediction using machine learning techniques. In Proceedings of the 7th International Conference on Data Science, Technology and Applications (pp. 56-62).

14. Sofaer, H. R., Hoeting, J. A., & Jarnevich, C. S. (2019). The area under the precision‐recall curve is a performance metric for rare binary events. Methods in Ecology and Evolution, 10(4), 565-577.

15. Xu, J., Lu, Z., & Xie, Y. (2021). Loan default prediction of the Chinese P2P market: a machine learning methodology. Scientific Reports, 11(1), 18759.

16. Tiwari, A. K. (2018). Machine learning application in loan default prediction. JournalNX, 4(05), 1-5.

17. Lai, L. (2020, August). Loan default prediction with machine learning techniques. In 2020 International Conference on Computer Communication and Network Security (CCNS) (pp. 5-9). IEEE.

18. Pérez-Sánchez, B., Fontenla-Romero, O., & Guijarro-Berdiñas, B. (2018). A review of adaptive online learning for artificial neural networks. Artificial Intelligence Review, 49, 281-299.

19. Ribeiro, M. T., Singh, S., & Guestrin, C. (2018, April). Anchors: High-precision model-agnostic explanations. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1).

20. Cerda, P., Varoquaux, G., & Kégl, B. (2018). Similarity encoding for learning with dirty categorical variables. Machine Learning, 107(8), 1477-1494.

Downloads

Published

2021-03-30

Issue

Section

Articles

How to Cite

1.
Taneja A. Explainable Machine Learning for Loan Default Prediction: Enhancing Transparency in Banking. IJAIBDCMS [Internet]. 2021 Mar. 30 [cited 2025 Sep. 14];2(1):57-65. Available from: https://ijaibdcms.org/index.php/ijaibdcms/article/view/200