Serverless AI Architectures for Scalable and Cost-Efficient E-Commerce Platforms
DOI:
https://doi.org/10.63282/3050-9416.IJAIBDCMS-V6I4P128Keywords:
Serverless Computing, Function-as-a-Service (FaaS), AI/ML Inference, E-commerce Architecture, Scalability, Cost Efficiency, Cold Start Mitigation, Composable Commerce, Denial-of-Wallet (DoW)Abstract
The contemporary e-commerce landscape necessitates real-time adaptability and massive scalability, driving the migration from restrictive monolithic systems to modern, composable architectures. Serverless computing, particularly Function-as-a-Service (FaaS), provides the foundational paradigm for this transformation, enabling horizontal scaling and consumption-based billing for dynamic workloads. This paper rigorously explores the efficacy of serverless architectures in deploying resource-intensive Artificial Intelligence (AI) and Machine Learning (ML) inference tasks, which are crucial for enhanced customer satisfaction and market competitiveness in e-commerce. Architectural analysis reveals that the core benefit of serverless AI is conditional upon functional decomposition, demonstrating that transforming monolithic ML processes into parallel functions can reduce execution time by over 95% at comparable costs. However, this architectural approach introduces persistent challenges, including critical trade-offs between cost and performance and performance degradation due to cold start latency, particularly for large models (LLMs). Advanced mitigation frameworks, such as TIDAL, address these challenges by reducing cold start latency by 1.79x to 2.11x for GPU-based LLMs. Finally, the report discusses the necessity of specialized security protocols to guard against financial risks unique to serverless systems, such as Denial-of-Wallet (DoW) threats. The synthesized findings establish that serverless AI is an inherently superior model for scalable e-commerce, provided organizations strategically overcome these architectural and operational complexities.
References
1. Athreya, S., Kurian, S., Dange, A., & Bhatsangave, S. (2022). Implementation of Serverless E-Commerce Mobile Application. 2022 2nd International Conference on Intelligent Technologies (CONIT), IEEE.
2. Cui, W., Xu, Z., Zhao, H., Chen, Q., et al. (2025). Efficient Function-as-a-Service for Large Language Models with TIDAL. arXiv preprint, DOI: 10.48550/arXiv.2503.06421.
3. Daraojimba, A. I., Ogeawuchi, J. C., Abayomi, A. A., et al. (2021). Systematic Review of Serverless Architectures and Business Process Optimization. Iconic Research And Engineering Journals, 5(4), 284-309.
4. Mahmoudi, N., Lin, C., Khazaei, H., & Litoiu, M. (2019). Optimizing serverless computing: Introducing an adaptive function placement algorithm. Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering (pp. 203-213).
5. Pinnapareddy, N. R. (2023). (Unspecified Title on FaaS Optimization). The American Journal of Engineering and Technology, 5(5).
6. Rane, N. L., Choudhary, S. P., & Rane, J. (2024). Artificial Intelligence and Machine Learning in Business Intelligence, Finance, and E-Commerce: A Review. SSRN AI Research Series.
7. Samea, F., Azam, F., Rashid, M., Anwar, M.W., Haider Butt, W., & Muzaffar, A.W. (2020). A model-driven framework for data-driven applications in serverless cloud computing.
8. Shafiei, H., Khonsari, A., & Mousavi, P. (2022). Serverless computing: a survey of opportunities, challenges, and applications. ACM Computing Surveys, 54(11s), 1-32.
9. (2025). Revolutionizing E-commerce with Serverless and Composable Architecture. European American Journals (EJCSIT), 13(26).
10. Proceedings of the 14th ACM International Conference on Distributed and Event-based Systems.
11. Software Architecture Patterns For Serverless Systems.
12. (2025). (Unspecified Title on Serverless Distributed Data Frame). arXiv preprint.
13. (Unspecified Authors). (2025). Scalable and Cost-Efficient ML Inference: Parallel Batch Processing with Serverless Functions. arXiv preprint.
14. (Unspecified Authors). (2020). QUART: Latency-Aware FaaS System for Pipelining Large Model Inference. Article, July 2020.
15. (Unspecified Authors). (202?). MArk: Exploiting cloud services for cost-effective, SLO-aware machine learning inference serving. 2019 USENIX Annual Technical Conference (USENIX ATC 19).
16. (Unspecified Authors). (2022). (Unspecified Title on Serverless ML Deployment). WJAETS.
17. (Unspecified Authors). (2025). (Unspecified Title on Anomaly Detection in Serverless Systems). arXiv preprint.
18. Assessing the Performance and Cost-Efficiency of Serverless Computing for Deploying and Scaling AI and ML Workloads in the Cloud. ResearchGate.
19. Ade, M., & Sheriffdeen, K. (2024). Evaluating the Trade-Offs: Cost vs. Performance in Serverless Computing for AI and ML Workload Deployment. ResearchGate.
20. Tari, M., Ghobaei-Arani, M., Pouramini, J., & Ghorbian, M. (2024). Auto-scaling mechanisms in serverless computing: A comprehensive review. Computer Science Review, 53, p.100650.
21. Tütüncüoğlu, F., & Dán, G. (2023). Joint resource management and pricing for task offloading in serverless edge computing. IEEE Transactions on Mobile Computing.
22. Vahidinia, P., Farahani, B., & Aliee, F.S. (2022). Mitigating cold start problem in serverless computing: A reinforcement learning approach. IEEE Internet of Things Journal, 10(5), 3917-3927.
23. Wu, S., Tao, Z., Fan, H., Huang, Z., Zhang, X., Jin, H., Yu, C., & Cao, C. (2022). Container lifecycle‐aware scheduling for serverless computing. Software: Practice and Experience, 52(2), 337-352.