Real-Time Data Engineering for Large-Scale Supply Chain Network Optimization: A Framework for Petabyte-Scale Analytics

Authors

  • Uday Dhembare Data Engineering Manager, Supply Chain Analytics,Bellevue, WA, USA. Author

DOI:

https://doi.org/10.63282/3050-9416.IJAIBDCMS-V7I1P128

Keywords:

Data Engineering, Supply Chain Analytics, Network Optimization, Real-Time Processing, Big Data, Transportation Networks, Fulfillment Systems, Operations Research, Distributed Computing, Supply Chain Optimization, Data Pipeline Architecture, ETL Processes, Scalable Systems, Data Infrastructure, Decision Support Systems

Abstract

Modern supply chain networks generate unprecedented volumes of operational data that, when properly analyzed, can reveal significant optimization opportunities across diverse industry sectors. This paper presents a comprehensive framework for implementing real-time data engineering solutions that enable large-scale supply chain network optimization, capable of processing petabyte-scale datasets within operational time constraints. The proposed three-layer architecture integrates data processing, candidate analysis, and aggregation components to enable data-driven network design decisions across retail, manufacturing, healthcare, and logistics industries. Through systematic implementation of distributed computing architectures, real-time processing paradigms, and advanced analytics methodologies, the framework demonstrates measurable improvements including 15-35% reduction in transportation costs, 20-40% improvement in service levels, and achievement of 90% automation in previously manual processes. The solution addresses critical challenges in supply chain visibility, real-time decision-making, and network reassignment analysis while maintaining scalability across diverse operational environments. This research contributes to the growing body of knowledge in supply chain analytics by bridging the gap between theoretical optimization models and practical implementation at enterprise scale, establishing design principles for scalable supply chain optimization systems that transform traditional reactive approaches into proactive, evidence-driven strategies.

References

1. Christopher, M., & Holweg, M. (2017). Supply chain 2.0 revisited: a framework for managing volatility-induced risk in the supply chain. International Journal of Physical Distribution & Logistics Management, 47(1), 2-17.

2. Waller, M. A., & Fawcett, S. E. (2013). Data science, predictive analytics, and big data: a revolution that will transform supply chain design and management. Journal of Business Logistics, 34(2), 77-84.

3. Choi, T. M., Wallace, S. W., & Wang, Y. (2018). Big data analytics in operations management. Production and Operations Management, 27(10), 1868-1883.

4. Gunasekaran, A., Papadopoulos, T., Dubey, R., Wamba, S. F., Childe, S. J., Hazen, B., & Akter, S. (2017). Big data and predictive analytics for supply chain and organizational performance. Journal of Business Research, 70, 308-317.

5. Babiceanu, R. F., & Seker, R. (2016). Big Data and virtualization for manufacturing cyber-physical systems: A survey of the current status and future outlook. Computers in Industry, 81, 128-137.

6. Sanders, N. R. (2016). How to use big data to drive your supply chain. California Management Review, 58(3), 26-48.

7. Shehab, E. M., Sharp, M. W., Supramaniam, L., & Spedding, T. A. (2004). Enterprise resource planning: An integrative review. Business Process Management Journal, 10(4), 359-386.

8. Chen, H., Chiang, R. H., & Storey, V. C. (2012). Business intelligence and analytics: from big data to big impact. MIS Quarterly, 36(4), 1165-1188.

9. Laney, D. (2001). 3D data management: Controlling data volume, velocity and variety. META Group Research Note, 6(70), 1-4.

10. Hazen, B. T., Boone, C. A., Ezell, J. D., & Jones-Farmer, L. A. (2014). Data quality for data science, predictive analytics, and big data in supply chain management: An introduction to the problem and suggestions for research and applications. International Journal of Production Economics, 154, 72-80.

11. Wang, G., Gunasekaran, A., Ngai, E. W., & Papadopoulos, T. (2016). Big data analytics in logistics and supply chain management: Certain investigations for research and applications. International Journal of Production Economics, 176, 98-110.

12. Carbonneau, R., Laframboise, K., & Vahidov, R. (2008). Application of machine learning techniques for supply chain demand forecasting. European Journal of Operational Research, 184(3), 1140-1154.

13. Stonebraker, M., Çetintemel, U., & Zdonik, S. (2005). The 8 requirements of real-time stream processing. ACM SIGMOD Record, 34(4), 42-47.

14. Simchi-Levi, D., Chen, X., & Bramel, J. (2014). The logic of logistics: theory, algorithms, and applications for logistics and supply chain management. Springer Science & Business Media.

15. Fosso Wamba, S., Akter, S., Edwards, A., Chopin, G., & Gnanzou, D. (2015). How 'big data' can make big impact: Findings from a systematic review and a longitudinal case study. International Journal of Production Economics, 165, 234-246.

16. Krishna Chaitanaya Chittoor, “Building AI-Powered Financial Risk Analytics Platforms Using Distributed Big Data Infrastructure”, JOURNAL OF EMERGING TRENDS AND NOVEL RESEARCH, 1(6), PP-a26-a33, 2023, https://rjpn.org/jetnr/papers/JETNR2306003.pdf

17. Sunkara, S. K. (2025). Leveraging AI, Iot, And Blockchain For Scalable Digital Transformation In Post-Harvest Supply Chains: A Multi-Sector Approach To Enhancing Efficiency And Traceability (Vol. 26, Issue 7, Pp. 2757–2766).

Downloads

Published

2026-02-27

Issue

Section

Articles

How to Cite

1.
Dhembare U. Real-Time Data Engineering for Large-Scale Supply Chain Network Optimization: A Framework for Petabyte-Scale Analytics. IJAIBDCMS [Internet]. 2026 Feb. 27 [cited 2026 Mar. 15];7(1):179-85. Available from: https://ijaibdcms.org/index.php/ijaibdcms/article/view/469