Comparative Analysis of Public Cloud Providers for Big Data Analytics: AWS, Azure, and Google Cloud

Authors

  • Naga Surya Teja Thallam Senior Software Engineer at Salesforce Author

DOI:

https://doi.org/10.63282/3050-9416.IJAIBDCMS-V4I3P103

Keywords:

Big Data Analytics, Cloud Computing, Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), Performance Benchmarking, Cost Optimization, Security and Compliance, Multi-Cloud Strategies, Artificial Intelligence, Enterprise Data Processing

Abstract

In the digital era we are witnessing an exponential growth of data which has made a need for organizations to adopt Cloud Based big data analytics solution, to leverage a scalable, cost effective and a flexible computing infrastructure. Of all the leading cloud service providers, Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform (GCP) provide a lot of choice when it comes to big data analytics tools to suit the needs of various businesses and research. In this study a comprehensive comparative analysis of AWS, Azure and Google Cloud from big data analytics point of view is made and their feature offerings, performance, pricing, security and scalability are investigated. To this effect, the use of a mix of qualitative and quantitative research methodologies is done including literature reviews, experimental benchmarking, and case studies on the real world adoption of cloud by big enterprises like Netflix (AWS), BMW (Azure) and Spotify (Google Cloud). From this findings, we can see that each of the cloud provider has their own strengths: AWS is good in to process large scale of data and to integrate with enterprise, Azure gives us a great experience on integration with Microsoft products and rich compliance frameworks, as well as Google Cloud shows superiority in real time data processing and AI powered analytics. By differently framing the question posed above, this research provides good insights for organizations to have a cloud adoption optimization strategy based on the workload demands, cost efficiency and security. It also points to developing trends such as hybrid and multi-cloud strategies, sustainability of cloud computing and AI security monitoring. The study ends with suggestions to the enterprises, policymakers and researchers to choose the most appropriate cloud platform for big data analytics and provides future directions to improve the cloud performance and cost efficiency

References

1. R. Naik, “Docker container-based big data processing system in multiple clouds for everyone,” in Proc. IEEE Systems Engineering (SysEng), 2017. doi: 10.1109/ syseng.2017.8088294.

2. S. Ahmadian, J. A. Clark, and B. O’Shea, “Security of Applications Involving Multiple Organizations and Order Preserving Encryption in Hybrid Cloud Environments,” in Proc. IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2014. doi: 10.1109/ipdpsw.2014.102.

3. Y. Demchenko, C. Ngo, P. Membrey, C. de Laat, and Z. Zhao, “Cloud based big data infrastructure: Architectural components and automated provisioning,” in Proc. IEEE High Performance Computing & Simulation (HPCS), 2016. doi: 10.1109/hpcsim.2016.7568394.

4. S. Cui, Y. Ding, W. Liu, and L. Zhang, “A Novel Scheduling Algorithm based on Clustering Analysis and Data Partitioning for Big Data,” in Proc. International Conference on Computer, Network, Communication, and Engineering (ICCNCE), 2013. doi: 10.2991/iccnce.2013.136.

5. Y. Zhang, X. Liu, and S. Li, “Privacy Preserving Deep Computation Model on Cloud for Big Data Feature Learning,” IEEE Transactions on Computers, vol. 65, no. 5, pp. 1351–1362, 2016. doi: 10.1109/tc.2015.2470255.

6. R. Zbakh, A. Haqiq, and M. M. Hasnaoui, “Cloud computing and big data: Technologies and applications,” Concurrency and Computation: Practice and Experience, vol. 29, no. 12, 2017. doi: 10.1002/cpe.4090.

7. Y. Zhang and Z. Li, “A Survey of Computational Offloading in Mobile Cloud Computing,” in Proc. IEEE Mobile Cloud Computing Conference, 2016. doi: 10.1109/mobilecloud.2016.15.

8. M. Bahrami and M. Singhal, “The Role of Cloud Computing Architecture in Big Data,” in

9. Advances in Computers and Information in Engineering Research, Springer, 2014, pp. 197–

10. 212. doi: 10.1007/978-3-319-08254-7_13.

11. S. Drissi, Y. Benkaouz, and H. Medromi, “Towards a Risk Assessment Model for Big Data in Cloud Computing Environment,” in Proc. CSIT Conference, 2020. doi: 10.5121/ csit.2020.101503.

12. H. Zhang, X. Li, and S. Wang, “A nodes scheduling model based on Markov chain prediction for big streaming data analysis,” International Journal of Communication Systems, vol. 27, no. 4, 2014. doi: 10.1002/dac.2779.

13. G. Francia, R. Hill, and M. Roberts, “Learning Cloud Computing and Cloud Security By Simulation,” in Proc. International Conference on Security and Management (SAM), 2013. doi: 10.2316/p.2013.808-019.

14. Pintye, G. Kecskemeti, and P. Kacsuk, “Big data and machine learning framework for clouds and its usage for text classification,” Concurrency and Computation: Practice and Experience, 2020. doi: 10.1002/cpe.6164.

15. Wang, Y. Zhang, and X. Chen, “IntegrityMR: Integrity assurance framework for big data analytics and management applications,” in Proc. IEEE Big Data Conference, 2013. doi: 10.1109/bigdata.2013.6691780.

16. Huang, X. Deng, and Y. Feng, “Analyzing Big Data with the Hybrid Interval Regression Methods,” The Scientific World Journal, vol. 2014, 2014. doi: 10.1155/2014/243921.

17. L. Dong, T. Zhang, and R. Yang, “HVSTO: Efficient privacy preserving hybrid storage in cloud data center,” in Proc. IEEE INFOCOM Workshops, 2014. doi: 10.1109/ infcomw.2014.6849287.

18. J. Choi, K. Kim, and Y. Kim, “Employing Vertical Elasticity for Efficient Big Data Processing in Container-Based Cloud Environments,” Applied Sciences, vol. 11, no. 13, 2021. doi: 10.3390/app11136200.

19. P. Pierleoni, F. Mercuri, and R. Palma, “Amazon, Google and Microsoft Solutions for IoT: Architectures and a Performance Comparison,” IEEE Access, vol. 8, 2020. doi: 10.1109/ access.2019.2961511.

20. Y. Demchenko et al., “CYCLONE: A Platform for Data Intensive Scientific Applications in Heterogeneous Multi-cloud/Multi-provider Environment,” in Proc. IEEE International Conference on Cloud Computing, 2016. doi: 10.1109/ic2ew.2016.46.

21. M. Falah, A. Zaidan, and A. Zaidan, “Comparison of cloud computing providers for development of big data and internet of things application,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 22, no. 3, 2021. doi: 10.11591/ ijeecs.v22.i3.pp1723-1730.

22. X. Li, X. Wang, and Y. Liu, “Deduplication-Based Energy Efficient Storage System in Cloud Environment,” The Computer Journal, vol. 57, no. 3, 2014. doi: 10.1093/comjnl/bxu122.

23. R. Calheiros et al., “CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms,” Software: Practice and Experience, vol. 41, no. 1, 2010. doi: 10.1002/spe.995.

24. Patibandla, K., Daruvuri, R. (2023). Reinforcement Deep Learning Approach for Multi-User Task Offloading in Edge-Cloud Joint Computing Systems. International Journal of Research in Electronics and Computer Engineering, 11(3), pp. 47-49.

25. W. Jansen and T. Grance, “Guidelines on security and privacy in public cloud computing,” NIST Special Publication 800-144, 2011. doi: 10.6028/nist.sp.800-144.

26. H. Aly, M. Said, and A. Zaki, “Survey of Computation Integrity Methods For Big Data,” IJCI International Journal of Computers and Information, vol. 10, no. 1, 2021. doi: 10.21608/ ijci.2021.207757.

27. “Secured Storage of Big Data in Cloud,” International Journal of Recent Technology and Engineering, vol. 8, no. 2S3, 2019. doi: 10.35940/ijrte.b1002.0782s319.

28. M. Mortazavi-Dehkordi and K. Zamanifar, “Efficient deadline-aware scheduling for the analysis of Big Data streams in public Cloud,” Cluster Computing, vol. 22, no. 4, 2019. doi: 10.1007/s10586-019-02908-2.

29. R. Naik, “A Methodological Study on Big Data and Cloud Computing for Public Policy Management,” International Journal for Research in Applied Science and Engineering Technology, vol. 11, no. 5, 2023. doi: 10.22214/ijraset.2023.56118.

30. M. Chaturvedi and F. Lone, “Analysis of Big Data Security Schemes for Detection and Prevention from Intruder Attacks in Cloud Computing,” International Journal of Computer Applications, vol. 162, no. 7, 2017. doi: 10.5120/ijca2017912831.

31. Joshi, B. Modi, and P. Dave, “Semantic approach to automating management of big data privacy policies,” in Proc. IEEE Big Data Conference, 2016. doi: 10.1109/ bigdata.2016.7840639.

32. Iordache, J. Seinturier, and L. Seinturier, “Resilin: Elastic MapReduce over Multiple Clouds,” in Proc. IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), 2013. doi: 10.1109/ccgrid.2013.48.

33. Daruvuri, R., Patibandla, K.(2023). Enhancing Data Security and Privacy in Edge Computing: A Comprehensive Review of Key Technologies and Future Directions. International Journal of Research in Electronics and Computer Engineering, 11(1), pp. 77-88.

34. D. Constantiou and J. Kallinikos, “New Games, New Rules: Big Data and the Changing Context of Strategy,” Journal of Information Technology, vol. 30, no. 1, pp. 44–57, 2015. doi: 10.1057/jit.2014.17.

35. R. Daruvuri, “Harnessing vector databases: A comprehensive analysis of their role across industries,” International Journal of Science and Research Archive, vol.7, no. 2, pp.703–705, Dec. 2022, doi: 10.30574/ijsra.2022.7.2.0334.

Downloads

Published

2023-09-16

Issue

Section

Articles

How to Cite

1.
Teja Thallam NS. Comparative Analysis of Public Cloud Providers for Big Data Analytics: AWS, Azure, and Google Cloud. IJAIBDCMS [Internet]. 2023 Sep. 16 [cited 2025 Oct. 3];4(3):18-29. Available from: https://ijaibdcms.org/index.php/ijaibdcms/article/view/72