Using Oracle’s AI Vector Search to Enable Concept-Based Querying across Structured and Unstructured Data

Authors

  • Nagireddy Karri Senior IT Administrator Database, Sherwin-Williams, USA. Author
  • Partha Sarathi Reddy Pedda Muntala Software Developer at Cisco Systems, Inc, USA. Author

DOI:

https://doi.org/10.63282/3050-9416.IJAIBDCMS-V5I3P115

Keywords:

Oracle AI, Vector Search, Concept-Based Querying, Structured Data, Unstructured Data, Embeddings, Semantic Search, Knowledge Management

Abstract

The tremendous growth of organised and unorganised information in enterprise has resulted in serious problems about how to access information and manage knowledge. Non-lucrative traditional search systems favorable to searching by keywords do not always find the semantic connection between the various types of data with a consequent incomplete or irrelevant search. In order to overcome this shortcoming, Oracle has built a vector search engine, driven by artificial intelligence, which uses deep learning embeddings to facilitate concept-based querying. This paper discusses how the Oracle AI Vector Search can be used in enterprise data ecosystems to better the accuracy of retrieval, further the semantic understanding, and connect the structured and unstructured data. We communicate the architecture, methodologies and implementation strategies underlining and give experimental results that feature the improving nature over the traditional means. This can be applied with semantic searches, similarity detection, as well as recommendation systems using more refined semantic representations

References

1. Jindal, V., Bawa, S., & Batra, S. (2014). A review of ranking approaches for semantic search on web. Information Processing & Management, 50(2), 416-425.

2. Tekale, K. M., & Rahul, N. (2022). AI and Predictive Analytics in Underwriting, 2022 Advancements in Machine Learning for Loss Prediction and Customer Segmentation. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 3(1), 95-113. https://doi.org/10.63282/3050-9262.IJAIDSML-V3I1P111

3. Moskovitch, R., Martins, S. B., Behiri, E., Weiss, A., & Shahar, Y. (2007). A comparative evaluation of full-text, concept-based, and context-sensitive search. Journal of the American Medical Informatics Association, 14(2), 164-174.

4. Galli, C., Cusano, C., Guizzardi, S., Donos, N., & Calciolari, E. (2024, December). Embeddings for efficient literature screening: A primer for life science investigators. In Metrics (Vol. 1, No. 1, p. 1). Multidisciplinary Digital Publishing Institute.

5. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26.

6. Tekale, K. M., & Enjam, G. reddy. (2023). Advanced Telematics & Connected-Car Data. International Journal of Emerging Trends in Computer Science and Information Technology, 4(1), 124-132. https://doi.org/10.63282/3050-9246.IJETCSIT-V4I1P114

7. Kulasekhara Reddy Kotte. 2023. Leveraging Digital Innovation for Strategic Treasury Management: Blockchain, and Real-Time Analytics for Optimizing Cash Flow and Liquidity in Global Corporation. International Journal of Interdisciplinary Finance Insights, 2(2), PP - 1 - 17, https://injmr.com/index.php/ijifi/article/view/186/45

8. Pennington, J., Socher, R., & Manning, C. D. (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).

9. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019, June). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers) (pp. 4171-4186).

10. Tekale, K. M. (2023). Cyber Insurance Evolution: Addressing Ransomware and Supply Chain Risks. International Journal of Emerging Trends in Computer Science and Information Technology, 4(3), 124-133. https://doi.org/10.63282/3050-9246.IJETCSIT-V4I3P113

11. Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901.

12. Johnson, J., Douze, M., & Jégou, H. (2019). Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, 7(3), 535-547.

13. Venkata SK Settibathini. Data Privacy Compliance in SAP Finance: A GDPR (General Data Protection Regulation) Perspective. International Journal of Interdisciplinary Finance Insights, 2023/6, 2(2), https://injmr.com/index.php/ijifi/article/view/45/13

14. Tekale, K. M. (2022). Claims Optimization in a High-Inflation Environment Provide Frameworks for Leveraging Automation and Predictive Analytics to Reduce Claims Leakage and Accelerate Settlements. International Journal of Emerging Research in Engineering and Technology, 3(2), 110-122. https://doi.org/10.63282/3050-922X.IJERET-V3I2P112

15. Oracle Database 23ai brings the power of AI to the enterprise, digitalisationworld, Online. https://digitalisationworld.com/news/67619/oracle-database-23ai-brings-the-power-of-ai-to-the-enterprise

16. Nikravesh, M. (2008). Concept-based search and questionnaire systems. Soft Computing, 12(3), 301-314.

17. Egozi, O., Markovitch, S., & Gabrilovich, E. (2011). Concept-based information retrieval using explicit semantic analysis. ACM Transactions on Information Systems (TOIS), 29(2), 1-34.

18. Tekale, K. M., & Rahul, N. (2023). Blockchain and Smart Contracts in Claims Settlement. International Journal of Emerging Trends in Computer Science and Information Technology, 4(2), 121-130. https://doi.org/10.63282/3050-9246.IJETCSIT-V4I2P112

19. Mishra, S., & Misra, A. (2017, September). Structured and unstructured big data analytics. In 2017 International Conference on Current Trends in Computer, Electrical, Electronics and Communication (CTCEEC) (pp. 740-746). IEEE.

20. Kastrati, F., Li, X., Quix, C., & Khelghati, M. (2011, June). Enabling structured queries over unstructured documents. In 2011 IEEE 12th international conference on mobile data management (Vol. 2, pp. 80-85). IEEE.

21. Nikravesh, M. (2007). Concept-based semantic web search and Q&A. In E-Service Intelligence: Methodologies, Technologies and Applications (pp. 95-124). Berlin, Heidelberg: Springer Berlin Heidelberg.

22. Oracle AI Vector Search : Demonstration, Online. https://dineshbandelkar.com/oracle-ai-vector-search-demonstration/

23. Fonseca, B. M., Golgher, P., Pôssas, B., Ribeiro-Neto, B., & Ziviani, N. (2005, October). Concept-based interactive query expansion. In Proceedings of the 14th ACM international conference on Information and knowledge management (pp. 696-703).

24. Tekale , K. M. (2023). AI-Powered Claims Processing: Reducing Cycle Times and Improving Accuracy. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 4(2), 113-123. https://doi.org/10.63282/3050-9262.IJAIDSML-V4I2P113

25. Venkata SK Settibathini. Enhancing User Experience in SAP Fiori for Finance: A Usability and Efficiency Study. International Journal of Machine Learning for Sustainable Development, 2023/8, 5(3), PP 1-13, https://ijsdcs.com/index.php/IJMLSD/article/view/467

26. Sehrawat, S. K. (2023). The role of artificial intelligence in ERP automation: state-of-the-art and future directions. Trans Latest Trends Artif Intell, 4(4).

27. Fonseca, M. J., & Jorge, J. A. (2003, March). Indexing high-dimensional data for content-based retrieval in large databases. In Eighth International Conference on Database Systems for Advanced Applications, 2003.(DASFAA 2003). Proceedings. (pp. 267-274). IEEE.

28. Thallam, N. S. T. (2023). Comparative Analysis of Public Cloud Providers for Big Data Analytics: AWS, Azure, and Google Cloud. International Journal of AI, BigData, Computational and Management Studies, 4(3), 18-29.

29. Tekale, K. M., Enjam, G. R., & Rahul, N. (2023). AI Risk Coverage: Designing New Products to Cover Liability from AI Model Failures or Biased Algorithmic Decisions. International Journal of AI, BigData, Computational and Management Studies, 4(1), 137-146. https://doi.org/10.63282/3050-9416.IJAIBDCMS-V4I1P114

30. Guo, Y., Ding, G., Liu, L., Han, J., & Shao, L. (2017). Learning to hash with optimized anchor embedding for scalable retrieval. IEEE Transactions on image processing, 26(3), 1344-1354.

31. Delbru, R., Campinas, S., & Tummarello, G. (2012). Searching web data: An entity retrieval and high-performance indexing model. Journal of Web Semantics, 10, 33-58.

32. Mountantonakis, M., & Tzitzikas, Y. (2019). Large-scale semantic integration of linked data: A survey. ACM Computing Surveys (CSUR), 52(5), 1-40.

33. Tekale, K. M. T., & Enjam, G. reddy . (2022). The Evolving Landscape of Cyber Risk Coverage in P&C Policies. International Journal of Emerging Trends in Computer Science and Information Technology, 3(3), 117-126. https://doi.org/10.63282/3050-9246.IJETCSIT-V3I1P113

34. Arpit Garg, S Rautaray, Devrajavans Tayagi. Artificial Intelligence in Telecommunications: Applications, Risks,and Governance in the 5G and Beyond Era. International Journal of Computer Techniques – Volume10Issue1,January - February – 2023. 1-19.

Downloads

Published

2024-10-30

Issue

Section

Articles

How to Cite

1.
Karri N, Pedda Muntala PSR. Using Oracle’s AI Vector Search to Enable Concept-Based Querying across Structured and Unstructured Data. IJAIBDCMS [Internet]. 2024 Oct. 30 [cited 2025 Oct. 29];5(3):145-54. Available from: https://ijaibdcms.org/index.php/ijaibdcms/article/view/280