Querying DSpace: An AI Powered Conversation Application using RAG with Langchain

Zhang, Zhongda; Yang, Le

doi:10.5281/zenodo.12528086

Published June 25, 2024 | Version v1

Poster Open

Querying DSpace: An AI Powered Conversation Application using RAG with Langchain

1. University of Oklahoma, United States of America
2. University of Oregon, United States of America

AI has the potential to significantly impact open access repository development landscape in various ways like enabling better search, content recommendation, identifying new patterns in scholarly content, and promoting openness in datasets and content. Large language models (LLMs) have emerged as crucial and widely used resources in the field of natural language processing, which is a subfield of artificial intelligence (AI) and shares common ground with machine learning (ML). LLMs allow computers to comprehend and produce text in a manner that resembles human communication.

Our goal during the experiment was to create a conversation application that integrates OpenAI to query DSpace using natural language processing (NLP). We explored technologies such as LLMs, OpenAI API, LangChain, embeddings, and vector stores. LLMs are deep learning models trained on large datasets. The OpenAI API provides a cloud interface for accessing OpenAI's machine learning models. LangChain is an AI framework for language-based applications. Embeddings encode information in high-dimensional vector spaces. Vector stores are databases that store vector embeddings of non-numerical data. To create better responses, we used retrieval-augmented generation (RAG) to incorporate additional, real-time data from DSpace. This allows us to explore the most up-to-date data in DSpace.

Files

146_Zhang_QueryingDSpace.pdf

Files (1.4 MB)

Name	Size	Download all
146_Zhang_QueryingDSpace.pdf md5:63292708a9f60e1cf770f370d70396d9	1.4 MB	Preview Download

	All versions	This version
Views	155	155
Downloads	171	171
Data volume	261.1 MB	261.1 MB

Querying DSpace: An AI Powered Conversation Application using RAG with Langchain

Authors/Creators

Description

Files

146_Zhang_QueryingDSpace.pdf

Files (1.4 MB)