What's RaaG?

Ever wanted to have a chat with a LLM (Large Language Model) like GPT-4o, but with the ability to ask it questions about your data? RaaG is the answer to your needs!

RaaG is the acronyme for "RAG as a GPT"

Why do we need dRAGon RaaG?

You can already use LangChain or LlamaIndex to query your data, so why do you need RaaG?

As you know, a typical RAG application is composed of two main components with each several sub-tasks :

Indexing: This involves the offline process of ingesting data from a source and indexing it.
1. Load: The data is loaded using Document Loaders.
2. Split: The loaded data is divided into smaller chunks using Document Splitters.
3. Store: The chunks are stored and indexed for future searching, typically utilizing a VectorStore and an Embeddings model.
Retrieval and Generation: This is the core RAG chain that operates during runtime, handling user queries.
1. Retrieve: Relevant data chunks are fetched from the storage in response to a user query using a Retriever.
2. Generate: A ChatModel/LLM generates an answer by incorporating the user query and the retrieved data into a prompt.

Let's say you want to build a RAG pipeline using LangChain: You would need to write code to handle each of these sub-tasks, which can be quite complex and time-consuming.

How does dRAGon RaaG simplify the process?

So instead of having an app code like this:

from langchain import hub
from langchain_postgres.vectorstores import PGVector
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import ChatOpenAI

os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY"

loader = WebBaseLoader(
    web_paths=("https://dragon.okinawa")
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
splits = text_splitter.split_documents(docs)

connection = "postgresql+psycopg://user:paswword@server:5433/vectors"
collection_name = "my_rag_chunks"
embeddings = OpenAIEmbeddings()

vectorstore = PGVector(
    embeddings=embeddings,
    collection_name=collection_name,
    connection=connection,
    use_jsonb=True,
)

retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

llm = ChatOpenAI(model="gpt-4o")

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

rag_chain.invoke("What's dRAGon?")

You will simply use the following code with a dRAGon powered RAG pipeline:

from langchain_openai import OpenAI

llm = OpenAI(
    model_name="your-dragon-raag-farm-name",
    openai_api_base="http://your.dragon.host:1985/api/raag/v1",
)

prompt = "What's dRAGon?"
llm.invoke(prompt)

dRAGon simplifies the process of building a RAG pipeline by providing a single interface to manage all these components by providing an OpenAI API compliant endpoint to interact with your RAG pipeline.

Components involved in dRAGon RaaG

Tutorials

PreviousWhat's a Farm?NextFarm Glossary

Last updated 7 months ago

Was this helpful?

Why do we need dRAGon RaaG?

You can already use LangChain or LlamaIndex to query your data, so why do you need RaaG?

As you know, a typical RAG application is composed of two main components with each several sub-tasks :

Indexing: This involves the offline process of ingesting data from a source and indexing it.

Load: The data is loaded using Document Loaders.
Split: The loaded data is divided into smaller chunks using Document Splitters.
Store: The chunks are stored and indexed for future searching, typically utilizing a VectorStore and an Embeddings model.

Retrieval and Generation: This is the core RAG chain that operates during runtime, handling user queries.

Retrieve: Relevant data chunks are fetched from the storage in response to a user query using a Retriever.
Generate: A ChatModel/LLM generates an answer by incorporating the user query and the retrieved data into a prompt.

Let's say you want to build a RAG pipeline using LangChain: You would need to write code to handle each of these sub-tasks, which can be quite complex and time-consuming.

How does dRAGon RaaG simplify the process?

So instead of having an app code like this:

from langchain import hub
from langchain_postgres.vectorstores import PGVector
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import ChatOpenAI

os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY"

loader = WebBaseLoader(
    web_paths=("https://dragon.okinawa")
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
splits = text_splitter.split_documents(docs)

connection = "postgresql+psycopg://user:paswword@server:5433/vectors"
collection_name = "my_rag_chunks"
embeddings = OpenAIEmbeddings()

vectorstore = PGVector(
    embeddings=embeddings,
    collection_name=collection_name,
    connection=connection,
    use_jsonb=True,
)

retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

llm = ChatOpenAI(model="gpt-4o")

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

rag_chain.invoke("What's dRAGon?")

You will simply use the following code with a dRAGon powered RAG pipeline:

from langchain_openai import OpenAI

llm = OpenAI(
    model_name="your-dragon-raag-farm-name",
    openai_api_base="http://your.dragon.host:1985/api/raag/v1",
)

prompt = "What's dRAGon?"
llm.invoke(prompt)