Using the RAG Studio
Integrate with existing data infrastructure and workflows, while also maintain control over your data.
- Access the Studio Application UI, if granted permission via the project settings
- Create chat interactions with the available large language models, where they can ask questions and get answers from the model
- Can ensure that a knowledge base is used to ground answers from the large language model with direct quotes and inputs sourced from the documents in the knowledge store
- Provide feedback on each answer which will be collated in the Analytics tab and stored in ML Flow for any model evaluation or training purposes
-
In the Cloudera
console, click the Cloudera AI
tile.
The Cloudera AI Workbenches page displays.
-
Click on the name of the workbench.
The workbenches Home page displays.
-
Click Projects, and then click New
Project to create a new project.
In the left navigation pane, the new AI Studios option is displayed.
-
Click AI Studios and select RAG
Studio to enter the application.
Using Retrieval-Augment-Generation with Knowledge Bases
The RAG Studio chat function, using the knowledge base, is a more advanced version of the chat function. It uses a knowledge base to access and retrieve information to provide more accurate and up-to-date answers to users. When you send a message, the chatbot searches the knowledge base to find relevant information and provide a response based on that information.
-
Click Knowledge Bases.
The Create Knowledge Base button is displayed.
Once created and connected to a chat, the information in the knowledge base becomes the only knowledge the large language model is allowed to reference, grounding its answers in your enterprise context.
Fill in the required fields for the Knowledge Base:- Name - the name of the Knowledge Base
- Chunk size - refers to the amount of data that is processed and written to the database in a single operation. Long documents are divided into smaller chunks for referencing, with the chunk size determining the size of these pieces. If unsure, it is recommended to keep the default value of 512.
- Embedding Model - The selected large language model that is used will “read” the provided documents and transform them into the vectors it needs to reference later in a process called embedding.
- Chunk overlap - This setting controls how much of the previous chunk's data is included in the next chunk, so as the small pieces of the larger document are referenced information at the boundaries between the pieces and are not lost
-
Fill the required Knowledge Base by uploading
documents.
Supported file types include:
- .txt, .md, .csv
- .pdf, .docx, .pptx, .pptm, .ppt
- .jpg, .jpeg, .png
- .jso
- If advanced document processing is enabled, then images and charts contained within PDFs will also be ingested
- To begin a RAG enabled chat, Click Chat in the top-left corner. The Start chatting with an existing Knowledge Base field is displayed.
- Select the Knowledge Base you would like to use for your chat, from the drop-down list, in the Start chatting with an existing Knowledge Base field.
-
Click the
icon.
The main Chat window with a chat field is displayed.
You can enable the usage of the Knowledge Base with the
icon.
- Optional:
Configure Chat Settings if required.
You can configure the followings:
- Knowledge Base: Select the required Knowledge Base.
- Name: Provide a name for the chat.
- Response synthesizer model: Select the model that you would like to write the final answer.
- Reranking model: Select the model you want to decide what documents and snippets are the most important to reference. This feature is not available with OpenAI.
- Number of documents: Select how many document chunks you want the answer to reference and incorporate. The number is set to 10 by default.
- Write your question into the chat field, send it and wait for a reply.
-
Check the answer you received from RAG studio.
You can also spot check RAG Studio’s automatic evaluation of the answer based on the available text, the Knowledge Base:
: marks the level of relevancy, measures if the response and source nodes match the query. Does the question/answer pair make sense?
: marks faithfulness, measures if the response from a query engine matches any source nodes. Does the provided answer match the source documents well?
-
Evaluate the answer with the help of the
icons.
Providing Feedback - Human in the loop
Anyone using the chat can optionally provide feedback on each answer, which can be used to systematically evaluate the performance of the chatbot.
-
View the evaluations, summary and analytics by selecting
Analytics in the top navigation bar.
The available analytics are:
- App Metrics
- Session Metrics
- Feedback Metrics
- Auto evaluation metric averages
- Chunk relevance over time
All the underlying data is also stored in the local MLFlow instance for machine learning scientists to use as needed.