Configuring RAG Studio
After launching RAG Studio, an initial configuration is required, with optional settings available for additional customization. Without this configuration, the Studio will not be able to access any AI models.
-
In the Cloudera
console, click the Cloudera AI
tile.
The Cloudera AI Workbenches page displays.
-
Click on the name of the workbench.
The workbenches Home page displays.
-
Click Projects, and then select the required
Project.
In the left navigation pane, the new AI Studios option is displayed.
-
Click AI Studios and select RAG
Studio.
The RAG Studio page is displayed.
The Settings page is displayed when opening up the RAG Studio for the first time. If the Settings page is not displayed, click on Settings in the top-right corner.The following settings are available:
- Optional: Enable the Enhanced PDF Processing option for better text extraction, however, note that this feature requires at least one GPU and at least 16GB of RAM and can significantly slow down document parsing.
-
Select one of the following Metadata Database
storage options.
- Embedded H2 database (default): Use an Embedded H2 database for storing metadata information.
- External PostgreSQL database: Store metadata in an external PostgreSQL database.
-
Select one of the following File Storage
options.
-
Project Filesystem (Default): Use the Cloudera AI Project filesystem for file storage.
- AWS S3: Select an existing Amazon S3 bucket and provide the bucket name and prefix to be used for all S3 paths.
-
-
Select one of the following Vector Database
options.
- Embedded Qdrant database (Default): Use Qdrant locally as a vector store database.
- Cloudera Semantic Search: Configure an existing CSS host by providing the host details, namespace and optionally, a username and password for authentication.
- ChromaDB database: To use it as a vector
store, you can either run it locally by setting the host to
localhostor provide external endpoint details for remote access.
-
Select one of the following Model Provider
options.
The following options are available:
- Cloudera AI: No authentication
required, but you will need to obtain the domain name of the Cloudera AI Inference service. Note, that the domain
must reside within the same Cloudera environment as the Cloudera AI Workbench where the studio is
operating. For more details, see Preparing to interact with the Cloudera AI Inference
service API.
- The system retrieves the JWT stored at
/tmp/jwtto obtain an authentication token for all interactions with Cloudera AI Inference service. Alternatively, you can set theCDP_TOKEN_OVERRIDEenvironment variable, which will then be used for authentication with Cloudera AI Inference service. - When models are required, the system interacts with the
Cloudera AI Inference service APIs to
identify available models. It utilizes the endpoint's
task attribute to determine the model
type—
TEXT_GENERATIONorTEXT_TO_TEXT_GENERATIONfor inference,EMBEDfor embedding, orRANKfor reranking. - You can also provide the CDP Auth Token in the Settings page.
- All endpoints utilized must adhere to the OpenAI API standard.
- The system retrieves the JWT stored at
- AWS Bedrock: Requires authentication:
- AWS Region: Choose the AWS region to use.
- AWS Access Key ID: Provide the Access Key ID for authentication.
- AWS Secret Access Key: Provide the Secret Access Key for authentication.
- Azure OpenAI: Requires configuration and
authentication:
- Azure OpenAI Endpoint: Find the endpoint of the Azure OpenAI service in the Azure portal.
- API Version: Find the API Version of the Azure OpenAI service in the Azure portal.
- Authentication Azure OpenAI Key: Provide the Azure OpenAI Key for authentication.
- OpenAI
- API Key: Provide your OpenAI API Key for authentication.
- Base URL: It is optional to provide the custom base URL for OpenAI-compatible endpoints.
- Cloudera AI: No authentication
required, but you will need to obtain the domain name of the Cloudera AI Inference service. Note, that the domain
must reside within the same Cloudera environment as the Cloudera AI Workbench where the studio is
operating. For more details, see Preparing to interact with the Cloudera AI Inference
service API.
- Optional:
Click on Settings in the top-right corner and select
Model Configuration.
The models available for the user are listed here. You can check if your models are available and function as expected by selecting the
button next to the model.You can choose form the following options:- Embedding Models
- Inference Models
- Reranking Models
- Optional:
Click on Tools in the top navigation bar.
In this tab you can view and manage tools available for use in chat sessions. You can add Model Context Protocol (MCP) to access tools that operate locally using the RAG studio or provide the URL for tools hosted externally.
After the initial configuration, the RAG Studio application will restart in order to access the model providers. This restart must be completed, after which the studio is ready to use.
