Configuring RAG Studio

After launching RAG Studio, an initial configuration is required, with optional settings available for additional customization. Without this configuration, the Studio will not be able to access any AI models.

In the Cloudera console, click the Cloudera AI tile.

The Cloudera AI Workbenches page displays.
Click on the name of the workbench.

The workbenches Home page displays.
Click Projects, and then select the required Project.

In the left navigation pane, the new AI Studios option is displayed.
Click AI Studios and select RAG Studio.

The RAG Studio page is displayed.

The Settings page is displayed when opening up the RAG Studio for the first time. If the Settings page is not displayed, click on Settings in the top-right corner.The following settings are available:
1. Optional: Enable the Enhanced PDF Processing option for better text extraction, however, note that this feature requires at least one GPU and at least 16GB of RAM and can significantly slow down document parsing.
2. Select one of the following Metadata Database storage options.
  - Embedded H2 database (default): Use an Embedded H2 database for storing metadata information.
  - External PostgreSQL database: Store metadata in an external PostgreSQL database.
3. Select one of the following File Storage options.
  - Project Filesystem (Default): Use the Cloudera AI Project filesystem for file storage.
  - AWS S3: Select an existing Amazon S3 bucket and provide the bucket name and prefix to be used for all S3 paths.
4. Select one of the following Vector Database options.
  - Embedded Qdrant database (Default): Use Qdrant locally as a vector store database.
  - Cloudera Semantic Search: Configure an existing CSS host by providing the host details, namespace and optionally, a username and password for authentication.
  - ChromaDB database: To use it as a vector store, you can either run it locally by setting the host to localhost or provide external endpoint details for remote access.
5. Select one of the following Model Provider options.
  The following options are available:
  - Cloudera AI: No authentication required, but you will need to obtain the domain name of the Cloudera AI Inference service. Note, that the domain must reside within the same Cloudera environment as the Cloudera AI Workbench where the studio is operating. For more details, see Preparing to interact with the Cloudera AI Inference service API.
    note
    For Cloudera AI Workbench 2.0.50-b68 or higher versions, the domain name is optional if the service is hosted within the same environment. However, if the LLM, embedding or reranking endpoints are hosted in a different environment, both the domain name and CDP token are required for proper functionality.
    - The system retrieves the JWT stored at /tmp/jwt to obtain an authentication token for all interactions with Cloudera AI Inference service. Alternatively, you can set the CDP_TOKEN_OVERRIDE environment variable, which will then be used for authentication with Cloudera AI Inference service.
    - When models are required, the system interacts with the Cloudera AI Inference service APIs to identify available models. It utilizes the endpoint's task attribute to determine the model type—TEXT_GENERATION or TEXT_TO_TEXT_GENERATION for inference, EMBED for embedding, or RANK for reranking.
    - You can also provide the CDP Auth Token in the Settings page.
    - All endpoints utilized must adhere to the OpenAI API standard.
  - AWS Bedrock: Requires authentication:
    - AWS Region: Choose the AWS region to use.
    - AWS Access Key ID: Provide the Access Key ID for authentication.
    - AWS Secret Access Key: Provide the Secret Access Key for authentication.
  - Azure OpenAI: Requires configuration and authentication:
    - Azure OpenAI Endpoint: Find the endpoint of the Azure OpenAI service in the Azure portal.
    - API Version: Find the API Version of the Azure OpenAI service in the Azure portal.
    - Authentication Azure OpenAI Key: Provide the Azure OpenAI Key for authentication.
  - OpenAI
    - API Key: Provide your OpenAI API Key for authentication.
    - Base URL: It is optional to provide the custom base URL for OpenAI-compatible endpoints.
Optional: Click on Settings in the top-right corner and select Model Configuration.
The models available for the user are listed here. You can check if your models are available and function as expected by selecting the button next to the model.
You can choose form the following options:
- Embedding Models
- Inference Models
- Reranking Models
Optional: Click on Tools in the top navigation bar.

In this tab you can view and manage tools available for use in chat sessions. You can add Model Context Protocol (MCP) to access tools that operate locally using the RAG studio or provide the URL for tools hosted externally.

After the initial configuration, the RAG Studio application will restart in order to access the model providers. This restart must be completed, after which the studio is ready to use.