Configuring RAG Studio
After launching RAG Studio, an initial configuration is required, with optional settings available for additional customization. Without this configuration, the Studio will not be able to access any AI models.
-
In the Cloudera
console, click the Cloudera AI
tile.
The Cloudera AI Workbenches page displays.
-
Click on the name of the workbench.
The workbenches Home page displays.
-
Click Projects, and then select the required
Project.
In the left navigation pane, the new AI Studios option is displayed.
-
Click AI Studios and select RAG
Studio.
The RAG Studio page is displayed.
The Settings page is displayed when opening up the RAG Studio for the first time. If the Settings page is not displayed, click on Settings in the top-right corner.The following settings are available:
- Optional: Enable the Enhanced PDF Processing option for better text extraction, however, note that this feature requires at least one GPU and at least 16GB of RAM and can significantly slow down document parsing.
-
Choose from the following File Storage
options:
-
Project Filesystem: Use the Cloudera AI Project filesystem for file storage.
-
-
Select one of the following Model
Providers:
- Cloudera AI: No authentication
required, but you will need to obtain the domain name of the Cloudera AI Inference service. Note, that the domain
must be in the same Cloudera
environment as the Cloudera AI Workbench that
the studio is running in. For more details, see Preparing to interact with the Cloudera AI Inference
service API.
- The system retrieves the JWT stored at
/tmp/jwt
to obtain an authentication token for all interactions with Cloudera AI Inference service. Alternatively, you can set theCDP_TOKEN_OVERRIDE
environment variable, which will then be used for authentication with Cloudera AI Inference service. - When models are required, the system interacts with the
Cloudera AI Inference service APIs to
identify available models. It utilizes the endpoint's
task attribute to determine the model
type—
TEXT_GENERATION
for inference,EMBED
for embedding, orRANK
for reranking. - You can also provide the CDP Auth Token in the Settings page.
- All endpoints utilized must adhere to the OpenAI API standard.
- The system retrieves the JWT stored at
- Cloudera AI: No authentication
required, but you will need to obtain the domain name of the Cloudera AI Inference service. Note, that the domain
must be in the same Cloudera
environment as the Cloudera AI Workbench that
the studio is running in. For more details, see Preparing to interact with the Cloudera AI Inference
service API.
- Optional:
Click on Settings in the top-right corner and select
Models.
The models available for the user are listed here. You can check if your models are available and function as expected by selecting the
button next to the model.
After the initial configuration, the RAG Studio application will restart in order to access the model providers. This restart must be completed, after which the studio is ready to use.