Managing AI settings

Cloudera Data Visualization offers extensive site-wide configuration options, enabling users with administrative privileges to manage various settings crucial for organizational workflows. You can enable the AI visual, select the AI engine, and specify additional settings.

The AI Settings option is included in the Site Settings menu, which is only available to users with administrative privileges.

Click on the main navigation bar to open the Administration menu and select Site Settings.

The Site Settings page opens on the Data menu.
Open the AI Settings menu from the left navigation and choose from the available options.

General

In this AI Settings section, you can manage your general AI configurations.

Enable AI features

Before enabling AI features in Cloudera Data Visualization, review the legal terms and conditions for using the AI visual, which are available in the Info tooltip.

When you enable AI features for the first time, a modal window prompts you to accept these terms and conditions. The system records the date and username of the person who accepts the terms. To ensure compliance, users are required to confirm acceptance of the terms each time they enable the AI features. This is important because the user enabling the feature may differ from the original user who enabled the feature earlier.

Selecting Enable AI features makes the Enable AI visual and Enable AI summary options available so that you can enable them individually as needed.

important
If the AI visual feature is not enabled, you cannot to view AI visuals on dashboards created by other users.
Redact AI visual logs

You can enable redaction for AI visual logs when at least one AI feature is enabled. When redaction is turned on, audit logs exclude any data sent by AI visuals, helping protect sensitive information and support privacy and compliance requirements.

Embeddings

In this AI Settings section, you can manage your embedding profiles. You can create new profiles, view the list of existing ones, and perform actions such as copying, editing, or deleting existing profiles, and setting a new default profile.

Creating a new embeddings profile

Click CREATE PROFILE.

The Create Profile modal opens.
Provide a name for the profile.
Select the AI engine you want to use for the embeddings.
You have the following options available:
- Cloudera AI Hosted MiniLM
- Cloudera AI Inference
- OpenAI
- OpenAI Azure
- Amazon Bedrock
- Other
Specify the settings for the selected AI engine.
Access Key: Required to access the external third-party AI service provider

Authorization API Key: Required to access the external third-party AI service provider

Service URL: URL for the embedding service used by the AI engine

Service input parameter name: Name for the embedding service input

Service response parameter name: Name for the embedding service response parameter

Maximum tokens: Maximum number of input tokens that can be processed to generate an embedding

The default value is 1000.
note
You can disable this limit by specifying zero or a negative value.
Authentication mode: Required to access the external third-party AI service provider

The available options for authentication are:

JWT

Select this option to authenticate using JSON Web Token (JWT). Authentication is automatically handled with the JWT token and no further configuration is required.

note
This authentication option is available only in Cloudera AI environments. JWT authentication can be used only when the Cloudera AI Inference model and Cloudera Data Visualization are in the same environment.

API Key

Select this option to authenticate using an API Key. You must provide a valid API key to access the external service.

Service URL: URL for the embedding service used by the AI engine

Maximum tokens: Maximum number of input tokens that can be processed to generate an embedding

The default value is 1000.
note
You can disable this limit by specifying zero or a negative value.

Model: Used for the embeddings
note
If the embedding model requires the input_type parameter, add the -query suffix to the model name. For more information, see the NVIDIA documentation.
API Key: Required to access the external third-party AI service provider

Ensure you review the legal terms and conditions provided in the Info tooltip before proceeding.

Model: Used for the embeddings

Maximum tokens: Maximum number of input tokens that can be processed to generate an embedding

The default value is 1000.
note
You can disable this limit by specifying zero or a negative value.
API Key: Required to access the external third-party AI service provider

Ensure you review the legal terms and conditions provided in the Info tooltip before proceeding.

Service URL: URL for the embedding service used by the AI engine

Maximum tokens: Maximum number of input tokens that can be processed to generate an embedding

The default value is 1000.
note
You can disable this limit by specifying zero or a negative value.
AWS Credentials: Required to access the external third-party AI service provider

Ensure you review the legal terms and conditions provided in the Info tooltip before proceeding.

Model: The embedding model for processing queries and generating vector representations:

amazon.titan-embed-text-v1 (default)

amazon.titan-embed-text-v2:0

Custom – Allows specifying a different model.

Maximum tokens: Maximum number of input tokens that can be processed to generate an embedding

The default value is 1000.
note
You can disable this limit by specifying zero or a negative value.
API Key: Required to access the external third-party AI service provider

Ensure you review the legal terms and conditions provided in the Info tooltip before proceeding.

Service URL: URL for the embedding service used by the AI engine

Custom Request/Response transformers: Use this textbox to provide your custom request and response transformer functions in valid Python.

Maximum tokens: Maximum number of input tokens that can be processed to generate an embedding

The default value is 1000.
note
You can disable this limit by specifying zero or a negative value.
[Optional] Click TEST to validate the model using the specified settings.
Click SAVE.
After creating a new profile, save the Site Settings page to apply your changes.

Synchronizing embeddings profiles (available only for Cloudera AI environments supporting auto-discovery)

Synchronizing models is available in Cloudera AI environments that support model syncing.

Click SYNC CAII MODELS to synchronize models with Cloudera AI Inference
After syncing, save the Site Settings page to apply your changes.

If a previously synced profile no longer exists in Cloudera AI Inference or its state is neither loaded nor pending, it gets deleted in Cloudera Data Visualization during the synchronization process.
If there is a new AI Inference model in Cloudera AI Inference, it is added in Cloudera Data Visualization during the synchronization process.
If an existing model has changed in Cloudera AI, synchronizing updates it in Cloudera Data Visualization.

Managing embeddings profiles

The Site Settings page for embeddings profiles displays the existing profiles, including their names, engines, and status information. For Cloudera AI environments supporting auto-discovery, the type of the profile is also shown. See more about profile types below.

Profile status: The statuses available across all environments are Default and Disabled.

note
Enabled (active) profiles do not display a separate status label.

Profile type (available only for Cloudera AI environments supporting auto-discovery)

The Type column indicates how an embeddings profile was created and its current state:

Manual – Profile was created manually, all settings are editable.
Synced – Profile was created automatically by syncing with Cloudera AI Inference. For synced embeddings profiles, authentication mode is always JSON Web Token (JWT), so this feature is available only if JWT is available.
Synced profiles can have the following states:
- Loaded – The AI Inference model is registered and available. Most fields are read-only and updated automatically through synchronization. The following fields can be edited manually:
  - Name
  - Token limit
  - Temperature
  - Enable/disable streaming
- Pending – The AI Inference model is not yet registered in Cloudera AI Inference.
  - Pending profiles are disabled by default in Cloudera Data Visualization because they are not yet usable. They become active once successfully synchronized.
  - A pending profile can automatically change to Loaded when sufficient resources become available, without manual action in Cloudera AI Inference.
  - In Cloudera Data Visualization, click SYNC CAII MODELS to refresh the status. If the model has been registered since the last sync, the profile updates to Loaded.

The list of profiles is sorted by status: default first, enabled profiles next, and disabled profiles last. For Cloudera AI environments supporting auto-discovery, profiles within the enabled and disabled categories are further sorted by type — manual first, loaded second, and pending last.

Action buttons for each profile can be accessed from a dropdown menu by clicking at the end of the profile row.

Setting a profile as default

The Status column shows whether an embeddings profile is set as the default.
The current default profile, for new AI visuals, is marked with the Default label.
To change the default, find the profile you want to set as default, click at the end of profile row and select Set as default.
note
After modifying a profile, save the Site Settings page to apply your changes.

Editing a profile

Click on the profile row and select Edit Profile from the dropdown menu.
Make the necessary changes.
[Optional] Click TEST to validate the model with the new settings.
Click SAVE.
After modifying a profile, save the Site Settings page to apply your changes.

Duplicating an existing profile

You can duplicate an existing embeddings profile to create a new profile instance.

Find the profile you want to copy and click on the profile row
Select Create Profile from the dropdown menu. All details are pre-filled from the original profile, except for the Name field.
Enter a new name for the profile.
Click SAVE.
After creating a profile, save the Site Settings page to apply your changes.

Disabling a profile

The Status column shows whether a profile is enabled or disabled.
Disabled profiles marked with the Disabled badge.
To disable an active profile, click at the end of the row and select Disable.
note
After modifying a profile, save the Site Settings page to apply your changes.
When a profile is disabled, you can only re-enable. No other actions are available.
Disabling the default profile auto-assigns a new default, which is a random active profile.
If all profiles are disabled, the AI Assistant and Annotation features display warning messages, and the AI Summary is not shown.

Deleting a profile

Click on the profile row and select Delete from the dropdown menu.
Click YES to confirm the deletion.
After deleting a profile, save the Site Settings page to apply your changes.

Completion

In this AI Settings section, you can manage completion profiles. You can create new profiles, view the list of existing ones, and perform actions such as copying, editing, deleting existing profiles, or setting a new default profile.

Creating a new completion profile

Click CREATE PROFILE.

The Create Profile modal window opens.
Provide a name for the profile.
Select the AI engine you want to use for the completion.
You have the following options available:
- Cloudera AI Hosted Lama
- Cloudera AI Inference
- OpenAI
- Anthropic
- OpenAI Azure
- Amazon Bedrock
- Other
Specify the settings for the selected AI engine.
Access Key: Required to access the external third-party AI service provider

Ensure you review the legal terms and conditions provided in the Info tooltip before proceeding.

Authorization API Key: Required to access the external third-party AI service provider

Ensure you review the legal terms and conditions provided in the Info tooltip before proceeding.

Service URL: URL for the completion service used by the AI engine

Service input parameter name: Name for the completion service input

Service response parameter name: Name for the completion service response parameter

Maximum tokens: Maximum number of tokens to generate in the response

The total token count includes both input and output tokens, and the default value is 4000.
note
You can disable this limit by specifying zero or a negative value.

Extra arguments: Any additional arguments for the completion query
Authentication mode: Required to access the external third-party AI service provider

The available options for authentication are:

JWT

Select this option to authenticate using JSON Web Token (JWT). Authentication is automatically handled with the JWT token and no further configuration is required.

note
This authentication option is available only in Cloudera AI environments. JWT authentication can be used only when the Cloudera AI Inference model and Cloudera Data Visualization are in the same environment.

API Key

Select this option to authenticate using an API Key. You must provide a valid API key to access the external service.

Service URL: URL for the completion service used by the AI engine

Maximum tokens: Maximum number of tokens to generate in the response

The total token count includes both input and output tokens, and the default value is 4000.
note
You can disable this limit by specifying zero or a negative value.

Model: Specific model name or version to use for the completion

Temperature: Controls randomness in the generated output on a scale from 0 to 2

Lower values make results more focused and deterministic, while higher values introduce greater randomness and and product more varied outputs. The default setting is 1.

important
Only a fixed temperature value of 1.0 is supported for Cloudera AI Inference models.

Enable Streaming Responses: When enabled, the model’s output appears in real time as it is generated instead of waiting for the full response

This means that you to start receiving query results immediately (in real time) while the system continues processing data and refining additional details.
API Key: Required to access the external third-party AI service provider

Ensure you review the legal terms and conditions provided in the Info tooltip before proceeding.

Model: Specific model name or version to use for the completion

For example: gpt-5, gpt-mini, or gpt-o3

Maximum tokens: Maximum number of tokens to generate in the response

The total token count includes both input and output tokens, and the default value is 4000.
note
You can disable this limit by specifying zero or a negative value.

Temperature: Controls randomness in the generated output on a scale from 0 to 2

Lower values make results more focused and deterministic, while higher values introduce greater randomness and and product more varied outputs. The default setting is 1.

important
This engine only supports a fixed temperature value of 1.0 for the following models:

gpt-5

gpt-5-mini

gpt-5-nano

gpt-5-chat-latest

gpt-5-codex

gpt-5-pro

o3-pro

o3

o3-deep-research

Enable Streaming Responses: When enabled, the model’s output appears in real time as it is generated instead of waiting for the full response

This means that you to start receiving query results immediately (in real time) while the system continues processing data and refining additional details.
API Key: Required to access the external third-party AI service provider

Ensure you review the legal terms and conditions provided in the Info tooltip before proceeding.

Model: Specific model name or version to use for the completion

For example: claude-3

Maximum tokens: Maximum number of tokens to generate in the response

The total token count includes both input and output tokens, and the default value is 4000.
note
You can disable this limit by specifying zero or a negative value.

Temperature: Controls randomness in the generated output on a scale from 0 to 2

Lower values make results more focused and deterministic, while higher values introduce greater randomness and and product more varied outputs. The default setting is 1.

Enable Streaming Responses: When enabled, the model’s output appears in real time as it is generated instead of waiting for the full response

This means that you to start receiving query results immediately (in real time) while the system continues processing data and refining additional details.
API Key: Required to access the external third-party AI service provider

Ensure you review the legal terms and conditions provided in the Info tooltip before proceeding.

Service URL: URL for the completion service used by the AI engine

Maximum tokens: Maximum number of tokens to generate in the response

The total token count includes both input and output tokens, and the default value is 4000.
note
You can disable this limit by specifying zero or a negative value.

Enable Streaming Responses: When enabled, the model’s output appears in real time as it is generated instead of waiting for the full response

This means that you to start receiving query results immediately (in real time) while the system continues processing data and refining additional details.
AWS Credentials: Required to access the external third-party AI service provider

Ensure you review the legal terms and conditions provided in the Info tooltip before proceeding.

Model: The completion model for generating AI-driven responses:

amazon.titan-text-lite-v1 (default)

amazon.titan-text-express-v1

us.anthropic.claude-3-7-sonnet-20250219-v1:0

anthropic.claude-3-5-sonnet-20241022-v2:0

anthropic.claude-3-5-haiku-20241022-v1:0

anthropic.claude-3-opus-20240229-v1:0

Custom – Allows specifying a different model.

Maximum tokens: Maximum number of tokens to generate in the response

The total token count includes both input and output tokens, and the default value is 4000.
note
You can disable this limit by specifying zero or a negative value.

Enable Streaming Responses: When enabled, the model’s output appears in real time as it is generated instead of waiting for the full response

This means that you to start receiving query results immediately (in real time) while the system continues processing data and refining additional details.
API Key: Required to access the external third-party AI service provider

Ensure you review the legal terms and conditions provided in the Info tooltip before proceeding.

Service URL: URL for the completion service used by the AI engine

Custom Request/Response transformers: Use this textbox to provide your custom request and response transformer functions in valid Python.

Maximum tokens: Maximum number of tokens to generate in the response

The total token count includes both input and output tokens, and the default value is 4000.
note
You can disable this limit by specifying zero or a negative value.
Optional: Click TEST to validate the model using the specified settings.

If you do not want to save the new profile, click CLOSE to close the dialog without saving changes.
Click SAVE.
After creating or a new profile, save the Site Settings page to apply your changes.

Synchronizing completion profiles (available only for Cloudera AI environments supporting auto-discovery)

Model synchronization is available in Cloudera AI environments that support automatic model discovery. Synchronizing ensures that completion profiles in Cloudera Data Visualization stay aligned with the models available in the Cloudera AI Inference service.

Click SYNC CAII MODELS to synchronize models with the Cloudera AI Inference service Inference.
After synchronization completes, click Save on the Site Settings page to apply your changes.
- If a previously synchronized profile no longer exists in Cloudera AI Inference service or its state is neither loaded nor pending, it gets automatically deleted in Cloudera Data Visualization during the synchronization process.
- If there is a new AI Inference model in Cloudera AI Inference service, it is automatically added in Cloudera Data Visualization during the synchronization process.
- If an existing model has changed in Cloudera AI Inference service, those changes are reflected and updated in Cloudera Data Visualization.

Managing completion profiles

The Site Settings page for completion profiles displays the existing profiles, including their names, engines, and status information. For Cloudera AI environments supporting auto-discovery, the type of the profile is also shown.

Profile type (available only for Cloudera AI environments supporting auto-discovery)

The Type column indicates how a completion profile was created and its current state:

Manual – Profile was created manually, all settings are editable.
Synced – Profile was created automatically by synchronizing with Cloudera AI Inference. For synced completion profiles, authentication mode is always JSON Web Token (JWT), so this feature is available only also if JWT is available.
Synced profiles can have the following states:
- Loaded – The AI Inference model is registered and available. Most fields are read-only and updated automatically through synchronization. The following fields can be edited manually:
  - Name
  - Token limit
  - Temperature
  - Enable/disable streaming
- Pending – The AI Inference model is not yet registered in Cloudera AI Inference.
  - Pending profiles are disabled by default in Cloudera Data Visualization because they are not yet usable. They become active once successfully synced.
  - A pending profile can automatically change to Loaded when sufficient resources become available, without manual action in Cloudera AI Inference.
  - In Cloudera Data Visualization, click SYNC CAII MODELS to refresh the status. If the model has been registered since the last sync, the profile updates to Loaded.

Profile status

The statuses available across all environments are Default and Disabled.

The list of profiles is sorted by status: default first, enabled profiles next, and disabled profiles last. Disabled profiles are displayed greyed out to distinguish them from active ones. For Cloudera AI environments supporting auto-discovery, profiles within the enabled and disabled categories are further sorted by type — manual first, loaded second, and pending last.