Creating an AI visual

You can create an AI visual in Cloudera Data Visualization to let users query data through natural language. Before creating the visual, you must configure the required AI engine and data settings, and ensure the dataset meets the requirements for the chosen AI approach (SQL query or similarity search).

Before you start creating the AI visual, make sure the following prerequisites are met.

Configuration requirements:
Data requirements:

There are no special data requirements if you are using the SQL query AI approach.

  1. On the main navigation bar, click VISUALS.
    • To add the AI visual to an existing dashboard: open the dashboard and click EDIT.
    • To start fresh: click Create New > Dashboard to open a new, blank dashboard.
  2. In the Dashboard Designer interface, open the Visuals menu from the side menu bar and click NEW VISUAL.

    The Visual Builder appears. By default, a table visual is created. For more information, see Visual Builder overview.

  3. In the VISUALS menu, click the AI visual icon.

    A blank AI visual is added to the dashboard.

    The screen also displays the current AI approach (SQL query by default) and the default completion model. You can adjust the AI approach and other settings in the VISUAL > Settings panel.

  4. Click Settings > AI Approach from the right-side VISUAL menu and choose how user queries will be answered.
    The following options are available:
    • Use previous data and conversation – Incorporates previous context in the conversation to refine or expand responses to provide more contextual answers. This option is enabled by default.

    • Use SQL query – Generates SQL queries against the connected dataset and summarizes the results. It is enabled by default, and when enabled, it disables similarity search.

    • Enable similarity search – Performs a vector-based search to find the most relevant records to answer the query. When enabled, SQL query is disabled automatically. This option is only available for SQlite, Solr 9+, and CSS datasets.

    To use the AI visual, you must select either SQL query or similarity search. With both options, you can also include previous data and conversation context.

    If you are using SQL query, continue with Step 5.

    If you are using similarity search, first you need to populate the shelves of the AI visual.

    1. Switch back to the Build menu and populate the AI visual shelves from the fields listed in the DATA pane.
      • Embeddings: Add vector fields containing embeddings that you want the AI model to analyze in a semantic way.

      • Context Dimensions: Add fields to group or segment embedding results into categories.

      • Context Aggregates: Add aggregated fields that summarize data for similar records in the AI visual. This allows the AI visual to better answer questions related to quantities, distributions, and comparisons.

      • Tooltip: Define the source information to be included in the response tooltip. This information appears when the user hovers over the response's Info icon, but it is not sent to the completion service.

      • Limit: Define the maximum number of data rows retrieved from the vector database and analyzed.

    2. Click REFRESH VISUAL.

    3. [Optional] If you are using similarity search, you can configure embedding and completion related settings. Click Settings > Vector Search from the right-side VISUAL menu to:
      • Set the maximum number of tokens. You can use 0 or a negative value to disable this setting
      • Change the default profile
      • Provide a template for vector search responses
  5. Optional: Click Settings > Completion from the right-side VISUAL menu to configure completion.
    You can:
    • Change the selected profile.

    • Set the maximum number of tokens. If exceeded, the request is not sent to the service and an error message is displayed.

    • Choose a context overflow policy that defines how to handle situations where the length of generated tokens exceeds the context window size:
      • Throw an error: Raises an error when the conversation length exceeds the defined context window.
      • Remove old conversation first: Automatically manages conversation length by removing old tokens.
      • Truncate embedding context first: Truncates the embedding context, retaining the latest tokens.
    • Specify the context placeholder text that is used in the search results when data is missing or empty. If set, the text replaces empty values, if left blank, empty values are removed.

    • Specify the question prompt, which is the formatting of the user question.

    • If you are using SQL query, you also have the option to provide custom instructions for formatting SQL query results.These instructions guide how data analysis results are presented to the users.
  6. Optional: Click Settings > Display from the right-side VISUAL menu to configure how the visual is displayed.
    • Customize the chat welcome message, which is displayed as the first chat message from the AI visual.
    • Set the display format for the conversation: plain text or markdown
  7. Click SAVE.