Using profiling

The Profiling tab provides statistics and visualizations for selected fields in a dataset. You can explore dimensions and measures, view data types, refresh results, and manage cards to better understand the dataset.

Enable data profiling in Site Settings > Data > Enable Data Profiling. For more information, see Managing data-related site settings.

Click DATA on the main navigation bar.
The Data view appears, open on the Datasets tab.
Find the dataset that you want to profile, either by scrolling through the list or using the search function.
Click the dataset you want to examine.
The dataset side navigation pane opens for the selected dataset, displaying the Dataset Detail page.
Click Dataset Profiling in the left navigation panel.
The Dataset Profiling view opens on the Profiling tab. This tab provides the option to select fields from the dimensions and measures of the dataset.
Select the fields that you want to examine and click Explore.
Cards appear with statistics and visualizations for each selected field. If the query is running too long, you can cancel it by clicking Stop. You can rerun a query and get updated results by clicking Refresh.
tip
- The data type of the specific dimension or measure is shown in the top-right corner of the card.
- If there are a lot of cards displayed, you can click a selected data field pill to jump directly to its card.
- To remove a card, click ✕ located in the corresponding data field label pill in the top row. This also stops any running queries.
The following information is shown for the selected dataset fields.
Dimension values:

Most frequent

The string that appears the most often in the dataset, helping to identify the most common value.

Least frequent

The string that appears the least often in the dataset, helping to identify rare or unusual values.

Number of Empty string

The count of cells that contain an empty string, for example cells that are not null but contain no characters.

Empty cells

The number of cells that are completely empty, including both null values and empty strings.

Record count

The total number of records in the dataset.

Duplicated rows

The number of rows that are exactly identical, helping to identify redundant data. A high number of duplicates often signifies low variability in the dataset.

Distinct count

The number of unique values in the dataset, showing how varied the data is.

Longest word

The string with the highest number of characters in the dataset, providing insight into the potential size of values.

Shortest word

The string with the fewest number of characters, indicating the shortest data entry.

MIN

The minimum value in the dataset based on alphabetical order, useful for sorting or range analysis.

MAX

The maximum value in the dataset based on alphabetical order, helping to define the upper range of the data.

The count histogram for dimensions shows how many times each category appears, that is the number of occurrences (counts) of each category (dimension). This visualization is useful for understanding the quality, distribution, and structure of your dataset.
Measure values:

MIN

The smallest numeric value in the dataset, showing the lower bound of your data.

MAX

The largest numeric value in the dataset, showing the upper bound of your data.

Avg (Average)

The sum of all numeric values divided by the total number of records, giving an overall sense of the central value.

Standard Dev

Standard deviation that shows how spread out the numbers are in the dataset. A higher value indicates more variability in the data and a lower value indicates that the data points are closer to the average.

Number of Zeroes

The count of records where the value is exactly zero, indicating gaps or null-equivalents in the data.

Empty cells

The number of cells that are completely empty, including both null values and missing values.

Record count

The total number of records in the dataset.

Duplicated rows

The number of rows where all values are identical, helping to identify redundant data. A high number of duplicates often signifies low variability in the dataset.

Distinct count

The number of unique values in the dataset, showing how varied the data is.

Most frequent

The numeric value that appears the most often in the dataset, helping to identify the most common value.

Least frequent

The numeric value that appears the least often in the dataset, helping to identify rare or unusual values.

The frequency histogram for measures shows the distribution of numeric values (measures) within specific ranges (buckets), giving insight into their spread and common ranges.