Field statistics

Cloudera Data Visualization enables you to create Field statistics visuals. Field statistics visuals describe the dataset in statistical terms. They work across all data connections, and on datasets that contain table joins.

  1. Start a new visual based on the US County Population dataset.
    For instructions, see Creating a visual.
  2. Select Field Statistics in the VISUALS menu.
    The visual appears as a table that shows the following statistical information for each field in the dataset.
    • Actions column contains some widgets for making changes to the appearance of the table.
    • Field column indicates the name of the dataset field, and provides an option to include only a limited number of top values, when applicable.
    • Type column indicates the data type of the dataset field.
    • Distinct Values column reports the number of distinct values for this field. This is typically a good indication of the field's suitability as a dimension.
    • Minimum column reports the smallest (lowest) value in the dataset field. For strings, this would be the 'least' in alphabetical sort.

      In this example, for field stname (state name), this would be Alabama, and for field ctyname (county name), this would be Abbeville County.

    • Maximum column reports the largest (highest) value in the dataset field. For strings, this would be the 'greatest' in alphabetical sort.

      In this example, for field stname (state name), this would be Wyoming, and for field ctyname (county name), this would be Zieback County.

    • Detail column holds a graphic that is either a bar chart of top K values, or a distribution histogram of these values, for each column.
  3. You can make changes to the visual through the following actions:
    1. In the Actions column, click the (double up) icon to move the field to the very top of the table.
    2. In the Actions column, click the (up) icon to move the field up one row in the table.
    3. In the Actions column, click the (down) icon to move the field down one row in the table.
    4. In the Actions column, click the (trash) icon to remove the field from the table.
    5. In the Field column, toggle between:
      • Change to top k to change the Detail column to show a bar representation of the top 20 values (default) for the dataset field
      • Change to histogram to change the Detail column to show a distribution histogram for the dataset field
  4. Change the title to US County Populations - Field Statistics.
    [Optional] You can also add a brief description of the visual as a subtitle below the title of the visualization.
  5. Click SAVE at the top left corner of the Dashboard Designer.

In View mode, the Field statistics visual does not have actions, or it does not contain the Actions column.