Understanding the Cloudera Observability metastore analytics UI elements

Learn about the Cloudera Observability Metastore Analytics UI elements that display the Hive Metastore (HMS) metadata information about your tables.

About the Cloudera Observability data temperatures

In Cloudera Observability Hot and Cold represents the number of times a query accesses the table. Where, the color and the depth of color represents the number of times a query accesses the table in relation to all the other tables in your system:
  • Hot tables (red) - are tables that were frequently accessed during the selected time-period.
  • Cold tables (blue) - are tables that were infrequently accessed during the selected time-period. This includes tables where no queries (zero) accessed their data during the selected time-period and by definition are considered the coldest tables.

About the data temperature charts

The Cloudera Observability Metastore Analytics feature has several UI elements that describe your table data.

The following charts display the data temperature information:
  • Located on the environment's Cluster Summary page, the Data Temperature chart automatically displays the top 25 most frequently queried and the bottom 25 least frequently queried tables from both the Hive and Impala engines in the Hot Tables and the Cold Tables chart widgets respectively.
  • Located on the Hive and Impala engine summary pages, the Data Temperature chart automatically displays the tables that were most frequently queried by their engine in the Hot Tables chart widget respectively.

Hovering over a Hot or Cold table with your mouse pointer, displays general information, such as, the number of queries that accessed the table, the total table size in gibibytes, the number of partitions that comprise the table, the number of files that make up the table, and whether statistics were enabled on the table's rows.

Clicking the table's name of interest in either the Hot Tables or Cold Tables chart widget or in the HMS Tables view in the HIVE METASTORE category of your environment's cluster, opens the table's Overview Details side drawer panel, which displays more information about the table.

About the Overview Details side drawer panel

The Overview Details side drawer panel describes more information about the table. Based on the table’s HMS metadata, such as the table's schema, database location, partitions, structure, and relationships, the information displayed may vary. It also describes the table's columns, such as the column names and their data types, and the table's metadata properties that include user-defined and predefined key-value pairs.

It is accessed by clicking on the table's name of interest in either the Hot or Cold Tables chart widget or from the HMS Tables view, which is found by selecting the Tables tab in the HIVE METASTORE category of your environment's cluster.

The information collected from your table's HMS metadata is divided into sub categories and displayed in the following tabs:
  • Details
  • Columns
  • Properties
Where, each tab displays the following general table values:
  • Volume, which displays the total table size in Kilobytes.
  • Rows, which displays the number of records in the table.
  • Partitions, which, if applicable, displays the number of segments that comprise the table.
  • Total Files, which displays the number of files that make up the table.
If your table contains partitions, the Distribution Across Partitions section is also displayed, which contains the following distribution cards:
  • DATA SIZE, which displays the total data size of the table selected and the distribution across its partitions.
  • NUMBER OF FILES, which displays the total number of files within the table selected and the distribution across its partitions.
  • NUMBER OF ROWS, which displays the total number of rows within the table selected and the distribution across its partitions.

The HMS metadata that is displayed in each tab is dependent on the table’s underlying data on which it is built. The following tables describe the most common parameters displayed in the Details, Columns, and Properties tabs:

Table 1. Details
Parameter Description
Historical Trend chart Displays the historical values for the Rows, Data Volume, and Partitions.
Database name The database in which the table resides.
Compressed Displays a True or False value depending on whether data compression been applied.
Location The table’s location in HDFS.
Partition Keys The name/s of the partition keys that are responsible for data distribution across the nodes.
Raw Data Size The raw data size of the table, in the nearest byte unit.
Storage Format The table’s storage format, such as but not limited to:
  • JDBC
  • LazySimple
  • Orc
  • Parquet
Stats Enabled Displays a True or False value depending on whether statistics were enabled.
Table Type The table’s type, such as but not limited to:
  • EXTERNAL_TABLE, which defines a table whose data is stored in the location specified during table creation.
  • MANAGED_TABLE, which defines a table whose data is stored in the warehouse directory.
  • VIRTUAL_TABLE, which defines a table that is the result of a query which has not materialized and whose data is not stored.
Transactional Displays a True or False value depending on whether the table contains one or more ACID semantic properties.
Created The date when the table was created, using the MM- DD-YYYY date format. For example, 06-25-2023.
Table 2. Columns
Parameter Description
Column Name Lists the Column field names.
Type The Hive data type, as one of the following:
  • bigint
  • binary
  • boolean
  • chara
  • date
  • decimal
  • double
  • float
  • int
  • smallint
  • string
  • timestamp
  • tinyint
  • varchar
Comment An informative note about the column that was added during table creation.
Table 3. Properties
Parameter Sections Description
Table Properties Predefined and user-defined metadata key-value pair properties.
SerDe Properties Serialization and deserialization properties.
Storage Descriptor Properties Metadata that describes the physical storage properties of the data residing in the table.