Understanding the Hive Metastore category

Learn about the Cloudera Observability Hive Metastore category that lists the details about each table available in your system and visually displays the current state and activity of your tables in the selected environment.

For users with a Hive Metastore deployment, Cloudera Observability captures the Hive Metastore (HMS) metadata about your tables and displays it into meaningful cards and views. These can be found in the HMS Summary and HMS Tables views, which display the current state and activity of all your tables and list details about each table available in your system, regardless of whether they have been queried or not.

The metric results displayed are dependent on your table’s schema and their HMS properties and parameters.

About the HMS Summary view

The Hive Metastore (HMS) Summary view visually displays information about the current state and activity of all your tables in the selected Environment.

It contains three sections:
  • Overview
  • Table Insights
  • Table Statistics

Overview

The Overview section displays general information about your tables and the number of databases in which they reside.

It displays the following cards:
Table 1. Overview cards
Card Description
DATABASES The number of databases in which your tables reside.
TABLES The number of tables and the percentage of tables that are External and Managed.
VIEWS The number of views and the percentage of views that are Materialized and Virtual.
PARTITIONS The number of partitions.

Tables Insights

This section displays the physical structure of your tables using the base table metrics. These cards enable you to identify how well your tables are structured for increased performance.
It displays the following cards:
Table 2. Tables insights cards
Card Description
NUMBER OF PARTITIONS The partition distribution across all tables.
PARTITION KEY SIZE The partition key size distribution across all tables.
COLUMN SIZE The column size distribution across all tables.
NUMBER OF BUCKET COLUMNS SIZE The bucket column size distribution across all tables.
BUCKETED TABLES The number of bucketed tables and the percentage across all tables.
COMPRESSED TABLES The number of compressed tables and the percentage across all tables.
NON PARTITIONED TABLES The number of non partitioned tables and the percentage across all tables.
PARTITIONED TABLES The number of partitioned tables and the percentage across all tables.
TABLES WITH ARRAY COLUMNS The number of tables with array columns and the percentage across all tables.
TABLES WITH BINARY COLUMNS The number of tables with binary columns and the percentage across all tables.
TABLES WITH MAP COLUMNS The number of tables with map columns and the percentage across all tables.
TABLES WITH STRUCT COLUMNS The number of tables with struct data type columns and the percentage across all tables.
TEMPORARY TABLES The number of temporary tables and the percentage across all tables.

Table Statistics

This section displays the physical characteristic metrics of those tables that have statistics enabled, such as the volume of data, the number of rows and files, and how these values are distributed. Table statistics improve the optimization of queries by the engine for increased performance. Understanding the size and volume of a table helps the engine organize the workload appropriately, such as for a join or insert operation.
It displays the following cards:
Table 3. Table statistics cards
Card Description
STATISTICS ENABLED The number and percentage across all tables with statistics enabled.
DATA VOLUME The total size of tables with statistics enabled and the distribution across all tables.
NUMBER OF FILES displayed by distribution The total number of files with statistics enabled and the distribution across all tables.
NUMBER OF ROWS The total number of rows with statistics enabled and the distribution across all tables.
TOTAL DATA VOLUME The total size of your tables with statistics enabled and the size of each storage format.
NUMBER OF FILES displayed by type The total number of files that form the tables with statistics enabled and their storage formats displayed as a percentage as a whole.

About the HMS Tables view

The HMS Extract, which is updated daily, is displayed in the Hive Metastore (HMS) Tables view. It lists the details about each table available in your system, regardless of whether they have been queried or not.

The Tables view contains the following columns:
Column Name Description
Table The name of the table.
Database The database in which the table resides.
Partitions The number of partitions.
Volume The total table size in bytes.
Rows The number of records in the table.
Files The number of files that make up the table.
Frequency of Access The number of times queries have accessed the table.
Storage Format The table’s storage format, such as but not limited to:
  • JDBC
  • LazySimple
  • Orc
  • Parquet
Table Type The table’s type, such as but not limited to:
  • EXTERNAL_TABLE, which defines a table whose data is stored in the location specified during table creation.
  • MANAGED_TABLE, which defines a table whose data is stored in the warehouse directory.
  • VIRTUAL_TABLE, which defines a table that is the result of a query which has not materialized and whose data is not stored.