Analyzing your tables
Minimize costs and maximize query performance by gaining more insights into your tables, including which tables are frequently or infrequently accessed, with the Cloudera Observability Metastore Analytics feature. By understanding your tables and their metadata, such as a table’s data volume or how often a query accesses a table, helps you troubleshoot, make informed decisions about your data, and ensures that your table data is in accordance with your Storage Policy.
The Cloudera Observability Metastore Analytics feature, collects and filters the Hive Metastore
(HMS) metadata into meaningful views of your tables, including which tables are hot (high
frequency of access) and which tables are cold (little or no frequency of access).
The Cloudera Observability Metastore Analytics feature displays information that enables you
to:
- Identify and track sudden table changes, such as a table’s data size, the number of partitions, or the number of rows that may impact the processing of your queries. The HMS Extract, which is updated daily, lists the details about each table available in your system regardless of whether they have been queried or not. It includes the table's configurations and if enabled the table's statistics, as well as size related information, such as the table's volume, the number of partitions, and the number of rows.
- View and analyze the most frequently accessed tables from the Data Temperature's Hot Tables chart widget on the environment's Cluster Summary page. With this information you can decide which tables should be moved to performance-efficient storage, such as an SSD that can improve a query's performance due to its fast processing speed, especially queries that access large amounts of data.
- View and analyze the least frequently accessed tables from the Data Temperature's Cold Tables chart widget on the environment's Cluster Summary page. With this information you can decide which tables should be purged or moved to cost-efficient storage, which will save platform costs.
- Analyze and troubleshoot inefficiencies within your tables, such as the wrong table type or storage format. The HMS Tables view and the Table Details panel display details about each table within your system, such as the table’s location, database, column names, and properties.
- Identify tables that contain huge amounts of data. With this information you can decide if partitioning is required or if more partitioning is required, which improves query performance and costs by reducing the amount of data that has to be retrieved, manipulated, and outputted, as well as making your tables easier to manage.