Change compactor configuration for Hive Virtual Warehouses on Cloudera Data Warehouse Public Cloud
To enhance performance, the compactor is a set of background processes that compact delta files, which are created as a by-product of data modifications. When it runs, it incurs additional load on the Hive Virtual Warehouse assigned as the compactor in Cloudera Data Warehouse (CDW) Public Cloud. You can change which Hive warehouse performs compaction to load-balance this workload as necessary.
One Hive Virtual Warehouse must be the compactor for all the Virtual Warehouses associated with a particular Database Catalog. This Hive Virtual Warehouse compactor runs all of the compaction queries for all Virtual Warehouses that use one particular Database Catalog, including Impala Virtual Warehouses. However, Impala Virtual Warehouses cannot be designated as the compactor Virtual Warehouse for a Database Catalog. Compaction tasks can only be assigned to a Hive Virtual Warehouse. The first Hive Virtual Warehouse you create against a Database Catalog is automatically set as the compactor. If you decide you do not want that particular warehouse to take on the compaction workload, you can set another Hive Virtual Warehouse to perform the compaction workload by following these steps:
- Log in to the CDP web interface and navigate to the Data Warehouse service.
- On the Overview page, select the Hive Virtual Warehouse that you want to set as the compactor, and click the options menu .
- In the options menu, select Set Compactor.