How compaction interacts with the Data Lake

In the Data Lake on Cloudera, the initiator and cleaner processes also run in the metastore as they do in Cloudera Data Warehouse on cloud. However, the worker process runs in HiveServer (HS2), which equates to a Hive Virtual Warehouse..

In Cloudera Data Warehouse, the initiator and cleaner processes run in the Database Catalog, which equates to the metastore. The Database Catalog maintains a connection with the Data Lake and compaction jobs run in parallel with it. The worker process in Hive Virtual Warehouses executes queries to perform compaction.