Replicating Data to Impala Clusters
Replicating Impala Metadata
Impala metadata replication is performed as a part of Hive replication. Impala replication is only supported between two CDH 5 clusters. The Impala and Hive services must be running on both clusters.
To enable Impala metadata replication, perform the following tasks:
- Schedule Hive replication as described in Configuring Replication of Hive/Impala Data.
- Confirm that the Replicate Impala Metadata option is set to Yes on the Advanced tab in the Create Hive Replication dialog.
Refreshing Impala Metadata
For Impala clusters that do not use LDAP authentication, you can configure an Advanced Configuration Snippet to automatically refresh Impala metadata with the INVALIDATE METADATA statement after Hive replication completes.
Alternatively, you can run the INVALIDATE METADATA statement manually for replicated tables. For more information about the statement, see INVALIDATE METADATA Statement.
To set the Advanced Configuration Snippet, perform the following steps:
- Navigate to the Configuration page for the Hive node you want to replicate.
- For the Category, select Advanced.
- In the Hive Replication Environment Advanced Configuration Snippet (Safety Valve) field, add the following parameter: RUN_INVALIDATE_METADATA=true.
- Save the changes.