To enable table-level replication, you must specify the list of tables to be replicated in a given replication policy. Table-level replication enables you to replicate just the critical tables. It also helps you to speed-up the replication process and also reduces network bandwidth utilization.
You can define table level replication policy using regular expressions, for example, db.marketing_*. You can dynamically add or remove tables to the list by manually changing the replication policy during run time.
Hive supports database level replication policy of the format
<db_name>.*. In the real-time world, the policy format is
<db_name>.(t1, t3, …). The tables list can be specified
using Java supported regular expressions in the replication policy of format:
The replication policy has three parts separated with a DOT (.). The first part
is the database name, the second part is a single regex to represent the included tables
list, and the third part is a single regex to represent the tables that need to be
excluded from the list even if it matches the
The following examples illustrate the different ways to specify the tables list in the policy:
<db_name>- Full database replication which is currently supported.
<db_name>.'.*?'-Full database replication.
<db_name>.'t1|t3'- Database replication with static list of tables t1 and t3 included.
<db_name>.'(t1*)|t2'.'t100'- Database replication with all tables having prefix t1, includes table t2 which does not have prefix t1, and excludes t100 which has the prefix t1.
- If a table is dynamically added for replication due to changes in regular expression or added to the include list, the tables' data may not be point-in-time consistent with other tables which are already replicated incrementally. However, this inconsistency is seen for a very small duration until the completion of the next incremental replication after tables are added in the bootstrapped manner.
- Hive does not support overlapping replication policies such as db., db.[t1], and *. to same target database but works as expected if the target databases are different.