Supported scheduled query operations

When you configure the CDP cluster, you can change the replication policy using the Hive scheduled queries that CDP supports.

CDP supports the following scheduled queries that you use to change the scheduled query that replicates the HDP workload to CDP.
  • Modify the replication policy schedule:
    alter scheduled query repl_policyname cron '2 2 * * *'; 
  • Run the replication policy:
    alter scheduled query repl_policyname execute; 
  • Delete the replication policy:
    drop scheduled query repl_policyname; 
  • Pause running the replication policy job:
     alter scheduled query repl_policyname disabled; 
  • Resume running the replication policy job:
    alter scheduled query repl_policyname enabled;
  • List scheduled replication policies:
    select * from sys.scheduled_queries; 
  • List all scheduled runs for all policies:
    select * from sys.scheduled_executions;
  • Track the error for the last job run for a replication policy:
    select s.error_message from sys.scheduled_executions s join sys.scheduled_queries t where s.scheduled_query_id = 7
    t.scheduled_query_id and t.schedule_name = '<policy-name>' order by s.scheduled_execution_id limit 1; 
  • Get the metrics from the Hive scheduler for a replication policy:
    select * from sys.replication_metrics where policy_name=repl<policy name> order by scheduled_execution_id desc limit 1; 
When you get metrics from the Hive scheduler, in case of irrecoverable errors, the status shows FAILED_ADMIN and the error log path appears in the metrics. Manually check the error and delete the error file to resume replication after you fix the error. For more information, see Troubleshooting.

The following sample snippet shows the Hive scheduler metrics on the target CDP cluster:

+---------------------------------------------+----------------------------------+----------------------------------------+----------------------------------------------------+----------------------------------------------------+
| replication_metrics.scheduled_execution_id  | replication_metrics.policy_name  | replication_metrics.dump_execution_id  |replication_metrics.metadata                        |replication_metrics.progress            
|
+---------------------------------------------+----------------------------------+----------------------------------------+----------------------------------------------------+----------------------------------------------------+
| 2380                                        | repl_db1                         | 0                                      |{"dbName":"db1_r","replicationType":"BOOTSTRAP","stagingDir":"hdfs://ns1/
user/hive/replDir/d1/ZGIx/fddc6e66-4ec4-431b-92ab-826c94bba5/hive","lastReplId":12082} 
                                                                                                                                                                               |{"status":"SUCCESS","stages":[{"name":"REPL_LOAD",
"status":"SUCCESS","startTime":1617669672695,"endTime":1617669675068,"metrics":[{"name":"FUNCTIONS","currentCount":0,"totalCount":0},{"name":"TABLES","currentCount":2,"totalCount":2}],"errorLogPath":null}]} |