Troubleshooting DAS Installation
Also available as:
PDF

Replication failure in the DAS Event Processor

DAS uses replication as a way to copy database and table metadata information from Hive to DAS Postgres database. If the replication fails, then you may not be able to see database or table information in DAS.

The Hive metadata replication process occurs in the following two phases:
  • Bootstrap dump
  • Incremental dump

When the DAS Event Processor is started for the first time, the entire Hive database and the table metadata is copied into DAS. This is known as the bootstrap dump. After this phase, only the differences are copied into DAS at one-minute intervals, from the time the last successful dump was run. This is known as an incremental dump.

If the bootstrap dump never succeeded, then you may not see any database or table information in DAS. If the bootstrap dump fails, then the information regarding the failure is captured in the most recent Event Processor log.

If an incremental dump fails, then you may not see any new changes to the databases and tables in DAS. The incremental dump relies on events stored in the Hive metastore because these events take up a lot of space and are only used for replicating data. The events are removed from Hive metastore daily, which can affect DAS.

Fixing incremental dump failures

Note
Note
You must be an admin user to complete this task.
If you see the message “Notification events are missing in the meta store”, then reset the Postgres database using the following command:
curl -H 'X-Requested-By: das' -H 'Cookie: JSESSIONID=<session id cookie>' http(s)://<hostname>:<port>/api/replicationDump/reset
Where:
  • session id cookie is the cookie value which you have to get for an admin user on the DAS UI, from the browser
  • hostname is the DAS Webapp hostname

  • port is the DAS Webapp port

Note
Note

The error message "Notification events are missing in the meta store" is a symptom and not the cause of an issue, at most times. You can usually fix this by resetting the database. However, to determine the actual cause of the failure, we need to look at the first repl dump failure. Older Event Processor logs may contain information about the actual cause, but they are purged from the system at a set frequency. Therefore, if you hit this issue, we recommend that you first copy all the Event Processor logs that are available; else it might be too late to diagnose the actual issue.

If the first exception that you hit is an SQLException, then it is a Hive-side failure. Save the HiveServer and the Hive Metastore logs for the time when the exception occurred.

File a bug with Cloudera Support along with the above-mentioned logs.