Duplicate entry errors in Hue Query Processor

Learn how to address duplicate entry errors in the Query Processor that occur due to timezone mismatches.

Condition

When you use a MySQL database for the Query Processor, the service might fail or stop processing events. You might see a SQLIntegrityConstraintViolationException error in the logs, and query entries do not appear in the Hive Queries tab of the Job browser.

Cause

This issue occurs because of a timezone mismatch between the Java Virtual Machine (JVM) and the MySQL server. If these timezones are not synchronized, the system incorrectly reads date values from the database. This mismatch causes the file-scanning pipeline to attempt to insert duplicate records into the file_status table for data that already exists.

Solution

  1. Log in to Cloudera Manager as Administrator.
  2. Navigate to the Query Processor configuration.
  3. Add the following flag in the Additional Query Processor Java Options (query_processor_java_opts) property:
    -Duser.timezone=UTC
  4. Locate the Database Name (query_processor_database_name) property and append the timezone parameter to the connection string:
    ?connectionTimeZone=UTC
  5. Restart the Query Processor service to apply the changes.