Query fails with "Counters limit exceeded" error message

Condition

After upgrading to CDP Private Cloud Base, you may notice that some Hive queries fail with a "Counters limit exceeded: Too many counters: 10001 max=10000" and "Counters limit exceeded: Too many counter groups: 3001 max=3000".

Cause

This issue occurs because a lower value is specified for Apache Tez task counters (tez.counters.max) and counter groups (tez.counters.max.groups), which prevents Tez from executing the DAG. If you are running long queries, increase the value for Tez counters and counter groups.

It is recommended that you set the value of tez.counters.max at the session level and also add it to the allowlist.

Solution

  1. Log in to Cloudera Manager as an administrator.
  2. Go to Clusters > Hive on Tez > Configuration and search for 'HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml'.
  3. Click and add the property key: hive.security.authorization.sqlstd.confwhitelist.append.
  4. Provide the property value, or values, to allowlist, for example: tez\.counters\..* or tez\.counters\.max|tez\.counters\.max\.groups.
    This action appends the parameters to the allowlist.
  5. Save the changes and restart the Hive on Tez service.
  6. From the Beeline shell, set the tez.counters.max property to a higher value and run the query.
    set tez.counters.max=50000;
    Run the Hive query and if it continues to fail, perform the steps provided below.
  7. In Cloudera Manager, go to Clusters > Tez > Configuration and search for the tez.counters.max property.
  8. Modify the value to 50000, save changes, and refresh the Tez service.
  9. Restart the Hive on Tez service and run the Hive query again.
    If you encounter the following error, modify the value for tez.counters.max.groups.
    ERROR : Counters limit exceeded: org.apache.tez.common.counters.LimitExceededException: Too many counter groups: 3001 max=3000
  10. In Cloudera Manager, go to Clusters > Tez > Configuration and search for the tez.counters.max.groups property.
  11. Modify the value to 10000, save changes, and refresh the Tez service.
  12. Restart the Hive on Tez service and run the Hive query.