Fixed Issues in CDH 6.3.3

HDFS Snapshot corruption

A fix to HDFS snapshot functionality caused a regression in the affected CDH releases. When a snapshot is deleted, internal data structure in the NameNode can become inconsistent and the checkpoint operation on the Standby NameNode can fail.

Products affected: HDFS

Releases affected:
  • CDH 5.4.0 - 5.15.1, 5.16.0
  • CDH 6.0.0 - 6.2.1, 6.3.0, 6.3.1, 6.3.2

Users affected: Any clusters with HDFS Snapshots enabled

Impact: A fix to HDFS snapshot functionality caused a regression in the affected CDH releases. When a snapshot is deleted, internal data structure in the NameNode can become inconsistent and the checkpoint operation on the Standby NameNode can fail.

Standby NameNode detects the inconsistent snapshot data structure and shuts itself down. To recover from this situation, the fsimage must be repaired and put back into both NameNodes' fsimage directory for the Standby NameNode to start normally. The Active NameNode stays up. However no fsimage checkpoint is performed because the Standby NameNode is down.

This problem can also prevent snapshots from being deleted or files within snapshots being listed. The following is an example of a typical error:
hdfs dfs -deleteSnapshot /path snapshot_123
deleteSnapshot: java.lang.IllegalStateException

The recovery of the corrupt fsimage can result in the loss of snapshots.

Immediate action required:
  • Upgrade: Update to a version of CDH containing the fix.
  • Workaround: Alternatively, avoid using snapshots. Cloudera BDR uses snapshots automatically when the relevant directories are snapshottable. Hence, we strongly recommend avoiding the upgrade to the affected releases if you are using BDR. For information and instructions, see Enabling and Disabling HDFS Snapshots.

Addressed in release/refresh/patch: CDH 6.3.3

Knowledge article: For the latest update on this issue see the corresponding Knowledge article: TSB 2020-390: HDFS Snapshot corruption

Processing UpdateRequest with delegation token throws NullPointerException

When using the Spark Crunch Indexer or another client application which utilizes the SolrJ API to send Solr Update requests with delegation token authentication, the server side processing of the request might fail with a NullPointerException.

Affected Versions: CDH 6.0.0, 6.0.1, 6.1.0, 6.1.1, 6.2.0, 6.2.1, 6.3.0, 6.3.1, 6.3.2

Fixed Version: CDH 6.3.3

Apache Issue: SOLR-13921

Cloudera Issue: CDH-82599

Solr service with no added collections causes the upgrade process to fail

CDH 5.x to CDH 6.x upgrade fails while performing the bootstrap collections step of the solr-upgrade.sh script with the error message:
Failed to execute command Bootstrap Solr Collections on service Solr
if there are no collections present in Solr.

Workaround: If there are no collections added to it, remove the Solr service from your cluster before you start the upgrade.

Affected Versions: CDH 6.0.0, 6.0.1, 6.1.0, 6.1.1, 6.2.0, 6.2.1, 6.3.0, 6.3.1, 6.3.2

Fixed Version: CDH 6.3.3

Cloudera Issue: CDH-82042

HBase Lily indexer might fail to write role log files

In certain scenarios the HBase Lily Indexer (Key-Value Store Indexer) fails to write its role log files.

Workaround: None

Affected Versions: CDH 6.0.0, 6.0.1, 6.1.0, 6.1.1, 6.2.0, 6.2.1, 6.3.0, 6.3.1, 6.3.2

Fixed Version: CDH 6.3.3

Cloudera Issue: CDH-82342

Adding a new indexer instance to HBase Lily Indexer fails with GSSException

When Kerberos authentication is enabled and adding a new indexer instance to HBase Lily Indexer (Key-Value Store Indexer), the authentication might fail when Lily is communicating to the HBase Master process, throwing a similar Exception:

javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
Workaround: Ensure that the Lily indexer has a Sentry dependency configured by following these steps:
  1. Go to Cloudera Manager > Key-Value Store indexer > Configuration.
  2. Make sure the Sentry Service configuration option points to a Sentry service instance instead of none.
The workaround does not require defining any Sentry roles or privileges, it is just to trigger a code execution path which will authenticate the HBase service user.

Affected Versions: CDH 6.0.0, 6.0.1, 6.1.0, 6.1.1, 6.2.0, 6.2.1, 6.3.0, 6.3.1, 6.3.2

Fixed Version: CDH 6.3.3

Cloudera Issue: CDH-82566

The SentryKafkaAuthorizer throws an exception when describing ACLs via Kafka AdminClient

If Sentry contains Kafka authorization policies for any ConsumerGroup resource, Kafka authorization policies cannot be described and manipulated via Kafka AdminClient. This is due to a conversion error in Sentry. The SentryKafkaAuthorizer throws the following exception when converting the ConsumerGroup resource type between Sentry and Kafka libraries.

kafka.common.KafkaException: CONSUMERGROUP not a valid resourceType name. The valid names are Topic,Group,Cluster,TransactionalId,DelegationToken 
This issue impacts any application that uses the ACL manipulation methods of KafkaAdminClient in Sentry enabled environments.

Workaround: Remove authorization policies for Kafka consumer groups in Sentry.

Affected Versions: CDH 5.x, CDH 6.3.0, 6.3.1, 6.3.2

Fixed Versions: 6.3.3

Apache Issue: SENTRY-2535

Cloudera Issue: CDH-82457

Upstream Issues Fixed

The following upstream issues are fixed in CDH 6.3.3:

Apache Accumulo

There are no notable fixed issues in this release.

Apache Avro

There are no notable fixed issues in this release.

Apache Crunch

There are no notable fixed issues in this release.

Apache Flume

There are no notable fixed issues in this release.

Apache Hadoop

The following issue is fixed in CDH 6.3.3:

  • HADOOP-15169 - The hadoop.ssl.enabled.protocols property should be considered in httpserver2
  • HADOOP-15812 - ABFS: Improve the AbfsRestOperationException format to ensure that the entire message can be displayed on the console
  • HADOOP-15846 - ABFS: Fix mask related bugs in setAcl, modifyAclEntries, and removeAclEntries
  • HADOOP-15872 - ABFS: Update to target the latest REST version for ADLS Gen 2
  • HADOOP-15940 - ABFS: For HNS account, avoid unnecessary get call when performing a rename operation.
  • HADOOP-15948 - Inconsistency in get and put syntax if the name of a file or a directory contains spaces
  • HADOOP-15968 - ABFS: Add try and catch for UGI failure when initializing ABFS
  • HADOOP-15969 - ABFS: getNamespaceEnabled can fail blocking of user access using ACLs
  • HADOOP-15972 - ABFS: Reduce the list page size to 500
  • HADOOP-15975 - ABFS: Remove timeout check for DELETE and RENAME
  • HADOOP-16048 - ABFS: Fix Date format parser
  • HADOOP-16461 - Regression: FileSystem cache lock parses XML within the lock
  • HADOOP-16578 - Avoid FileSystem API calls when the FileSystem already exists
  • HADOOP-16587 - Make ABFS AAD-endpoints configurable

HDFS

The following issues are fixed in CDH 6.3.3:

  • HDFS-13193 - Various improvements for BlockTokenSecretManager
  • HDFS-13941 - Make storageId in BlockPoolTokenSecretManager.checkAccess optional
  • HDFS-14026 - Overload BlockPoolTokenSecretManager.checkAccess to make storageId and storageType optional
  • HDFS-14366 - Improve HDFS append performance

MapReduce 2

There are no notable fixed issues in this release.

YARN

The following issues are fixed in CDH 6.3.3:

  • YARN-9217 - Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing
  • YARN-9235 - If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
  • YARN-9337 - Addendum to fix compilation error due to mockito spy call
  • YARN-9337 - GPU auto-discovery script runs even when the resource is given by hand

Apache HBase

The following issues are fixed in CDH 6.3.3:

  • HBASE-21991 - [Addendum] Mark LossCounting as Private
  • HBASE-22380 - Break circle replication when doing bulkload
  • HBASE-23046 - Remove compatibility case from truncate command

Apache Hive

The following issues are fixed in CDH 6.3.3:

  • HIVE-21999 - Add sensitive ABFS configuration properties to HiveConf hidden list
  • HIVE-22236 - Fail to create View selecting View containing NOT IN subquery

Hue

The following issues are fixed in CDH 6.3.3:

  • HUE-8946 - [core] Add back name as argument to import LDAP group or user commands
  • HUE-8946 - [useradmin] Fix argument as list in import_ldap_user and import_ldap_group
  • HUE-9011 - [hive] Fix invalid delimiters in create Hive table
  • HUE-9019 - [core] Fix concurrent_user_session_limit failed after Django upgrade
  • HUE-9025 - [editor] Fix multi query statement with invalidate metadata
  • HUE-9027 - [editor] Fix erratic behaviour of the horizontal result scrollbar

Apache Impala

The following issues are fixed in CDH 6.3.3:

  • IMPALA-6159 - Enabled TCP Keepalive packets for all outbound connections to ensure that stale TCP connections in an idle cluster are detected and closed within a time bound and a new connection is created on the next use
  • IMPALA-7802 - Now Impala closes connections of idle client sessions to allow the service threads to be freed up
  • IMPALA-8333 - Removed a benign Impala Shell warnings message at the start-up time
  • IMPALA-8612 - Fixed sporadic the null point exception error when dropping an authorized table
  • IMPALA-8673 - Added the DEFAULT_HINTS_INSERT_STATEMENT query option that sets the default hints for the INSERT statements with no optimizer hint specified
  • IMPALA-8790 - Fixed an error for queries containing GROUP BY expressions of aggregations
  • IMPALA-8851 - Fixed an issue where the DROP TABLE IF EXISTS statement on a non-existing table threw an authorization exception when authorization is enabled
  • IMPALA-8969 - Fixed an issue where grouping aggregator could cause segmentation fault when doing multiple aggregations

Apache Kafka

There are no notable fixed issues in this release.

Apache Kite

There are no notable fixed issues in this release.

Apache Kudu

The following issues are fixed in CDH 6.3.3:

  • KUDU-3014 - Fixed a bug in the connection negotiation code of the Kudu Java client. Now the Kudu Java client verifies the channel binding information while negotiating connections to Kudu servers
  • KUDU-2980 - Fixed an issue of fault-tolerant scan operation failing for a projection with key columns specified in an order other than the table schema’s order
  • KUDU-2871 - Fixed RPC negotiation failure in the case when TLS v1.3 is supported at both the client and the server side. This is a temporary workaround before the connection negotiation code is properly updated to support 1.5-RTT handshake used in TLS v1.3. The issue affected Linux distributions shipped or updated with OpenSSL version 1.0.2 and newer
  • KUDU-2989 - Fixed an issue with connection negotiation using SASL mechanism when server FQDN is longer than 64 characters
  • Squeasel now supports ECC ciphers such as ECDH, based on the prime256v1 curve

Apache Oozie

The following issues are fixed in CDH 6.3.3:

Apache Parquet

There are no notable fixed issues in this release.

Apache Pig

There are no notable fixed issues in this release.

Cloudera Search

The following issues are fixed in CDH 6.3.3:

  • SOLR-13532 - Unable to start core recovery due to timeout in ping request
  • SOLR-13921 - Processing UpdateRequest with delegation token throws NullPointerException

Apache Sentry

The following issues are fixed in CDH 6.3.3:

  • SENTRY-2535 - SentryKafkaAuthorizer throws Exception when describing ACLs

Apache Spark

The following issues are fixed in CDH 6.3.3:

  • SPARK-24621 - [WEBUI] Show secure URLs on web pages
  • SPARK-27453 - Pass partitionBy as options in DataFrameWriter
  • SPARK-27621 - [ML] Linear Regression - validate training related params such as loss only during fitting phase
  • SPARK-29082 - [CORE] Skip delegation token generation if no credentials are available
  • SPARK-29105 - [CORE] Keep driver log file size up to date in HDFS

Apache Sqoop

There are no notable fixed issues in this release.

Apache ZooKeeper

The following issues are fixed in CDH 6.3.3:

  • ZOOKEEPER-2251 - Add Client side packet response timeout to avoid infinite wait