Fixed Issues in CDH 6.0.0
See below for issues fixed in CDH 6.0.0, grouped by component:
Apache Accumulo
Running Apache Accumulo on top of a CDH 6.0.0 cluster is not currently supported. If you try to upgrade to CDH 6.0.0 you will be asked to remove the Accumulo service from your cluster. Running Accumulo on top of CDH 6 will be supported in a future release.
Apache Avro
There are no notable fixed issues in this release.
Apache Crunch
There are no notable fixed issues in this release.
Apache Flume
There are no notable fixed issues in this release.
Apache Hadoop
HDFS
- HADOOP-12267 - DistCp to S3a fails due to integer overflow in retry timer.
MapReduce 2 and YARN
- YARN-4212 (CDH-31358) - Jobs in pool with DRF policy will not run if root pool is FAIR.
- MAPREDUCE-6638 (CDH-37412) - Jobs with encrypted spills do not recover if the Application Master goes down.
- YARN-1558 - Moving jobs between queues not persistent after restart.
Apache HBase
In CDH 6.0.0, the default values for properties that are required to enable cell-level ACLs have changed. Previously, you needed to modify the properties to enable cell-level ACLs. In CDH 6.0.0, you do not need to modify them. The properties and their new default values are listed below:
hbase.security.exec.permission.checks => true hbase.security.access.early_out => false hfile.format.version => 3
For information about the upstream fixes, see the Apache HBase JIRAs.
The following JIRA has also been closed:
- HBASE-7621 - RemoteHTable now supports binary row keys with any character or byte by properly encoding request URLs.
Apache Hive / HCatalog / Hive on Spark
In CDH 6.0, Hive fixed issues resulted in new features and incompatible changes. For details on these fixed issues, see the following sections of the CDH 6.0 Release Notes for Hive:
Hue
There are no notable fixed issues in this release.
Apache Impala
There are no notable fixed issues in this release
Apache Kafka
Authenticated Kafka clients may impersonate other users
Authenticated Kafka clients may impersonate any other user via a manually crafted protocol message with SASL/PLAIN or SASL/SCRAM authentication when using the built-in PLAIN or SCRAM server implementations in Apache Kafka.
Note that the SASL authentication mechanisms that apply to this issue are neither recommended nor supported by Cloudera. In Cloudera Manager (CM) there are four choices: PLAINTEXT, SSL, SASL_PLAINTEXT, and SASL_SSL. The SASL/PLAIN option described in this issue is not the same as SASL_PLAINTEXT option in CM. That option uses Kerberos and is not affected. As a result it is highly unlikely that Kafka is susceptible to this issue when managed by CM unless the authentication protocol is overridden by an Advanced Configuration Snippet (Safety Valve).
Products affected: CDK Powered by Apache Kafka
Releases affected: CDK 2.1.0 to 2.2.0, CDK 3.0.0
Users affected: All users
Detected by: Rajini Sivaram (rsivaram@apache.org)
Severity (Low/Medium/High):8.3 (High) (CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:H/A:H)
Impact:Privilege escalation.
CVE: CVE-2017-12610
Immediate action required: Upgrade to a newer version of CDK Powered by Apache Kafka where the issue has been fixed.
Addressed in release/refresh/patch: CDK 3.1, CDH 6.0 and higher
Knowledge article: For the latest update on this issue see the corresponding Knowledge article: TSB 2018-332: Two Kafka Security Vulnerabilities: Authenticated Kafka clients may impersonate other users and and may interfere with data replication
Authenticated clients may interfere with data replication
Authenticated Kafka users may perform an action reserved for the Broker via a manually created fetch request interfering with data replication, resulting in data loss.
Products affected: CDK Powered by Apache Kafka
Releases affected: CDK 2.0.0 to 2.2.0, CDK 3.0.0
Users affected: All users
Detected by: Rajini Sivaram (rsivaram@apache.org)
Severity (Low/Medium/High):6.3 (Medium) (CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:L/A:L)
Impact:Potential data loss due to improper replication.
CVE:CVE-2018-1288
Immediate action required: Upgrade to a newer version of CDK Powered by Apache Kafka where the issue has been fixed.
Addressed in release/refresh/patch: CDK 3.1, CDH 6.0 and higher
Knowledge article: For the latest update on this issue see the corresponding Knowledge article: TSB 2018-332: Two Kafka Security Vulnerabilities: Authenticated Kafka clients may impersonate other users and and may interfere with data replication
Upstream Issues Fixed
Metrics
- KAFKA-6252 - A metric named 'XX' already exists, can't register another one.
- KAFKA-5987 - Kafka metrics templates (MetricNameTemplate and the Metric.toHtmlTable) used in document generation should maintain order of tags.
- KAFKA-5968 - Remove all broker metrics during shutdown.
- KAFKA-5746 - New metrics to support health checks, including:
Broker-side metrics
- Error rates
- Message conversion rate and time
- Request size and temporary memory size
- Authentication success and failure rates
- ZooKeeper status and latency
Client-side metrics
- Client versions exposed as a metric
- KAFKA-5738 - Add cumulative count attribute for all Kafka rate metrics to make Kafka metrics more compatible with other metrics such as Yammer.
- KAFKA-5597 - Auto-generate Producer sender metrics.
- KAFKA-5461 - KIP-168: New metric ("GlobalTopicCount") track the total topic count per cluster.
- KAFKA-5341 - New metrics ("UnderMinIsrPartitionCount" and per-partition "UnderMinIsr") track the number of partitions whose in-sync replicas count is less than the minimum configured for in-sync replicas (min.insync.replicas).
Security Supportability
- KAFKA-6258 - SSLTransportLayer should keep reading from the socket until either the buffer is full or the socket has no more data.
- KAFKA-5920 - Handle SSL authentication failures as non-retriable exceptions in clients marked CONNECTED and DISCONNECTED at the same time.
- KAFKA-5854 - KIP-152: Handle SASL authentication failures as non-retriable exceptions in clients.
- KAFKA-5783 - KIP-189: Implement KafkaPrincipalBuilder interface with support for SASL.
- KAFKA-5720 - In Jenkins, kafka.api.SaslSslAdminClientIntegrationTest failed with org.apache.kafka.common.errors.TimeoutException.
- KAFKA-5417 - Clients get inconsistent connection states when SASL/SSL connection is marked CONNECTED and DISCONNECTED at the same time.
- KAFKA-4764 - KIP-152: Improve diagnostics for SASL authentication failures.
Kafka Client
- KAFKA-6287 - Inconsistent protocol type for empty consumer groups.
- KAFKA-5856 - KIP 195: Increase the number of partitions of a topic using AdminClient.createPartitions().
- KAFKA-5763 - Refactor NetworkClient to use LogContext.
- KAFKA-5762 - Refactor AdminClient to use LogContext.
- KAFKA-5755 - Refactor KafkaProducer to use LogContext.
- KAFKA-5737 - KafkaAdminClient thread should be a daemon.
- KAFKA-5726 - The KafkaConsumer method subscribe overload that takes just pattern without ConsumerRebalanceListener.
- KAFKA-5629 - ConsoleConsumer overrides auto.offset.reset property when provided on the command line without warning about it.
- KAFKA-5556 - The KafkaConsumer method commitSync throws the exception IllegalStateException: Attempt to retrieve exception from future which hasn't failed
- KAFKA-5534 - The KafkaConsumer method offsetsForTimes should include partitions in result even if no offset could be found.
- KAFKA-5512 - KafkaConsumer: High memory allocation rate when idle.
- KAFKA-4856 - Calling KafkaProducer.close() from multiple threads may cause spurious error.
- KAFKA-4767 - KafkaProducer is not joining its IO thread properly.
- KAFKA-4669 - KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws an exception.
- KAFKA-2105 - NullPointerException in client on metadataRequest.
Apache Kudu
There are no notable fixed issues in this release
Apache Oozie
There are no notable fixed issues in this release.
Apache Parquet
- PARQUET-1217 - Incorrect handling of missing values in Statistics.
- PARQUET-357 - Parquet-thrift generates wrong schema for Thrift binary fields.
- PARQUET-686 - Clarifications about min-max stats.
- PARQUET-753 - Fixed GroupType.union() to handle original type.
- PARQUET-765 - Upgrade Avro to 1.8.1.
- PARQUET-783 - Close the underlying stream when an H2SeekableInputStream is closed.
- PARQUET-791 - Add missing column support for UserDefinedPredicate.
- PARQUET-806 - Parquet-tools silently suppresses error messages.
- PARQUET-825 - Static analyzer findings (NPEs, resource leaks).
- PARQUET-1005 - Fix DumpCommand parsing to allow column projection.
- PARQUET-1064 - Deprecate type-defined sort ordering for INTERVAL type.
- PARQUET-1065 - Deprecate type-defined sort ordering for INT96 type.
- PARQUET-1133 - Add int96 support by returning bytearray, Skip originalType comparison for map types when originalType is null.
- PARQUET-1141 - Fix field ID handling.
- PARQUET-1152 - Parquet-thrift doesn't compile with Thrift 0.9.3.
- PARQUET-1153 - Parquet-thrift doesn't compile with Thrift 0.10.0.
- PARQUET-1185 - TestBinary#testBinary unit test fails after PARQUET-1141.
- PARQUET-1191 - Type.hashCode() takes originalType into account but Type.equals() does not.
- PARQUET-1208 - Occasional endless loop in unit test.
- PARQUET-1217 - Incorrect handling of missing values in Statistics.
- PARQUET-1246 - Parquet ignores float/double statistics in case of NaN.
Apache Pig
There are no notable fixed issues in this release of Apache Pig.
Cloudera Search
Cloudera Search in CDH 6.0 is rebased on Apache Solr 7.0, which has fixed many issues since the 4.10 version of Apache Solr used in recent CDH 5 releases.
For information on the fixes, see the upstream release notes:
Apache Sentry
There are no notable fixed issues in this release
Apache Spark
CDH 6.0.0 uses CDS 2.2 Release 2 Powered By Apache Spark. The fixed issues listed in the release notes for CDS 2.2 Release 2 have been incorporated in CDH 6.0.0.
Apache Sqoop
- SQOOP-3273: Removing com.cloudera.sqoop packages
- SQOOP-3275: HBase test cases should start mini DFS cluster as well
- SQOOP-3255: Sqoop ignores metastore properties defined in sqoop-site.xml
- SQOOP-3241: ImportAllTablesTool uses the same SqoopOptions object for every table import
- SQOOP-3153: Sqoop export with --as-<spec_file_format> error message could be more verbose
- SQOOP-3266: Update 3rd party and manual test running related info in COMPILING.txt
- SQOOP-3233: SqoopHCatImportHelper.convertNumberTypes check for Varchar instead of Char
- SQOOP-3257: Sqoop must not log database passwords
- SQOOP-3243: Importing BLOB data causes 'Stream closed' error on encrypted HDFS
- SQOOP-3229: Document how to run third party tests manually with databases running in docker
- SQOOP-3216: Expanded Metastore support for MySql, Oracle, Postgresql, MSSql, and DB2
- SQOOP-3014: Sqoop with HCatalog import loose precision for large numbers that does not fit into double
- SQOOP-3232: Remove Sqoop dependency on deprecated HBase APIs
- SQOOP-3222: Test HBase kerberized connectivity
- SQOOP-3195: SQLServerDatatypeImportDelimitedFileTest can fail in some environments
- SQOOP-3196: Modify MySQLAuthTest to use configurable test database parameters
- SQOOP-3139: sqoop tries to re execute select query during import in case of a connection reset error and this is causing lots of duplicate records from source
- SQOOP-3218: Make sure the original ClassLoader is restored when running HCatalog tests
- SQOOP-3178: Incremental Merging for Parquet File Format
- SQOOP-3206: Make sqoop fail if user uses --direct connector and tries to encode a null value when using a MySQL database
Excluded JIRA
SQOOP-3149 is part of Sqoop 1.4.7, but because it contains a serious bug, Cloudera has excluded it from the CDH 6.0 release.
Apache Zookeeper
There are no notable fixed issues in this release.