Fixed Issues in CDH 6.1.0

CDH 6.1.0 fixes the following issues:

Hive Jobs Are Submitted to a Single Queue When Sentry is Deployed
Hadoop LdapGroupsMapping does not support LDAPS for self-signed LDAP server
ZooKeeper JMX did not support TLS when managed by Cloudera Manager
Spark Streaming jobs loop if missing Kafka topic
Long-running Spark applications on a secure cluster might fail if driver is restarted
Kafka May Be Stuck with Under-replicated Partitions after ZooKeeper Session Expires
Upstream Issues Fixed

Hive Jobs Are Submitted to a Single Queue When Sentry is Deployed

Hive jobs are not submitted into the correct YARN queue when Hive is using Sentry because Hive does not use the YARN API to resolve the user or group of the job's original submitter. This causes the job to be placed in a queue using the placement rules based on the Hive user. The HiveServer2 fair scheduler queue mapping used for "non-impersonation" mode does not handle the primary-secondary queue mappings correctly.

Workaround: If you are a Hive and Sentry user, do not upgrade to CDH 6.0.0. This issue will be fixed as soon as possible. If you must use Hive and Sentry in CDH 6.0.0, see YARN Dynamic Resource Pools Do Not Work with Hive When Sentry Is Enabled for additional workarounds.

Affected Version: CDH 6.0.0

Fixed Versions: CDH 6.0.1, CDH 6.1.0 and later

Cloudera Issue: CDH-51596

Hadoop LdapGroupsMapping does not support LDAPS for self-signed LDAP server

Hadoop LdapGroupsMapping does not work with LDAP over SSL (LDAPS) if the LDAP server certificate is self-signed. This use case is currently not supported even if Hadoop User Group Mapping LDAP TLS/SSL Enabled, Hadoop User Group Mapping LDAP TLS/SSL Truststore, and Hadoop User Group Mapping LDAP TLS/SSL Truststore Password are filled properly.

Affected Versions: CDH 5.x and 6.0.x versions

Fixed Versions: CDH 6.1.0

Apache Issue: HADOOP-12862

Cloudera Issue: CDH-37926

ZooKeeper JMX did not support TLS when managed by Cloudera Manager

Technical Service Bulletin 2019-310 (TSB)

The ZooKeeper service optionally exposes a JMX port used for reporting and metrics. By default, Cloudera Manager enables this port, but prior to Cloudera Manager 6.1.0, it did not support mutual TLS authentication on this connection. While JMX has a password-based authentication mechanism that Cloudera Manager enables by default, weaknesses have been found in the authentication mechanism, and Oracle now advises JMX connections to enable mutual TLS authentication in addition to password-based authentication. A successful attack may leak data, cause denial of service, or even allow arbitrary code execution on the Java process that exposes a JMX port. Beginning in Cloudera Manager 6.1.0, it is possible to configure mutual TLS authentication on ZooKeeper’s JMX port.

Products affected: ZooKeeper

Releases affected: Cloudera Manager 6.1.0 and lower, Cloudera Manager 5.16 and lower

Users affected: All

Date/time of detection: June 7, 2018

Severity (Low/Medium/High): 9.8 High (CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H)

Impact: Remote code execution

CVE: CVE-2018-11744

Immediate action required: Upgrade to Cloudera Manager 6.1.0 and enable TLS for the ZooKeeper JMX port by turning on the configuration settings “Enable TLS/SSL for ZooKeeper JMX” and “Enable TLS client authentication for JMX port” on the ZooKeeper service and configuring the appropriate TLS settings. Alternatively, disable the ZooKeeper JMX port via the configuration setting “Enable JMX Agent” on the ZooKeeper service.

Addressed in release/refresh/patch: Cloudera Manager 6.1.0

Spark Streaming jobs loop if missing Kafka topic

Spark jobs can loop endlessly if the Kafka topic is deleted while a Kafka streaming job (which uses KafkaSource) is in progress.

Cloudera Issue: CDH-57903, CDH-64513

Long-running Spark applications on a secure cluster might fail if driver is restarted

If you submit a long-running app on a secure cluster using the --principal and --keytab options in cluster mode, and a failure causes the driver to restart after 7 days (the default maximum HDFS delegation token lifetime), the new driver fails with an error similar to the following:

Exception in thread "main" org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token <token_info> can't be found in cache

Apache Issue: SPARK-23361

Cloudera Issue: CDH-64865

Kafka May Be Stuck with Under-replicated Partitions after ZooKeeper Session Expires

This problem can occur when your Kafka cluster includes a large number of under-replicated Kafka partitions. One or more broker logs include messages such as the following:

[2016-01-17 03:36:00,888] INFO Partition [__samza_checkpoint_event-creation_1,3] on broker 3: Shrinking ISR for partition [__samza_checkpoint_event-creation_1,3] from 6,5 to 5 (kafka.cluster.Partition)
[2016-01-17 03:36:00,891] INFO Partition [__samza_checkpoint_event-creation_1,3] on broker 3: Cached zkVersion [66] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)

There will also be an indication of the ZooKeeper session expiring in one or more Kafka broker logs around the same time as the previous errors:

INFO zookeeper state changed (Expired) (org.I0Itec.zkclient.ZkClient)

The log is typically in /var/log/kafka on each host where a Kafka broker is running. The location is set by the property kafka.log4j.dir in Cloudera Manager. The log name is kafka-broker-hostname.log. In diagnostic bundles, the log is under logs/hostname-ip-address/.

Workaround: To move forward after seeing this problem, restart the affected Kafka brokers. You can restart individual brokers from the Instances tab in the Kafka service page in Cloudera Manager.

Reduce the potential for long garbage collection pauses by brokers:
- Use a better garbage collection mechanism in the JVM, such as G1GC. You can do this by adding ‑XX:+UseG1GC in the broker_java_opts.
- Increase broker heap size if it is too small (broker_max_heap_size). Be careful that you don’t choose a heap size that can cause out-of-memory problems given all the services running on the node.
Increase the ZooKeeper session timeout configuration on brokers (zookeeper.session.timeout.ms), to reduce the likelihood that sessions expire.
Ensure ZooKeeper itself is well resourced and not overwhelmed so it can respond. For example, it is highly recommended to locate the ZooKeeper log directory on its own disk.

Affected Versions: CDK 1.4.x, 2.0.x, 2.1.x, 2.2.x

Fixed Versions:

Full Fix: CDH 6.1.0
Partial Fix: CDH 6.0.0, Kafka implementations with CDH 6.0.0 are less likely to encounter this issue.

Apache Issue: KAFKA-2729

Cloudera Issue: CDH-42514

Upstream Issues Fixed

The following upstream issues are fixed in CDH 6.1.0:

Apache Accumulo
Apache Avro
Apache Crunch
Flume
Hadoop
HBase
Hive
Hue
Impala
Kafka
Kudu
Oozie
Parquet
Apache Pig
Cloudera Search
Sentry
Spark
Sqoop
Zookeeper

Apache Accumulo

There are no notable fixed issues in this release.

Apache Avro

There are no notable fixed issues in this release.

Apache Crunch

There are no notable fixed issues in this release.

Apache Flume

The following issues are fixed in CDH 6.1.0:

FLUME-2442 - Need an alternative to providing clear text passwords in flume config
FLUME-2973 - Deadlock in hdfs sink
FLUME-2977 - Upgrade RAT to 0.12
FLUME-3050 - add counters for error conditions and expose to monitor URL
FLUME-3182 - add support for SSL/TLS for syslog (tcp) sources
FLUME-3222 - Fix for NoSuchFileException thrown when files are being deleted
FLUME-3223 - Flume HDFS Sink should retry close prior recover lease
FLUME-3227 - Add Rate Limiter to stresssource
FLUME-3239 - Do not rename files in SpoolDirectorySource
FLUME-3246 - Validate flume configuration to prevent larger source batchsize than
FLUME-3269 - Support JSSE keystore/trustore -D system properties
FLUME-3278 - Handling -D keystore parameters in Kafka components

Apache Hadoop

HDFS

The following issues are fixed in CDH 6.1.0:

HADOOP-9214 - Enhance the hadoop fs touchz command so that it can now modify atime and mtime.
HADOOP-12502 - Fixed an issue where setting the replication of a HDFS folder recursively can run out of memory.
HADOOP-13649 - s3guard: Implement time-based (TTL) expiry for LocalMetadataStore.
HADOOP-13761 - S3Guard: Implement retries for DDB failures and throttling; translate exceptions.
HADOOP-14212 - Expose SecurityEnabled boolean field in JMX for other services besides NameNode.
HADOOP-14507 - Extend per-bucket secret key config with explicit getPassword() on fs.s3a.$bucket.secret.key.
HADOOP-14758 - Improve S3GuardTool.prune to handle UnsupportedOperationException.
HADOOP-14759 - Improve S3GuardTool.prune to prune specific bucket entries.
HADOOP-14913 - Implement sticky bit for rename() operation in Azure WASB.
HADOOP-14935 - Fix an issue where Azure POSIX permissions are taking effect in access() method even when authorization is enabled.
HADOOP-14965 - Change the S3a input stream "normal" fadvise mode to be adaptive.
HADOOP-15054 - Upgrade hadoop dependency on commons-codec to 1.11.
HADOOP-15086 - Fix an issue where the NativeAzureFileSystem file rename is not atomic.
HADOOP-15121 - Fix a NullPointerException when using DecayRpcScheduler.
HADOOP-15141 - Support IAM Assumed roles in S3A.
HADOOP-15143 - Fix an NPE due to Invalid KerberosTicket in UGI.
HADOOP-15151 - Fix an issue where the MapFile.fix creates a wrong index file in case of block-compressed data file.
HADOOP-15176 - Enhance IAM Assumed Role support in S3A client.
HADOOP-15206 - Fix an issue where BZip2 drops and duplicates records when input split size is small.
HADOOP-15209 - Enhance DistCp to eliminate needless deletion of files under already deleted directories.
HADOOP-15212 - Add independent secret manager method for logging expired tokens.
HADOOP-15215 - Enhance s3guard set-capacity command to fail on read/write of 0.
HADOOP-15217 - Enhance FsUrlConnection to handle paths with spaces.
HADOOP-15250 - Fix an issue where a multiHomed server network cluster Network IPC Client binds the wrong address.
HADOOP-15267 - S3A multipart upload fails when SSE-C encryption is enabled.
HADOOP-15391 - Add missing CSS file in hadoop-aws, hadoop-aliyun, hadoop-azure and hadoop-azure-datalake modules.
HADOOP-15423 - Merge fileCache and dirCache into one single cache in LocalMetadataStore
HADOOP-15441 - Log kms url and token service at debug level.
HADOOP-15446 - WASB: PageBlobInputStream.skip breaks HBASE replication.
HADOOP-15449 - Increase default timeout of ZK session to avoid frequent NameNode failover.
HADOOP-15469 - Fix an issue where the S3A directory committer commit job fails if _temporary directory created under destination.
HADOOP-15478 - Fix an issue with WASB that caused an hflush() and hsync() regression.
HADOOP-15541 - Fix an issue where the AWS SDK can mistake stream timeouts for EOF and throw SdkClientExceptions.
HADOOP-15598 - Fix an issue where the DataChecksum calculate checksum experiences contention on hashtable synchronization.
HADOOP-15612 - Improve exception when tfile fails to load LzoCodec.
HADOOP-15633 - Fix an issue where fs.TrashPolicyDefault cannot create trash directory.
HADOOP-15679 - Enhance ShutdownHookManager shutdown time to be configurable & extended.
HADOOP-15684 - Fix an issue where triggerActiveLogRoll stuck on dead NameNode when ConnectTimeoutException happens.
HADOOP-15719 - Fail-fast when using OAuth over http.
HADOOP-15850 - Enhance CopyCommitter#concatFileChunks to check that the blocks per chunk is not 0.
HADOOP-15861 - Move DelegationTokenIssuer to the correct path.
HDFS-9049 - Make Datanode Netty reverse proxy port configurable.
HDFS-10183 - Prevent race condition during class initialization.
HDFS-11701 - Fix an issue where NPE from Unresolved Host causes permanent DFSInputStream failures.
HDFS-11719 - Enhance Arrays.fill() wrong index in BlockSender.readChecksum() exception handling.
HDFS-11900 - Fix an issue where hedged reads thread pool creation not synchronized.
HDFS-12070 - Fix an issue where failed block recovery leaves files open indefinitely and at risk for data loss.
HDFS-12574 - Add CryptoInputStream to WebHdfsFileSystem read call.
HDFS-12907 - Allow read-only access to reserved raw for non-superusers.
HDFS-12978 - Add fine-grained locking while consuming journal stream.
HDFS-13027 - Handle possible NPEs due to deleted blocks in race condition.
HDFS-13048 - Fix an issue where the LowRedundancyReplicatedBlocks metric can be negative.
HDFS-13052 - Add support for snasphot diff with WebHDFS.
HDFS-13060 - Add a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver.
HDFS-13081 - Allow SASL and privileged HTTP with Datanode#checkSecureConfig.
HDFS-13087 - Make snapshotted encryption zone information immutable.
HDFS-13145 - Fix an issue where an SBN crash occurs when transitioning to ANN with in-progress edit tailing enabled.
HDFS-13225 - Fix an issue where StripeReader#checkMissingBlocks() 's IOException info is incomplete.
HDFS-13280 - Fix NPE in get snasphottable directory list call.
HDFS-13330 - Fix an issue where ShortCircuitCache#fetchOrCreate never retries.
HDFS-13448 - Ignore locality for First Block Replica.
HDFS-13493 - Reduce the HttpServer2 thread count on DataNodes.
HDFS-13641 - Add metrics for edit log tailing.
HDFS-13658 - Expose HighestPriorityLowRedundancy blocks statistics.
HDFS-13668 - FSPermissionChecker may throw rrayIndexOutOfBoundsException when checking inode permission.
HDFS-13686 - Add overall metrics for FSNamesystemLock.
HDFS-13728 - Fix an issue where the Disk Balancer fails if volume usage is greater than capacity.
HDFS-13731 - Fix an issue where ReencryptionUpdater fails with ConcurrentModificationException during processCheckpoints.
HDFS-13738 - Fix an issue where fsck -list-corruptfileblocks encounters an infinite loop if the user is not privileged.
HDFS-13758 - Enhance DatanodeManager to throw exception if it has BlockRecoveryCommand but the block is not under construction.
HDFS-13820 - Add an ability to disable CacheReplicationMonitor.
HDFS-13830 - Add support for getting snasphottable directory list.
HDFS-13831 - Make block increment deletion number configurable.
HDFS-13833 - Improve BlockPlacementPolicyDefault's consider load logic.
HDFS-13838 - Fix an issue where WebHdfsFileSystem.getFileStatus() does not return correct "snapshot enabled" status.
HDFS-13846 - Fix an issue where safe blocks counter is not decremented correctly if the block is striped.
HDFS-13868 - Fix an NPE with the GETSNAPSHOTDIFF API when the parameter "snapshotname" is given but "oldsnapshotname" is not.
HDFS-13876 - Implement ALLOWSNAPSHOT/DISALLOWSNAPSHOT for HttpFS.
HDFS-13877 - Implement GETSNAPSHOTDIFF for HttpFS.
HDFS-13878 - Implement GETSNAPSHOTTABLEDIRECTORYLIST for HttpFS.
HDFS-13882 - Set a maximum delay for retrying locateFollowingBlock.
HDFS-13885 - Add debug logs in dfsclient around decrypting EDEK.
HDFS-13886 - Fix an issue where HttpFSFileSystem.getFileStatus() doesn't return "snapshot enabled" bit.
HDFS-14009 - Fix an issue where FileStatus#setSnapShotEnabledFlag throws InvocationTargetException when attribute set is emptySet.

MapReduce 2

The following issues are fixed in CDH 6.1.0:

MAPREDUCE-6861 - Add metrics tags for ShuffleClientMetrics.
MAPREDUCE-7150 - Optimize collections used by MR JHS to reduce its memory.

YARN

The following issues are fixed in CDH 6.1.0:

YARN-7159 - Normalize unit of resource objects in ResourceManager to avoid unit conversion in critical path.
YARN-7237 - Cleanup usages of ResourceProfiles.
YARN-7728 - Expose container preemptions related information in Capacity Scheduler queue metrics.
YARN-7738 - CapacityScheduler: Support refresh maximum allocation for multiple resource types .
YARN-7948 - Enable fair scheduler to refresh maximum allocation for multiple resource types.
YARN-8338 - Fixed an issue where TimelineService V1.5 does not come up after HADOOP-15406.
YARN-8566 - Add diagnostic message for unschedulable containers .
YARN-8842 - Expose metrics for custom resource types in QueueMetrics.
YARN-8990 - Fix fair scheduler race condition in app submit and queue cleanup.

Apache HBase

The following issues are fixed in CDH 6.1.0:

HBASE-18451 - PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request, fix logging
HBASE-18549 - Add metrics for failed replication queue recovery
HBASE-19418 - configurable range of delay in PeriodicMemstoreFlusher
HBASE-20193 - Basic Replication Web UI - Regionserver
HBASE-20375 - Remove use of getCurrentUserCredentials in hbase-spark module
HBASE-20469 - Directory used for sidelining old recovered edits files should be made configurable
HBASE-20732 - Shutdown scan pool when master is stopped
HBASE-20734 - Colocate recovered edits directory with hbase.wal.dir
HBASE-20741 - Split of a region with replicas creates all daughter regions
HBASE-20792 - info:servername and info:sn inconsistent for OPEN region
HBASE-20808 - (Addendum) Remove duplicate calls for cancelling of chores
HBASE-20846 - Restore procedure locks when master restarts
HBASE-20857 - balancer status tag in jmx metrics
HBASE-20865 - CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT state
HBASE-20892 - [UI] Start / End keys are empty on table.jsp
HBASE-20942 - Revert "Fix Array Index Out Of Bounds Exception for RpcServer TRACE logging"
HBASE-20965 - Separate region server report requests to new handlers
HBASE-20985 - add two attributes when we do normalization
HBASE-20986 - Separate the config of block size when we do log splitting and write Hlog
HBASE-21001 - ReplicationObserver fails to load in HBase 2.0.0
HBASE-21023 - Added bypassProcedure() API to HbckService
HBASE-21032 - ScanResponses contain only one cell each
HBASE-21055 - NullPointerException when balanceOverall() but server balance info is null
HBASE-21072 - Addendum do not write lock file when running TestHBaseFsckReplication
HBASE-21073 - Redo concept of maintenance mode
HBASE-21095 - The timeout retry logic for several procedures are broken after master restarts
HBASE-21125 - 'HBASE-20942 Improve RpcServer TRACE logging' to branch-2.1
HBASE-21126 - "Add ability for HBase Canary to ignore a configurable number of ZooKeeper down nodes" to branch-2.1
HBASE-21127 - TableRecordReader need to handle cursor result too
HBASE-21132 - return wrong result in rest multiget
HBASE-21144 - AssignmentManager.waitForAssignment is not stable
HBASE-21155 - Save on a few log strings and some churn in wal splitter by skipping out early if no logs in dir
HBASE-21156 - [hbck2] Queue an assign of hbase:meta and bulk assign/unassign
HBASE-21158 - Empty qualifier cell is always returned when using QualifierFilter
HBASE-21164 - reportForDuty should do backoff rather than retry
HBASE-21171 - [amv2] Tool to parse a directory of MasterProcWALs standalone
HBASE-21172 - Reimplement the retry backoff logic for ReopenTableRegionsProcedure
HBASE-21174 - [REST] Failed to parse empty qualifier in TableResource#getScanResource
HBASE-21179 - Fix the number of actions in responseTooSlow log
HBASE-21181 - Use the same filesystem for wal archive directory and wal directory
HBASE-21182 - Failed to execute start-hbase.sh
HBASE-21185 - WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes
HBASE-21190 - Log files and count of entries in each as we load from the MasterProcWAL store
HBASE-21191 - Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared).
HBASE-21196 - HTableMultiplexer clears the meta cache after every put operation
HBASE-21200 - Memstore flush doesn't finish because of seekToPreviousRow() in memstore scanner.
HBASE-21204 - NPE when scan raw DELETE_FAMILY_VERSION and codec is not set
HBASE-21206 - Scan with batch size may return incomplete cells
HBASE-21207 - Add client side sorting functionality in master web UI for table and region server details
HBASE-21208 - Bytes#toShort doesn't work without unsafe
HBASE-21212 - Wrong flush time when update flush metric
HBASE-21214 - [hbck2] setTableState just sets hbase:meta state, not in-memory state
HBASE-21223 - [amv2] Remove abort_procedure from shell
HBASE-21228 - Memory leak since AbstractFSWAL caches Thread object and never clean later
HBASE-21232 - Show table state in Tables view on Master home page
HBASE-21233 - Allow the procedure implementation to skip persistence of the state after a execution
HBASE-21242 - Revert "[amv2] Miscellaneous minor log and assign procedure create improvements; ADDENDUM Fix TestHRegionInfo"
HBASE-21248 - Implement exponential backoff when retrying for ModifyPeerProcedure
HBASE-21249 - Add jitter for ProcedureUtil.getBackoffTimeMs
HBASE-21250 - Addendum remove unused modification in hbase-server module
HBASE-21250 - Refactor WALProcedureStore and add more comments for better understanding the implementation
HBASE-21254 - Need to find a way to limit the number of proc wal files
HBASE-21259 - [amv2] Revived deadservers; recreated serverstatenode
HBASE-21260 - The whole balancer plans might be aborted if there are more than one plans to move a same region
HBASE-21263 - Mention compression algorithm along with other storefile details
HBASE-21266 - Not running balancer because processing dead regionservers, but empty dead rs list
HBASE-21280 - Add anchors for each heading in UI
HBASE-21287 - Allow configuring test master initialization wait time.
HBASE-21288 - HostingServer in UnassignProcedure is not accurate
HBASE-21292 - IdLock.getLockEntry() may hang if interrupted
HBASE-21299 - List counts of actual region states in master UI tables section
HBASE-21303 - [shell] clear_deadservers with no args fails
HBASE-21323 - Revert "Should not skip force updating for a sub procedure even if"
HBASE-21425 - 2.1.1 fails to start over 1.x data; namespace not assigned

Apache Hive

The following issues are fixed in CDH 6.1.0:

Code Changes Might Be Required

The following fixes might require code changes for the CDH 6.1.0 release of Apache Hive:

HIVE-14388 - Add number of rows inserted message after insert command in Beeline
HIVE-17799 - Add Ellipsis For Truncated Query In Hive Lock
HIVE-19344 - Change default value of msck.repair.batch.size

Code Changes Should Not Be Required

The following fixes should not require code changes, but they contain improvements that might enhance your deployment:

HIVE-6980 - Drop table by using direct SQL
HIVE-10296 - Cast exception observed when hive runs a multi-join query on metastore (postgres), since postgres pushes the filter into the join, and ignores the condition before applying cast
HIVE-13900 - HiveStatement.executeAsync() may not work properly when hive.server2.async.exec.async.compile is turned on
HIVE-14162 - Allow disabling of a long-running job on Hive On Spark On YARN
HIVE-14560 - Support exchange partition between s3 and HDFS tables
HIVE-14690 - Query fail when hive.exec.parallel=true, with conflicting session dir
HIVE-14984 - Hive-WebUI access results in Request is a replay (34) attack
HIVE-15104 - Hive on Spark generate more shuffle data than hive on mr
HIVE-15180 - Extend JSONMessageFactory to store additional information about metadata objects on different table events
HIVE-15250 - Reuse partitions info generated in MoveTask to its subscribers (StatsTask)
HIVE-15712 - New HiveConf in SQLOperation.getSerDe() impacts CPU on Hiveserver2
HIVE-15995 - Syncing metastore table with serde schema
HIVE-16071 - HoS RPCServer misuses the timeout in its RPC handshake
HIVE-16143 - Improve msck repair batching
HIVE-16172 - Switch to a fairness lock to synchronize HS2 thrift client
HIVE-16219 - Metastore notification_log contains serialized message with non-functional fields
HIVE-16285 - Servlet for dynamically configuring log levels
HIVE-16346 - inheritPerms should be conditional based on the target filesystem
HIVE-16348 - HoS query is canceled but error message shows RPC is closed
HIVE-16431 - Support Parquet StatsNoJobTask for Spark & Tez engine
HIVE-16607 - ColumnStatsAutoGatherContext regenerates HiveConf.HIVEQUERYID
HIVE-16664 - Add join related Hive blobstore tests
HIVE-16736 - General Improvements to BufferedRows
HIVE-17300 - WebUI query plan graphs
HIVE-17401 - Hive session idle timeout doesn't function properly
HIVE-17747 - HMS DropTableMessage should include the full table object
HIVE-18031 - Support replication for Alter Database operation
HIVE-18118 - Explain Extended should indicate if a file being read is an EC file
HIVE-18652 - Print Spark metrics on console
HIVE-18690 - Integrate with Spark OutputMetrics
HIVE-18696 - The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs
HIVE-18705 - Improve HiveMetaStoreClient.dropDatabase
HIVE-18743 - CREATE TABLE on S3 data can be extremely slow.DO_NOT_UPDATE_STATS workaround is buggy
HIVE-18766 - Race condition during shutdown of RemoteDriver, error messages aren't always sent
HIVE-18778 - Needs to capture input/output entities in explain
HIVE-18906 - Lower Logging for "Using direct SQL".
HIVE-18916 - SparkClientImpl doesn't error out if spark-submit fails.
HIVE-19008 - Improve Spark session id logging
HIVE-19053 - RemoteSparkJobStatus#getSparkJobInfo treats all exceptions as timeout errors
HIVE-19079 - Add extended query string to Spark job description
HIVE-19370 - Issue: ADD Months function on timestamp datatype fields in Hive
HIVE-19371 - Add table ownerType to HMS thrift API
HIVE-19372 - Add table ownerType to JDO/SQL and ObjectStore
HIVE-19374 - Parse and process ALTER TABLE SET OWNER command syntax
HIVE-19477 - Hiveserver2 in HTTP mode not emitting metric default.General.open_connections
HIVE-19486 - Discrepancy in HikariCP config naming
HIVE-19508 - SparkJobMonitor getReport doesn't print stage progress in order
HIVE-19525 - Spark task logs print PLAN PATH excessive number of times
HIVE-19559 - SparkClientImpl shouldn't name redirector thread RemoteDriver
HIVE-19718 - Adding partitions in bulk also fetches table for each partition
HIVE-19733 - RemoteSparkJobStatus#getSparkStageProgress inefficient implementation
HIVE-19766 - Show the number of rows inserted when execution engine is Spark
HIVE-19783 - Retrieve only locations in HiveMetaStore.dropPartitionsAndGetLocations
HIVE-19786 - RpcServer cancelTask log message is incorrect
HIVE-19787 - Log message when spark-submit has completed
HIVE-19814 - RPC Server port is always random for spark
HIVE-19899 - Support stored as JsonFile
HIVE-19937 - Intern fields in MapWork on deserialization
HIVE-19942 - Hive Notification: All events for indexes should have table name
HIVE-19986 - Add logging of runtime statistics indicating when Hdfs Erasure Coding is used by MR
HIVE-20032 - Don't serialize hashCode for repartitionAndSortWithinPartitions
HIVE-20056 - SparkPartitionPruner shouldn't be triggered by Spark tasks
HIVE-20098 - Statistics: NPE when getting Date column partition statistics
HIVE-20212 - Hiveserver2 in http mode emitting metric default.General.open_connections incorrectly
HIVE-20374 - Write Hive version information to Parquet footer
HIVE-20466 - Improve org.apache.hadoop.hive.ql.exec.FunctionTask Experience
HIVE-20505 - upgrade org.openjdk.jmh:jmh-core to 1.21
HIVE-20544 - TOpenSessionReq logs password and username
HIVE-20545 - Exclude parameters that can have potentially large size from HMS notification message JSON
HIVE-20601 - EnvironmentContext null in ALTER_PARTITION event in DbNotificationListener
HIVE-20603 - "Wrong FS" error when inserting to partition after changing table location filesystem
HIVE-20678 - HiveHBaseTableOutputFormat should implement HiveOutputFormat to ensure compatibility
HIVE-20695 - HoS Query fails with hive.exec.parallel=true.
HIVE-20711 - Race Condition when Multi-Threading in SessionState.createRootHDFSDir
HIVE-20742 - SparkSessionManagerImpl maintenance thread only cleans up session once

Hue

The following issues are fixed in CDH 6.1.0:

HUE-7407 - [useradmin] Added superuser group priv to useradmin
HUE-7698 - [oozie] Added warning when there is a space in the shell action
HUE-7698 - [oozie] Files of a Shell document action in a workflow are not being generated in the XML
HUE-7860 - [core] Update greenlet from 0.4.12 to 0.4.15
HUE-7860 - [core] Add monotonic 1.5
HUE-7860 - [core] Update Gunicorn from 19.7.1 to 19.9.0
HUE-7860 - [core] Update eventlet from 0.21.0 to 0.24.1
HUE-7860 - [core] Add dnspython 1.15.0
HUE-8139 - [core] Fix django-debug-toolbar 1.9.1 to work with django_debug_panel
HUE-8140 - [editor] Automatically continue execution after DDL statements in batch mode
HUE-8330 - [cluster] Keep only external cluster configs in [[clusters]]
HUE-8330 - [core] API should not check for remote cloud clusters if they are not configured
HUE-8339 - [impala] Fix typo in smart pooling ini configuration
HUE-8391 - [importer] Improve Create table from File UX when loading data from parent directory not readable by hive/impala
HUE-8488 - [fb] Disable drag&drop when show_upload_button=false
HUE-8507 - [editor] Add types to sqlalchemy results.
HUE-8507 - [editor] SQL alchemy result set column headers are missing.
HUE-8509 - [oozie] Schedule repetitive remote jobs
HUE-8509 - [oozie] Support sending a SQL query to a remote cluster
HUE-8509 - [jb] Clean-up of the listing of remote jobs
HUE-8509 - [oozie] Properly set the capture output flag of shell document action
HUE-8509 - [oozie] Remote job action
HUE-8509 - [kafka] Do not break left panel
HUE-8514 - [core] Log metrics when calling is_alive
HUE-8516 - [cluster] List more namespaces and filter out invalide ones
HUE-8518 - [editor] Fix sample Kudu
HUE-8519 - [jb] Impala API can now directly return json
HUE-8521 - [auth] Protect against empty LDAP login username
HUE-8522 - [jb] Make paused tasks more obvious. Add queued state to Impala
HUE-8523 - [jb] Display Impala backends & instances
HUE-8524 - [impala] Provide the root cause of INVALIDATE METADATA failures
HUE-8527 - [editor] Fix concatenation type exception in namespace call
HUE-8528 - [frontend] Temporarily disable namespace caching
HUE-8529 - [frontend] Create a context selector component
HUE-8531 - [sqoop] Properly name the table import job
HUE-8532 - [core] Fix database migration test.
HUE-8533 - [importer] Properly displayed failed import progress bar as red and not orange
HUE-8534 - [jb] Django url name does not exist and breaks page
HUE-8535 - [sqoop] Use the proper engine name and not the connection nice name as jdbc prefix
HUE-8536 - [sqoop] Include hive-site.xml automatically when importing data to hive
HUE-8537 - [sqoop] List the proper column type when importing to a hive table
HUE-8538 - [sqoop] Allow table preview from manual input not JDBC
HUE-8538 - [importer] Automatically fill-up the db driver list when selecting sqoop
HUE-8539 - [importer] Clean-up configuration and turn sqoop and solr imports to on by default
HUE-8540 - [sqoop] Add ability to set default jdbc driver path for any sqoop job
HUE-8541 - [oozie] Workflow rerun does not restart polling for job status
HUE-8542 - [frontend] Add a custom left nav for multi cluster mode
HUE-8542 - [frontend] Polish cloud cluster and require multi cluster mode to be on
HUE-8544 - [importer] Support sending file data into a kafka topic
HUE-8545 - [search] Fix filtering in the index selection dropdown
HUE-8546 - [assist] Limit assist refresh to the active namespace for DDL statement executions
HUE-8546 - [assist] Make sure the assist gets refreshed after multiple DDL statement executions
HUE-8547 - [jb] Fix navigation from create schedule to view schedule.
HUE-8547 - [jb] Fix refresh on coordinator page.
HUE-8548 - [jb] Fix invalid date in workflow task
HUE-8549 - [autocomplete] Improve CTE alias suggestions when there's a trailing ";"
HUE-8550 - [jb] Use the context selector component in the job browser
HUE-8550 - [frontend] Make last selected compute and namespace sticky
HUE-8550 - [jb] Default to the last selected type of compute in the job browser
HUE-8550 - [jb] Refresh job browser tabs on compute selection
HUE-8551 - [importer] Support setting basic Flume configs
HUE-8553 - [kafka] Link create topic API to the UI
HUE-8553 - [kafka] Add a workaround API for creating a topic
HUE-8554 - [indexer] Protect against empty sample data that can be null
HUE-8554 - [importer] Support latest Spark version 2 natively
HUE-8554 - [manager] Adding a check if service is installed API
HUE-8554 - [cluster] Create data warehouse cluster skeleton
HUE-8554 - [core] Support dist Spark installed when running envelope via shell
HUE-8554 - [cluster] Avoid double escapating of data warehouse results
HUE-8554 - [cluster] Rename analytic cluster API command to dataware
HUE-8555 - [cluster] Do not submit remote coordinator jobs by default
HUE-8555 - [jb] Refactor job browser preview to support multi cluster
HUE-8555 - [jb] Support killing data warehouse cluster
HUE-8555 - [jb] Sort clusters with the most recents first
HUE-8555 - [jb] List data warehouse clusters
HUE-8555 - [jb] Auto select the first cluster if possible at init
HUE-8556 - [fb] Overuse of trash folder checking
HUE-8557 - [sqoop] DB name and table names variables were already present
HUE-8557 - [sqoop] Offer to rename the table or selected a different existing Hive database
HUE-8558 - [jb] Add tracking URL to Spark Jobs and remove url and killUrl
HUE-8559 - [jb] Hue shows incorrect color for failed oozie jobs
HUE-8560 - [tb] Make sure the default DB is opened by default in the Table Browser
HUE-8560 - [tb] Stick to the same view when switching namespaces in the Table Browser
HUE-8561 - [editor] Don't show databases for spark editor
HUE-8562 - [frontend] Make sure the context popover is shown above the jobs panel
HUE-8564 - [useradmin] Fix last activity update for notebook/api/check_status
HUE-8564 - [useradmin] Fix last activity update for jobbrowser/api/jobs requests
HUE-8565 - [fb] Parent directory should not be selectable
HUE-8565 - [fb] Current directory should not be deletable.
HUE-8566 - [useradmin] Update message for duplicate user creation.
HUE-8567 - [jb] Fix id max length in mini jb
HUE-8568 - [jb] Prevent mini jb actions from taking content width
HUE-8568 - [jb] Activate smart file links from the logs by also checking for prefixes
HUE-8570 - [assist] Extract a separate column sample component
HUE-8570 - [frontend] Right align the Hue dropdown when rendered outside the window
HUE-8570 - [assist] Add distinct as an option for column samples in the context popover
HUE-8570 - [editor] Enable click to insert from sample popover to SQL variables
HUE-8570 - [assist] Add inline autocomplete for column samples
HUE-8570 - [editor] Enable optional operation on the sample API endpoints
HUE-8570 - [assist] Limit context popover sample operations to Impala and Hive
HUE-8570 - [assist] Add min and max to column sample popover
HUE-8571 - [sentry] navigator_api ERROR for PRIVILEGE_HIERARCHY[hierarchy[server][SENTRY_PRIVILEGE_KEY]['action']]
HUE-8572 - [cluster] Bubble up authentication errors on remote clusters
HUE-8572 - [tb] Fix JS exception when clearing table browser selection via pubsub
HUE-8572 - [tb] Add compute and namespace to DROP table endpoint
HUE-8572 - [tb] Fix log overflow in history panel
HUE-8573 - [sqoop] Out of the box import of a MySQL table
HUE-8573 - [sqoop] Avoid unrelated casting error when testing the connection
HUE-8574 - [importer] Adding Flume flows
HUE-8574 - [flume] Support updating Flume agent config
HUE-8574 - [importer] Nav Kafka stream import to Solr and Kudu part 1
HUE-8574 - [importer] Allow audit logs to be sent to Solr
HUE-8574 - [importer] Setup automatically a Flume grapping Hue HTTPD logs and put into the sample collection
HUE-8574 - [importer] Feature flag for showing the Field Editor
HUE-8574 - [cluster] Auto scaling data warehouse cluster API skeleton
HUE-8574 - [importer] Button caret to call for getting the job config
HUE-8575 - [importer] Add external multi table support
HUE-8575 - [importer] Fix file to table import.
HUE-8576 - [editor] Add backticked suggestion to the syntax checked for reserved keywords
HUE-8577 - [autocomplete] Add all currently reserved keywords for Impala
HUE-8577 - [editor] Rebuild Ace with updated dependencies
HUE-8577 - [autocomplete] Add support for Impala SHOW GRANT ROLE/USER statements
HUE-8577 - [autocomplete] Add support for Impala ALTER TABLE/VIEW SET OWNER
HUE-8577 - [autocomplete] Add Impala METHOD to reserved keywords
HUE-8577 - [autocomplete] Make previously non-reserved keywords reserved for Impala
HUE-8577 - [autocomplete] Fix issue where the statement type location is added twice
HUE-8578 - [importer] Auto select id column if present in Kudu tables
HUE-8578 - [importer] Implement Flume output
HUE-8578 - [importer] Get basic Flume ingest step integrated
HUE-8578 - [manager] Restrict API calls to admin
HUE-8579 - [core] Blacklisting certain apps like filebrowser and oozie can fail
HUE-8580 - [editor] Fix jdbc assist.
HUE-8580 - [importer] Improve usability of table import
HUE-8580 - [importer] Fix RDBMS support for scoop configured import.
HUE-8581 - [importer] Fix timing related JS exceptions
HUE-8581 - [importer] Improve query type selection layout for the field editor
HUE-8581 - [importer] Fix JS error on target namespace selection and improve layout for table import
HUE-8581 - [importer] Improve the stream import form layout
HUE-8581 - [importer] Allow typed paths in the hivechooser binding
HUE-8581 - [importer] Fix JS error for field query editor in importer
HUE-8582 - [jb] Make back button from editing a file more obvious
HUE-8583 - [fb] Surface too many buckets error
HUE-8588 - [core] Fix PAM backend has conflict with timer metrics
HUE-8589 - [core] Split cluster listing to its own API
HUE-8589 - [jb] Switch from compute to the cluster API endpoint in the job browser
HUE-8591 - [cluster] Integration skeleton for Data Warehouse v2 API
HUE-8591 - [impala] Properly pickup the selected compute cluster
HUE-8591 - [cluster] Remove extra debug info
HUE-8591 - [cluster] Step of logic simplification of the multi cluster configuration
HUE-8591 - [cluster] Display impalad hostname
HUE-8591 - [impala] Properly point to the selected cluster hostname
HUE-8591 - [cluster] Protect against override of cluster name
HUE-8591 - [core] Showing up S3 browser by default in cloud mode
HUE-8591 - [cluster] Move port to 21050
HUE-8591 - [cluster] Safeguard against localhost
HUE-8591 - [cluster] Properly use the correct cluster hostname in the editor
HUE-8591 - [cluster] Add hostname check in the cluster hostname log trace
HUE-8591 - [cluster] Split cluster template between static and dynamic clusters
HUE-8591 - [cluster] Clear the compute cache on namespace refresh from left assist
HUE-8591 - [cluster] Avoid failing when cluster is None
HUE-8591 - [cluster] Wire in API for listing and creating k8 clusters
HUE-8591 - [cluster] Plug in the list of clusters
HUE-8591 - [cluster] Adding cluster resize capabilities on the cluster page
HUE-8591 - [cluster] Add Thrift client used for the specific query server
HUE-8591 - [cluster] Refresh the context selector when namespaces are refreshed
HUE-8591 - [cluster] Hook in remote Impala coordinator URL of selected cluster
HUE-8591 - [cluster] Use default port if ont in a selected remote cluster
HUE-8591 - [cluster] Add impalad link to cluster page
HUE-8591 - [cluster] Add logic to get the corresponding Impalad name
HUE-8591 - [cluster] Add some progress bar color and effect on cluster resize
HUE-8591 - [cluster] Add proper cluster page
HUE-8591 - [cluster] Use name as clusterName throughout the calls
HUE-8591 - [cluster] Move API url to a config property
HUE-8591 - [cluster] Prevent red error popups
HUE-8591 - [cluster] Fix name of default cluster
HUE-8591 - [cluster] Use properly Impala Thrift Client on remote Impala cluster direct connection
HUE-8592 - [frontend] Enable default click to navigate for catalog entries table
HUE-8592 - [frontend] Add option to automatically refresh samples in the catalog entries table
HUE-8592 - [frontend] Create a polling catalog entries list component that waits until an entity exists
HUE-8594 - [editor] Avoid js error when lastSelectedCompute does not exist
HUE-8595 - [flume] Collect and ingest Hue balancer logs out of the box
HUE-8597 - [frontend] Use the default SQL interpreter as source type in the global search results
HUE-8599 - [frontend] Add pubSub to force clear the context catalog from the job browser
HUE-8599 - [frontend] Improve stability of the context selector
HUE-8600 - [tb] Limit Table Browser namespace selection to namespaces with active computes
HUE-8601 - [jb] Fix issue where context selector in mini jb is hidden behind expand text
HUE-8602 - [sentry] Remove ALTER and DROP table privileges for now
HUE-8602 - [sentry] Remove ALTER and DROP in the Hive section
HUE-8603 - [editor] Always show the query compatibility check results
HUE-8604 - [frontend] Use the latest opened database by default throughout
HUE-8606 - [s3] Opening S3 browser makes a call to HDFS
HUE-8607 - [tb] Include namespace when querying a table from the table browser
HUE-8607 - [tb] Fix broken drop table action in the table browser
HUE-8607 - [tb] Fix query and view table actions in the table browser
HUE-8609 - [tb] Fix exception in describe table call from the Table Browser
HUE-8610 - [tb] Make sure the created notebook for samples requests has the provided compute
HUE-8610 - [tb] Include compute in stats and describe table calls from the table browser
HUE-8610 - [core] Always send the full cluster instead of id to the APIs
HUE-8610 - [tb] Include compute when fetching samples from the table browser
HUE-8611 - [assist] Send cluster parameter with the invalidate calls
HUE-8612 - [editor] Improve the editor shortcut search to show results from all categories
HUE-8612 - [editor] Add missing keyboard shortcuts to the editor help
HUE-8613 - [tb] Send cluster when dropping databases from the table browser
HUE-8614 - [tb] Fix the create new database action in the Table Browser
HUE-8615 - [frontend] Make sure namespaces and computes always have a name in the context selector
HUE-8617 - [frontend] Add pubSub to the context selector for setting cluster/compute/namespace
HUE-8618 - [editor] Prevent js exception when typing while the context is loading
HUE-8619 - [tb] Include cluster in the partitions API call
HUE-8619 - [tb] Switch to POST for partitions API call
HUE-8621 - [editor] Add a custom Ace mode for the dark theme
HUE-8621 - [editor] Add keyboard shortcut to toggle dark mode
HUE-8621 - [editor] Add ace option to toggle dark mode
HUE-8621 - [editor] Add dark mode keyboard shortcut to the editor help
HUE-8623 - [frontend] Send cluster when checking if a table or database exists in the importer
HUE-8624 - [beeswax] Fix tests on create database to redirect on a v4 page
HUE-8625 - [editor] Prevent js exception when dragging from top search to the editor after visiting the importer
HUE-8626 - [security] Fix navigation issues after visiting the security app
HUE-8627 - [frontend] Add partition result view to the top search
HUE-8628 - [assist] Indicate context in the left assist filter placeholder
HUE-8629 - [assist] Don't show a database icon in the breadcrumb of non sql type assist panels
HUE-8629 - [assist] Customise the assist icons for streams
HUE-8629 - [assist] Add a dedicated streams assist panel
HUE-8629 - [assist] Make sure entries are loaded in left assist for non sql types
HUE-8629 - [assist] Improve assist context menu for kafka
HUE-8630 - [core] Fix TestMetastoreWithHadoop.test_basic_flow _get_apps
HUE-8630 - Fix TestRdbmsIndexer missing RdbmsIndexer
HUE-8630 - [fb] Fix TestFileBrowserWithHadoop.test_index home_directory
HUE-8634 - HUE-8111 [core] Perform 4.3 release
HUE-8635 - [editor] Add the correct styles to the language reference context popover
HUE-8639 - [metadata] Do not do Sentry filtering when Sentry is not configured
HUE-8639 - [metadata] Include the docstring into the configuration
HUE-8650 - [importer] Fix make_notebook default namespace & compute
HUE-8652 - [frontend] Fix JS exception in jquery.hiveautocomplete when no namespaces are returned
HUE-8654 - [editor] Prevent setting empty object for namespace and compute
HUE-8654 - [editor] Guarantee a namespace and compute is set in single cluster mode
HUE-8655 - [editor] Have the location handler wait for a compute and namespace to be set
HUE-8656 - [tb] Make sure a compute is always set in the table browser
HUE-8660 - [assist] Fix file preview in left assist for files with # in the name
HUE-8660 - [core] Fix page routing issues with file browser paths containing #
HUE-8660 - [assist] Support multiple # in file names for assist preview
HUE-8662 - [core] Fix missing static URLs

Apache Impala

The following issues are fixed in CDH 6.1.0:

IMPALA-6202 - The mod() function now behaves the same as the % operator.
IMPALA-6373 - Allow primitive type widening on parquet tables. Impala only supports conversion to those types without any loss of precision:
- TINYINT (INT32) -> SMALLINT (INT32), INT (INT32), BIGINT (INT64), DOUBLE
- SMALLINT (INT32) -> INT (INT32), BIGINT (INT64), DOUBLE
- INT (INT32) -> BIGINT (INT64), DOUBLE
- FLOAT -> DOUBLE
IMPALA-6442 - Fixed the incorrectly reported Parquet file offset in error messages.
IMPALA-6568 - The Query Compilation section was added to profile outputs.
IMPALA-6844 - Impala now correctly handles a possible null pointer in the to_date() function.
IMPALA-7272 - Fixed potential crash when a min-max runtime filter is generated for a string value.
IMPALA-7449 - Fixed network throughput calculation by measuring the network throughput of each individual RPC and uses a summary counter to track avg/min/max of network throughputs.
IMPALA-7585 - Now Impala always explicitly sets user credentials after creating RPC proxy.
IMPALA-7668 - Now Impala closes URLClassLoader instances and cleans up any open temporary jar files to avoid file descriptor leaks and disk space issues.
IMPALA-7824 - Running INVALIDATE METADATA with authorization enabled no longer causes a hang when Sentry is unavailable.

Apache Kafka

The following issues are fixed in CDH 6.1.0:

KAFKA-2983 - Remove Scala consumers and related code
KAFKA-3702 - Change log level of SSL close_notify failure
KAFKA-4950 - Fix ConcurrentModificationException on assigned-partitions metric update
KAFKA-5098 - KafkaProducer should reject sends to invalid topics
KAFKA-5588 - Remove deprecated --new-consumer tools option
KAFKA-5697 - Use nonblocking poll in Streams
KAFKA-5891 - Proper handling of LogicalTypes in Cast
KAFKA-5919 - Adding checks on "version" field for tools using it
KAFKA-6054 - Add 'version probing' to Kafka Streams rebalance
KAFKA-6264 - Split log segments as needed if offsets overflow the indexes
KAFKA-6538 - Changes to enhance ByteStore exceptions thrown from RocksDBStore with more human readable info
KAFKA-6546 - Use LISTENER_NOT_FOUND error for missing listener
KAFKA-6562 - Make jackson-databind an optional clients dependency
KAFKA-6648 - Fetcher.getTopicMetadata() should return all partitions for each requested topic
KAFKA-6697 - Broker should not die if getCanonicalPath fails
KAFKA-6704 - InvalidStateStoreException from IQ when StreamThread closes store
KAFKA-6711 - GlobalStateManagerImpl should not write offsets of in-memory stores in checkpoint file
KAFKA-6726 - Fine Grained ACL for CreateTopics (KIP-277)
KAFKA-6730 - Simplify State Store Recovery
KAFKA-6743 - ConsumerPerformance fails to consume all messages [KIP-281]
KAFKA-6749 - Fixed TopologyTestDriver to process stream processing guarantee as exactly once
KAFKA-6750 - Add listener name to authentication context (KIP-282)
KAFKA-6760 - Fix response logging in the Controller
KAFKA-6782 - solved the bug of restoration of aborted messages for GlobalStateStore and KGlobalTable
KAFKA-6805 - Enable broker configs to be stored in ZK before broker start
KAFKA-6809 - Count inbound connections in the connection-creation metric
KAFKA-6813 - return to double-counting for count topology names
KAFKA-6841 - Support Prefixed ACLs (KIP-290)
KAFKA-6859 - Do not send LeaderEpochRequest for undefined leader epochs
KAFKA-6860 - Fix NPE in Kafka Streams with EOS enabled
KAFKA-6884 - Consumer group command should use new admin client
KAFKA-6897 - Prevent KafkaProducer.send from blocking when producer is closed
KAFKA-6906 - MINOR: code cleanup follow up for
KAFKA-6906 - Fixed to commit transactions if data is produced via wall clock punctuation
KAFKA-6927 - Chunked down-conversion to prevent out of memory errors on broker [KIP-283]
KAFKA-6935 - Add config for allowing optional optimization
KAFKA-6936 - Implicit materialized for aggregate, count and reduce
KAFKA-6944 - Add system tests testing the new throttling behavior using older clients/brokers
KAFKA-6946 - Keep the session id for incremental fetch when fetch responses are throttled
KAFKA-6949 - alterReplicaLogDirs() should grab partition lock when accessing log of the future replica
KAFKA-6955 - Use Java AdminClient in DeleteRecordsCommand
KAFKA-6967 - TopologyTestDriver does not allow pre-populating state stores that have change logging
KAFKA-6973 - Validate topic config message.timestamp.type
KAFKA-6975 - Fix replica fetching from non-batch-aligned log start offset
KAFKA-6979 - Add `default.api.timeout.ms` to KafkaConsumer (KIP-266)
KAFKA-6981 - Move the error handling configuration properties into the ConnectorConfig and SinkConnectorConfig classes
KAFKA-6986 - Export Admin Client metrics through Stream Threads
KAFKA-6991 - Fix ServiceLoader issue with PluginClassLoader
KAFKA-6997 - Exclude test-sources.jar when $INCLUDE_TEST_JARS is FALSE
KAFKA-7001 - Rename errors.allowed.max property in Connect to errors.tolerance
KAFKA-7002 - Add a config property for DLQ topic's replication factor
KAFKA-7003 - Set error context in message headers
KAFKA-7005 - Remove duplicate resource class.
KAFKA-7006 - remove duplicate Scala ResourceNameType in preference to...
KAFKA-7007 - Use JSON for /kafka-acl-extended-changes path
KAFKA-7010 - Rename ResourceNameType to PatternType
KAFKA-7011 - Remove ResourceNameType field from Java Resource class.
KAFKA-7012 - Don't process SSL channels without data to process
KAFKA-7019 - Make reading metadata lock-free by maintaining an atomically-updated read snapshot
KAFKA-7021 - Reuse source based on config
KAFKA-7021 - Update upgrade guide section for reusing source topic
KAFKA-7023 - Move prepareForBulkLoad() call after customized RocksDBConfigSetter
KAFKA-7028 - Properly authorize custom principal objects
KAFKA-7029 - Update ReplicaVerificationTool not to use SimpleConsumer
KAFKA-7030 - Add configuration to disable message down-conversion (KIP-283)
KAFKA-7031 - Connect API shouldn't depend on Jersey
KAFKA-7032 - The TimeUnit is neglected by KakfaConsumer#close(long, Tim...
KAFKA-7039 - Create an instance of the plugin only it's a Versioned Plugin
KAFKA-7043 - Modified plugin isolation whitelist with recently added converters
KAFKA-7044 - Fix Fetcher.fetchOffsetsByTimes and NPE in describe consumer group
KAFKA-7047 - Added SimpleHeaderConverter to plugin isolation whitelist
KAFKA-7048 - NPE when creating connector
KAFKA-7050 - Decrease default consumer request timeout to 30s
KAFKA-7055 - Update InternalTopologyBuilder to throw TopologyException if a processor or sink is added with no upstream node attached
KAFKA-7056 - Moved Connects new numeric converters to runtime
KAFKA-7058 - Comparing schema default values using Objects#deepEquals()
KAFKA-7066 - added better logging in case of Serialisation issue
KAFKA-7068 - Handle null config values during transform
KAFKA-7076 - Skip rebuilding producer state when using old message format
KAFKA-7080 - pass segmentInterval to CachingWindowStore
KAFKA-7082 - Concurrent create topics may throw NodeExistsException
KAFKA-7091 - AdminClient should handle FindCoordinatorResponse errors
KAFKA-7097 - HOTFIX:; Set create time default to -1L in VerifiableProducer
KAFKA-7097 - VerifiableProducer does not work properly with --message-create-time argument
KAFKA-7104 - More consistent leader's state in fetch response
KAFKA-7111 - Log error connecting to node at a higher log level
KAFKA-7112 - MINOR:Only resume restoration if state is still PARTITIONS_ASSIGNED after poll
KAFKA-7119 - Handle transient Kerberos errors as non-fatal exceptions
KAFKA-7119 - Handle transient Kerberos errors on server side
KAFKA-7126 - Reduce number of rebalance for large consumer group after a topic is created
KAFKA-7128 - Follower has to catch up to offset within current leader epoch to join ISR
KAFKA-7136 - Avoid deadlocks in synchronized metrics reporters
KAFKA-7144 - Fix task assignment to be even
KAFKA-7147 - ReassignPartitionsCommand should be able to connect to broker over SSL
KAFKA-7164 - Follower should truncate after every missed leader epoch change
KAFKA-7168 - Treat connection close during SSL handshake as retriable
KAFKA-7182 - SASL/OAUTHBEARER client response missing %x01 seps
KAFKA-7185 - Allow empty resource name when matching ACLs
KAFKA-7192 - Follow-up: update checkpoint to the reset beginning offset
KAFKA-7192 - Wipe out if EOS is turned on and checkpoint file does not exist
KAFKA-7194 - Fix buffer underflow if onJoinComplete is retried after failure
KAFKA-7216 - Ignore unknown ResourceTypes while loading acl cache
KAFKA-7228 - Set errorHandlingMetrics for dead letter queue
KAFKA-7231 - Ensure NetworkClient uses overridden request timeout
KAFKA-7242 - Reverse xform configs before saving
KAFKA-7250 - switch scala transform to TransformSupplier
KAFKA-7250 - fix transform function in scala DSL to accept TranformerSupplier
KAFKA-7255 - Fix timing issue with create/update in SimpleAclAuthorizer
KAFKA-7261 - Record 1.0 for total metric when Count stat is used for rate
KAFKA-7278 - replaceSegments() should not call asyncDeleteSegment() for segments which have been removed from segments list
KAFKA-7280 - Synchronize consumer fetch request/response handling
KAFKA-7284 - streams should unwrap fenced exception
KAFKA-7285 - Create new producer on each rebalance if EOS enabled
KAFKA-7286 - Avoid getting stuck loading large metadata records
KAFKA-7287 - Set open ACL for old consumer znode path
KAFKA-7296 - Handle coordinator loading error in TxnOffsetCommit
KAFKA-7298 - Raise UnknownProducerIdException if next sequence number is unknown
KAFKA-7301 - Fix streams Scala join ambiguous overload
KAFKA-7316 - Fix Streams Scala filter recursive call #5538
KAFKA-7322 - Fix race condition between log cleaner thread and log retention thread when topic cleanup policy is updated
KAFKA-7347 - Return not leader error for OffsetsForLeaderEpoch requests to non-replicas
KAFKA-7353 - Connect logs 'this' for anonymous inner classes
KAFKA-7354 - Fix IdlePercent and NetworkProcessorAvgIdlePercent metric
KAFKA-7369 - Handle retriable errors in AdminClient list groups API
KAFKA-7385 - Fix log cleaner behavior when only empty batches are retained
KAFKA-7386 - streams-scala should not cache serdes
KAFKA-7388 - equal sign in property value for password
KAFKA-7414 - Out of range errors should never be fatal for follower
KAFKA-7434 - Fix NPE in DeadLetterQueueReporter
KAFKA-7453 - Expire registered channels not selected within idle timeout
KAFKA-7454 - Use lazy allocation for SslTransportLayer buffers and null them on close
KAFKA-7459 - Use thread-safe Pool for RequestMetrics.requestRateInternal
KAFKA-7460 - Fix Connect Values converter date format pattern

Apache Kudu

The following issues are fixed in CDH 6.1.0:

KUDU-844 - [webui]and other /tablet-rowsetlayout-svg improvements
KUDU-972 - Fixed an issue where Kudu’s block cache memory tracking (as seen on the /mem-trackers web UI page) wasn’t accounting for all of the overhead of the cache itself.
KUDU-1038 - When a tablet is deleted, its write-ahead log recovery directory is also deleted, if it exists.
KUDU-2179 - Fixed an issue where kudu cluster ksck running a snapshot checksum scan would use a single snapshot timestamp for all tablets. This caused the checksum process to fail if the checksum process took a long time and the number of tablets was sufficiently large. The tool should now be able to checksum tables even if the process takes many hours.
KUDU-2260 - Fixed a rare issue where system failure could leave unexpected null bytes at the end of metadata files, causing Kudu to be unable to restart.
KUDU-2293 - Fixed an issue with failed tablet copies that would cause subsequent tablet copies to crash the tablet server.
KUDU-2322 - Fixed a bug where leader logged excessively when the followers fell behind.
KUDU-2324 - Add gflags to disable individual maintenance ops.
KUDU-2335 - Fixed reporting of leader health during lifecycle transitions.
KUDU-2364 - When a tablet server was wiped and recreated with the same RPC address, ksck listed it twice, both as healthy, even though only one of them was there. This bug is now fixed by verifying the UUID of the server.
KUDU-2406 - Fixed an issue preventing Kudu from starting when using Vormetric’s encrypted filesystem (secfs2) on ext4.
Note: Use of Vormetric encryption for Kudu is considered experimental. We recommend you to experiment using Vormetric encryption with Kudu in a development environment.
KUDU-2414 - Fixed an issue where the C++ client would fail to reopen an expired scanner; instead, the client would retry in a tight loop and eventually timeout.
KUDU-2437 - Split a tablet into primary key ranges by size.
KUDU-2443 - Fixed moving single-replica tablets.
KUDU-2447 - Fixed a tablet server crash when a tablet is scanned with two predicates on its primary key and the predicates do not overlap.
KUDU-2463 - Fixed a bug in which incorrect results would be returned in scans following a server restart.
KUDU-2509 - Fixed use-after-free in case of WAL replay error.
KUDU-2510 - Fixed symmetric difference logging.
KUDU-2521 - Java Implementation for BloomFilter.
KUDU-2525 - Fixed an issue where the Kudu MapReduce connector’s KuduTableInputFormat may exhaust its scan too early.
KUDU-2531 - (part 1) Ignore invalid tablet metadata files.
.KUDU-2531 - (part 2) Add -nobackup flag to pbc edit tool.
KUDU-2540 - Fixed a bug causing a tablet server crash when a write batch request from a client failed coarse-grained authorization.
KUDU-2580 - Fixed authentication token reacquisition in the C++ client.
KUDU-2601 - Correctly print newly created files.
Fixed an error that would cause the Kudu CLI tool to unexpectedly exit when the connection to the master or tserver was abruptly closed.

Apache Oozie

The following issues are fixed in CDH 6.1.0:

OOZIE-2427 - [Kerberos] Authentication failure for the javascript resources under /ext-2.2
OOZIE-2791 - ShareLib installation may fail on busy Hadoop clusters
OOZIE-2867 - [Coordinators] Emphasize Region/City timezone format
OOZIE-2883 - ProxyUserService: invalid configuration error message is misleading
OOZIE-2914 - Consolidate trim calls
OOZIE-2934 - [sharelib/spark] Fix Findbugs error
OOZIE-2967 - TestStatusTransitService.testBundleStatusCoordSubmitFails fails intermittently in Apache Oozie Core 5.0.0-SNAPSHOT
OOZIE-2968 - TestJavaActionExecutor.testCredentialsSkip fails intermittently
OOZIE-3134 - Potential inconsistency between the in-memory SLA map and the Oozie database
OOZIE-3155 - [ui] Job DAG is not refreshed when a job is finished
OOZIE-3217 - Enable definition of admin users using oozie-site.xml
OOZIE-3221 - Rename DEFAULT_LAUNCHER_MAX_ATTEMPS
OOZIE-3224 - Upgrade Jetty to 9.3
OOZIE-3229 - [client] [ui] Improved SLA filtering options
OOZIE-3229 - [build] test-patch-30-distro improvement
OOZIE-3233 - Remove DST shift from the coordinator job's end time
OOZIE-3235 - Upgrade ActiveMQ to 5.15.3
OOZIE-3251 - Disable JMX for ActiveMQ in the tests
OOZIE-3257 - TestHiveActionExecutor#testHiveAction still fails
OOZIE-3260 - [sla] Remove stale item above max retries on JPA related errors from in-memory SLA map
OOZIE-3298 - [MapReduce action] External ID is not filled properly and failing MR job is treated as SUCCEEDED
OOZIE-3303 - Oozie UI does not work after Jetty 9.3 upgrade
OOZIE-3309 - Runtime error during /v2/sla filtering for bundle
OOZIE-3310 - SQL error during /v2/sla filtering
OOZIE-3348 - [Hive action] Remove dependency hive-contrib
OOZIE-3354 - [core] [SSH action] SSH action gets hung
OOZIE-3369 - [core] Upgrade guru.nidi:graphviz-java to 0.7.0
OOZIE-3370 - Property filtering is not consistent across job submission
OOZIE-3376 - [tests] TestGraphGenerator should assume JDK8 minor version at least 1.8.0_u40
OOZIE-3378 - Coordinator action's status is SUBMITTED after E1003 error

Apache Parquet

The following issues are fixed in CDH 6.1.0:

PARQUET-952 - Avro union with single type fails with 'is not a group'
PARQUET-1417 - BINARY_AS_SIGNED_INTEGER_COMPARATOR fails with IOBE for the same arrays with the different length

Apache Pig

There are no notable fixed issues in this release.

Cloudera Search

The following issues are fixed in CDH 6.1.0:

SOLR-12541 - Metrics handler throws an error if there are transient cores.
SOLR-12594 - MetricsHistoryHandler.getOverseerLeader fails when hostname contains hyphen.
SOLR-12683 - HashQuery will throw an exception if more than 4 partitionKeys is specified.
SOLR-12704 - Guard AddSchemaFieldsUpdateProcessorFactory against null field names and field values.
SOLR-12750 - Migrate API should lock the collection instead of shard.
SOLR-12765 - Incorrect format of JMX cache stats.
SOLR-12836 - ZkController creates a cloud solr client with no connection or read timeouts.

For more information on the fixes, see the upstream release notes:

Apache Sentry

The following issues are fixed in CDH 6.1.0:

SENTRY-853 - Handle show grant on auth failure correctly
SENTRY-1572 - SentryMain() shouldn't dynamically load tool class
SENTRY-1896 - Optimize retrieving entities by other entity types
SENTRY-1944 - Optimize DelegateSentryStore.getGroupsByRoles() and update SentryGenericPolicyProcessor to retrieve roles to group mapping in a single transaction
SENTRY-2085 - Sentry error handling exposes SentryGroupNotFoundException externally.
SENTRY-2092 - Drop Role log message shows "Creating role"
SENTRY-2115 - Incorrect behavior of HMsFollower when HDFSSync feature is disabled.
SENTRY-2127 - Fix unstable unit test TestColumnEndToEnd.testCrossDbTableOperations
SENTRY-2141 - Sentry Privilege TimeStamp is not converted to grantTime in HivePrivilegeInfo correctly
SENTRY-2143 - Table renames should synchronize with Sentry
SENTRY-2168 - Altering table will not update sentry permissions when HDFS sync is disabled
SENTRY-2194 - Upgrade Sentry hadoop-version dependency to 2.7.5
SENTRY-2198 - Update to Kafka 1.0.0.
SENTRY-2199 - Bump Hive version from 2.3.2 to 2.3.3
SENTRY-2200 - Migrate 3.x Datanucleus unsupported configurations to 4.1 Datanucleus
SENTRY-2209 - Incorrect class in SentryHdfsMetricsUtil.java.
SENTRY-2210 - AUTHZ_PATH should have index on the foreign key AUTHZ_OBJ_ID
SENTRY-2213 - Increase schema version from 2.0.0 to 2.1.0
SENTRY-2214 - Sentry should not allow URI grants to EMPTY or NULL locations
SENTRY-2224 - Support SHOW GRANT on HIVE_OBJECT
SENTRY-2231 - Fix URI check on List Privileges by Provider in SentryStore
SENTRY-2238 - Explicitly set Database on SentryHivePrivilegeObjectDesc
SENTRY-2244 - Alter sentry role or user at granting privilege can avoid extra query to database
SENTRY-2245 - Remove privileges that do not associate with a role or a user
SENTRY-2251 - Update user privileges based on changes to authorizables
SENTRY-2252 - Normalize the Sentry store API's to handle both user/role privileges
SENTRY-2255 - alter table set owner command can be executed only by user with proper privilege
SENTRY-2258 - Remove user when it is not associated with other objects
SENTRY-2259 - SQL CONSTRAINT name is too long for Oracle 11.2
SENTRY-2261 - Implement JSONAlterDatabaseMessage to write HMS alter database events
SENTRY-2262 - Sentry client is not compatible when connecting to Sentry 2.0
SENTRY-2264 - It is possible to elevate privileges from DROP using alter table rename
SENTRY-2270 - Illegal privileges on columns can be granted on Hive
SENTRY-2271 - Wrong log messages/method names in SentrySchema related classes.
SENTRY-2273 - Create the SHOW GRANT USER task for Hive
SENTRY-2280 - The request received in SentryPolicyStoreProcessor.sentry_notify_hms_event is null
SENTRY-2281 - list_privileges_by_user() fails with a JDODetachedFieldAccessException
SENTRY-2293 - Fix logging parameters on SentryHDFSServiceProcessor
SENTRY-2294 - Add requestorUsername to client.notifyHmsEvent() method
SENTRY-2295 - Owner privileges should not be granted to sentry admin users
SENTRY-2307 - Avoid HMS event synchronization while sentry is fetching full snapshot
SENTRY-2309 - Port ModifiedCatch NPE thrown when fetching Partitions with no corresponding SDS entry
SENTRY-2310 - Sentry is not be able to fetch full update subsequently, when there is HMS restart in the snapshot process.
SENTRY-2312 - Update owner privileges for table when owner is changed.
SENTRY-2313 - alter database set owner command can be executed only by user with proper privilege
SENTRY-2315 - The grant all operation is not dropping the create/alter/drop/index/lock privileges
SENTRY-2324 - Allow sentry to fetch configurable notifications from HMS
SENTRY-2332 - Load hadoop default configuration when starting sentry service
SENTRY-2333 - Create index AUTHZ_PATH_FK_IDX at table AUTHZ_PATH for Postgres only when it does not exist
SENTRY-2352 - User roles with ALTER on a table can not show or describe the table on which they have ALTER
SENTRY-2359 - Object owner is unable to grant privileges: SentryAccessDeniedException
SENTRY-2373 - Incorrect WARN message when processing add partition messages
SENTRY-2376 - Bump Jackson libraries versions to 1.9.13 and 2.9.6
SENTRY-2392 - Add metrics statistics to list_user_privileges and list_role_privileges API
SENTRY-2395 - ALTER VIEW AS SELECT is asking for CREATE privileges instead of ALTER
SENTRY-2403 - Incorrect naming in RollingFileWithoutDeleteAppender
SENTRY-2406 - Make sure inputHierarchy and outputHierarchy have unique values
SENTRY-2409 - ALTER TABLE SET OWNER does not allow to change the table if using only the table name
SENTRY-2417 - LocalGroupMappingService class docs do not accurately reflect required INI format
SENTRY-2419 - Log where sentry stands in the process of persisting the snpashot
SENTRY-2423 - Increase the allocation size for auto-increment of id's for Snapshot tables.
SENTRY-2427 - Use Hadoop KerberosName class to derive shortName
SENTRY-2429 - Transfer database owner drops table owner
SENTRY-2432 - PortThe case of a username is ignored when determining object ownership
SENTRY-2433 - Dropping object privileges does not include update of dropping user privileges

Apache Spark

The following issues are fixed in CDH 6.1.0:

SPARK-4502 - [SQL] Rename to spark.sql.optimizer.nestedSchemaPruning.enabled
SPARK-19355 - Revert[SPARK-25352]
SPARK-19724 - [SQL] allowCreatingManagedTableUsingNonemptyLocation should have legacy prefix
SPARK-20327 - [YARN] Follow up: fix resource request tests on Hadoop 3.
SPARK-20327 - [CORE][YARN] Add CLI support for YARN custom resources, like GPUs
SPARK-20360 - [PYTHON] reprs for interpreters
SPARK-20594 - Adjust fix forfor CDH version of Hive.
SPARK-21318 - [SQL] Improve exception message thrown by `lookupFunction`
SPARK-21402 - [SQL] Fix java array of structs deserialization
SPARK-22666 - [ML][FOLLOW-UP] Improve testcase to tolerate different schema representation
SPARK-23401 - [PYTHON][TESTS] Add more data types for PandasUDFTests
SPARK-23429 - [CORE] Add executor memory metrics to heartbeat and expose in executors REST API
SPARK-23549 - [SQL] Rename config spark.sql.legacy.compareDateTimestampInTimestamp
SPARK-23715 - Revert "[SQL] the input of to/from_utc_timestamp can not have timezone
SPARK-23907 - [SQL] Revert regr_* functions entirely
SPARK-23972 - Revert "[BUILD][SQL] Update Parquet to 1.10.0."
SPARK-24157 - [SS][FOLLOWUP] Rename to spark.sql.streaming.noDataMicroBatches.enabled
SPARK-24324 - [PYTHON][FOLLOW-UP] Rename the Conf to spark.sql.legacy.execution.pandas.groupedMap.assignColumnsByName
SPARK-24518 - Revert "[CORE] Using Hadoop credential provider API to store password"
SPARK-24519 - [CORE] Compute SHUFFLE_MIN_NUM_PARTS_TO_HIGHLY_COMPRESS only once
SPARK-24709 - [SQL][FOLLOW-UP] Make schema_of_json's input json as literal only
SPARK-24709 - [SQL][2.4] use str instead of basestring in isinstance
SPARK-24777 - [SQL] Add write benchmark for AVRO
SPARK-24787 - [CORE] Revert hsync in EventLoggingListener and make FsHistoryProvider to read lastBlockBeingWritten data for logs
SPARK-24918 - [CORE] Executor Plugin API
SPARK-25021 - [K8S][BACKPORT] Add spark.executor.pyspark.memory limit for K8S
SPARK-25044 - [FOLLOW-UP] Change ScalaUDF constructor signature
SPARK-25314 - [SQL] Fix Python UDF accessing attributes from both side of join in join conditions
SPARK-25318 - Add exception handling when wrapping the input stream during the the fetch or stage retry in response to a corrupted block
SPARK-25321 - [ML] Fix local LDA model constructor
SPARK-25384 - [SQL] Clarify fromJsonForceNullableSchema will be removed in Spark 3.0
SPARK-25416 - [SQL] ArrayPosition function may return incorrect result when right expression is implicitly down casted
SPARK-25417 - [SQL] ArrayContains function may return incorrect result when right expression is implicitly down casted
SPARK-25422 - [CORE] Don't memory map blocks streamed to disk.
SPARK-25425 - [SQL][BACKPORT-2.4] Extra options should override session options in DataSource V2
SPARK-25450 - [SQL] PushProjectThroughUnion rule uses the same exprId for project expressions in each Union child, causing mistakes in constant propagation
SPARK-25454 - [SQL] add a new config for picking minimum precision for integral literals
SPARK-25460 - [BRANCH-2.4][SS] DataSourceV2: SS sources do not respect SessionConfigSupport
SPARK-25468 - [WEBUI] Highlight current page index in the spark UI
SPARK-25469 - [SQL] Eval methods of Concat, Reverse and ElementAt should use pattern matching only once
SPARK-25495 - [SS] FetchedData.reset should reset all fields
SPARK-25502 - [CORE][WEBUI] Empty Page when page number exceeds the reatinedTask size.
SPARK-25503 - [CORE][WEBUI] Total task message in stage page is ambiguous
SPARK-25505 - [SQL] The output order of grouping columns in Pivot is different from the input order
SPARK-25505 - [SQL][FOLLOWUP] Fix for attributes cosmetically different in Pivot clause
SPARK-25509 - [CORE] Windows doesn't support POSIX permissions
SPARK-25519 - [SQL] ArrayRemove function may return incorrect result when right expression is implicitly downcasted.
SPARK-25521 - [SQL] Job id showing null in the logs when insert into command Job is finished.
SPARK-25522 - [SQL] Improve type promotion for input arguments of elementAt function
SPARK-25533 - [CORE][WEBUI] AppSummary should hold the information about succeeded Jobs and completed stages only
SPARK-25535 - [CORE] Work around bad error handling in commons-crypto.
SPARK-25536 - [CORE] metric value for METRIC_OUTPUT_RECORDS_WRITTEN is incorrect
SPARK-25538 - [SQL] Zero-out all bytes when writing decimal
SPARK-25543 - [K8S] Print debug message iff execIdsRemovedInThisRound is not empty.
SPARK-25546 - [CORE] Don't cache value of EVENT_LOG_CALLSITE_LONG_FORM.
SPARK-25568 - [CORE] Continue to update the remaining accumulators when failing to update one accumulator
SPARK-25579 - [SQL] Use quoted attribute names if needed in pushed ORC predicates
SPARK-25591 - [PYSPARK][SQL] Avoid overwriting deserialized accumulator
SPARK-25601 - [PYTHON] Register Grouped aggregate UDF Vectorized UDFs for SQL Statement
SPARK-25602 - [SQL] SparkPlan.getByteArrayRdd should not consume the input when not necessary
SPARK-25636 - [CORE] spark-submit cuts off the failure reason when there is an error connecting to master
SPARK-25644 - [SS] Fix java foreachBatch in DataStreamWriter
SPARK-25660 - [SQL] Fix for the backward slash as CSV fields delimiter
SPARK-25669 - [SQL] Check CSV header only when it exists
SPARK-25673 - [BUILD] Remove Travis CI which enables Java lint check
SPARK-25674 - [SQL] If the records are incremented by more than 1 at a time,the number of bytes might rarely ever get updated
SPARK-25674 - [FOLLOW-UP] Update the stats for each ColumnarBatch
SPARK-25690 - [SQL] Analyzer rule HandleNullInputsForUDF does not stabilize and can be applied infinitely
SPARK-25697 - [CORE] When zstd compression enabled, InProgress application is throwing Error in the history webui
SPARK-25704 - [CORE] Allocate a bit less than Int.MaxValue
SPARK-25708 - [SQL] HAVING without GROUP BY means global aggregate
SPARK-25714 - Fix Null Handling in the Optimizer rule BooleanSimplification
SPARK-25718 - [SQL] Detect recursive reference in Avro schema and throw exception
SPARK-25727 - [SQL] Add outputOrdering to otherCopyArgs in InMemoryRelation
SPARK-25738 - [SQL] Fix LOAD DATA INPATH for hdfs port
SPARK-25741 - [WEBUI] Long URLs are not rendered properly in web UI
SPARK-25768 - [SQL] fix constant argument expecting UDAFs
SPARK-25776 - [CORE]The disk write buffer size must be greater than 12
SPARK-25793 - [ML] call SaveLoadV2_0.load for classNameV2_0
SPARK-25816 - [SQL] Fix attribute resolution in nested extractors
SPARK-25822 - [PYSPARK] Fix a race condition when releasing a Python worker
SPARK-25827 - [CORE] Avoid converting incoming encrypted blocks to byte buffers
SPARK-25840 - [BUILD] `make-distribution.sh` should not fail due to missing LICENSE-binary
SPARK-25842 - [SQL] Deprecate rangeBetween APIs introduced in SPARK-21608
SPARK-25854 - [BUILD] fix `build/mvn` not to fail during Zinc server shutdown
SPARK-25855 - [CORE] Don't use erasure coding for event logs by default
SPARK-25871 - [STREAMING] Don't use EC for streaming WAL
SPARK-25904 - [CORE] Allocate arrays smaller than Int.MaxValue
SPARK-25918 - [SQL] LOAD DATA LOCAL INPATH should handle a relative path

Apache Sqoop

The following issues are fixed in CDH 6.1.0:

SQOOP-2567 - SQOOP import for Oracle fails with invalid precision/scale for decimal
SQOOP-2949 - SQL Syntax error when split-by column is of character type and min or max value has single quote inside it
SQOOP-3042 - Sqoop does not clear compile directory under /tmp/sqoop-username/compile automatically
SQOOP-3052 - Introduce Gradle based build for Sqoop to make it more developer friendly / open
SQOOP-3082 - Sqoop import fails after TCP connection reset if split by datetime column
SQOOP-3224 - Mainframe FTP transfer should have an option to use binary mode for transfer
SQOOP-3225 - Mainframe module FTP listing parser should cater for larger datasets on disk
SQOOP-3267 - Incremental import to HBase deletes only last version of column
SQOOP-3288 - Changing OracleManager to use CURRENT_TIMESTAMP instead of
SQOOP-3300 - Implement JDBC and Kerberos tools for HiveServer2 support
SQOOP-3309 - Implement HiveServer2 client
SQOOP-3326 - Mainframe FTP listing for GDG should filter out non-GDG datasets in a heterogeneous listing
SQOOP-3327 - Mainframe FTP needs to Include "Migrated" datasets when parsing the FTP list
SQOOP-3328 - Implement an alternative solution for Parquet reading and writing
SQOOP-3330 - Sqoop --append does not work with -Dmapreduce.output.basename
SQOOP-3331 - Add Mainframe FTP integration test for GDG dataset.
SQOOP-3333 - Change default behavior of the MS SQL connector to non-resilient.
SQOOP-3335 - Add Hive support to the new Parquet writing implementation
SQOOP-3353 - Sqoop should not check incremental constraints for HBase imports
SQOOP-3378 - Error during direct Netezza import/export can interrupt process in uncontrolled ways

Apache Zookeeper

The following issues are fixed in CDH 6.1.0:

ZOOKEEPER-706 - Large numbers of watches can cause session re-establishment to fail
ZOOKEEPER-1382 - Zookeeper server holds onto dead/expired session ids in the watch data structures

Categories: CDH | Fixed Issues | Release Notes | All Categories

New Features

Unsupported Features