Issues Fixed in CDH 5.7.x
The following topics describe issues fixed in CDH 5.7.x, from newest to oldest release. You can also review What's New In CDH 5.7.x or Known Issues in CDH 5.
Issues Fixed in CDH 5.7.6
Upstream Issues Fixed
The following upstream issues are fixed in CDH 5.7.6:
- AVRO-1943 - Flaky test: TestNettyServerWithCompression.testConnectionsCount.
- FLUME-2908 - NetcatSource - SocketChannel not closed when session is broken.
- FLUME-2997 - Fix flaky test in SpillableMemoryChannel.
- FLUME-3002 - Fix tests in TestBucketWriter.
- FLUME-3003 - Fix flaky testSourceCounter in TestSyslogUdpSource.
- FLUME-3049 - Make HDFS sink rotate more reliably in secure mode.
- HADOOP-7930 - Kerberos relogin interval in UserGroupInformation should be configurable.
- HADOOP-11031 - Design Document for Credential Provider API.
- HADOOP-11619 - FTPFileSystem should override getDefaultPort.
- HADOOP-12453 - Support decoding KMS Delegation Token with its own Identifier.
- HADOOP-12537 - S3A to support Amazon STS temporary credentials.
- HADOOP-12655 - TestHttpServer.testBindAddress bind port range is wider than expected.
- HADOOP-12723 - S3A: Add ability to plug in any AWSCredentialsProvider.
- HADOOP-13034 - Log message about input options in distcp lacks some items.
- HADOOP-13433 - Race in UGI.reloginFromKeytab.
- HADOOP-13590 - Retry until TGT expires even if the UGI renewal thread encountered exception.
- HADOOP-13627 - Have an explicit KerberosAuthException for UGI to throw, text from public constants.
- HADOOP-13641 - Update UGI#spawnAutoRenewalThreadForUserCreds to reduce indentation.
- HADOOP-13838 - KMSTokenRenewer should close providers.
- HADOOP-13953 - Make FTPFileSystem's data connection mode and transfer mode configurable.
- HADOOP-14003 - Make additional KMS tomcat settings configurable.
- HDFS-9428 - Fix intermittent failure of TestDNFencing.testQueueingWithAppend.
- HDFS-9630 - DistCp minor refactoring and clean up.
- HDFS-9638 - Improve DistCp Help and documentation.
- HDFS-9764 - DistCp doesn't print value for several arguments including -numListstatusThreads.
- HDFS-9804 - Allow long-running Balancer to login with keytab.
- HDFS-9820 - Improve distcp to support efficient restore to an earlier snapshot.
- HDFS-9888 - Allow reseting KerberosName in unit tests.
- HDFS-10216 - Distcp -diff throws exception when handling relative path.
- HDFS-10271 - Extra bytes are getting released from reservedSpace for append.
- HDFS-10298 - Document the usage of distcp -diff option.
- HDFS-10313 - Distcp need to enforce the order of snapshot names passed to -diff.
- HDFS-10336 - TestBalancer failing intermittently because of not reseting UserGroupInformation completely
- HDFS-10397 - Distcp should ignore -delete option if -diff option is provided instead of exiting.
- HDFS-10556 - DistCpOptions should be validated automatically.
- HDFS-10763 - Open files can leak permanently due to inconsistent lease update.
- HDFS-11040 - Add documentation for HDFS-9820 distcp improvement.
- HDFS-11056 - Concurrent append and read operations lead to checksum error.
- HDFS-11160 - VolumeScanner reports write-in-progress replicas as corrupt incorrectly.
- HDFS-11229 - HDFS-11056 failed to close meta file.
- HDFS-11275 - Check groupEntryIndex and throw a helpful exception on failures when removing ACL.
- HDFS-11292 - log lastWrittenTxId etc info in logSyncAll.
- HDFS-11306 - Print remaining edit logs from buffer if edit log can't be rolled.
- MAPREDUCE-6571 - JobEndNotification info logs are missing in AM container syslog.
- MAPREDUCE-6763 - Shuffle server listen queue is too small.
- MAPREDUCE-6798 - Fix intermittent failure of TestJobHistoryParsing.testJobHistoryMethods.
- MAPREDUCE-6801 - Fix flaky TestKill.testKillJob.
- MAPREDUCE-6817 - The format of job start time in JHS is different from those of submit and finish time.
- MAPREDUCE-6831 - Flaky test TestJobImpl.testKilledDuringKillAbort
- YARN-2306 - Add test for leakage of reservation metrics in fair scheduler.
- YARN-3554 - Default value for maximum nodemanager connect wait time is too high.
- YARN-4363 - In TestFairScheduler, testcase should not create FairScheduler redundantly.
- YARN-4555 - TestDefaultContainerExecutor#testContainerLaunchError fails on non-english locale environment.
- YARN-5752 - TestLocalResourcesTrackerImpl#testLocalResourceCache times out.
- YARN-5837 - NPE when getting node status of a decommissioned node after an RM restart.
- YARN-5859 - TestResourceLocalizationService#testParallelDownloadAttemptsForPublicResource sometimes fails
- YARN-5862 - TestDiskFailures.testLocalDirsFailures failed.
- YARN-5890 - FairScheduler should log information about AM-resource-usage and max-AM-share for queues.
- YARN-5920 - Fix deadlock in TestRMHA.testTransitionedToStandbyShouldNotHang.
- HBASE-15324 - Jitter may cause desiredMaxFileSize overflow in ConstantSizeRegionSplitPolicy and trigger unexpected split.
- HBASE-15430 - Failed taking snapshot - Manifest proto-message too large.
- HBASE-16172 - Unify the retry logic in ScannerCallableWithReplicas and RpcRetryingCallerWithReadReplicas.
- HBASE-16270 - Handle duplicate clearing of snapshot in region replicas.
- HBASE-16345 - RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions.
- HBASE-16824 - Writer.flush() can be called on already closed streams in WAL roll.
- HBASE-16841 - Data loss in MOB files after cloning a snapshot and deleting that snapshot.
- HBASE-17058 - Lower epsilon used for jitter verification from HBASE-15324.
- HBASE-17241 - Avoid compacting already compacted mob files with _del files.
- HBASE-17452 - Failed taking snapshot - region Manifest proto-message too large.
- HBASE-17522 - Handle JVM throwing runtime exceptions when we ask for details on heap usage the same as a correctly returned 'undefined'.
- HIVE-10965 - direct SQL for stats fails in 0-column case.
- HIVE-11849 - NPE in HiveHBaseTableShapshotInputFormat in query with just count(*).
- HIVE-12083 - HIVE-10965 introduces thrift error if partNames or colNames are empty.
- HIVE-12465 - Hive might produce wrong results when (outer) joins are merged.
- HIVE-12619 - Switching the field order within an array of structs causes the query to fail.
- HIVE-12780 - Fix the output of the history command in Beeline HIVE-12789: Fix output twice in the history command of Beeline.
- HIVE-12891 - Hive fails when java.io.tmpdir is set to a relative location.
- HIVE-12976 - MetaStoreDirectSql doesn't batch IN lists in all cases.
- HIVE-13129 - CliService leaks HMS connection.
- HIVE-13149 - Remove some unnecessary HMS connections from HS2.
- HIVE-13240 - GroupByOperator: Drop the hash aggregates when closing operator.
- HIVE-13539 - HiveHFileOutputFormat searching the wrong directory for HFiles.
- HIVE-13696 - Modify FairSchedulerShim to dynamically reload changes to fair-scheduler.xml.
- HIVE-13866 - flatten callstack for directSQL errors.
- HIVE-14173 - NPE was thrown after enabling directsql in the middle of session.
- HIVE-14764 - Enabling "hive.metastore.metrics.enabled" throws OOM in HiveMetastore.
- HIVE-14820 - RPC server for spark inside HS2 is not getting server address properly.
- HIVE-15054 - Hive insertion query execution fails on Hive on Spark.
- HIVE-15090 - Temporary DB failure can stop ExpiredTokenRemover thread.
- HIVE-15338 - Wrong result from non-vectorized DATEDIFF with scalar parameter of type DATE/TIMESTAMP.
- HIVE-15410 - WebHCat supports get/set table property with its name containing period and hyphen.
- HIVE-15551 - memory leak in directsql for mysql+bonecp specific initialization.
- HUE-4662 - [security] fixing Hue - Wildcard Certificates not supported.
- HUE-4466 - [security] deliver csrftoken cookie with secure bit set if possible.
- HUE-5163 - [security] Speed up initial page rendering.
- HUE-4916 - [core] Truncate last name to 30 chars on ldap import.
- HUE-5050 - [core] Logout fails for local login when multiple backends are used.
- HUE-5042 - [core.backend] Unable to kill jobs after Resource Manager failover.
- HUE-4201 - [editor] Add warning about max limit of cells before truncation.
- HUE-4968 - [oozie] Remove access to /oozie/import_wokflow when v2 is enabled.
- HUE-5218 - [search] Validate dashboard sharing works.
- IMPALA-2864 - Ensure that client connections are closed after a failed Open()
- IMPALA-3167 - Fix assignment of WHERE conjunct through grouping agg + OJ.
- IMPALA-3552 - Make incremental stats max serialized size configurable
- IMPALA-3698 - Fix Isilon permissions test
- IMPALA-3861 - Replace BetweenPredicates with their equivalent CompoundPredicate.
- IMPALA-3875 - Thrift threaded server hang in some cases
- IMPALA-3983 - /IMPALA-3974: Delete function jar resources after load
- IMPALA-4153 - Return valid non-NULL pointer for 0-byte allocations
- IMPALA-4223 - Handle truncated file read from HDFS cache
- IMPALA-4336 - Cast exprs after unnesting union operands.
- IMPALA-4363 - Add Parquet timestamp validation
- IMPALA-4423 - Correct but conservative implementation of Subquery.equals().
- IMPALA-4433 - Always generate testdata using the same time zone setting
- IMPALA-4449 - Revisit table locking pattern in the catalog
- IMPALA-4488 - HS2 GetOperationStatus() should keep session alive
- IMPALA-4550 - Fix CastExpr analysis for substituted slots
- IMPALA-4579 - SHOW CREATE VIEW fails for view containing a subquery
- IMPALA-4765 - Avoid using several loading threads on one table.
- OOZIE-2194 - oozie job -kill doesn't work with spark action.
- OOZIE-2243 - Kill Command does not kill the child job for java action.
- OOZIE-2584 - Eliminate Thread.sleep() calls in TestMemoryLocks.
- OOZIE-2678 - Oozie job -kill doesn't work with tez jobs.
- OOZIE-2742 - Unable to kill applications based on tag.
- PIG-5025 - Fix flaky test failures in TestLoad.java.
- SENTRY-1260 - Improve error handling - ArrayIndexOutOfBoundsException in PathsUpdate.parsePath can cause MetastoreCacheInitializer intialization to fail.
- SENTRY-1270 - Improve error handling - Database with malformed URI causes NPE in HMS plugin during DDL.
- SENTRY-1520 - Provide mechanism for triggering HMS full snapshot.
- SENTRY-1564 - Improve error detection and reporting in MetastoreCacheInitializer.java.
- SOLR-9284 - The HDFS BlockDirectoryCache should not let it's keysToRelease or names maps grow indefinitely.
- SOLR-9330 - Fix AlreadyClosedException on admin/mbeans?stats=true.
- SOLR-10031 - Validation of filename params in ReplicationHandler.
- SPARK-12241 - [YARN] Improve failure reporting in Yarn client obtainTokenForHBase().
- SPARK-12523 - [YARN] Support long-running of the Spark On HBase and hive meta store.
- SPARK-12966 - [SQL] ArrayType(DecimalType) support in Postgres JDBC.
- SPARK-13566 - [CORE] Avoid deadlock between BlockManager and Executor Thread.
- SPARK-13958 - Executor OOM due to unbounded growth of pointer array in…
- SPARK-14204 - [SQL] register driverClass rather than user-specified class.
- SPARK-16044 - [SQL] Backport input_file_name() for data source based on NewHadoopRDD to branch 1.6.
- SPARK-17245 - [SQL][BRANCH-1.6] Do not rely on Hive's session state to retrieve HiveConf.
- SPARK-17465 - [SPARK CORE] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak.
- SPARK-18750 - [YARN] Follow up: move test to correct directory in 2.1 branch.
- SPARK-18750 - [YARN] Avoid using "mapValues" when allocating containers.
- SQOOP-2349 - Add command line option for setting transaction isolation levels for metadata queries.
- SQOOP-2880 - Provide argument for overriding temporary directory.
- SQOOP-2884 - Document --temporary-rootdir.
- SQOOP-2915 - Fixing Oracle related unit tests.
- SQOOP-2983 - OraOop export has degraded performance with wide tables.
- SQOOP-3013 - Configuration "tmpjars" is not checked for empty strings before passing to MR.
- SQOOP-3028 - Include stack trace in the logging of exceptions in ExportTool.
- SQOOP-3034 - HBase import should fail fast if using anything other than as-textfile.
- SQOOP-3053 - Create a cmd line argument for sqoop.throwOnError and use it through SqoopOptions.
- SQOOP-3055 - Fixing MySQL tests failing due to ignored test inputs/configuration.
- SQOOP-3057 - Fixing 3rd party Oracle tests failing due to invalid case of column names.
- SQOOP-3071 - Fix OracleManager to apply localTimeZone correctly in case of Date objects too.
- SQOOP-3124 - Fix ordering in column list query of PostgreSQL connector to reflect the logical order instead of adhoc ordering.
Issues Fixed in CDH 5.7.5
Upstream Issues Fixed
The following upstream issues are fixed in CDH 5.7.5:
- HADOOP-10300 - Allowed deferred sending of call responses
- HADOOP-12483 - Maintain wrapped SASL ordering for postponed IPC responses
- HADOOP-13317 - Add logs to KMS server-side to improve supportability
- HADOOP-13558 - UserGroupInformation created from a Subject incorrectly tries to renew the Kerberos ticket
- HADOOP-13638 - KMS should set UGI's Configuration object properly
- HADOOP-13669 - KMS Server should log exceptions before throwing
- HADOOP-13693 - Remove the message about HTTP OPTIONS in SPNEGO initialization message from kms audit log
- HDFS-4176 - EditLogTailer should call rollEdits with a timeout
- HDFS-6962 - ACLs inheritance conflict with umaskmode
- HDFS-7413 - Some unit tests should use NameNodeProtocols instead of FSNameSystem
- HDFS-7964 - Add support for async edit logging
- HDFS-8224 - Schedule a block for scanning if its metadata file is corrupt
- HDFS-8709 - Clarify automatic sync in FSEditLog#logEdit
- HDFS-9038 - DFS reserved space is erroneously counted towards non-DFS used
- HDFS-10178 - Permanent write failures can happen if pipeline recoveries occur for the first packet
- HDFS-10609 - Uncaught InvalidEncryptionKeyException during pipeline recovery can abort downstream applications
- HDFS-10641 - TestBlockManager#testBlockReportQueueing fails intermittently
- HDFS-10722 - Fix race condition in TestEditLog#testBatchedSyncWithClosedLogs
- HDFS-10760 - DataXceiver#run() should not log InvalidToken exception as an error
- HDFS-10879 - TestEncryptionZonesWithKMS#testReadWrite fails intermittently
- HDFS-10962 - TestRequestHedgingProxyProvider fails intermittently
- HDFS-11012 - Unnecessary INFO logging on DFSClients for InvalidToken
- MAPREDUCE-6633 - AM should retry map attempts if the reduce task encounters commpression related errors
- MAPREDUCE-6718 - Add progress log to JHS during startup
- MAPREDUCE-6728 - Give fetchers hint when ShuffleHandler rejects a shuffling connection
- MAPREDUCE-6771 - RMContainerAllocator sends container diagnostics event after corresponding completion event
- YARN-4004 - container-executor should print output of Docker logs if the Docker container exits with non-0 exit status
- YARN-4017 - container-executor overuses PATH_MAX
- YARN-4245 - Generalize config file handling in container-executor
- YARN-4255 - container-executor does not clean up Docker operation command files
- YARN-4723 - NodesListManager$UnknownNodeId ClassCastException
- YARN-4940 - YARN node -list -all fails if RM starts with decommissioned node
- YARN-5704 - Provide configuration knobs to control enabling/disabling new/work in progress features in container-executor
- HBASE-16294 - hbck reporting "No HDFS region dir found" for replicas
- HBASE-16699 - Overflows in AverageIntervalRateLimiter's refill() and getWaitInterval()
- HBASE-16767 - Mob compaction needs to clean up files in /hbase/mobdir/.tmp and /hbase/mobdir/.tmp/.bulkload when running into IO exceptions
- HIVE-10384 - BackportRetryingMetaStoreClient does not retry wrapped TTransportExceptions
- HIVE-12077 - MSCK Repair table should fix partitions in batches
- HIVE-12475 - Parquet schema evolution within array<struct<>> does not work
- HIVE-12785 - View with union type and UDF to the struct is broken
- HIVE-13058 - Add session and operation_log directory deletion messages
- HIVE-13198 - Authorization issues with cascading views
- HIVE-13237 - Select parquet struct field with upper case throws NPE
- HIVE-13429 - Tool to remove dangling scratch dir
- HIVE-13997 - Insert overwrite directory does not overwrite existing files
- HIVE-14313 - Test failure TestMetaStoreMetrics.testConnections
- HIVE-14421 - FS.deleteOnExit holds references to _tmp_space.db files
- HIVE-14762 - Add logging while removing scratch space
- HIVE-14784 - Operation logs are disabled automatically if the parent directory does not exist
- HIVE-14799 - Query operations are not thread safe during cancellation
- HIVE-14805 - Subquery inside a view will have the object in the subquery as the direct input
- HIVE-14810 - Fix failing test: TestMetaStoreMetrics.testMetaDataCounts
- HIVE-14817 - Shutdown the SessionManager timeoutChecker thread properly upon shutdown
- HIVE-14839 - Improve the stability of TestSessionManagerMetrics
- HUE-3860 - Fix unittest beeswax.tests.test_hiveserver2_jdbc_url
- HUE-3905 - Reset beeswax.conf params in beeswax.tests:test_hiveserver2_jdbc_url
- HUE-4201 - Add warning about max limit of cells before truncation in the download query result
- HUE-4662 - Fixed: Wildcard Certificates not supported
- HUE-4739 - Fixed Jobbrowser tests which were failing after resource manager pool change
- HUE-4916 - Truncate last name to 30 chars on ldap import
- HUE-4968 - Remove access to /oozie/import_wokflow when v2 is enabled
- HUE-5042 - Unable to kill jobs after Resource Manager failover
- HUE-5050 - Logout fails for local login when multiple backends are used
- HUE-5161 - Speed up roles rendering
- HUE-5163 - Speed up initial page rendering
- IMPALA-1619 - Support 64-bit allocations
- IMPALA-1740 - Add support for skip.header.line.count
- IMPALA-3458 - Fix table creation to test insert with header lines
- IMPALA-3949 - Log the error message in FileSystemUtil.copyToLocal()
- IMPALA-4037 - Fx locking during query cancellation
- IMPALA-4076 - Fix runtime filter sort compare method
- IMPALA-4099 - Fix the error message while loading UDFs with no JARs
- IMPALA-4120 - Incorrect results with LEAD() analytic function
- IMPALA-4135 - Thrift threaded server times-out connections during high load
- IMPALA-4170 - Fix identifier quoting in COMPUTE INCREMENTAL STATS
- IMPALA-4196 - Cross compile bit-byte functions
- IMPALA-4237 - Fix materialization of 4 byte decimals in data source scan node
- IMPALA-4246 - SleepForMs() utility function has undefined behavior for > 1s
- OOZIE-1814 - Oozie should mask any passwords in logs and REST interfaces
- SOLR-9310 - PeerSync fails on a node restart due to IndexFingerPrint mismatch
- SPARK-12009 - Avoid re-allocating YARN container when driver wants to stop all Executors
- SPARK-12392 - Optimize a location order of broadcast blocks by considering preferred local hosts
- SPARK-12941 - Spark-SQL JDBC Oracle dialect fails to map string datatypes to Oracle VARCHAR datatype mapping
- SPARK-13328 - Poor read performance for broadcast variables with dynamic resource allocation
- SPARK-16625 - General data types to be mapped to Oracle
- SPARK-16711 - YarnShuffleService does not re-init properly on YARN rolling upgrade
- SPARK-17171 - DAG lists all partitions in the graph
- SPARK-17433 - YarnShuffleService does not handle moving credentials levelDb
- SPARK-17611 - Make shuffle service test really test authentication
- SPARK-17644 - Do not add failedStages when abortStage for fetch failure
- SPARK-17696 - Partial backport of to branch-1.6.
- SQOOP-2952 - Row key not added into column family using --hbase-bulkload
- SQOOP-2986 - Add validation check for --hive-import and --incremental lastmodified
- SQOOP-3021 - ClassWriter fails if a column name contains a backslash character
Issues Fixed in CDH 5.7.4
Upstream Issues Fixed
The following upstream issues are fixed in CDH 5.7.4:
- FLUME-2797 - Use SourceCounter for SyslogTcpSource
- FLUME-2844 - SpillableMemoryChannel must start ChannelCounter
- HADOOP-8436 - NPE In getLocalPathForWrite ( path, conf ) when the required context item is not configured
- HADOOP-8437 - getLocalPathForWrite should throw IOException for invalid paths
- HADOOP-10048 - LocalDirAllocator should avoid holding locks while accessing the filesystem
- HADOOP-11469 - KMS should skip default.key.acl and whitelist.key.acl when loading key acl.
- HADOOP-12252 - LocalDirAllocator should not throw NPE with empty string configuration
- HADOOP-12548 - Read s3a credentials from a Credential Provider
- HADOOP-12609 - Fix intermittent failure of TestDecayRpcScheduler.
- HADOOP-13270 - BZip2CompressionInputStream finds the same compression marker twice in corner case, causing duplicate data blocks
- HADOOP-13353 - LdapGroupsMapping getPassward should not return null when IOException is thrown
- HADOOP-13437 - KMS should reload whitelist and default key ACLs when hot-reloading
- HADOOP-13487 - Hadoop KMS should load old delegation tokens from Zookeeper on startup
- HADOOP-13526 - Add detailed logging in KMS for the authentication failure of proxy user
- HADOOP-13579 - Fix source-level compatibility after HADOOP-11252
- HDFS-4210 - Throw helpful exception when DNS entry for JournalNode cannot be resolved
- HDFS-7415 - Move FSNameSystem.resolvePath() to FSDirectory
- HDFS-7420 - Delegate permission checks to FSDirectory
- HDFS-7463 - Simplify FSNamesystem#getBlockLocationsUpdateTimes
- HDFS-7478 - Move org.apache.hadoop.hdfs.server.namenode.NNConf to FSNamesystem
- HDFS-7517 - Remove redundant non-null checks in FSNamesystem#getBlockLocations
- HDFS-8269 - getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime
- HDFS-8897 - Balancer should handle fs.defaultFS trailing slash in HA
- HDFS-9198 - Coalesce IBR processing in the NameNode.
- HDFS-9781 - FsDatasetImpl#getBlockReports can occasionally throw NullPointerException
- HDFS-9906 - Remove unhelpful log entries when restarting a datanode
- HDFS-9958 - BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages
- HDFS-10270 - TestJMXGet:testNameNode() fails
- HDFS-10457 - DataNode should not auto-format block pool directory if VERSION is missing
- HDFS-10544 - Balancer does not work with IPFailoverProxyProvider
- HDFS-10643 - Namenode should use loginUser(hdfs) to generateEncryptedKey
- HDFS-10822 - Log DataNodes in the write pipeline
- MAPREDUCE-4784 - TestRecovery occasionally fails
- MAPREDUCE-6359 - In RM HA setup, Cluster tab links populated with AM hostname instead of RM
- MAPREDUCE-6514 - Fixed MapReduce ApplicationMaster to properly updated resources ask after ramping down of all reducers avoiding job hangs
- MAPREDUCE-6628 - Potential memory leak in CryptoOutputStream
- MAPREDUCE-6670 - TestJobListCache#testEviction sometimes fails on Windows with timeout
- MAPREDUCE-6680 - JHS UserLogDir scan algorithm sometimes could skip directory with update in CloudFS (Azure FileSystem, S3, and so on)
- MAPREDUCE-6684 - High contention on scanning of user directory under immediate_done in Job History Server
- MAPREDUCE-6738 - TestJobListCache.testAddExisting failed intermittently in slow VM testbed
- MAPREDUCE-6761 - Regression when handling providers - invalid configuration ServiceConfiguration causes Cluster initialization failure
- YARN-2977 - Fixed intermittent TestNMClient failure
- YARN-4989 - TestWorkPreservingRMRestart#testCapacitySchedulerRecovery fails intermittently
- YARN-5608 - TestAMRMClient.setup() fails with ArrayOutOfBoundsException
- HBASE-15856 - Fix UnknownHostException import in MetaTableLocator
- HBASE-15856 - Do not cache unresolved addresses for connections
- HBASE-16194 - Should count in MSLAB chunk allocation into heap size change when adding duplicate cells
- HBASE-16195 - Should not add chunk into chunkQueue if not using chunk pool in HeapMemStoreLAB
- HBASE-16284 - Unauthorized client can shut down the cluster
- HBASE-16317 - Revert all ESAPI changes
- HBASE-16318 - Fail build while rendering velocity template if dependency license is not in whitelist
- HBASE-16318 - Consistently use the correct name for "Apache License, Version 2.0"
- HBASE-16321 - Ensure no findbugs-jsr305
- HBASE-16340 - Exclude Xerces implementation jars from coming in transitively
- HBASE-16360 - TableMapReduceUtil addHBaseDependencyJars has the wrong class name for PrefixTreeCodec
- HIVE-9570 - Investigate test failure on union_view.q
- HIVE-10007 - Support qualified table name in analyze table compute statistics for columns
- HIVE-10728 - Deprecate unix_timestamp(void) and make it deterministic
- HIVE-11901 - StorageBasedAuthorizationProvider requires write permission on table for SELECT statements
- HIVE-12556 - Ctrl-C in Beeline does not kill Tez query on HS2
- HIVE-13160 - HS2 unable to load UDFs on startup when HMS is not ready
- HIVE-13620 - Merge llap branch work to master
- HIVE-13645 - Beeline needs null-guard around hiveVars and hiveConfVars read
- HIVE-14296 - Session count is not decremented when HS2 clients do not shutdown cleanly
- HIVE-14436 - Hive 1.2.1/Hitting "ql.Driver: FAILED: IllegalArgumentException Error"
- HIVE-14519 - Multi insert query bug
- HIVE-14538 - Beeline throws exceptions with parsing Hive configuration when using !sh statement
- HIVE-14715 - Hive throws NumberFormatException with query with Null value
- HIVE-14743 - ArrayIndexOutOfBoundsException - HBASE-backed views query with JOINs
- HUE-2689 - Sub-workflow submitted from coordinator gets parent workflow graph
- HUE-4541 - Fixing Hue job browser - Kerberos mutual authentication error in Hue
- HUE-4635 - Fix duration on jobs page for running jobs
- HUE-4804 - Download function of HTML widget breaks the display
- HUE-4808 - Do not show the edit link for sub-workflows when submitted outside Hue
- HUE-4809 - Add truststore parameters only if SSL is turned on
- HUE-4809 - Only add truststore paths when they actually exist
- IMPALA-3081 - Increase memory limit for TestWideRow
- IMPALA-3311 - Fix string data coming out of aggs in subplans
- IMPALA-3575 - Add retry to back end connection request and rpc timeout
- IMPALA-3678 - Fix migration of predicates into union operands with an order by + limit.
- IMPALA-3682 - Do not retry unrecoverable socket creation errors
- IMPALA-3687 - Fix test failure introduced by backporting
- IMPALA-3687 - Prefer Avro field name during schema reconciliation
- IMPALA-3820 - Handle linkage errors while loading Java UDFs in Catalog
- IMPALA-3930 - Fix shuffle insert hint with constant partition exprs
- IMPALA-3940 - Fix getting column stats through views
- IMPALA-4020 - Handle external conflicting changes to HMS gracefully
- IMPALA-4049 - Fix empty batch handling NLJ build side
- OOZIE-2068 - Configuration as part of sharelib
- OOZIE-2347 - Remove unnecessary new Configuration()/new jobConf() calls from Oozie
- OOZIE-2555 - Oozie SSL enable setup does not return port for admin -servers
- OOZIE-2567 - HCat connection is not closed while getting hcat credentials
- OOZIE-2589 - CompletedActionXCommand is hardcoded to wrong priority
- OOZIE-2649 - Cannot override sub-workflow configuration property if defined in parent workflow XML
- PIG-3807 - Pig creates wrong schema after dereferencing nested tuple fields with sorts
- SPARK-8428 -Fix integer overflows in TimSort
- SPARK-12339 - Added a null check that was removed in
- SPARK-13242 - codegen fallback when there many branches
Issues Fixed in CDH 5.7.3
Upstream Issues Fixed
The following upstream issues are fixed in CDH 5.7.3:
- FLUME-2821 - KafkaSourceUtil Can Log Passwords at Info remove logging of security related data in older releases.
- FLUME-2913 - Don't strip SLF4J from imported classpaths.
- FLUME-2922 - Sync SequenceFile.Writer before calling hflush
- HADOOP-8751 - NPE in Token.toString() when Token is constructed using null identifier.
- HADOOP-11361 - Fix a race condition in MetricsSourceAdapter.updateJmxCache.
- HADOOP-11901 - BytesWritable fails to support 2G chunks due to integer overflow.
- HADOOP-12659 - Incorrect usage of configuration parameters in token manager of KMS.
- HADOOP-13263 - Reload cached groups in background after expiry.
- HADOOP-13381 - KMS clients should use KMS Delegation Tokens from current UGI.
- HADOOP-13457 - Remove hardcoded absolute path for shell executable.
- HDFS-6434 - Default permission for creating file should be 644 for WebHdfs/HttpFS.
- HDFS-7597 - DelegationTokenIdentifier should cache the TokenIdentifier to UGI mapping.
- HDFS-8008 - Support client-side back off when the datanodes are congested.
- HDFS-9276 - Failed to Update HDFS Delegation Token for long running application in HA mode.
- HDFS-9466 - TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky.
- HDFS-9939 - Increase DecompressorStream skip buffer size.
- HDFS-10512 - VolumeScanner may terminate due to NPE in DataNode.reportBadBlocks.
- MAPREDUCE-6442 - Stack trace is missing when error occurs in client protocol provider's constructor.
- MAPREDUCE-6473 - Job submission can take a long time during Cluster initialization.
- MAPREDUCE-6675 - TestJobImpl.testUnusableNode failed
- YARN-4459 - container-executor should only kill process groups.
- YARN-4784 - Fairscheduler: defaultQueueSchedulingPolicy should not accept FIFO.
- YARN-4866 - FairScheduler: AMs can consume all vcores leading to a livelock when using FAIR policy.
- YARN-4878 - Expose scheduling policy and max running apps over JMX for Yarn queues.
- YARN-5077 - Fix FSLeafQueue#getFairShare() for queues with zero fairshare.
- YARN-5272 - Handle queue names consistently in FairScheduler.
- HBASE-14963 - Remove use of Guava Stopwatch from HBase client code.
- HBASE-15621 - Suppress Hbase SnapshotHFile cleaner error messages when a snaphot is going on.
- HBASE-15808 - Reduce potential bulk load intermediate space usage and waste.
- HBASE-16135 - PeerClusterZnode under rs of removed peer may never be deleted
- HBASE-16207 - can't restore snapshot without "Admin" permission
- HBASE-16227 - [Shell] Column value formatter not working in scans. Tested : manually using shell.
- HBASE-16288 - HFile intermediate block level indexes might recurse forever creating multi TB files.
- HBASE-16319 - Fix TestCacheOnWrite after HBASE-16288.
- HIVE-11432 - Hive macro gives same result for different arguments.
- HIVE-11487 - Add getNumPartitionsByFilter api in metastore api.
- HIVE-11980 - Follow up on HIVE-11696, exception is thrown from CTAS from the table with table-level serde is Parquet while partition-level serde is JSON.
- HIVE-12277 - Hive macro results on macro_duplicate.q different after adding ORDER BY.
- HIVE-12635 - Hive should return the latest HBase cell timestamp as the row timestamp value.
- HIVE-13043 - Reload function has no impact to function registry.
- HIVE-13090 - Hive metastore crashes on NPE with ZooKeeperTokenStore.
- HIVE-13372 - Hive Macro overwritten when multiple macros are used in one column.
- HIVE-13704 - Do not call DistCp.execute() instead of DistCp.run().
- HIVE-13749 - Memory leak in Hive Metastore.
- HIVE-13884 - Disallow queries in HMS fetching more than a configured number of partitions
- HIVE-14055 - directSql - getting the number of partitions is broken.
- HIVE-14187 - JDOPersistenceManager objects remain cached if MetaStoreClient#close is not called.
- HIVE-14209 - Add some logging info for session and operation management.
- HIVE-14298 - NPE could be thrown in HMS when an ExpressionTree could not be made from a filter.
- HIVE-14359 - Hive on Spark might fail in HS2 with LDAP authentication in a kerberized cluster.
- HIVE-14457 - Partitions in encryption zone are still trashed though an exception is returned.
- HUE-3481 - [assist] Do not sort the columns by name, instead use the creation order.
- HUE-3842 - [core] HTTP 500 while emptying Hue 3.9 trash directory.
- HUE-3845 - [sentry] Sometimes see group as editable on role section.
- HUE-3880 - [core] Add importlib directly for Python 2.6.
- HUE-3988 - [search] Support schemaless collections.
- HUE-3999 - [oozie] list_oozie_workflow page should not break in case of bad json from oozie.
- HUE-4265 - [beeswax] Bring back show preview in the assist.
- HUE-4300 - [fb] Avoid double file listing call on folder search.
- HUE-4333 - [core] Properly reset API_CACHE on failover.
- HUE-4477 - [security] Select All is not filtering out the non visible roles from the selection .
- HUE-4493 - [oozie] Fix sync-workflow action when Workflow includes sub-workflow.
- HUE-4515 - [oozie] Remove oozie.bundle.application.path from properties when rerunning workflow.
- IMPALA-3711 - Remove unnecessary privilege checks in getDbsMetadata().
- IMPALA-3915 - Register privilege and audit requests when analyzing resolved table refs.
- OOZIE-2391 - spark-opts value in workflow.xml is not parsed properly.
- OOZIE-2537 - SqoopMain does not set up log4j properly.
- SOLR-7280 - BackportLoad cores in sorted order and tweak coreLoadThread counts to improve cluster stability on restarts.
- SOLR-9236 - AutoAddReplicas will append an extra /tlog to the update log location on replica failover.
- SPARK-14963 - [YARN] Using recoveryPath if NM recovery is enabled.
- SPARK-16505 - [YARN] Optionally propagate error during shuffle service startup.
- SQOOP-2561 - Special Character removal from Column name as avro data results in duplicate column and fails the import.
- SQOOP-2906 - Optimization of AvroUtil.toAvroIdentifier.
- SQOOP-2971 - OraOop does not close connections properly.
- SQOOP-2995 - Backward incompatibility introduced by Custom Tool options.
Issues Fixed in CDH 5.7.2
CDH 5.7.2 fixes the following issues.
Kerberized HS2 with LDAP authentication fails in a multi-domain LDAP case
In CDH 5.7, Hive introduced a feature to support HS2 with Kerberos plus LDAP authentication; but it broke compatibility with multi-domain LDAP cases on CDH 5.7.x and C5.8.x versions.
Affected Versions: CDH 5.7.1, CDH 5.8.0, and CDH 5.8.1
Fixed in Versions: CDH 5.7.2 and higher, CDH 5.8.2 and higher
Bug: HIVE-13590.
Workaround: None.
Oozie
Oozie Web Console returns 500 error when Oozie server runs on JDK 8u75 or higher
Bug: OOZIE-2533
Cloudera Bug: CDH-40362
The Oozie Web Console returns a 500 error when the Oozie server is running on JDK 8u75 and higher. The Oozie server still functions, and you can use the Oozie command line, REST API, Java API, or the Hue Oozie Dashboard to review status of those jobs.
Upstream Issues Fixed
The following upstream issues are fixed in CDH 5.7.2:
- FLUME-1899 - Make SpoolDir work with subdirectories
- FLUME-2910 - AsyncHBaseSink: Failure callbacks should log the exception that caused them
- FLUME-2918 - Speed up TaildirSource on directories with many files
- HADOOP-8934 - Shell command ls should include sort options
- HADOOP-10971 - Add -C flag to make `hadoop fs -ls` print filenames only
- HADOOP-11409 - FileContext.getFileContext can stack overflow if default fs misconfigured
- HADOOP-11432 - Fix SymlinkBaseTest#testCreateLinkUsingPartQualPath2.
- HADOOP-12787 - KMS SPNEGO sequence does not work with WebHDFS
- HADOOP-12841 - Update s3-related properties in core-default.xml.
- HADOOP-12901 - Add warning log when KMSClientProvider cannot create a connection to the KMS server.
- HADOOP-12963 - Allow using path style addressing for accessing the S3 endpoint.
- HADOOP-13079 - Add -q option to Ls to print ? instead of non-printable characters
- HADOOP-13132 - Handle ClassCastException on AuthenticationException in LoadBalancingKMSClientProvider
- HADOOP-13155 - Implement TokenRenewer to renew and cancel delegation tokens in KMS
- HADOOP-13251 - Authenticate with Kerberos credentials when renewing KMS delegation token
- HADOOP-13255 - KMSClientProvider should check and renew TGT when doing delegation token operations
- HDFS-8581 - ContentSummary on / skips further counts on yielding lock
- HDFS-8829 - Make SO_RCVBUF and SO_SNDBUF size configurable for DataTransferProtocol sockets and allow configuring auto-tuning
- HDFS-9085 - Show renewer information in DelegationTokenIdentifier#toString
- HDFS-9259 - Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario.
- HDFS-9365 - Balancer does not work with the HDFS-6376 HA setup.
- HDFS-9405 - Warmup NameNode EDEK caches in background thread
- HDFS-9700 - DFSClient and DFSOutputStream should set TCP_NODELAY on sockets for DataTransferProtocol
- HDFS-9732 - Improve DelegationTokenIdentifier.toString() for better logging
- HDFS-9805 - Add server-side configuration for enabling TCP_NODELAY for DataTransferProtocol and default it to true
- HDFS-10360 - DataNode may format directory and lose blocks if current/VERSION is missing
- HDFS-10381 - DataStreamer DataNode exclusion log message should be warning
- HDFS-10396 - Using -diff option with DistCp may get "Comparison method violates its general contract" exception
- HDFS-10481 - HTTPFS server should correctly impersonate as end user to open file
- HDFS-10516 - Fix bug when warming up EDEK cache of more than one encryption zone
- HDFS-10525 - Fix NPE in CacheReplicationMonitor#rescanCachedBlockMap
- MAPREDUCE-6558 - multibyte delimiters with compressed input files generate duplicate records
- MAPREDUCE-6577 - MR AM unable to load native library without MR_AM_ADMIN_USER_ENV set
- MAPREDUCE-6635 - Unsafe long to int conversion in UncompressedSplitLineReader and IndexOutOfBoundsException
- MAPREDUCE-6701 - Application Master log unavailable when clicking JobHistory's AM logs link
- YARN-2605 - [RM HA] Rest API endpoints doing redirect incorrectly.
- YARN-4812 - TestFairScheduler#testContinuousScheduling fails intermittently.
- YARN-4916 - Revert "TestNMProxy.tesNMProxyRPCRetry fails
- YARN-5048 - DelegationTokenRenewer#skipTokenRenewal may throw NPE
- HBASE-11625 - Reading datablock throws "Invalid HFile block magic" and can not switch to hdfs checksum
- HBASE-13532 - Make UnknownScannerException less scary by giving more information in the exception string.
- HBASE-14644 - Region in transition metric is broken
- HBASE-14818 - user_permission does not list namespace permissions
- HBASE-15236 - Inconsistent cell reads over multiple bulk-loaded HFiles
- HBASE-15439 - getMaximumAllowedTimeBetweenRuns in ScheduledChore ignores the TimeUnit
- HBASE-15465 - userPermission returned by getUserPermission() for the selected namespace does not have namespace set
- HBASE-15496 - Throw RowTooBigException only for user scan/get
- HBASE-15698 - Increment TimeRange not serialized to server
- HBASE-15746 - Remove extra RegionCoprocessor preClose() in RSRpcServices#closeRegion
- HBASE-15791 - Improve javadoc around ScheduledChore
- HBASE-15811 - Batch Get after batch Put does not fetch all cells
- HBASE-15872 - Split TestWALProcedureStore
- HBASE-15873 - ACL for snapshot restore / clone is not enforced
- HBASE-15925 - provide default values for hadoop compat module related properties that match default hadoop profile.
- HBASE-16034 - Fix ProcedureTestingUtility#LoadCounter.setMaxProcId()
- HBASE-16056 - Procedure v2 - fix master crash for FileNotFound
- HBASE-16093 - Fix splits failed before creating daughter regions leave meta inconsistent
- HIVE-7443 - Fix HiveConnection to communicate with Kerberized Hive JDBC server and alternative JDKs
- HIVE-9486 - Use session classloader instead of application loader
- HIVE-9499 - hive.limit.query.max.table.partition makes queries fail on non-partitioned tables
- HIVE-10685 - Alter table concatenate oparetor will cause duplicate data
- HIVE-10925 - Non-static threadlocals in metastore code can potentially cause memory leak
- HIVE-11031 - ORC concatenation of old files can fail while merging column statistics
- HIVE-11243 - Changing log level in Utilities.getBaseWork
- HIVE-11747 - Unnecessary error log is shown when executing a "INSERT OVERWRITE LOCAL DIRECTORY" cmd in the embedded mode
- HIVE-11827 - STORED AS AVRO fails SELECT COUNT(*) when empty
- HIVE-12742 - NULL table comparison within CASE does not work as previous hive versions
- HIVE-12958 - Make embedded Jetty server more configurable
- HIVE-13285 - ORC concatenation may drop old files from moving to final path
- HIVE-13462 - HiveResultSetMetaData.getPrecision() fails for NULL columns
- HIVE-13590 - Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case
- HIVE-13736 - View's input/output formats are TEXT by default.
- HIVE-13932 - Hive SMB Map Join with small set of LIMIT failed with NPE
- HIVE-13953 - Issues in HiveLockObject equals method
- HIVE-13991 - Union All on view fail with no valid permission on underneath table
- HIVE-14006 - Hive query with UNION ALL fails with ArrayIndexOutOfBoundsException.
- HIVE-14015 - SMB MapJoin failed for Hive on Spark when kerberized
- HIVE-14098 - Logging task properties and environment variables might contain passwords
- HIVE-14118 - Make the alter partition exception more meaningful
- HUE-2678 - [jobbrowser] Read Spark job data from Spark History Server API
- HUE-3197 - [oozie] Decision node support in external Workflow graph
- HUE-3520 - [jb] Use impersonation to access JHS if security is enabled
- HUE-3521 - [core] Provide a force_username_uppercase option
- HUE-3526 - [useradmin] Fix LDAP tests for force_username_uppercase
- HUE-3688 - [oozie] Fix TestEditor.test_workflow_dependencies unit test
- HUE-3700 - [core] Support force_username_lowercase and ignore_username_case for all Auth backends
- HUE-3802 - [oozie] Fix HS2 action on SSL enabled cluster
- HUE-3805 - [oozie] Add support for oozie schema 0.4 in dashboard graph for external workflows
- HUE-3808 - [core] Offer to live turn on/off debug level
- HUE-3821 - [pig] Logs are never returned on running script
- HUE-3822 - [pig] Display logs when found
- HUE-3861 - [core] Upgrade Django Axes to 1.5
- HUE-3866 - [core] Hue CPU reaches ~100% usage while uploading files with SSL to HTTPFS/WebHDFS
- HUE-3908 - [useradmin] Ignore (objectclass=*) filter when searching for LDAP users
- HUE-3923 - [core] Simplify force debug logic option
- HUE-4005 - [oozie] Remove oozie.coord.application.path from properties when rerunning workflow
- HUE-4006 - [oozie] Create new deployment directory when coordinator or bundle is copied
- HUE-4007 - [oozie] Fix deployement_dir for the bundle in oozie example fixtures
- HUE-4021 - [libsolr] Allow customization of the Solr path in ZooKeeper
- HUE-4023 - [useradmin] update AuthenticationForm to allow activated users to login
- HUE-4061 - [jb] Job attempt logs not appearing for running jobs
- HUE-4087 - [jobbrowser] Unable to kill jobs with Resource Manager HA enabled
- HUE-4092 - [security] Can't type any / in the HDFS ACLs path input
- HUE-4113 - [Pig] Hue breaks when user has only access to pig app
- HUE-4134 - [liboozie] Avoid logging truststore credentials
- HUE-4202 - [jb] Enable offset param for fetching jobbrowser logs
- HUE-4215 - [yarn] Reset API_CACHE on logout
- HUE-4227 - [yarn] Fix unittest for MR API Cache
- HUE-4238 - [doc2] Ignore history docs in find_jobs_with_no_doc during sync documents
- HUE-4252 - [core] Handle 307 redirect from YARN upon standby failover
- HUE-4258 - [jb] Close and pool Spark History Server connections
- IMPALA-1928 - Fix Thrift client transport wrapping order
- IMPALA-2660 - Respect auth_to_local configs from hdfs configs
- IMPALA-3276 - Consistently handle pin failure in BTS::PrepareForRead()
- IMPALA-3369 - Add ALTER TABLE SET COLUMN STATS statement.
- IMPALA-3441 - Impala should not crash for invalid avro serialized data
- IMPALA-3499 - Split catalog update
- IMPALA-3502 - Fix race in the coordinator while updating filter routing table
- IMPALA-3633 - Cancel fragment if coordinator is gone
- IMPALA-3732 - Handle string length overflow in Avro files
- IMPALA-3745 - Corrupt encoded values in parquet files can cause crashes
- IMPALA-3751 - Fix clang build errors and warnings
- IMPALA-3754 - Fix TestParquet.test_corrupt_rle_counts flakiness
- OOZIE-2314 - Unable to kill old instance child job by workflow or coord rerun by Launcher
- OOZIE-2329 - Make handling yarn restarts configurable
- OOZIE-2330 - Spark action should take the global jobTracker and nameNode configs by default and allow file and archive elements
- OOZIE-2345 - Parallel job submission for forked actions
- OOZIE-2436 - Fork/join workflow fails with oozie.action.yarn.tag must not be null
- OOZIE-2481 - Add YARN_CONF_DIR in the Shell action
- OOZIE-2504 - Create a log4j.properties under HADOOP_CONF_DIR in Shell Action
- OOZIE-2511 - SubWorkflow missing variable set from option if config-default is present in parent workflow
- OOZIE-2533 -Oozie Web UI gives Error 500 with Java 8u91
- SENTRY-1175 - Improve usability of URI privileges when granting URIs
- SENTRY-1201 - Sentry ignores database prefix for MSCK statement
- SENTRY-1252 - grantServerPrivilege and revokeServerPrivilege should treat "*" and "ALL" as synonyms when action is not explicitly specified
- SENTRY-1265 - Sentry service should not require a TGT as it is not talking to other kerberos services as a client
- SENTRY-1292 - Reorder DBModelAction EnumSet
- SENTRY-1293 - Avoid converting string permission to Privilege object
- SENTRY-1311 - Improve usability of URI privileges by supporting mixed use of URIs with and without scheme
- SENTRY-1320 - truncate table db_name.table_name fails
- SOLR-7178 - OverseerAutoReplicaFailoverThread compares Integer objects using ==
- SOLR-8451 - We should not call method.abort in HttpSolrClient and HttpSolrCall#remoteQuery should not close streams
- SOLR-8497 - Merge indexes should mark its directories as done rather than keep them around in the directory cache.
- SOLR-8691 - Cache index fingerprints per searcher
- SOLR-9053 - Upgrade commons-fileupload to 1.3.1, fixing a potential vulnerability
- SPARK-13278 - [CORE] Launcher fails to start with JDK 9 EA
- SPARK-14391 - [LAUNCHER] Fix launcher communication test
- SPARK-15067 - [YARN] YARN executors are launched with fixed perm gen size
- SPARK-15165 - [SPARK-15205] [SQL] Introduce place holder for comments in generated code
- SQOOP-2846 - Sqoop Export with update-key failing for avro data file
- SQOOP-2864 - ClassWriter chokes on column names containing double quotes
- SQOOP-2920 - Sqoop performance deteriorates significantly on wide datasets; sqoop 100% on CPU
Issues Fixed in CDH 5.7.1
CDH 5.7.1 fixes the following issues.
Apache HBase
The ReplicationCleaner process can abort if its connection to ZooKeeper is inconsistent
Bug: HBASE-15234
If the connection with ZooKeeper is inconsistent, the ReplicationCleaner may abort, and the following event is logged by the HMaster:
WARN org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner: Aborting ReplicationLogCleaner because Failed to get list of replicators
Unprocessed WALs accumulate.
The seekBefore() method calculates the size of the previous data block by assuming that data blocks are contiguous, and HFile v2 and higher store Bloom blocks and leaf-level INode blocks with the data. As a result, reverse scans do not work when Bloom blocks or leaf-level INode blocks are present when HFile v2 or higher is used.
Workaround: Restart the HMaster occasionally. The ReplicationCleaner restarts if necessary and process the unprocessed WALs.
Upstream Issues Fixed
The following upstream issues are fixed in CDH 5.7.1:
- AVRO-1781 - Schema.parse is not thread safe
- FLUME-2781 - Kafka Channel with parseAsFlumeEvent=true should write data as is, not as flume events
- FLUME-2891 - Revert FLUME-2712 and FLUME-2886
- FLUME-2897 - AsyncHBase sink NPE when Channel.getTransaction() fails
- HADOOP-7139 - Allow appending to existing SequenceFiles
- HADOOP-7817 - RawLocalFileSystem.append() should give FSDataOutputStream with accurate .getPos()
- HADOOP-11321 - copyToLocal cannot save a file to an SMB share unless the user has Full Control permissions
- HADOOP-11687 - Ignore x-* and response headers when copying an Amazon S3 object
- HADOOP-12668 - Support excluding weak Ciphers in HttpServer2 through ssl-server.conf
- HADOOP-12825 - Log slow name resolutions
- HADOOP-12954 - Add a way to change hadoop.security.token.service.use_ip
- HADOOP-12972 - Lz4Compressor#getLibraryName returns the wrong version number
- HDFS-3519 - Checkpoint upload may interfere with a concurrent saveNamespace
- HDFS-6520 - hdfs fsck passes invalid length value when creating BlockReader
- HDFS-7600 - Refine hdfs admin classes to reuse common code
- HDFS-8142 - DistributedFileSystem encryption zone commands should resolve relative paths
- HDFS-8211 - DataNode UUID is always null in the JMX counter.
- HDFS-8496 - Calling stopWriter() with FSDatasetImpl lock held may block other threads
- HDFS-8855 - Webhdfs client leaks active NameNode connections
- HDFS-9549 - TestCacheDirectives#testExceedsCapacity is flaky
- HDFS-9589 - Block files which have been hardlinked should be duplicated before the DataNode appends to the them
- HDFS-9949 - Add a test case to ensure that the DataNode does not regenerate its UUID when a storage directory is cleared
- HDFS-10223 - peerFromSocketAndKey performs SASL exchange before setting connection timeouts
- HDFS-10267 - Extra "synchronized" on FsDatasetImpl#recoverAppend and FsDatasetImpl#recoverClose
- HDFS-10324 - Trash directory in an encryption zone should be pre-created with correct permissions
- HDFS-10344 - DistributedFileSystem#getTrashRoots should skip encryption zone that does not have .Trash
- MAPREDUCE-4785 - TestMRApp occasionally fails
- MAPREDUCE-6297 - Task Id of the failed task in diagnostics should link to the task page
- MAPREDUCE-6333 - TestEvents,TestAMWebServicesTasks,TestAppController are broken due to MAPREDUCE-6297
- MAPREDUCE-6384 - Add the last reporting reducer info for too many fetch failure diagnostics
- MAPREDUCE-6388 - Remove deprecation warnings from JobHistoryServer classes
- MAPREDUCE-6485 - Create a new task attempt with failed map task priority if in-progress attempts are unassigned
- MAPREDUCE-6513 - MR job got hanged forever when one NM unstable for some time
- MAPREDUCE-6535 - TaskID default constructor results in NPE on toString()
- MAPREDUCE-6580 - Test failure: TestMRJobsWithProfiler
- YARN-2871 - TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
- YARN-3104 - Fixed RM to not generate new AMRM tokens on every heartbeat between rolling and activation
- YARN-3493 - RM fails to come up with error "Failed to load/recover state" when mem settings are changed
- YARN-3695 - ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception
- YARN-4168 - Fixed a failing test TestLogAggregationService.testLocalFileDeletionOnDiskFull
- YARN-4414 - Nodemanager connection errors are retried at multiple levels
- YARN-4579 - Allow DefaultContainerExecutor container log directory permissions to be configurable
- YARN-4629 - Distributed shell breaks under strong security
- YARN-4639 - Remove dead code in TestDelegationTokenRenewer added in YARN-3055
- YARN-4717 - TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails Intermittently due to IllegalArgumentException from cleanup
- YARN-4795 - ContainerMetrics drops records
- YARN-4916 - TestNMProxy.tesNMProxyRPCRetry fails
- HBASE-15234 - Don't abort ReplicationLogCleaner on ZooKeeper errors
- HBASE-15271 - Spark bulk load should write to temporary location and then rename on success
- HBASE-15349 - Update surefire version to 2.19.1
- HBASE-15405 - Fix PE logging and wrong defaults in help message
- HBASE-15456 - CreateTableProcedure/ModifyTableProcedure needs to fail when there is no family in table descriptor
- HBASE-15479 - No more garbage or beware of autoboxing
- HBASE-15582 - SnapshotManifestV1 too verbose when there are no regions
- HBASE-15591 - ServerCrashProcedure not yielding
- HBASE-15592 - Print Procedure WAL content
- HBASE-15622 - Superusers does not consider the keytab credentials
- HBASE-15622 - Superusers does not consider the keytab credentials
- HBASE-15673 - Fix latency metrics for multiGet. - Also fixes some stuff in help text
- HBASE-15707 - ImportTSV bulk output does not support tags with hfile.format.version=3
- HIVE-6099 - Multi insert does not work properly with distinct count
- HIVE-10303 - HIVE-9471 broke forward compatibility of ORC files
- HIVE-10313 - Literal Decimal ExprNodeConstantDesc should contain value of HiveDecimal instead of String
- HIVE-10396 - moredecimal_precision2.q test is failing on trunk
- HIVE-10636 - CASE comparison operator rotation optimization
- HIVE-11054 - Handle varchar/char partition columns in vectorization
- HIVE-11097 - HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
- HIVE-11369 - Mapjoins in HiveServer2 fail when jmxremote is used
- HIVE-11408 - HiveServer2 is leaking ClassLoaders when add jar / temporary functions are used due to constructor caching in Hadoop ReflectionUtils
- HIVE-11427 - Location of temporary table for CREATE TABLE SELECT broken by HIVE-7079
- HIVE-11590 - AvroDeserializer is very chatty
- HIVE-11919 - Hive Union Type Mismatch
- HIVE-12481 - Occasionally "Request is a replay" will be thrown from HS2
- HIVE-12506 - SHOW CREATE TABLE command creates a table that does not work for RCFile format
- HIVE-12568 - Provide an option to specify network interface used by Spark remote client [Spark Branch]
- HIVE-12616 - NullPointerException when spark session is reused to run a mapjoin
- HIVE-12706 - Incorrect output from from_utc_timestamp()/to_utc_timestamp when local timezone has DST
- HIVE-12941 - Unexpected result when using MIN() on struct with NULL in first field
- HIVE-13082 - Enable constant propagation optimization in query with left semi join
- HIVE-13082 - Enable constant propagation optimization in query with left semi join
- HIVE-13115 - MetaStore Direct SQL getPartitions call fail when the columns schemas for a partition are null
- HIVE-13200 - Aggregation functions returning empty rows on partitioned columns
- HIVE-13217 - Replication for HoS mapjoin small file needs to respect dfs.replication.max
- HIVE-13243 - Hive drop table on encyption zone fails for external tables
- HIVE-13251 - hive can't read the decimal in AVRO file generated from previous version
- HIVE-13286 - Query ID is being reused across queries
- HIVE-13295 - Improvement to LDAP search queries in HS2 LDAP Authenticator
- HIVE-13300 - Hive on spark throws exception for multi-insert with join
- HIVE-13302 - direct SQL: cast to date doesn't work on Oracle
- HIVE-13376 - HoS emits too many logs with application state
- HIVE-13401 - Kerberized HS2 with LDAP auth enabled fails kerberos/delegation token authentication
- HIVE-13410 - PerfLog metrics scopes not closed if there are exceptions on HS2
- HIVE-13500 - Fix OOM with explan output being logged
- HIVE-13527 - Using deprecated APIs in HBase client causes zookeeper connection leaks
- HIVE-13530 - Hive on Spark throws Kryo exception in some cases
- HIVE-13570 - Some queries with Union all fail when CBO is off
- HIVE-13585 - Add counter metric for direct sql failures
- HIVE-13632 - CDH39911Hive failing on insert empty array into parquet table
- HIVE-13657 - Spark driver stderr logs should appear in hive client logs
- HUE-3171 - Fix vertical resize handle for queries with long descriptions
- HUE-3171 - Long descriptions doesn’t wrap and headers from table follows with horizontal scroll
- HUE-3221 - Styling on column stats popup leaks on the tables page
- HUE-3293 - Prevent document matching query error when going one home 1
- HUE-3293 - Fix mis-switching to new home page when new editor is on
- HUE-3293 - Move new editor flag to desktop
- HUE-3303 - PostgreSQL requires data update and alter table operations in separate transactions
- HUE-3310 - Prevent browsing job designs by API
- HUE-3334 - Update test, now se send empty query instead of error
- HUE-3334 - Skip checking for multi queries if there is no semi colon
- HUE-3350 - Reverse browsing link to use the correct version of the editor
- HUE-3398 - Filter out sessions with empty guid or secret key
- HUE-3434 - Logs of finished Oozie workflow are not displayed
- HUE-3436 - Retain old dependencies when saving a workflow
- HUE-3437 - PamBackend does not honor ignore_username_case
- HUE-3459 - Put stat popover on top
- HUE-3459 - Fixed Flexbox for IE10
- HUE-3459 - Use fixed positioning for assist panel
- HUE-3459 - Revert sticky assist
- HUE-3459 - Fix issue with single panel in metastore and new editor
- HUE-3459 - Clear the height interval on update
- HUE-3459 - Assist doesn't stretch to the end of the page in the old editors
- HUE-3471 - Set the assist database on design update
- HUE-3471 - Assist does not show the DB from the saved query
- HUE-3476 - Clear any running intervals after closing the stats popover
- HUE-3480 - Impala refresh pop-over won't close after assist action while open
- HUE-3506 - Limit length of comments on table page
- HUE-3511 - Reduce flickering of action icons when moving the pointer across several entries
- HUE-3523 - Modify find_jobs_with_no_doc method to exclude jobs with no name
- HUE-3528 - Call correct metrics api to avoid 500 error
- HUE-3543 - Timeout prevents refreshing of the Assist tables/dbs
- HUE-3594 - Smarter DOM based XSS filter on hashes
- HUE-3601 - Spinner not positioned correctly
- HUE-3613 - Empty div elements are added when scrolling the DB assist panel
- HUE-3614 - Scrolling on assist in old editor also scrolls the editor
- HUE-3637 - Avoid decode errors on attribute values
- HUE-3650 - Notify of caught errors in the watch logs process
- HUE-3651 - Upgrade Moment.js
- HUE-3704 - Force enable notebook permissions
- HUE-3716 - Add gen-py paths to hue.pth
- HUE-3725 - 'SparkJob' object has no attribute 'amHostHttpAddress'
- HUE-3731 - Send database on Impala refresh with invalidate
- HUE-3741 - Display field validation errors on create table wizard
- HUE-3800 - Job attempt logs not appearing for some Oozie jobs
- HUE-3819 - Make the upload and create icons not disappear under 1180px
- IMPALA-2076 - Correct execution time tracking for DataStreamSender
- IMPALA-2502 - Don't redundantly repartition grouping aggregations
- IMPALA-2892 - Buffered-tuple-stream-ir.cc is not cross-compiled
- IMPALA-3133 - Wrong privileges after a REVOKE ALL ON SERVER statement
- IMPALA-3139 - Fix drop table statement to not drop views and vice versa
- IMPALA-3141 - Send dummy filters when filter production is disabled
- IMPALA-3194 - Allow queries materializing scalar type columns in RC/sequence files
- IMPALA-3220 - Skip logging empty ScannerContext's stream in parse error
- IMPALA-3236 - Increase timeout for runtime filter tests
- IMPALA-3238 - Avoid log spam for very large hash tables
- IMPALA-3245, IMPALA-3305: Fix crash with global filters when NUM_NODES=1
- IMPALA-3269 - Remove authz checks on default table location in CTAS queries
- IMPALA-3285 - Fix ASAN failure in webserver-test
- IMPALA-3317 - Fix crash in sorter when spilling zero-length strings
- IMPALA-3334 - Fix some bugs in query options' parsing.
- IMPALA-3367 - Ensure runtime filters tests run on 3 nodes
- IMPALA-3378, IMPALA-3379: fix various JNI issues
- IMPALA-3385 - Fix crashes on accessing error_log
- IMPALA-3395 - Old HT filter code uses wrong expr type
- IMPALA-3396 - Fix ConcurrentTimerCounter unit test "TimerCounterTest" failure.
- IMPALA-3412 - Fix CHAR codegen crash in tuple comparator
- IMPALA-3420 - Set IMPALA_THRIFT_VERSION patch level to +4
- KITE-1108 - Add optional retry feature to loadSolr morphline command
- KITE-1114 - Kite CLI json-import HDFS temp file path not multiuser safe
- OOZIE-2429 - TestEventGeneration test is flakey
- OOZIE-2466 - Repeated failure of TestMetricsInstrumentation.testSamplers
- OOZIE-2486 - TestSLAEventsGetForFilterJPAExecutor is flakey
- OOZIE-2490 - Oozie can't set hadoop.security.token.service.use_ip
- SENTRY-922 - INSERT OVERWRITE DIRECTORY permission not working correctly
- SENTRY-1112 - Change default value of "sentry.hive.server" to empty string
- SENTRY-1164 - testCaseSensitivity test failure on a real cluster and also a minor improvements to testConcurrentClients to run locally. (Anne Yu, reviewed by Haohao).
- SENTRY-1169 - MetastorePlugin#renameAuthzObject log message prints oldpathname as newpathname
- SENTRY-1184 - Clean up HMSPaths.renameAuthzObject
- SENTRY-1190 - IMPORT TABLE silently fails if Sentry is enabled
- SOLR-6631 - DistributedQueue spinning on calling zookeeper getChildren()
- SOLR-6879 - Have an option to disable autoAddReplicas temporarily for all collections.
- SOLR-7493 - Requests aren't distributed evenly if the collection isn't present locally. Merges r1683946 and r1683948 from trunk.
- SOLR-8551 - Make collection deletion more robust.
- SOLR-8683 - Tune down stream closed logging
- SOLR-8720 - ZkController#publishAndWaitForDownStates should use #publishNodeAsDown.
- SOLR-8771 - Multi-threaded core shutdown creates executor per core
- SOLR-8855 - The HDFS BlockDirectory should not clean up it's cache on shutdown.
- SOLR-8856 - Do not cache merge or 'read once' contexts in the hdfs block cache.
- SOLR-8857 - HdfsUpdateLog does not use configured or new default number of version buckets and is hard coded to 256.
- SOLR-8869 - Optionally disable printing field cache entries in SolrFieldCacheMBean
- SPARK-4452 - Shuffle data structures can starve others on the same thread for memory
- SPARK-12614 - Don't throw non fatal exception from ask
- SPARK-13622 - Issue creating level db for YARN shuffle service
- SPARK-14242 - Avoid copy in compositeBuffer for frame decoder
- SPARK-14290 - Avoid significant memory copy in Netty's tran…
- SPARK-14363 - Fix executor OOM due to memory leak in the Sorter
- SPARK-14477 - Allow custom mirrors for downloading artifacts in build/mvn
- SPARK-14679 - Fix UI DAG visualization OOM.
- SQOOP-2847 - Sqoop --incremental + missing parent --target-dir reports success with no data
Issues Fixed in CDH 5.7.0
CDH 5.7.0 fixes the following issues.
Apache Flume
TailDirSource throws FileNotFound Exception if ~/.flume directory is not created already
Bug: FLUME-2773
This fix ensures that any missing parent directories in the positionFile path (either default or user given input) are always created.
flume_env script should handle JVM parameters like -javaagent -agentpath -agentlib
Bug: FLUME-2763
This fix enables the flume_env script to handle JVM parameters such as -javaagent -agentpath and -agentlib.
Kafka channel timeout property is overridden by default value
Bug: FLUME-2734
When the Kafka channel timeout property is passed to the Kafka consumer internally, it does not work as expected. It is overridden by the default value or the value specified by the .timeout property, which is undocumented. Now the kafka.consumer.timeout.ms value specified in the configuration takes effect like other Kafka consumer properties.
Apache Hadoop
ReplicationMonitor can infinitely loop in BlockPlacementPolicyDefault#chooseRandom()
Bug: HDFS-4937
Cloudera Bug: CDH-34043
When a large number of nodes are removed by refreshing node lists, the network topology is updated and the replication monitor thread may get stuck in the while loop of chooseRandom().
Clean up temporary files after fsimage transfer failures
Bug: HDFS-7373
Cloudera Bug: CDH-19177
When an fsimage (or checkpoint) transfer fails, a temporary file is left in each storage directory. If the namespace is large, these files can take up a large amount of space.
Lease recovery should return true if the lease can be released and the file can be closed
Bug: HDFS-8576
Cloudera Bug: CDH-37212
FSNamesystem#recoverLease should return true when a lease is recovered both explicitly and implicitly—that is, when a lease recovery is successful and the file is closed, and also when a file is closed and the lease is released without a recovery.
fsck does not list correct file path when bad replicas or blocks are in a snapshot
Bug: HDFS-9231
Cloudera Bug: CDH-32221
When blocks are corrupt in a snapshot, the fsck command lists the original directory and not the snapshot directory. This happens even when the original file is deleted. The specific commands are fsck -list-corruptfileblocks and fsck -list-corruptfileblocks -includeSnapshots.
Make DataStreamer#block thread safe and verify generationStamp in commitBlock
Bug: HDFS-9289
Cloudera Bug: CDH-33723
When the client calls updatePipeline, a block might commit with an old generationStamp, causing replicas to look corrupt.
Delayed heartbeat processing causes storm of subsequent heartbeats
Bug: HDFS-9305
Cloudera Bug: CDH-33589
The NameNode usually handles DataNode heartbeats quickly, but can be delayed for various reasons, such as a long garbage collection or lock contention. After the NameNode recovers, the DataNode sends a storm of heartbeat messages in a tight loop which, in a big cluster, can overload the NameNode and make cluster recovery difficult.
FSImage may get corrupted after deleting snapshot
Bug: HDFS-9406
Cloudera Bug: CDH-33224
When deleting a snapshot that contains the last record of a given INode, the fsimage may become corrupt because the create list of the snapshot diff in the previous snapshot and the child list of the parent INodeDirectory are not cleaned.
Apache HBase
See also Known Issues In CDH 5.7.0.
Potential data loss after a RegionServerAbortedException
Bug: HBASE-13895
If the master attempts to assign a region while handling a RegionServer abort, the returned RegionServerAbortedException is handled as though the region had been cleanly taken offline, so the new assignment is allowed to proceed. If the region is opened in its new location before WAL replay has completed, the replayed edits are ignored, or are later played back on top of new edits that happened after the region was opened. In either case, data can be lost.
Workaround: None.
Data loss can occur if a table has more than 2,147,483,647 columns
Bug: HBASE-15133
Data loss can occur if a table has more than 2,147,483,647 (Integer.MAX_INT) columns, because some key variable types are INT rather than LONG.
Workaround: Adjust your schema to use fewer than Integer.MAX_INT columns.
Delete operations that occur during a region merge may be eclipsed by new Put operations
Bug: HBASE-13938
The master's timestamp is not used when sending hbase:meta edits on region merges, so correct ordering of new region additions and old region deletes is not assured and data loss can occur if edits are applied in the wrong order.
Workaround: None.
RPC handler / task monitoring seems to be broken after 0.98
Bug: HBASE-14674
After pluggable RPC scheduler, the way the tasks work for the handlers got changed. We no longer list idle RPC handlers in the tasks, but we register them dynamically to TaskMonitor through CallRunner. However, the IPC readers are still registered the old way (meaning that idle readers are listed as tasks, but not idle handlers).
From the javadoc of MonitoredRPCHandlerImpl, it seems that we are NOT optimizing the allocation for the MonitoredTask anymore, but instead allocate one for every RPC call breaking the pattern
Conflicts between HBase Balancer and hbase:meta reassignment
Bug: HBASE-14536
If hbase:meta is assigned to a RegionServer that becomes unavailable, and the HBase balancer has scheduled but not completed a plan to move hbase:meta to a different RegionServer, the hbase:meta becomes unassigned.
Workaround: None.
Regions can fail to transition in a write-heavy cluster with a small number of read handlers
Bug: HBASE-13635
On a write-heavy cluster configured with a small number of read handlers, all requests that are not mutations are sent to the read handlers, including ReportRegionInTransition requests. If these requests time out, the RegionServer is assumed to be unavailable, and the regions cannot transition correctly.
Workaround: None.
In a secured environment, when a RegionServer is stopped, znodes may not be cleaned up correctly
Bug: HBASE-14581
When a RegionServer process is stopped, the zkcli command is invoked to delete its znodes. In a secure cluster, the zkcli command does not authenticate to ZooKeeper and the deletion fails. This problem occurs because the REGIONSERVER_OPTS environment variable is not correctly passed when invoking the zkcli command.
Workaround: None.
Delays in RegionServer responses can cause a region to be closed indefinitely
Bug: HBASE-14407
Handling of region assignment by the master has a flaw when RegionServer responses are delayed due to network delays, system load, or other reasons. This flaw can cause the master to close a region indefinitely.
Workaround: Restart the RegionServer to force the region to be reassigned.
When a RegionServer crashes, replication peers can crash due to inode exhaustion from old WALs
Bug: HBASE-14621
The fix for HBASE-12865 ensures that loadWALsFromQueues attempts a retry when the replication source version is changed while loading the replication queue. However, the fix introduced a bug in ReplicationLogCleaner that causes an infinite loop when a RegionServer crashes. As a result, old WALs are not cleaned up. In a cluster under high load, the inode limit on the replication peer RegionServer can be exhausted, causing the RegionServer to crash.
Workaround: None.
When a RegionServer crashes, cell-level visibility tags may be lost during WAL replay
Bug: HBASE-15218
When reading cells after a RegionServer crash, the KeyValueCodec and the WallCellCodec both use NoTagsKeyValue, which does not preserve visibility tags.
Workaround: None.
Column is not deleted if you do not pass the visibility label
Bug: HBASE-14761
If a column was created or modified with a visibility label, and you attempt to delete it without passing the visibility label, the column is not deleted. It is not visible using a Scan operation, but is visible using a raw Scan, and is marked with deleteColumn.
Workaround: None.
If multiple users are configured with the role hbase.superuser, an attempt to connect to a secure ZooKeeper instance fails
Bug: HBASE-14425
The hbase.superuser configuration option is a comma-separated list of users. A bug in the code to connect to a secure ZooKeeper causes the list to be evaluated as a single value, so a list of multiple users fails because no username matches the comma-separated list.
Workaround: Only specify a single user in the hbase.superuser configuration option.
Region split request audits are performed against the hbase user instead of the requesting user
Bug: HBASE-14475
When checking whether the requesting user has permission to perform a region split, the hbase user's permissions are checked instead of those of the requesting user.Due to this bug, CREATE is sufficient for the split, rather than CREATE and ADMIN. Because CREATE permissions are also sufficient for flushes and compactions, this issue is not severe in most environments.
Workaround: None.
Incorrect timestamp checking causes unpredictable deletes with VisibilityScanDeleteChecker.
Bug: HBASE-13635
Incorrect timestamp checking when VisibilityScanDeleteChecker is used causes unpredictable results when deleting cells. In some cases, the timestamp is deleted but the cell contents are not deleted. In other cases, a request to delete an entire row or to delete a version results in only a single cell being deleted.
Workaround: None.
A BulkLoad of an HFile with tags that requires splits does not preserve the tags
Bug: HBASE-15035
When an HFile is created with cell tags and is imported into HBase using a bulk load, the tags are present as expected when the HFile is loaded into a single region. However, if the bulk load spans multiple regions, the original HFile is automatically split into a set of HFiles corresponding to each of the regions the original HFile covers. Tags, including ACLs, TTLs, and MOB pointers, are not copied to the split files.
Workaround: None.
Restoring a snapshot from a table in a user-defined namespace causes a URISyntaxException
Bug: HBASE-14578
A table in a user-defined namespace uses a colon between the namespace and the table name (for instance, example_ns:users). This colon is interpreted incorrectly when restoring from a snapshot.
Workaround: None.
The list_snapshots HBase shell command shows all snapshots, regardless of the user's permission to view them
Bug: HBASE-12552
A user with no privileges to interact with a snapshot can list the snapshot using the list_snapshots HBase shell command.
Workaround: None.
ExportSnapshot or DistCp operations may fail on the Amazon s3a:// protocol
Bug: None.
ExportSnapshot or DistCP operations may fail on AWS when using certain JDK 8 versions, due to an incompatibility between AWS Java SDK 1.9.x and the joda-time date-parsing module.
Workaround: Use joda-time 2.8.1 or higher, which is included in AWS Java SDK 1.10.1 or higher.
If HDFS is restarted while HBase is running, WALs being replicated may not close correctly
Bug: HBASE-15019
The RegionServer receiving the replicated WALs has no mechanism to be notified to perform a recovery if HDFS is restarted on the source cluster.
Workaround: Restart the RegionServer to force the master to trigger the lease recovery during WAL splitting.
The permissions of the .top/ directory are not explicitly set during LoadIncrementalHFiles operations
Bug: HBASE-14005
Permissions are not explicitly set on the .top/ directory created during LoadIncrementalHFiles. The permissions should be set the same as the .bottom/ and .tmp/ directories.
Workaround: None.
Nonfatal errors in the FSHLog subsystem are incorrectly logged as fatal errors
Bug: HBASE-14042
If an IOException causes a log roll to be requested, it is logged as a fatal event, although it should be logged as a warning.
Workaround: None.
FuzzyRowFilter may omit some rows if multiple fuzzy keys are present
Bug: HBASE-14269
If you use the FuzzyRowFilter for Scan operations, and you have multiple fuzzy keys, some rows may be omitted from the RowTracker.
Workaround: None.
The prefix-tree module is not automatically included in MapReduce jobs
Bug: HBASE-15152
JARs for prefix-tree module are not automatically included in YarnChildren processes. This causes a ClassNotFoundException.
Workaround: Manually add the prefix-tree JARs to the classpath if needed.
Values of some metrics may appear to be negative
Bug: HBASE-12961
Some metric value are stored in integers, and cannot accommodate real-world values. This causes metric values to appear to be negative.
Workaround: None.
The HBase Shell cannot handle Scan filters which contain non-UTF8 characters
Bug: HBASE-15032
The HBase Shell incorrectly handles filter strings which contain non-UTF8 characters.
Workaround: None.
Reverse scans do not work when Bloom blocks or leaf-level inode blocks are present
Bug: HBASE-14283
Because the seekBefore() method calculates the size of the previous data block by assuming that data blocks are contiguous, and HFile v2 and higher store Bloom blocks and leaf-level inode blocks with the data, reverse scans do not work when Bloom blocks or leaf-level inode blocks are present when HFile v2 or higher is used.
Workaround: None.
Apache Hive
Fix regression in bind and search logic for Hive external LDAP authentication
Bug: HIVE-12885
Cloudera Bug: CDH-35075
Fixes a regression in LDAP bind and search authentication from CDH 5.5.0.
Some queries using LEFT SEMI JOIN fail with IndexOutOfBoundsException
Bug: HIVE-13082
Cloudera Bug: CDH-37515
Some queries using LEFT SEMI JOIN fail with IndexOutOfBoundsException. Constant propagation optimization for these queries is now enabled.
BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet tables
Bug: HIVE-13039
Cloudera Bug: CDH-37322
BETWEEN becomes exclusive in Parquet table when predicate pushdown (PPD) is enabled for Parquet tables, leading to potential incorrect results.
Performance degradation when running Hive queries against wide tables with Sentry enabled
Bug: SENTRY-1007
Cloudera Bug: CDH-35908
Fixes a performance regression due to inefficient authorization checks in the Sentry Hive binding for Hive tables that are wide (more than 100 columns).
Optionally cancel queries after configurable timeout waiting on compilation lock
Bug: HIVE-12431
Cloudera Bug: CDH-34693
Adds a new configuration option, hive.server2.compile.lock.timeout, that cancels queries if they are waiting for the compile lock for more than this amount of time. This applies only to queries waiting on compilation and does not cancel queries that are being compiled. By default, the timeout is unlimited.
Hue
The Hive Sample Table, customer, Cannot be Queried
Apache Impala
For the list of Impala fixed issues, see Issues Fixed in Impala for CDH 5.7.0.
See also Apache Impala Known Issues for issues that are known but not resolved yet.
MapReduce
MapReduce Rolling Upgrades To and From CDH 5.6.0 Fail
Cloudera Bug: CDH-38587
Users can now safely use rolling upgrade from releases CDH 5.6.0 and lower to CDH 5.7.0.
Cloudera Search
Reordered updates cause leaders and replicas to become out of sync
Solr relied on checking leader/replica document synchronization by comparing the last 100 updates on the leader and replica for significant overlap, and then applying any missing updates from the leader. In certain cases, document updates could be significantly reordered, resulting in mismatches in the index, even when the last 100 documents matched. Solr now implements hashing over the versions of all the documents to check for synchronization, eliminating a class of errors in which replicas and leaders could become out of sync.
Apache Sentry
Fixed Sentry Oracle upgrade script
Bug: SENTRY-1066
This fixes previous Sentry upgrade issues with Oracle (ORA-0955).
Tables with non-HDFS locations break Hive Metastore startup
Bug: SENTRY-1044
Tables with non-HDFS locations cause the HDFS/Sentry plugin to enter an invalid state and fail with the exception, Could not create Initial AuthzPaths or HMSHandler !!.
URI check is now case-sensitive
Bug: SENTRY-968
Sentry no longer ignores case when validating privileges for URIs.
TRUNCATE on empty partitioned table in Hive fails
Bug: SENTRY-826
PathsUpdate.parsePath(path) will throw an NPE when parsing relative paths
Bug: SENTRY-1002
Sentry now skips relative paths (that is, paths without a fully qualified scheme) rather than failing with a NPE.
The Sentry Server should be not be updated if the CREATE/DROP operations fail
Bug: SENTRY-1008
Previously, even if a CREATE TABLE operation failed, the Sentry Server would still be updated with a path to the table. This has been fixed.
SimpleDBProviderBackend should retry the authorization process
Bug: SENTRY-902
This fix includes corrections to the retry logic to remove recursive calls and include a wait time between retries when authorization fails.
Support Hive RELOAD by updating the classpath for Sentry
Bug: SENTRY-1003
When Hive issues the RELOAD command, Sentry now gets the updated auxiliary JAR path from the hive.reloadable.aux.jars.path property.
RealTimeGet with explicit ids can bypass document-level authorization
Bug: SENTRY-989
Users can no longer bypass security by guessing the document IDs for the RealTimeGet command.
Updated Apache Shiro dependency
Bug: SENTRY-1054
External partitions referenced by more than one table can cause some unexpected behavior with Sentry HDFS sync
Bug: SENTRY-953
INSERT INTO command no longer requires URI privilege on partition location under table
Bug: SENTRY-1095
The checks on the Hive INSERT INTO command have been relaxed. The INSERT INTO command adds location information to the partition description. Although this requires a check on URI privileges, in this case location information can be generated even if the partition is under the table directory.
Improvement to the SentryAuthFilter error message when authentication fails
Bug: SENTRY-1060
Avoid logging all DataNucleus queries when debug logging is enabled
Bug: SENTRY-945
Logging DataNucleus queries when debugging can fill up 2 GB of log file space in less than five minutes.
getGroup and getUser should always return original HDFS values for paths that are not managed by Sentry
Bug: SENTRY-936
Paths that do not correspond to Hive metastore objects should not be affected by HDFS/Sentry sync.
Exceptions in MetastoreCacheInitializer should not prevent HMS from starting up
Bug: SENTRY-957
Instead of only throwing a runtime exception, this fix ensures failed tasks are first retried.
Set maximum message size for Thrift messages
Bug: SENTRY-904
This ensures that security scans and unformatted messages do not bring down the Sentry server by going out of bounds.
Allow SentryAuthorization setter path always fall through and update HDFS
Bug: SENTRY-988
Setting HDFS rules on Sentry-managed HDFS paths should not affect original HDFS rules
Bug: SENTRY-944
Removing and modifying ACLs on Sentry-managed paths that correspond to Hive metastore objects should not affect the original HDFS access rules.
Fix inconsistency in column-level privileges
Bug: SENTRY-847
If you have column-level privileges for a table, the SHOW columns operation should not require extra table-level privileges.
Performance Improvements
Improved performance for filtering Hive SHOW commands
Bug: SENTRY-565
HiveAuthzBinding has been improved to reduce the number of RPC calls when filtering SHOW commands.
Apache Spark
Spark SQL does not support the char type
Spark SQL does not support the char type (fixed-length strings). Like unions, tables with such fields cannot be created from or read by Spark.
Spark SQL statements that can result in table partition metadata changes may fail
Because Spark does not have access to Sentry data, it may not know that a user has permissions to execute an operation and instead fail it. SQL statements that can result in table partition metadata changes, for example, "ALTER TABLE" or "INSERT", may fail.
Cloudera Bug: CDH-33446
Certain Spark MLlib features not supported
- spark.ml
- ML pipeline APIs
Streaming incompatibility between Spark 1.2 and 1.3
Applications built as a JAR with dependencies ("uber JAR") must be built for the specific version of Spark running on the cluster.
Cloudera Bug: CDH-26527
Workaround: Rebuild the JAR with the Spark dependencies in pom.xml pointing to the specific version of Spark running on the target cluster.
Spark SQL cannot retrieve data from a partitioned Hive table
When reading from a partitioned Hive table, Spark SQL cannot identify the column delimiter used and reads the full record as the first column entry.
Cloudera Bug: CDH-37189
Workaround: Contact Cloudera Support for information on how to deploy a patch to resolve the issue.
Tasks that fail due to YARN preemption can cause job failure
Bug: SPARK-8167
Tasks that are running on preempted executors will count as FAILED with an ExecutorLostFailure.
Apache Sqoop
Oracle connector not working with lowercase columns
Bug: SQOOP-2723
The Oracle connector now works with lowercase columns.
Run only one map task attempt during export
Bug: SQOOP-2712
In most scenarios, running multiple map task attempts by default when performing an export is not required. The default is now one map task attempt during export operations.
Do not dump data on error in TextExportMapper by default
Bug: SQOOP-2651
Dumping data in the TextExportMapper might unintentionally leak sensitive information to logs. The enableDataDumpOnError key is set to false by default. A user can set the value to true to intentionally write the data to the log.
Support of glob paths during export
Bug: SQOOP-1281
Glob paths are now supported for export.
Sqoop should support importing from table with column names containing some special characters
Bug: SQOOP-2387
Sqoop supports some special characters in column names. The specific list of characters depends on those supported for a particular database.
Avro export ignores --columns option
Bug: SQOOP-1369
AvroExportMapper now supports the --columns option to restrict the columns to export.
JDK
Java 8 (updates 60 and higher) has problems with joda-time and S3 requests
Bug: SPARK-11413
Cloudera Bug: CDH-31245
Versions of Java 1.8, from update 60 and higher (jdk1.8.0_60++), cause S3 to fail because joda-time cannot format time zones.