Known Issues
Summary of known issues for this release.
Cloudera Bug ID | Apache JIRA | Apache component | Summary | |||
---|---|---|---|---|---|---|
BUG-121483 | N/A | Atlas |
Description of the problem or behavior Atlas entity PUT API
Associated error message Error code 200 is displayed instead of Error code 403 Workaround There is no workaround. Customer must request a hotfix and apply the patch available in ATLAS-3550 |
|||
N/A | N/A | Hive | Description of the problem or behavior In PostgreSQL 9.6 and lower, hash indexes are prone to become corrupted. HiveMetastore makes use of hash indexes for PostgreSQL and these corruptions cause issues. Affected Indexes: TC_TXNID_INDEX and HL_TXNID_INDEX Workaround The workaround is to reindex the corrupted indexes. For example, in PostgreSQL run reindex index tc_txnid_index. And, If you use PostgreSQL as the backend database, a supported version later than 9.6 is recommended. |
|||
BUG-122325 | N/A | Atlas | Description of the problem or behavior Duplicate audit events (ENTITY_UPDATE) are created when you rename a Hive table both for the table and its columns. Associated error message No error message is displayed. Workaround There is currently no workaround. |
|||
BUG-122149 | N/A | Atlas | Description of the problem or behavior If an Hbase column family is previously deleted, then importing Hbase entities via import HBase script fails.Associated error message
Workaround There is no workaround. Currently, ATLAS-3551 is created to fix this issue. |
|||
BUG-122430 | N/A | Teradata Connector |
Description of the problem or behavior Teradata Database does not create mappers equivalent to AMP instances. Workaround No workaround needed. |
|||
BUG-123169 | N/A | Ambari and Oozie | Description of the problem or behavior Oozie-Server won’t start after “Install Packages” when upgrading from HDP-3.1.x
to HDP-3.1.4 or later due to incompatible Tomcat version. The start fails
with below error:
Workaround
|
|||
BUG-79238 | N/A | Documentation, HBase, HDFS, Hive, MapReduce, Zookeeper | Description of the problem or behavior SSL is deprecated and its use in production is not recommended. Use TLS. Workaround In Ambari: Use ssl.enabled.protocols=TLSv1|TLSv1.1|TLSv1.2 and security.server.disabled.protocols=SSL|SSLv2|SSLv3. For help configuring TLS for other components, contact customer support. Documentation will be provided in a future release. |
|||
BUG-106494 | N/A | Documentation, Hive | Description of Problem When you partition a Hive column of type double, if the column value is 0.0, the actual partition directory is created as "0". An AIOB exception occurs. Associated error message
Workaround Do not partition columns of type double. |
|||
BUG-106379 | N/A | Documentation, Hive | Description of the Problem The upgrade process fails to perform necessary compaction of ACID tables and can cause permanent data loss. Workaround If you have ACID tables in your Hive metastore, enable ACID operations in Ambari or set Hive configuration properties to enable ACID. If ACID operations are disabled, the upgrade process does not convert ACID tables. This causes permanent loss of data; you cannot recover data in your ACID tables later. |
|||
BUG-106286 | N/A | Documentation, Hive | Description of the Problem The upgrade process might fail to make a backup of the Hive metastore, which is critically important. Workaround Manually make a manual backup of your Hive metastore database before upgrading. Making a backup is especially important if you did not use Ambari to install Hive and create the metastore database, but highly recommended in all cases. Ambari might not have the necessary permissions to perform the backup automatically. The upgrade can succeed even if the backup fails, so having a backup is critically important. |
|||
BUG-101082 | N/A | Documentation, Hive | Description of the problem or behavior When running Beeline in batch mode, queries killed by the Workload Management process can on rare occasions mistakenly return success on the command line. Workaround There is currently no workaround. |
|||
BUG-103495 | HBASE-20634, HBASE-20680, HBASE-20700 | HBase | Description of the problem or behavior Because the region assignment is refactored in HBase, there are unclear issues that may affect the stability of this feature. If you rely on RegionServer Groups feature, you are recommended to wait until a future HDP 3.x release, which will return the stability of this features as it was available in HBase 1.x/HDP 2.x releases. Workaround There is currently no workaround. |
|||
BUG-98727 | N/A | HBase | Description of the problem or behavior Because the region assignment is refactored in HBase, there are unclear issues that may affect the stability of this feature. If you rely on Region replication feature, you are recommended to wait until a future HDP 3.x release, which will return the stability of this features as it was available in HBase 1.x/HDP 2.x releases. Workaround There is currently no workaround. |
|||
BUG-105983 | N/A | HBase |
Description of the problem or behavior An HBase service (Master or RegionServer) stops participating with the rest of the HBase cluster. Associated error message The service's log contains stack traces that contain "Kerberos principal name does NOT have the expected hostname part..." Workaround Retrying the connection solves the problem. |
|||
BUG-96402 | HIVE-18687 | Hive | Description of the problem or behavior When HiveServer2 is running in HA (high-availability) mode in HDP 3.0.0, resource plans are loaded in-memory by all HiveServer2 instances. If a client makes changes to a resource plan, the changes are reflected (pushed) only in the HiveServer2 to which the client is connected. Workaround In order for the resource plan changes to be reflected on all HiveServer2 instances, all HiveServer2 instances has to be restarted so that they can reload the resource plan from metastore. |
|||
BUG-88614 | N/A | Hive | Description of the problem or behavior RDMBS schema for Hive metastore contains an index HL_TXNID_INDEX defined as CREATE INDEX HL_TXNID_INDEX ON HIVE_LOCKS USING hash
(HL_TXNID) ; Hash indexes are not recommended by PostgreSQL. For more information, see https://www.postgresql.org/docs/9.4/static/indexes-types.html Workaround It's recommended that this index is changed to type
|
|||
BUG-60904 | KNOX-823 | Knox | Description of the problem or behavior When Ambari is being proxied by Apache Knox, the QuickLinks are not rewritten to go back through the gateway. If all access to Ambari is through Knox in the deployment, the new Ambari QuickLink profile may be used to hide and/or change URLs to go through Knox permanently. Future release will make these reflect the gateway appropriately. Workaround There is currently no workaround. |
|||
BUG-107399 | N/A | Knox | Description of the problem or behavior After upgrade from previous HDP versions, certain topology deployments may return a 503 error.This includes, but may not be limited to, knoxsso.xml for the KnoxSSO enabled services. Workaround When this is encountered, a minor change through Ambari (whitespace even) to the knoxsso topology (or any other with this issue) and restart of the Knox gateway server should eliminate the issue. |
|||
BUG-110463 | KNOX-1434 | Knox |
Description of the problem or behavior Visiting Knox Admin UI in any browser (Firefox / Chrome) sets the HTTP Strict Transport Security (HSTS) header for the host where Knox is running. Any subsequent request to other service on the same host (e.g. Graphana, Ranger etc.) over HTTP would get redirected to HTTPS due to this header. Please note that, this HSTS header is disabled in all Knox topologies by default. For more information, see https://knox.apache.org/books/knox-1-1-0/user-guide.html#HTTP+Strict+Transport+Security Impact All the non-SSL requests to other services get redirected automatically to HTTPS and would result in SSL errors like: SSL_ERROR_RX_RECORD_TOO_LONG or some other error. Workaround Use the manager.xml topology and remove the setting from the WebAppSec provider. You can do this using the Knox Admin UI. After you have removed the setting, close your browser or clear the cookies. |
|||
BUG-106266 | OOZIE-2769, OOZIE-3085, OOZIE-3156, OOZIE-3183 | Oozie | Description of the problem or behavior When check() method of SshActionExecutor gets invoked, Oozie will execute the command "ssh <host-ip> ps -p <pid>" to determine whether the SSH action completes or not. However if the connection to the host fails during the action status check, the command will return with an error code, but the action status will be determined as OK, which may not be correct. Associated error message SSH command exits with the exit status of the remote command or with 255 if an error occurred. Workaround Retrying the connection solves the problem. |
|||
BUG-121014 | N/A | Oozie |
Description of the problem or behavior If you are using a non-rpm based Linux distribution, for example, Debian, Ubuntu, Oozie cannot start after upgrade due to incorrect Apache Tomcat server version present in your operating system. Workaround Install Apache Tomcat 7 or later manually after you finished the upgrade. |
|||
BUG-95909 | RANGER-1960 | Ranger | Description of problem or behavior Delete snapshot operation fails even if the user has Administrator privilege because the namespace is not considered in the Authorization flow for HBase Ranger plugin. Associated error message ERROR: org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient permissions for user '<username>' (action=admin) Workaround For the delete snapshot operation to succeed, you need to be system-wide Administrator privileges. |
|||
BUG-89714 | N/A | Ranger | Description of the problem or behavior Sudden increase in Login Session audit events from Ranger Usersync and Ranger Tagsync. Associated error message If policy storage DB size increases suddenly, then periodically backup and purge 'x_auth_sess' table periodically. Workaround Take a backup of the policy DB store and purge 'x_auth_sess' table from Ranger DB schema. |
|||
BUG-101227 | N/A | Spark | Description of the problem or behavior When Spark Thriftserver has to run several queries concurrently, some of them can fail with a timeout exception when performing broadcast join. Associated error message
Workaround You can resolve this issue by increasing the spark.sql.broadcastTimeout value. |
|||
BUG-109979 | N/A | Spark |
Description of the problem or behavior YARN NodeManagers fail to start after a Spark patch upgrade due to YarnShuffleService CNF. Workaround To resolve this problem you must: Replace "{{spark2_version}}" with "${hdp.version}" in "yarn.nodemanager.aux-services.spark2_shuffle.classpath" property value. For example, old value "{{stack_root}}/{{spark2_version}}/spark2/aux/*" -> new value "{{stack_root}}/${hdp.version}/spark2/aux/*" |
|||
BUG-65977 | SPARK-14922 | Spark |
Description of the problem or behavior Since Spark 2.0.0, `DROP PARTITION BY RANGE` is not supported grammatically. In other words, only '=' is supported while `<', '>', '<=', '>=' aren't. Associated error message
Workaround To drop partition, use the exact match with '='.
|
|||
N/A | N/A | Spark |
Description of the problem or behavior Spark Structured Streaming job is intermittently reading from the beginning instead of the last offset maintained by the checkpoint. Associated error message
Workaround Upgrade to the Cloudera Data Platform (CDP). |
|||
BUG-114383 | N/A | Storm |
Description of the problem or behavior Submitting a topology to Storm fails. You see an error when you submit a topology to Storm. Associated error message The following error message is displayed when submitting a topology with the stack trace org.apache.storm.hack:
Workaround Find if `client.jartransformer.class` is present in the Storm configuration using the Ambari user interface. If the config is present, please set to ' ' and restart Storm service to take effect.
|
|||
BUG-106917 | N/A | Sqoop | Description of the problem or behavior In HDP 3, managed Hive tables must be transactional
( Associated error message
Workaround When using --hive-import with --as-parquetfile , users must also
provide --external-table-dir with a fully qualified
location of the table:
|
|||
BUG-102672 | N/A | Sqoop | Description of the problem or behavior In HDP 3, managed Hive tables must be transactional (hive.strict.managed.tables=true). Writing transactional table with HCatalog is not supported by Hive. This leads to errors during HCatalog Sqoop imports if the specified Hive table does not exist or is not external. Associated error message Store into a transactional table db.table from Pig/Mapreduce is not supportedWorkaround Before running the HCatalog import with Sqoop, the user must create the external table in Hive. The --create-hcatalog-table does not support creating external tables. |
|||
BUG-109607 | N/A | YARN |
Description of the problem or behavior With wire encryption enabled with containerized Spark on YARN with Docker, Spark submit fails in "cluster" deployment mode. Spark submit in "client" deployment mode works successfully. Workaround There is currently no workaround. |
|||
BUG-110192 | N/A | YARN | Description of the problem or behavior When YARN is installed and configured with KNOX SSO alone, Application Timeline Server web endpoint blocks remote REST calls from YARN UI and displays a 401 Unauthorized error. Associated error message 401 Unauthorized error.Workaround Administrator needs to configure Knox authentication handler for Timeline Server and existing hadoop level configuration. Administrator needs to tune the following cluster specific configurations. Values for the last two property is in the hadoop.authentication.* properties file.
|
|||
BUG-123606 | YARN-10070 | YARN | Description of the problem or behavior If the application-tag-based-placement property is enabled and the hive.doAs property is set to false, the ResourceManager may crash because of the issue listed in YARN-10070. In some scenarios, setting this mapping rule for the proxy user solved the issue. However, in some scenarios, setting this property did not resolve the issue. Workaround There is currently no workaround for this. |
|||
RMP-11408 | ZEPPELIN-2170 | Zeppelin | Description of the problem or behavior Zeppelin does not show all WARN messages thrown by spark-shell at the Zeppelin's notebook level. Workaround There is currently no workaround for this. |
|||
N/A | N/A | N/A |
Description of the problem or behavior Open JDK 8u242 is not supported as it causes Kerberos failure. Workaround Use a different version of Open JDK. |
|||
N/A | N/A | HBase and Phoenix | Description of the problem or behavior If you are upgrading from HDP 3.0.0 and above to HDP 3.1.5 and Phoenix is part of your HDP cluster, you must apply HBASE-20781, HBASE-23044, and HBASE-25459 (this addresses PHOENIX-5250 as well). For more information, see TSB-494.Workaround Perform a rolling restart of HBase if the number of ZNodes under hbase-secure/splitWAL in ZooKeeper is greater than 8000. Upgrade (recommended): Upgrade to the latest version of CDP containing the fix. |
Technical Service Bulletin | Apache JIRA | Apache component | Summary |
---|---|---|---|
TSB-405 | N/A | N/A | Impact of LDAP Channel Binding and LDAP signing changes in Microsoft
Active Directory Microsoft has introduced changes in LDAP Signing and LDAP Channel Binding to increase the security for communications between LDAP clients and Active Directory domain controllers. These optional changes will have an impact on how 3rd party products integrate with Active Directory using the LDAP protocol. Workaround Disable LDAP Signing and LDAP Channel Binding features in Microsoft Active Directory if they are enabled For the latest update on this issue see the corresponding Knowledge article:TSB-2021 405: Impact of LDAP Channel Binding and LDAP signing changes in Microsoft Active Directory |
TSB-406 | N/A | HDFS | CVE-2020-9492 Hadoop filesystem bindings (ie: webhdfs) allows credential
stealing WebHDFS clients might send SPNEGO authorization header to remote URL without proper verification. A maliciously crafted request can trigger services to send server credentials to a webhdfs path (ie: webhdfs://…) for capturing the service principal For the latest update on this issue see the corresponding Knowledge article: TSB-2021 406: CVE-2020-9492 Hadoop filesystem bindings (ie: webhdfs) allows credential stealing |
TSB-453 | HBASE-25206 | HBase | HBASE-25206: snapshot and cloned table corruption when original table is
deleted HBASE-25206 can cause data loss either through corrupting an existing hbase snapshot or destroying data that backs a clone of a previous snapshot. For the latest update on this issue see the corresponding Knowledge article: TSB 2021-453: HBASE-25206 "snapshot and cloned table corruption when original table is deleted" |
TSB-458 | N/A | HDFS | Possible HDFS Erasure Coded (EC) Data Files Corruption in EC
Reconstruction Cloudera has detected two bugs that can cause corruption of HDFS Erasure Coded (EC) files during the data reconstruction process. For the latest update on this issue see the corresponding Knowledge article: Cloudera Customer Advisory: Possible HDFS Erasure Coded (EC) Data Files Corruption in EC Reconstruction |
TSB-463 | N/A | HBase | HBase Performance Issue The HDFS short-circuit setting dfs.client.read.shortcircuit is overwritten to disabled by hbase-default.xml. HDFS short-circuit reads bypass access to data in HDFS by using a domain socket (file) instead of a network socket. This alleviates the overhead of TCP to read data from HDFS which can have a meaningful improvement on HBase performance (as high as 30-40%). For the latest update on this issue see the corresponding Knowledge article: TSB 2021-463: HBase Performance Issue |
TSB-471 | HBASE-23044 | HBase | Upgrading to HDP 3.1.5 can cause silent data loss on HBase due to
HBASE-23044 The HBase version shipped in HDP 3.1.5 does not include HBASE-23044. Without that fix, HBase can experience silent data loss due to regions being deleted by the HBase master cleaning incorrect entries in the hbase:meta catalog. For the latest update on this issue see the corresponding Knowledge article: TSB 2021-471: Upgrading to HDP 3.1.5 can cause silent data loss on HBase due to HBASE-23044 |
TSB-480/2 | HIVE-24224 | Hive | Hive ignores the property to skip a header or footer in a compressed
file Incorrect results can occur running SELECT queries if count value is greater than 0. For the latest update on this issue see the corresponding Knowledge article: TSB 2021-480.2: Hive ignores the property to skip a header or footer in a compressed file |
TSB-494 | HBase | Accumulated WAL Files Cannot be Cleaned up When Using Phoenix Secondary
Global Indexes The Write-ahead-log (WAL) files for Phoenix tables that have secondary global indexes defined on them, cannot be automatically cleaned up by HBase, leading to excess storage usage and possible error due to filling up the storage. Workaround Perform rolling restart of HBase if the number of znodes under hbase-secure/splitWAL in ZooKeeper is greater than 8000. For the latest update on this issue see the corresponding Knowledge article: TSB 2021-494: Accumulated WAL Files Cannot be Cleaned up When Using Phoenix Secondary Global Indexes |
|
TSB-508 | HIVE-25159 | Hive | Removing support for Order By statement from HWC LLAP mode The Order By statement is no longer supported for HWC LLAP mode. Workaround
|
TSB-497 | N/A | Solr | CVE-2021-27905: Apache Solr SSRF vulnerability with the Replication
handler The Apache Solr ReplicationHandler (normally registered at "/replication" under a Solr core) has a "masterUrl" (also "leaderUrl" alias) parameter. The “masterUrl” parameter is used to designate another ReplicationHandler on another Solr core to replicate index data into the local core. To help prevent the CVE-2021-27905 SSRF vulnerability, Solr should check these parameters against a similar configuration used for the "shards" parameter. For the latest update on this issue, see the corresponding Knowledge article: TSB 2021-497: CVE-2021-27905: Apache Solr SSRF vulnerability with the Replication handler |
TSB-512 | N/A | HBase | HBase MOB data loss HBase tables with the MOB feature enabled may encounter problems which result in data loss. For the latest update on this issue, see the corresponding Knowledge article: TSB 2021-512: HBase MOB data loss |