Known Issues in Apache Hadoop
Learn about the known issues in Hadoop, the impact or changes to the functionality, and the workaround.
- CDPD-10352: Hive on Tez cannot run certain queries on tables stored in encryption zones. This occurs when KMS connection is SSL encrypted and a self-signed certificate is used. You may see SSLHandshakeException in Hive logs in this case.
- There are two workarounds: 1. You can install a self-signed SSL certificate into cacerts file in all hosts. 2. You can copy ssl-client.xml to a directory that is available in all hosts. Then you must set the tez.aux.uris=path-to-ssl-client.xml property in Hive on Tez advanced configuration.
Technical Service Bulletins
- TSB 2021-434: KMS Load Balancing Provider Fails to invalidate Cache on Key Delete
- The KMS Load balancing Provider has not been correctly invalidating the cache on key
delete operations. The failure to invalidate the cache on key delete operations can
result in the possibility that data can be leaked from the framework for a short period
of time based on the value of the
hadoop.kms.current.key.cache.timeout.ms
property. Its default value is 30,000ms. When the KMS is deployed in an HA pattern theKMSLoadBalancingProvider
class will only send the delete operation to one KMS role instance in a round-robin fashion. The code lacks a call to invalidate the cache across all instances and can leave key information including the metadata and key stored (the deleted key) in the cache on one or more KMS instances up to the key cache timeout. - Upstream JIRA
- Knowledge article
- For the latest update on this issue see the corresponding Knowledge article: TSB 2020-434: KMS Load Balancing Provider Fails to invalidate Cache on Key Delete