Known Issues and Workarounds in Key Trustee KMS

CDH upgrade failure

When upgrading to Key Trustee KMS 5.15.0 or 5.15.1 from Key Trustee KMS 5.14.0 or lower, and performing a rolling restart (instead of a full restart), the first Key Trustee KMS instance to restart may fail to come up and present the error:
"Unable to verify private key match between KMS hosts. If the system has been recently upgraded, DO NOT TAKE FURTHER ACTION and contact your support representative as soon as possible. If this is a new installation, verify private key files have been synched between all KMS hosts. Aborting to prevent data inconsistency."

Affected Versions: 5.15.0, 5.15.1

Fixed in Version: 5.16.0

Cloudera Bug: KT-6499

Workaround: If possible, perform a full restart instead of a rolling restart.

If you cannot execute a full restart, then add the following line to the /var/lib/kms-keytrustee/keytrustee/.keytrustee/keytrustee.conf file on all Key Trustee KMS instances, and then restart the Key Trustee KMS that failed:
"FINGERPRINT_VALIDATED": "True"

Error when enabling Key Trustee KMS HA mode

When a new Key Trustee KMS instance is added to an existing Key Trustee KMS service, the new instance may initially fail to start up and return the error message:
Unable to verify private key match between KMS hosts.
This error will be corrected automatically after the GPG private keys are synchronized and the KMS service is restarted to pick up the new configuration. No user intervention is required.

Affected Versions: 5.15.0

Fixed in Version: 5.16.0

Cloudera Bug: KT-6471

Workaround: None.

Validation fails if hostname command returns shortname

If the hostname command on the OS returns shortname, and the core-site.xml of the KMS process has a hadoop.security.key.provider.path with a fully qualified domain name (FQDN), then the znodes will be created with the shortname. Consequently, when KMS 1 checks the fingerprint of KMS 2, it will expect the FQDN as the znode, and fail the validation.

Affected Versions: 5.15.0

Fixed in Version: 5.15.1

Cloudera Bug: KT-6412

Workaround: In /var/lib/kms-keytrustee/keytrustee/.keytrustee/keytrustee.conf, change the value of FINGERPRINT_VALIDATED from True to False.

Incorrect default KMS ACL values allow remote access to purge and undelete API calls on encryption zone keys

The Navigator Key Trustee KMS includes two API calls in addition to those in Apache Hadoop KMS: purge and undelete. The KMS ACL values for these commands are keytrustee.kms.acl.PURGE and keytrustee.kms.acl.UNDELETE, respectively. The default value for the ACLs in Key Trustee KMS 5.12.0 and 5.13.0 is "*", which allows anyone who knows the name of an encryption zone key and has network access to the Key Trustee KMS to make those calls against known encryption zone keys. The UNDELETE command will result in the recovery of a previously deleted, but not purged, key. The key will still be protected by normal ACLs and is not exposed. The PURGE command will permanently delete a key in active use, resulting in loss of access to encrypted HDFS data.

Affected Versions:
  • Navigator Key Trustee KMS 5.12.0, 5.13.0
  • Cloudera Manager 5.12.0, 5.12.1, 5.12.2
  • Cloudera Manager 5.13.0, 5.13.1

Fixed in Version: 5.14.0

Cloudera Bug: DOCS-2910

Workaround: Use Cloudera Manager to set the values of the ACLs for purge (keytrustee.kms.acl.PURGE) and undelete (keytrustee.kms.acl.UNDELETE) to be empty; this denies access to all calls to those functions.

To set the values:

  1. Log into Cloudera Manager as an administrative user.
  2. Click on the Key Trustee KMS service.
  3. Click on Configuration.
  4. In the Search box, type “ACL”.
  5. At the bottom of the section Key Management Server Proxy Advanced Configuration Snippet (Safety Valve) for kms-acls.xml add a new ACL by clicking the + icon.
  6. In the Name field add “keytrustee.kms.acl.PURGE”.
  7. Repeat step 5, and in the Name field add “keytrustee.kms.acl.UNDELETE”.
  8. Scroll to the bottom of the page and click Save.
  9. Use Cloudera Manager to perform a rolling restart of the Navigator Key Trustee KMS service.

Cannot re-encrypt an encryption zone if a previous re-encryption on it was canceled

When canceling a re-encryption on an encryption zone, the status of the re-encryption may continue to show "Processing". When this occurs, future re-encrypt commands for this encryption zone will fail inside the NameNode, and the re-encryption will never complete.

Affected Version: 5.13.0

Fixed Version: 5.13.1

Cloudera Bug: CDH-59073

Workaround: To halt, or remove the "Processing" status for the encryption zone, re-issue the cancel re-encryption command on the encryption zone. If a new re-encryption command is required for this encryption zone, restart the NameNode before issuing the command.

Adding Key Trustee KMS 5.4 to Cloudera Manager 5.5 displays warning

Adding the Key Trustee KMS service to a CDH 5.4 cluster managed by Cloudera Manager 5.5 displays the following message, even if Key Trustee KMS is installed:

"The following selected services cannot be used due to missing components: keytrustee-keyprovider. Are you sure you wish to continue with them?"

Affected Version: 5.4

Workaround: Verify that the Key Trustee KMS parcel or package is installed and click OK to continue adding the service.

The Key Trustee KMS service fails to start if the Trust Store is configured without also configuring the Keystore

If you configure the Key Trustee KMS service Key Management Server Proxy TLS/SSL Certificate Trust Store File and Key Management Server Proxy TLS/SSL Certificate Trust Store Password parameters without also configuring the Key Management Server Proxy TLS/SSL Server JKS Keystore File Location and Key Management Server Proxy TLS/SSL Server JKS Keystore File Password parameters, the Key Trustee KMS service does not start.

Workaround: Configure all Trust Store and Keystore parameters.

Key Trustee KMS backup script fails if PostgreSQL versions lower than 9.3 are installed

If PostgreSQL versions lower than 9.3 are installed on the Key Trustee KMS host, the ktbackup.sh script fails with an error similar to the following:

pg_dump: server version: 9.3.11; pg_dump version: 9.2.14
pg_dump: aborting because of server version mismatch 

Workaround: Uninstall the lower PostgreSQL version.