Fixed Issues in Cloudera Manager 7.11.3

Fixed issues in Cloudera Manager 7.11.3.

OPSAPS-65324: The default value of the Cloudera Manager redaction policy configuration CORE_SETTINGS Log and Query Redaction Policy (parameter name: redation_policy) is modified.

The lines Credit Card numbers (with separator) and Social Security numbers (with separator) are modified with the addition of \b symbols before and after the regular expression in the Search field to prevent unintended matching against HDFS block identifiers in the Datanode logs.

Cloudera Manager only applies this change in default value to CDP runtimes with version 7.1.9 and higher.

OPSAPS-47937: Errors while collecting host statistics data, the collectHostStatistics command is frequently timing out

While collecting host statistics data, the collectHostStatistics command is aborting after 150 seconds when the /var/log/messages file is too large. As a result of this, the host statistics data is missing from the diagnostic bundle.

This issue is fixed now by limiting the data size taken from the /var/log/messages file to 300 MB. The collectHostStatistics command now collects only 300 MB of the latest data from the /var/log/messages file to avoid timeouts.

OPSAPS-68044: Certain Cloudera Runtime services (such as HDFS) might fail to start on RHEL 8.8 with FIPS mode enabled.

While configuring Cloudera Manager cluster for installation or upgrade process on RHEL 8.8 with FIPS mode enabled, certain Cloudera Runtime services such as HDFS might fail to start and throws the following error:OpenSSL internal error: FATAL FIPS SELFTEST FAILURE

This issue is fixed now.

OPSAPS-66052: Cloudera Manager is unable to execute certain operations when you enable the noexec option for the /tmp directory

When you enable noexec option for the /tmp directory of the cluster hosts, Cloudera Manager is not able to complete some operations, most notably for the Add Hosts workflow and while generating TLS certificates.

This behavior is resolved now and Cloudera Manager functions normally when you enable the noexec option for the /tmp directory on cluster hosts.

OPSAPS-67942: Installation failed due to schematool error
Setting the hive.hook.proto.base-directory for Hive Metastore (HMS) in hive-site.xml is causing sys.db creation to fail because of incompatibility issues between Cloudera Manager 7.11.3 and CDH 7.1.7 SP1/SP2. This patch addresses the issue and sets the above configuration only if the CDH version of Hive is atleast CDH 7.1.8.
OPSAPS-67968: An issue while upgrading to Cloudera Manager 7.11.3 without upgrading the Cloudera Runtime version 7.1.7 SP2
With this fix, you can now upgrade to Python 3 compliant Cloudera Manager 7.11.3 without upgrading the Cloudera Runtime version (Cloudera Runtime 7.1.7 SP2 and below versions - which are not Python 3 compliant). Cloudera Manager 7.11.3 now supports Cloudera Runtime 7.1.7 SP2 and below versions.

Cloudera Manager 7.11.3 typically supports Python 3.8 version. Cloudera Manager 7.11.3 also supports Python 3.9 version when running on RHEL 9.1 operating system.

OPSAPS-63724
By default, the snapshot diff-based (incremental) HDFS - HDFS replication falls back to bootstrap (full file listing / FFL) replication when there are unexpected target-side changes. By enabling this workaround, certain target-side changes are tolerated by incremental replication without falling back to FFL. Note that when source side HDFS moves are expected to be synchronized the workaround mentioned in OPSAPS-66197 is recommended to be activated.
Activating this workaround:
  • Set "com.cloudera.enterprise.distcp.check-for-safe-to-merge-target-side-changes.enabled" to "true" in the "YARN Service Advanced Configuration Snippet (Safety Valve) for core-site.xml" on the destination side, and then restart the stale services / redeploy client configuration. (Note that enabling OPSAPS-66197 uses a different advanced configuration snippet).
  • In an incremental replication run, check the stderr log of the first "Run Pre-Filelisting Check" and make sure the INFO distcp.PreCopyListingCheck: Check for safe to ignore (merge) target side changes is enabled. message appears.
Usage notes:
  • When a safe-to-ignore target change is found, "Run Pre-Filelisting Check" prints the following messages to its stderr log:
    INFO util.DistCpUtils: There are changes on target, falling back to regular distcp
    NFO distcp.PreCopyListingCheck: The changes on target are safe to ignore.
    INFO distcp.PreCopyListingCheck: Note that it is up to the downstream processing steps if it falls back to full file listing or continue with snapshot diff execution
    INFO distcp.PreCopyListingCheck: Changes to target: true
    INFO distcp.PreCopyListingCheck: Changes to target are safe to ignore: true
    
  • When target changes are not found safe-to-ignore, then the details about the reason appears in the messages:
    INFO distcp.PreCopyListingCheck: Changes to target: true
    INFO distcp.PreCopyListingCheck: Changes to target are safe to ignore: false
    
Allowed changes:
The following destination side changes (snapshot diff entries) are considered safe-to-ignore when this workaround is enabled:
  • Additions ( + ): only if they are empty directories or contain only directories, all present on the source as directory.
  • Deletions ( - ): only the source side path also missing.
  • Modifications (M): must have an immediate, allowed ( + ) or ( - ) child path.
OPSAPS-63930
By default, snapshot diff-based (incremental) HDFS - HDFS replication uses a temp directory, created in the parent of replication destination directory to synchronize source-side rename and delete operations: deleted and renamed paths are first moved into this temporary directory, then the renamed ones will be moved to their target followed by the deletion of this temporary directory (thus deleting the paths scheduled to be deleted). Note that OPSAPS-63759 provides an optional behavior to execute individual deletes without these moves.

This behavior of incremental replication leads to failure and fallback to bootstrap (full file listing) replication when the replication process can not create this temporary directory (due to restrictive HDFS permissions) or when the replication destination contains one or more HDFS encryption zones (because HDFS moves can not cross encryption zone boundary).

This optional workaround solves these problems by executing rename operations in-place when possible, otherwise using the best possible temporary rename operations without the need of the above mentioned common temporary directory. Note that this workaround can be considered as a superset of OPSAPS-63759. That is when both are enabled, the current one is applied.

Activating this workaround:
  • Set HDFS service core-site.xml advanced configuration snippet (on the destination side) "com.cloudera.enterprise.distcp.direct-rename-and-delete.enabled" to "true".
  • In an incremental replication run, check the stderr log of the last "Trigger a HDFS replication job on one of the available HDFS roles." step, and make sure the INFO distcp.DistCpSync: Will use direct rename and delete (for non cloud target) when using snapshot diff based sync. Temp directory creation on the target will be skipped. message is displayed.
Adjusting delete logging: By default, every 100000 direct delete operations executed by this workaround are logged. This is useful for following the synchronization of large source side deletes. This default interval can be overridden by setting the "com.cloudera.enterprise.distcp.direct-delete.log-interval" advanced configuration snippet to an integer value greater than 0. Note that this advanced configuration snippet is shared with a workaround in OPSAPS-63759.
Usage notes: There can be conflicting source side renames and rename - delete interactions when their destination side replay need to use temporary renames (for example, a name swap between two paths using three renames). For these cases, the temporary rename destination will typically be next to the final rename destination (will share the same parent path) avoiding both above mentioned failure scenarios. Such temporary renames will be logged during execution like:
distcp.DistCpSync: Executing a temp rename: /test-repl-target/test-repl-source/file2 -> /test-repl-target/test-repl-source/file2748016654
After execution, the number of operations will also be logged like:
INFO distcp.DistCpSync: Synced 0 through-tmp/cloud rename(s) and 0 through-tmp delete(s) to target.
INFO distcp.DistCpSync: Synced 2 direct delete(s) to target.
INFO distcp.DistCpSync: Synced 2 direct rename(s) to target.
INFO distcp.DistCpSync: Used 2 additional temporary rename(s) during syncing.
OPSAPS-66197
Snapshot diff-based (incremental) HDFS to HDFS replication might corrupt destination directory structure when:
  • there is a source side HDFS move/rename operation.
  • the (move/rename) target on the replication destination is an existing unexpected directory.

OPSAPS-63724 introduced an optional workaround where the target-side directory creations are ignored. When a colliding source-side move is expected both workarounds are recommended to be activated.

Workaround:
  • Set the HDFS service core-site.xml advanced configuration snippet (also called safety valve) (on the destination side) "com.cloudera.enterprise.distcp.overwrite-merge-existing-rename-targets.enabled" to "true". (Note that enabling the workaround in OPSAPS-63724 uses a different advanced configuration snippet).
  • In an incremental replication run, check the stderr log of the last "Trigger a HDFS replication job on one of the available HDFS roles." step and make sure that the INFO distcp.DistCpSync: Overwrite merge of already existing move targets is enabled message is displayed.
Usage notes:
  • When there is a conflicting replicated source side move/rename operation where - on the destination side - the target exists, there will be a merge attempt:
  • When the source side moved path is a directory and the conflicting destination side path is also a directory their contents will be merged.
  • When the destination side conflicting path is a file it will be overwritten by the replicated move.
  • When the source side moved path is a file the destination side conflicting path will be overwritten by the replicated move.
  • In case of other failures replication is expected to fall back to bootstrap (full file listing) run.

Details of merge activity (when there is a conflicting path) is logged in the same stderr log with messages containing INFO distcp.DistCpSync$OverwriteMergeRenameBehavior.

OPSAPS-63558
Previously, DistCp did not correctly report renames and deletes in case of snapshot diff-based HDFS replications. This change extends DistCp's output report to contain counters related to snapshot diff-based replications beside the already reported counters. These counters are added to the following group: com.cloudera.enterprise.distcp.DistCpSyncCounter.
The following new counters are added:
  • FILES_MOVED_TO_COMMON_TEMP_DIR: Number of files and directories moved to a common temporary directory to be renamed or deleted later in the process. This counter is the sum of FILES_DELETED_VIA_COMMON_TEMP_DIR and FILES_RENAMED_VIA_COMMON_TEMP_DIR.
  • FILES_DELETED_VIA_COMMON_TEMP_DIR: Number of files moved to a common temporary directory to be deleted later.
  • FILES_RENAMED_VIA_COMMON_TEMP_DIR: Number of files moved to a common temporary directory first, then moved to their final place.
  • FILES_DIRECT_DELETED: Number of files deleted directly. This is a feature introduced in OPSAPS-63759.
  • FILES_DIRECT_RENAMED: Number of files renamed directly, without moving to an intermediate temporary directory. This is a feature introduced in OPSAPS-63930.
  • FILES_DIRECT_RENAMED_VIA_TEMP_LOCATION: Number of files moved to an intermediate temporary directory and then renamed. This intermediate temporary directory is different from the common temporary directory referenced in the FILES_RENAMED_VIA_COMMON_TEMP_DIR counter's description. This is also related to OPSAPS-63930.
The common temporary directory is a sibling of the replication target directory.
The values of FILES_DELETED_VIA_COMMON_TEMP_DIR and FILES_DIRECT_DELETED are also aggregated in the replication result as the number of files deleted.
OPSAPS-65831: DistCp job deletes multiple threads for bootstrap replication
Performance of bootstrap or FFL (full file listing) replication for destination-side delete of paths missing from the source is improved with the following optional behaviors.
  • FFL replication schedules all the missing paths for deletion regardless of parent relationships. When the com.cloudera.enterprise.distcp.parent-only-delete.enabled safety valve is set to "true", only the topmost deleted paths are scheduled for deletions and their descendants or children cannot be accessed. This is optional and by default turned off (which preserves the previous behavior).
  • Delete requests can be issued from multiple threads concurrently to improve performance, and can be enabled and configured using the following safety valves:
    • com.cloudera.enterprise.distcp.parallel-ffl-delete.enabled. Default is "false".
    • com.cloudera.enterprise.distcp.parallel-ffl-delete.threads. Default is 20.
    • com.cloudera.enterprise.distcp.parallel-ffl-delete.max-queue-size. Default is 10000.
OPSAPS-65823
Added periodic progress logging during copy listing. Optionally, the performance statistics of file system operations (min/max/avg/total time since last log and since beginning of copy listing) are also printed.

When the bootstrap (full file listing) run launches target side copy listing (to handle deletions) the reducer log also contains the log messages of the reducer activity. Overview of this activity (status reducer step; listed path count) is also logged on the main DistCp process. Job counters containing reducer timing measurements and listed target side path count are also added.

By default, the interval of copy listing logging is 10 seconds which can be adjusted by setting the com.cloudera.enterprise.distcp.copy-listing.progress-log.interval.seconds configuration parameter in the HDFS replication core-site.xml configuration.

Setting detailed log is done by setting the com.cloudera.enterprise.distcp.copy-listing.detailed-progress-log.enabled configuration parameter to "true".

Disabling progress logging is done by setting the com.cloudera.enterprise.distcp.copy-listing.basic-progress-log.enabled configuration parameter to "false".

For testing purposes, the poll interval to check the progress of the MR job (from DistCp main process) can be set with the com.cloudera.enterprise.distcp.job-poll-interval.seconds configuration parameter.

OPSAPS-65104
Importing table column statistics for Hive replication is thread-safe but causes performance regression.

To resolve this issue, perform the following steps:

  1. Go to the Cloudera Manager > Clusters > [*** Hive Service***] > Configuration tab.
  2. Locate the hive_replication_env_safety_valve property.
  3. Add only one of the following key-value pair depending on your requirement:
    • COLUMN_STATS_IMPORT_MULTI_THREADED=true

      This ensures that the column statistics import operation is multi-threaded for Hive replication.

    • SKIP_COLUMN_STATS_IMPORT=true

      This ensures that the column statistics import is skipped entirely.

OPSAPS-66517: Changing password from Home > username > Change Password bypasses validation

In Cloudera Manager, while changing the password for the current user from Home > username > Change Password, password validations are completely bypassed. This issue is fixed now and it now validates the password before saving the new password.

OPSAPS-67490: Cloudera Manager unable to deploy the Hadoop User Group Mapping LDAP Bind User Password configuration completely

Fixed an issue where Cloudera Manager is unable to deploy complete configurations from Core Configurations (CORE_SETTINGS-1) to client configurations under local /etc directory in the JCEKS file.

OPSAPS-65267

Cross-site sessions were prohibited in the latest browsers because of SameSite header by default was set to Lax. This issue is fixed now by adding SameSite=None with a secure attribute for the session cookies that are created after login so that cross-site secure cookies are supported.

The secure attribute works only with TLS-configured clusters. You must have a TLS-enabled cluster for cross-site sessions to work.

OPSAPS-66080: Optimize pattern.compile in CspUtils.java

When Cloudera Manager is running, compiling the regex pattern for CSP multiple times causes other threads to wait, and that results in the slowness of Cloudera Manager. This issue is fixed now.

OPSAPS-66198: On Cloudera Manager UI, the Install Oozie ShareLib command fails to install shared libraries for the Oozie service

On Cloudera Manager UI, the Install Oozie ShareLib command fails to install shared libraries for the Oozie service if you configure the Kerberos krb_krb5_conf_path file path at the non-default file path. This issue is fixed now.

OPSAPS-67152: Cloudera Manager does not allow you to update some configuration parameters.

Cloudera Manager does not allow you to set to "0" for the dfs_access_time_precision and dfs_namenode_accesstime_precision configuration parameters.

You will not be able to update dfs_access_time_precision and dfs_namenode_accesstime_precision to "0". If you try to enter "0" in these configuration input fields, then the field gets cleared off and results in a validation error: This field is required. This issue is fixed now.

OPSAPS-66023: Error message about an unsupported ciphersuite while upgrading or installing cluster with the latest FIPS compliance

When upgrading or installing a FIPS enabled cluster, Cloudera Manager is unable to download the new CDP parcel from the Cloudera parcel archive.

Cloudera Manager displays the following error message:

HTTP ERROR 400 java.net.ConnectException: Unsupported ciphersuite TLS_EDH_RSA_WITH_3DES_EDE_CBC_SHA

This issue is fixed now by correcting the incorrect ciphersuite selection.
OPSAPS-67897, OPSAPS-68023
Ozone replication policies do not fail and the files on the target cluster are deleted successfully when you set the Advanced Options > Delete Policy option to 'Delete to Trash' or 'Delete Permanently' during the Ozone replication policy creation process in CDP Private Cloud Base Replication Manager UI, or if you set the "removeMissingFiles" parameter to 'true' while creating the Ozone replication policy using Cloudera Manager REST APIs.