Fixed Issues

Review the list of Cloudera Manager issues that are resolved in Cloudera Manager 7.13.1 and its cumulative hotfixes.

Cloudera Manager 7.13.1 CHF7 (7.13.1.700)

OPSAPS-71617: Kafka related data is not collected during Diagnostic Bundle creation: 7.13.1.700; Fixed a bug when some Kafka related data was not collected during a Diagnostic Bundle creation. This fix is only available for clusters running Cloudera Runtime 7.3.1 or higher.
OPSAPS-76032: Livy fails to start due to misconfiguration of the filesystem scheme: 7.13.1.700; Cloudera Manager prepended the default filesystem URL, resulting in an invalid URL with double slashes, preventing Livy from creating the recovery directory. This issue is now fixed, and Cloudera Manager truncates double slashes in the fs.defaultFS value.
DMX-4598, DMX-4558:Transfer step fails for Iceberg replication policies: 7.13.1.700; Previously, when you replicated the Iceberg tables using Iceberg replication policies from CDP Private Cloud Base 7.1.9 to Cloudera on premises 7.3.1, the transfer step failed while finding and copying the catalog.json file. This issue is now fixed.
DMX-4599, DMX-4572:The “catalog.json” file source staging location: 7.13.1.700; Previously, the catalog.json file was in the source staging location. To use this file during the Iceberg replication policy jobs, the file location has now been moved to the source warehouse location.
DMX-4600, DMX-4556: SyncCLI step fails to parse the “catalog.json” file: 7.13.1.700; Previously, the syncCLI step during Iceberg replication policy job runs failed to parse the catalog.json file when replicating from a CDP Private Cloud Base 7.1.9 source cluster to a Cloudera on premises 7.3.1 target cluster. This issue is now fixed.
DMX-4601, DMX-4580: The “catalog.json” file is not generated based on the Apache Iceberg version: 7.13.1.700; Previously, the catalog.json file generated during the Iceberg replication policy runs defaulted to Apache Iceberg version v2. This issue is now fixed by introducing the -cv flag to specify the correct Apache Iceberg version during the generation process. The -cv flag can be set to v1 and v2 to match the correct Apache Iceberg version.
DMX-4602, DMX-4584: Iceberg replication policy fails to copy “catalog.json” file when the transfer step runs on the target cluster: 7.13.1.700; Previously, when you replicated the Iceberg tables between cloud storage buckets using Iceberg replication policies and the transfer step ran on the target cluster, the catalog.json file was not copied from the source warehouse to the target staging location. This issue is now fixed.

Cloudera Manager 7.13.1 CHF6 (7.13.1.600)

OPSAPS-75123: The KdcLoginMonitor failed to clean up temporary Kerberos Ticket Granting Tickets (TGTs) after use, and the system lacked control over the automatic Key Version Number (KVNO) validation process

7.13.1.600

In Cloudera Manager 7.13.1 CHF5 (and lower versions), the fetch_kvno.sh script (executed every 10 minutes by the KdcLoginMonitor) created Kerberos TGTs using kinit but failed to clean them up.

This occurred because the script ended with

exec
              kvno

, preventing the necessary cleanup steps. This lack of cleanup caused several issues:

Persistence of TGTs: Created persistent, growing Kerberos ticket cache files (for example, /tmp/krb5cc_*) for each service user.
Interference: Caused possible interference with other Kerberos commands.
Raised customer concerns about Cloudera Manager automatically creating TGTs without authorization or cleanup.

This issue is fixed now and the TGTs are isolated and destroyed automatically.

However, if a you wish to disable the automatic KVNO validation feature entirely, you can set the following environment variable:

export CMF_FF_KERBEROS_KVNO_VALIDATION="false"

The fix addresses the issue in three ways: isolation, cleanup, and control:

Isolation: TGTs now go to a separate, dedicated directory: /tmp/cm-check-kvno/.
Cleanup: The script was updated to run kdestroy, ensuring TGTs are removed immediately after use.
Control: A new feature flag was added to enable/disable automatic KVNO validation.

You can encounter this problem if:

You are running on Cloudera Manager 7.13.1 CHF5 (and lower versions).
Kerberos is enabled in your cluster.
The KVNO validation script is running every 10 minutes, resulting in growing Kerberos ticket cache files under /tmp.

OPSAPS-74673: The YARN Poller fails with a NullPointerException if the logAggregationStatus is null.

7.13.1.600

This issue is now resolved by defaulting a null logAggregationStatus to DISABLED. This change prevents log export for non-compliant applications while maintaining metadata processing functionality.

OPSAPS-74669:File uploads to Databus using presigned S3 URLs will fail with a java.net.SocketException: Broken pipe (Write failed) error. This issue occurs when the file size is greater than 5 GB which is a limitation of Amazon S3 single part upload, S3 rejects the request and closes the socket while TP tries to write the file to the closed socket.

7.13.1.600

This issue is now resolved by enabling the expect-continue handshake for S3 uploads. This update prevents broken pipe errors and retry loops if a request is rejected.

OPSAPS-74460: Previously, Spark extractions did not fetch YARN application metadata. The Spark jobs could not fetch accurate queue information and did not produce an auxiliary-files/YARN/appInfo.json file in the extraction output.

7.13.1.600

This issue is now resolved.

OPSAPS-75291: Service Monitor (SMON) cannot run Filesystem tasks if Isilon is used.

7.13.1.600

The Service Monitor (SMON) failed to perform filesystem tasks, specifically failing to collect the Yarn Container Usage Metric Collection, when using PowerScale (Isilon) DFS.

This failure occurred because, for PowerScale, the configuration property hadoop.security.token.service.use_ip is typically set to false (recommended for PowerScale/Isilon).

A user would see this problem only when they enable the Yarn Container Usage Metric Collection metric. When enabled, SMON failed to process the request, and the respective error was logged in the SMON logs.

This issue is fixed now. The SMON is now updated to utilize the configuration if it is present (defaulting to True) and perform the Filesystem tasks.

OPSAPS-74288: Alert publisher cannot send email alerts due to missing JAR

7.13.1.600

Alert publisher cannot send email alerts due to missing camel-attachments-3.14.9.jar in Cloudera Manager.

This issue is fixed now.

OPSAPS-71581: Cloudera Manager Agent's append_properties function fails with the realpath: invalid option -- 'u' error when executed from service control scripts.

7.13.1.600

Errors appear on the standard error (stderr) log of Cloudera Data Platform (CDP) services when you are attempting to trigger the cloudera-config.sh script. The error log contains the following message: realpath: invalid option -- 'u'. This is caused by an incorrectly placed command-line flag in the script, which prevents some service configurations from loading correctly.

This issue is fixed now.

OPSAPS-74668: ozone.snapshot.deep.cleaning.enabled and ozone.snapshot.ordered.deletion.enabled configurations are missing with Cloudera 7.1.9 SP1 CHF and Cloudera Manager 7.13.1

7.13.1.600

Previously, two Ozone Manager configurations ozone.snapshot.deep.cleaning.enabled and ozone.snapshot.ordered.deletion.enabled were missing while using Cloudera 7.1.9 SP1 CHF after upgrading Cloudera Manager version from 7.11.3 to 7.13.1.400. This issue is fixed now.

OPSAPS-75290, OPSAPS-74994: The yarn_enable_container_usage_aggregation job is failing with “Null real user” error on Service Monitor.

7.13.1.600

The yarn_enable_container_usage_aggregation job is failing with "Null real user" error on Service Mnitor when the Yarn service is running on the computer cluster with Stub DFS, and when the Powerscale Service is running in the cluster with Powerscale DFS provider instead of HDFS.

To mitigate this error, Cloudera introduced the “DFS User to Impersonate (template name: dfs_user_to_impersonate)” configuration.

You must set the “DFS User to Impersonate” configuration to “hdfs” (recommended) or the respective File System user to resolve the impersonation user issue in Service Monitor.

OPSAPS-75100: Fixing Spark upgrade

7.13.1.600

Fixed the Spark log directory upgrade handler that used an incorrect URI scheme in Spark settings.

CDPD-87548, OPSAPS-74346, OPSAPS-74316: Enhancing Spark security

7.13.1.600

Changes in the generation of the spark.yarn.historyServer.address value to use the HTTPS address when SSL/TLS is enabled. The spark3.network.crypto.enabled new configuration property is now available to enable AES-based encryption.

OPSAPS-74479: Allow override of the Cloudera Manager-supplied PYTHONPATH in Livy

7.13.1.600

Added an Override PYTHONPATH field in the Cloudera Manager configuration for Livy. Users can set any PYTHONPATH value (including an empty string) to retain compatibility with older Python versions.

OPSAPS-74139, OPSAPS-74184: Base cluster installation fails at Hive service startup

7.13.1.600

Cloudera 7.3.1 base cluster installation failed during the Hive Metastore setup when the underlying database contained an older schema. The installation process incorrectly tried to initialize (-initSchema) an existing schema instead of upgrading it, resulting in a schema validation failure.

This issue is now resolved by changing the Hive Metastore command to use -initOrUpgradeSchema. This ensures that the installation can correctly upgrade an existing schema, preventing validation failures on first run.

OPSAPS-74370: Knox's Save Alias - IDBroker command fails due to missing variable declaration

7.13.1.600

Users encountered issues when creating IDBroker aliases through the Cloudera Manager UI in Cloudera Manager 7.13.1 while using CDP 7.1.9. This issue is fixed now.

OPSAPS-72673, OPSAPS-75346 : Allow users to disable fallback to system truststore for AM proxy

7.13.1.600

There was an issue when passing the truststore location and password to the Resource Manager (RM) from the YARN configurations page. This configuration issue is now resolved and RM now properly uses the values set for the ssl.client.truststore.location and ssl.client.truststore.password configurations.

OPSAPS-74900: Issue with files or directories containing the % character

7.13.1.600

Previously, on RHEL 9.5, files or directories containing the % character might fail to open or copy due to Apache HTTPD version 2.4.62. This issue is now resolved by ensuring proper handling of special characters in file and directory names. To prevent similar issues, avoid the use of double-encoded special characters in the Apache Load Balancer configuration.

OPSAPS-70403: Custom Kerberos configuration not passed to gen_tgt.sh

7.13.1.600

Previously, the custom Kerberos configuration file location specified by the krb_krb5_conf_path parameter was not passed to security/gen_tgt.sh from SecurityUtils::generateTgt. As a result, security/gen_tgt.sh defaulted to /etc/krb5.conf, even when a custom path was configured.

This issue is now resolved. security/gen_tgt.sh correctly uses the custom Kerberos configuration file specified by the krb_krb5_conf_path parameter, which means that if a custom path is configured then security/gen_tgt.sh will use the custom /[***custom_path***]/etc/krb5.conf file. If no custom path is set, security/gen_tgt.sh defaults to using the /etc/krb5.conf file.

OPSAPS-71544, OPSAPS-75166, OPSAPS-75182: Ranger replication policies failed for custom username

7.13.1.600

Previously, when you used a custom username or Kerberos principal in the Ranger replication policy, the policy failed during the transformation step if the custom Ranger process user was set in Cloudera Manager. This issue is now fixed.

OPSAPS-72453, OPSAPS-74753: Admin role for machine user was not verified for incremental Ozone incremental replication policies

7.13.1.600

To run incremental Ozone replication policies, the machine user of the Cloudera Manager peer must have the admin role. Previously, replication would fail with an exception, if you added a peer without selecting the Create User With Admin Role option in the Cloudera Manager > Replication > Peers > Add Peer modal window. This issue is resolved for any new replication requests.

For any existing replication, when the source peer is created without selecting the Create User With Admin Role, the exception is now handled with a message stating that the Create User With Admin Role option must be selected for the peer.

OPSAPS-74314, OPSAPS-74636: HBase snapshot export always runs with the default client configuration

7.13.1.600

Previously, when multiple HBase services existed in a cluster, the HBase export process used the default client configuration. This issue is now resolved because the export process prioritizes the correct HBase replication client configurations based on the set CLASSPATH value in the snapshot-hbase.sh file.

OPSAPS-75136, OPSAPS-75187, OPSAPS-75245, OPSAPS-75449: Kerberos ticket validation fails during HDFS replication

7.13.1.600

Previously, Kerberos ticket validation failed during the HDFS replication policy run. This issue is now fixed because Kerberos ticket validation now checks the current cached tickets by utilizing the Kerby Credential Cache. This improvement also prevents a round-trip authentication request to the Key Distribution Center (KDC).

OPSAPS-74748, OPSAPS-74666: Snapshot replication policy, on-demand snapshots, and HBase snapshot export do not work on JDK17

7.13.1.600

Previously, snapshot replication policy, on-demand snapshots, and the HBase snapshot export process did not support JDK17. This issue is now resolved.

OPSAPS-74691, OPSAPS-74853: Hive external replication fails with UnrecognizedPropertyException

7.13.1.600

Previously, the Hive external replication failed due to the UnrecognizedPropertyException error. This issue appeared for the older version source clusters that did not recognize the atlasClientAdvanceConfigs field. This issue is now fixed.

OPSAPS-75214: Hive3 replication remote command failed during Hive ACID replication policy run

7.13.1.600

Previously, the Hive3 replication remote command failed during the Hive ACID replication policy run. This issue is now fixed.

OPSAPS-73217, OPSAPS-75303, OPSAPS-75444: Snapshot retention after incremental Ozone replication dry run

7.13.1.600

Previously, the dry run process for the incremental Ozone replication policy did not delete the snapshot it created after the replication process was complete. This issue is now fixed. For information about this issue, see the corresponding Knowledge article: Technical Service Bulletin 2025-835: Dry run of incremental Ozone replication can cause failure to replicate some changes in Cloudera Replication Manager.

Cloudera Manager 7.13.1 CHF5 (7.13.1.500)

DMX-3659, DMX-3670: Updated row is replicated as inserted row

Previously, during an Iceberg replication policy run, an updated row on the source cluster was replicated as an inserted row. As a result, the number of rows for the table in the source and target clusters did not match after the replication job completed. This issue is now resolved.

DMX-3637: Incorrect report of Iceberg replication

Previously, the number of files transformed during the Iceberg replication policy run was incorrect when multiple tables existed in the Iceberg replication policy. This issue is now resolved

DMX-3627: Iceberg replication policies failed with exception

Previously, some Iceberg replication policies failed with the java.lang.ClassNotFoundException: com.cloudera.repl.iceberg.PathUtil exception. This issue is now resolved.

CDPD-91410: Upgraded external_versions jetty to 9.4.58 for Replication Manager

Upgraded the external_versions jetty to 9.4.58 to fix CVE-2025-5115 for Replication Manager.

OPSAPS-74212: Atlas rolling upgrade related to Zero Downtime Upgrade (ZDU) fails from from 7.1.7 SP3 to 7.3.1.500

The issue causing ZDU failures during upgrades from Cloudera Runtime 7.1.7 SP3 to 7.3.1.500 has been resolved. Previously, the Atlas rolling upgrade was failing because the RoleState for Atlas was not checked, and the upgradeCommand was not set correctly. The updated logic now includes a check for the RoleState of Atlas. Based on this state, the appropriate upgradeCommand is determined and set, ensuring a smooth and reliable rolling upgrade path in ZDU-enabled scenarios.

OPSAPS-74091: Migrate Navigator Encrypt keys to Ranger KMS from KTS configured with HSM

Exporting Navigator Encrypt keys from KTS to Ranger KMS is already available. But if HSM is configured with KTS, this does not work as key's content does not contain the actual key material; it needs to be fetched from HSM first.

Condition has been added to check for HSM setup and accordingly publish a warning log stating Navigator Encrypt keys with HSM cannot be migrated, along with the document link for the steps to migrate.

OPSAPS-74341: NodeManagers might fail to start during the cluster restart after the Cloudera Manager 7.13.1.x upgrade

Cgroup v2 support is enabled in CDP 7.1.9 SP1 CHF5 and higher versions. However, if the user upgrades from Cloudera Manager 7.11.3.x to Cloudera Manager 7.13.1.x, and the environment is using cgroup v2, the NodeManagers might fail to start during the cluster restart after the Cloudera Manager 7.13.1.x upgrade.

This issue is fixed now.

OPSAPS-73038: False-positive port conflict error message displayed in Cloudera Manager

This issue is fixed now. A new health port has been added as a configuration to the Knox configuration. The health topology port can be set with topology port mapping. By setting the new configuration, the checkDeployment script will use the new health port.

OPSAPS-73498: Backport Cloudera Manager side Ranger-Trino integration changes

Trino plugin support in Ranger has been added.

OPSAPS-74276: RockDB JNI library is loaded from the same place to multiple Ozone components

By default, Ozone roles define a separate directory to load RocksDB shared library, and clean up separately from each other on the same host, unless the environment already defines the ROCKSDB_SHAREDLIB_DIR variable via a Safety valve as suggested in the workaround for OPSAPS-67650. After this change, that workaround becomes obsolete. The new directory used reside within directories used by the Cloudera Manager agent to manage the Ozone related processes.

OPSAPS-73628: Impala query profile export to Telemetry Publisher failed due to a 5MB string length limit introduced in Jackson 2.15.0.

The Jackson string length limit was increased to allow exporting large Impala query profiles. Specifically, maxStringLength was set to Integer.MAX_VALUE using StreamReadConstraints, resolving the export failure.

OPSAPS-73709: When Ranger is enabled, Telemetry Publisher fails to export Hive payloads from Data Hub due to missing Ranger client dependencies in the Telemetry Publisher classpath.

This issue has been resolved by adding the necessary dependencies to the classpath.

OPSAPS-73912: In the Cloudera Manager versions 7.13.1, the PROXY settings for the Telemetry Publisher (TP) are not functioning as expected. This may impact the Telemetry Publisher's ability to communicate through a configured proxy.

This issue is resolved by updating the cdp-sdk-java JAR to a version that supports proxy settings.

OPSAPS-74392: When creating a compressed archive of an input directory, an open input stream was not closed before a file was deleted. This could lead to filesystem errors, such as the creation of .nfs files.

The issue is now resolved by ensuring the input stream for each file is closed when adding it to the archive.

OPSAPS-73188: Stable Hive queries in a JDK17 environment

Hive queries failed in JDK17 environments because of an InaccessibleObjectException. This was caused by outdated Java options that were not compatible with the new JDK version.

This issue is now resolved by an upgrade handler for Tez that automatically adds the required JDK17 configuration flags. This change automates the manual workaround, ensuring that Hive queries run successfully in JDK17 environments

OPSAPS-73359: Snappy native library loading failure

Snappy native library loading fail in certain cluster configurations. This occurs because Snappy attempts to locate its .so files in /var/lib/hive.

This issue is now fixed.

OPSAPS-72446, OPSAPS-71565, OPSAPS-71566, OPSAPS-73405,OPSAPS-72860: Replication policy runs when the source or target cluster becomes available after it recovers from temporary node failures

Hive replication policies and HBase replication policies can now recover from a temporary node failure on the source or target clusters to continue the replication policy job run. Alternatively, you can also rerun the failed or aborted policies manually.

To ensure that the RemoteCmdWork daemon continues to poll even in case of network failures or if the Cloudera Manager goes down, you can set the remote_cmd_network_failure_max_poll_count=[*** ENTER REMOTE EXECUTOR MAX POLL COUNT***] parameter on the target Cloudera Manager > Administration > Settings page. The actual timeout is provided by a piecewise constant function that is a step function with the following breakpoints: 1 through 11 is 5 seconds, 12 through 17 is 1 minute, 18 through 35 is 2 minutes, 36 through 53 is 5 minutes, 54 through 74 is 8 minutes, 75 through 104 is 15 minutes, and so on. Therefore when you enter 1, the polling continues for 5 seconds after the Cloudera Manager goes down or after a network failure. Similarly when you set it to 75, the polling continues for 15 minutes.

To ensure Replication Manager attempts to recover the RemoteCmdWork daemon on the target cluster, ensure that you set the retry value in the target Cloudera Manager > Administration > Settings > remote_cmd_max_recovery_count parameter, or set it to 0 to turn off the feature. By default, Replication Manager attempts to recover the command twice after the target cluster goes down temporarily.

This issue is now fixed.

OPSAPS-71459: Commands continue to run after Cloudera Manager restart

Previously, some remote replication commands continued to run endlessly even after a Cloudera Manager restart operation. This issue is now fixed.

OPSAPS-72439, OPSAPS-74278, OPSAPS-74265: HDFS and Hive external tables replication policies failed when using custom krb5.conf files

The issue appeared because the custom krb5.conf file was not propagated to the required files. This issue can now be fixed by using a custom Kerberos path before running the replication policies as described in step 13 in Using a custom Kerberos configuration path.

OPSAPS-73645, OPSAPS-73847: Ozone bucket browser does not show the volume buckets

Previously, when you clicked on Next Page on the Cloudera Manager > Clusters > Ozone > Bucket Browser page, and then on a volume name, the volume buckets did not appear if the number of volumes exceeded 26. This issue is now fixed.

OPSAPS-73780: Existing Iceberg replication policies fail after Cloudera Manager is upgraded

When you upgrade the source and target clusters to Cloudera Manager 7.13.1.x but continue to use CDP Private Cloud Base 7.1.9.x version, existing Iceberg replication policies fail because of compatibility issues. After you upgrade to Cloudera Manager 7.13.1.x, you must also upgrade to Cloudera on premises 7.3.1.x to avoid compatibility issues.

Always ensure that the Cloudera on premises and Cloudera Manager versions are compatible before you run replication policies.

OPSAPS-73906, OPSAPS-73737, OPSAPS-73655, OPSAPS-74061: Cloud replication no longer fails after the delegation token is issued

Previously, the replication policies were failing during incremental replication job runs if you chose the Advanced Setting > Delete Policy > Delete permanently option during the replication policy creation process.

You can now configure com.cloudera.enterprise.distcp.skip-delegation-token-on-cloud-replication to false in the Cloudera Manager > Clusters > HDFS service > Configuration > HDFS Replication Advanced Configuration Snippet (Safety Valve) for core-site.xml advanced configuration snippet to ensure that the HDFS and Hive external table replication policies replicating from an on-premises cluster to cloud do not fail.

When the advanced configuration snippet is set to false, the MapReduce client process obtains the delegation tokens explicitly before it submits the MapReduce job for the replication policy. By default, the advanced configuration snippet is set to true.

OPSAPS-74026: Error appears when you disable Iceberg replication policies

Previously, the Invalid Peer Exception error appeared when you disabled or enabled an Iceberg replication policy on the Cloudera Manager > Replication > Replication Policies page. This issue is now resolved.

OPSAPS-74040, OPSAPS-74058: Ozone OBS replication fails due to pre-filelisting check failure

During OBS-to-OBS Ozone replication, if the source bucket is a linked bucket, the replication failed during the Run Pre-Filelisting Check step, and the error message Source bucket is a linked bucket, however the bucket it points to is also a link appeared, even when the source bucket directly links to a regular, non-linked bucket. The issue is now fixed.

The Ozone OBS-to-OBS replication no longer fails when the source or the target bucket is a linked bucket because the linked bucket resides in the s3v volume and refers to another bucket in s3v or any other volume.

OPSAPS-73158, OPSAPS-73903, OPSAPS-74206: HDFS replication policies fail when the policies pre-fetch the expired Kerberos ticket from the 'sourceTicketCache' file

Previously, Replication Manager pre-fetched the Kerberos ticket from the sourceTicketCache file for the replication policies. Issues appeared when the file contained an expired Kerberos ticket.

To ensure that the replication policies do not pre-fetch the expired Kerberos ticket from the sourceTicketCache file before the replication policy run, Replication Manager now determines whether the sourceTicketCache is current and valid. If its not valid, it continues to fetch tickets until a valid ticket is identified.

If you do not want Replication Manager to perform this step and always fetch a new ticket, you can add USE_SOURCE_PREFETCHED_KERBEROS_PRINCIPAL = false in the Cloudera Manager > Clusters > HDFS service > Configuration > HDFS Replication Environment Advanced Configuration Snippet (Safety Valve) advanced configuration snippet.

OPSAPS-73602, OPSAPS-74353: HDFS replication policies to cloud failed with HTTP 400 error

Previously, the HDFS replication policies to cloud were failing after you edited the replication policies in the Cloudera Manager > Replication Manager UI. This issue is now fixed.

OPSAPS-74950: Ozone replication policies failed for Cloudera Private Cloud Base 7.1.9 SP1 CHF11 clusters using Cloudera Manager 7.13.1.400

Previously, Ozone replication policies for Ozone linked buckets failed when the Cloudera Private Cloud Base 7.1.9 SP1 CHF11 source or target clusters used Cloudera Manager 7.13.1.400.

To mitigate this issue, use Cloudera Private Cloud Base 7.1.9 SP1 CHF11 source or target clusters with Cloudera Manager 7.13.1.500.

OPSAPS-73572: Add HBase client TLS parameters to the HBase server and Gateway roles

New client-side TLS configuration parameters are added to Cloudera Manager for both the HBase server and Gateway roles. You can now control the TLS client-side configuration settings directly in Cloudera Manager.

Cloudera Manager 7.13.1 CHF4 (7.13.1.400)

OPSAPS-73370: Enhance code to merge compressed Spark event log files

Fixes an issue with unreported metrics in Cloudera Observability when the spark.eventLog.compress property was set to true.

The spark.eventLog.compress property is set to false by default, but enabling it will no longer encoded event logs to fail when processed in Cloudera Observability.

OPSAPS-60642: Host header injection issue on


                /j_spring_security_check

internal endpoint

/j_spring_security_check is internal endpoint which is vulnerable to Host header injection. This issue occurs if the user disabled PREVENT_HOST_HEADER_INJECTION feature flag.

Host header injection: In an incoming HTTP request, web servers often dispatch the request to the target virtual host based on the value supplied in the Host header. Without proper validation of the header value, the attacker can supply invalid input to cause the web server to:

Dispatch requests to the first virtual host on the list
Redirect to an attacker-controlled domain
Perform web cache poisoning
Manipulate password reset functionality

This issue is resolved now by adding Feature Flag PREVENT_HOST_HEADER_INJECTION to prevent host header injection vulnerability on /j_spring_security_check internal endpoint. This feature flag is by default enabled and it enables additional logic to block potential Host Header Injection attacks targeting the /j_spring_security_check endpoint in Cloudera Manager.

OPSAPS-74019/OPSAPS-72739: Query execution stability with temporary directories

Queries previously failed with an execution error when using a compression library. Although /tmp is a default temporary folder, its use for script execution was blocked due to security restrictions, causing queries to fail.

This issue was resolved by configuring Hive to use a different default temporary folder, /var/lib/hive, instead of /tmp.

OPSAPS-74141: Hive service setup on reused databases

During 7.3.1 base cluster installations, the Hive service setup failed when attempting to validate the Hive Metastore Schema. This happened specifically when the new cluster used a database that had been previously used by an older installation, causing the schema validation to fail due to a version mismatch with the newer Hive components.

This issue was resolved by modifying the Hive script in Cloudera Manager to use the -initOrUpgradeSchema argument instead of -initSchema for the hive_metastore_create_tables command. This change allows the Hive Metastore schema to be properly initialized or upgraded even when connecting to a database that was previously used by an older installation.

OPSAPS-73011: Wrong parameter in the /etc/default/cloudera-scm-server file

In case the Cloudera Manager needs to be installed in High Availability (2 nodes or more as explained here), the parameter CMF_SERVER_ARGS in the /etc/default/cloudera-scm-server file is missing the word "export" before it (on the file there is only CMF_SERVER_ARGS= and not export CMF_SERVER_ARGS=), so the parameter cannot be utilized correctly.

This issue is fixed now.

OPSAPS-72756:The runOzoneCommand API endpoint fails during the Ozone replication policy run

The /clusters/{clusterName}/runOzoneCommand Cloudera Manager API endpoint fails when the API is called with the getOzoneBucketInfo command. In this scenario, the Ozone replication policy runs also fail if the following conditions are true:

The source Cloudera Manager version is 7.11.3 CHF11 or 7.11.3 CHF12.
The target Cloudera Manager is version 7.11.3 through 7.11.3 CHF10 or 7.13.0.0 or later where the feature flag API_OZONE_REPLICATION_USING_PROXY_USER is disabled.

This issue is fixed now.

OPSAPS-72710: Marking the snapshots created by incremental replication policies differently

In the Ozone bucket browser, the snapshots created by an Ozone replication are marked. When the snapshots are deleted, a confirmation modal window appears before the deletion. The restore bucket modal window now displays information about how the restore operation is implemented in Ozone and how this operation affects Ozone replications.

OPSAPS-72447, CDPD-76705: Ozone incremental replication fails to copy renamed directory

Ozone incremental replication using Ozone replication policies succeed but might fail to sync nested renames for FSO buckets.

When a directory and its contents are renamed between the replication runs, the outer level rename synced but did not sync the contents with the previous name.

This issue is fixed now.

OPSAPS-71046: The jstack logs collected on Cloudera Manager 7.11.3 are not in the right format

On viewing the jstack logs in the user cluster, the jstack logs for ozone and other services on Cloudera Manager 7.11.3 and CDP Private Cloud Base 7.1.9 are not in the right format. This issue is fixed now.

OPSAPS-65377: Cloudera Manager - Host Inspector not finding Psycopg2 on Ubuntu 20 or Redhat 8.x when Psycopg2 version 2.9.3 is installed.

Host Inspector fails with Psycopg2 version error while upgrading to Cloudera Manager 7.13.1.x versions. When you run the Host Inspector, you get an error Not finding Psycopg2, even though it is installed on all hosts. This issue is fixed now.

OPSAPS-70226: Atlas uses the Solr configuration directory available in ATLAS_PROCESS/conf/solr instead of the Cloudera Manager provided directory

The solrconfig.xml and schema.xml files are updated for the existing atlas_configs directory to ensure backward compatibility for existing customers. This included downloading the current Atlas configuration from Solr, applying the necessary updates, and then overwriting the modified configuration back to Solr. This issue is fixed now and Atlas uses the correct configuration directory in /var/run/cloudera-scm-agent/process/151-atlas-ATLAS_SERVER/solrconf.xml. New clusters already use the updated configuration directory.

OPSAPS-74147: Atlas rolling upgrade related to Zero Downtime Upgrade (ZDU) fails from from 7.1.7 SP3 to 7.3.1.400

The issue causing ZDU failures during upgrades from Cloudera Runtime 7.1.7 SP3 to 7.3.1.400 has been resolved. Previously, the Atlas rolling upgrade was failing because the RoleState for Atlas was not checked, and the upgradeCommand was not set correctly.

OPSAPS-73174: Autoscaling fails when any of the RM hosts are down

When a master node hosting RM abruptly goes down, Cloudera Manager can proceed with the NM commission/decommission command-flow.

OPSAPS-73780: Existing Iceberg replication policies fail after Cloudera Manager is upgraded

Existing Iceberg replication policies fail when the source and target clusters are upgraded to Cloudera Manager 7.13.1.x which use CDP Private Cloud Base 7.1.9.x version. The replication policies fail because of compatibility issues. You must ensure that the CDP versions and Cloudera Manager versions are compatible before you run replication policies. For example, use Cloudera on premises 7.3.1.x with Cloudera Manager 7.13.1.x. This issue is resolved.

Cloudera Manager 7.13.1 CHF3 (7.13.1.300)

OPSAPS-73225: Cloudera Manager Agent reporting inactive/failed processes in Heartbeat request

As part of introducing Cloudera Manager 7.13.x, some changes were done to the Cloudera Manager logging, eventually causing Cloudera Manager Agent to report on inactive/stale processes during Heartbeat request.

As a result, the Cloudera Manager servers logs are getting filled rapidly with these notifications though they do not have impact on service.

In addition, with adding the support for the Observatory feature, some additional messages were added to the logging of the server. However, in case the customer did not purchase the Observatory feature, or the telemetry monitoring is not being used, these messages (which appears as "TELEMETRY_ALTUS_ACCOUNT is not configured for Otelcol" are filling the server logs and preventing proper follow-up on the server activities).

This issue is fixed now.

OPSAPS-72270: Start ECS command fails on uncordon nodes step

Ensure the kube-apiserver is up and running for at least 60 seconds before proceeding with the uncordon step, and use the correct target node name, not just the name of the node where the uncordon command is executed.

This issue is fixed now.

OPSAPS-72978: The getUsersFromRanger API parameter truncates the user list after 200 items

The Cloudera Manager API endpoint v58/clusters/[***CLUSTER***]/services/[***SERVICE***]/commands/getUsersFromRanger API endpoint no longer truncates the list of returned users at 200 items.

Cloudera Manager 7.13.1 CHF2 (7.13.1.200)

OPSAPS-71527: Hive Metrics not loading after installing Cloudera Manager Cumulative hotfix - 7.11.3.9 (CHF6)

After applying a Cloudera Manager Cumulative hotfix - 7.11.3.9 (CHF6) upgrade, multiple Hive metric charts showed no data, specifically for data collected post the Cloudera Manager upgrade timestamp.

A user would see this problem when attempting an upgrade from a lower Cumulative hotfix (for example, CM 7.11.3 CHF1 (7.11.3.2) to a higher Cumulative hotfix within the same release line (for example, CM 7.11.3 CHF6 (CM 7.11.3.9).

This issue is fixed in CM 7.11.3 CHF13 (7.11.3.32) and CM 7.13.1 CHF2 (7.13.1.200) versions. This fix resolves an issue where Hive metric charts failed to display data after applying a Cloudera Manager Cumulative hotfix upgrade.

OPSAPS-72809: Ranger policy script for Knox fails due to double quotation marks

The Ranger policy script for Knox (setupRanger.sh) fails, because the CSD_JAVA_OPTS parameters are enclosed by double quotation marks in the script. The issue is fixed now.

OPSAPS-72795: Do not allow multiple Ozone services in a cluster

It is possible to configure multiple Ozone services in a single cluster which can cause irreversible damage to a running cluster. So, this fix allows you to install only one Ozone service in a cluster.

OPSAPS-72767:

Install
                Oozie ShareLib

Cloudera Manager command fails on FIPS and FedRAMP clusters

The Install Oozie ShareLib command using Cloudera Manager fails to execute on FIPS and FedRAMP clusters. This issue is fixed now.

OPSAPS-72323: Cloudera Manager UI is down with bootstrap failure due to ConfigGenExecutor throwing exception

This issue is fixed now.

OPSAPS-71566: The polling logic of RemoteCmdWork goes down if the remote Cloudera Manager goes down

When the remote Cloudera Manager goes down or when there are network failures, the RemoteCmdWork stops to poll. To ensure that the daemon continues to poll even when there are network failures or if the Cloudera Manager goes down, you can set the remote_cmd_network_failure_max_poll_count=[*** ENTER REMOTE EXECUTOR MAX POLL COUNT***] parameter on the Cloudera Manage > Administration > Settings page. Note that the actual timeout is provided by a piecewise constant function (step function) where the breakpoints are: 1 through 11 is 5 seconds, 12 through 17 is 1 minute, 18 through 35 is 2 minutes, 36 through 53 is 5 minutes, 54 through 74 is 8 minutes, 75 through 104 is 15 minutes, and so on. Therefore when you enter 1, the polling continues for 5 seconds after the Cloudera Manager goes down or after a network failure. Similarly when you set it 75, the polling continues for 15 minutes.

OPSAPS-67197: Ranger RMS server shows as healthy without service being accessible

Being a Web service, Ranger RMS might not be initialized due to other issues causing RMS to be inaccessible. But Ranger RMS service was still shown as healthy, because Cloudera Manager only monitors Process Identification Number (PID).

This issue is fixed now. Added the health status canary support for Ranger RMS service which connects to RMS after some specific intervals and shows alert on the Cloudera Manager UI if RMS is not reachable.

OPSAPS-71933: Telemetry Publisher is unable to publish Spark event logs to Cloudera Observability when multiple History Servers are set up in the Spark service.

This issue is now resolved by adding the support for multiple Spark History Server deployments in Telemetry Publisher.

OPSAPS-71623: Some Spark jobs are missing from the Workload XM interface. In the Telemetry Publisher logs for these Spark jobs, the error message java.lang.IllegalArgumentException: Wrong FS for Datahub cluster is displayed.

The issue is resolved by addressing Telemetry Publisher failures during the processing of Yarn logs.

Cloudera Manager 7.13.1 CHF1 (7.13.1.100)

OPSAPS-71669: The Continue option is disabled on the Static Service Pools Review page, affecting the functionality of Static Service Pools

The minimum and maximum I/O weight values for Cgroup v2 were incorrectly set to 100 and 1000, respectively, in Cloudera Manager 7.13.1.0. According to official Cgroup v2 documentation, the valid range should be 1 to 10,000. Due to this incorrect configuration range, the Continue option on the Static Service Pools Review page was disabled, preventing users from proceeding with pool configuration.

This issue might occur on clusters running Cloudera Manager 7.13.1.0 with Cgroup v2 resource management when configuring or reviewing Static Service Pools. After upgrading to Cloudera Manager 7.13.1.100 CHF-1, this issue no longer occurs.

This issue is fixed now. The minimum and maximum I/O weight values for Cgroup v2 have been corrected to 1 and 10,000, respectively, in accordance with the official Cgroup v2 specification. This fix ensures that the Continue option on the Static Service Pools Review page is now enabled as expected.

OPSAPS-72369: Update snapshot default configuration for enabling ordered snapshot deletion

This issue is now resolved by changing the default configuration value on Cloudera Manager.

OPSAPS-72215: ECS CM UI Config for docker cert CANNOT accept the new line - unable to update new registry cert in correct format

Currently there is no direct way to update the external docker certificate in the UI for ECS because newlines are removed when the field is saved. Certs can be uploaded by adding '\n' character for newline now. When user wants to update docker cert through Cloudera Manager UI config. User need to add '\n' to specify a newline character in the certificate. Example:

OPSAPS-72662: UIDs (User IDs) conflicts for the kubernetes containers as the Kubernetes containers use the user ID - 1001 which is a pretty common UID in a Unix environment.

This issue is fixed now by using a large UID such as 1000001 to reduce UID conflicts.

Using large UIDs (User IDs) for Kubernetes containers is a recommended security practice because it helps minimize the risk of a container compromising the host system. By assigning a high UID, it reduces the chances of conflicts with existing user accounts on the host, particularly if the container is compromised and attempts to access host files or escalate privileges. In essence, a large UID ensures the container operates with restricted permissions on the host system. Therefore, when creating the CLI pod in Cloudera Manager, the runAsUser value should be set to an integer greater than 1,000,000. To avoid UID conflicts, it is advisable to use a UID such as 1000001.

OPSAPS-72559: Incorrect error messages appear for Hive ACID replication policies

Replication Manager now shows correct error messages for every Hive ACID replication policy run on the Cloudera Manager > Replication Manager > Replication Policies > Actions > Show History page as expected. This issue is fixed now.

OPSAPS-72509: Hive metadata transfer to GCS fails with ClassNotFoundException

Hive external table replication policies from an on-premises cluster to cloud failed during the Transfer Metadata Files step when the target is on Google Cloud and the source Cloudera Manager version is 7.11.3 CHF7, 7.11.3 CHF8, 7.11.3 CHF9, 7.11.3 CHF9.1, 7.11.3 CHF10, or 7.11.3 CHF11. This issue is fixed.

OPSAPS-72559: Incorrect error messages appear for Hive ACID replication policies

OPSAPS-72558, OPSAPS-72505: Replication Manager chooses incorrect target cluster for Iceberg, Atlas, and Hive ACID replication policies

When a Cloudera Manager instance managed multiple clusters, Replication Manager picked the first cluster in the list as the Destination during the Iceberg, Atlas, and Hive ACID replication policy creation process, and the Destination field was non-editable. You can now edit the replication policy to change the target cluster in these scenarios.

OPSAPS-72468: Subsequent Ozone OBS-to-OBS replication policy do not skip replicated files during replication

Replication Manager now skips the replicated files during subsequent Ozone replication policy runs after you add the following key-value pairs in Cloudera Manager > Clusters > Ozone service > Configuration > Ozone Replication Advanced Configuration Snippet (Safety Valve) for core-site.xml:

com.cloudera.enterprise.distcp.ozone-schedules-with-unsafe-equality-check = [***ENTER COMMA-SEPARATED LIST OF OZONE REPLICATION POLICIES’ ID or ENTER all TO APPLY TO ALL OZONE REPLICATION POLICIES***]
The advanced snippet skips the already replicated files when the relative file path, file name, and file size are equal and ignores the modification times.
caution
Usage of this advanced snippet might lead to data loss. For example, if you modified a file on the source or target cluster and the file size remains the same, the advanced snippet ignores the file during the replication run.
com.cloudera.enterprise.distcp.require-source-before-target-modtime-in-unsafe-equality-check = [***ENTER true OR false***]

When you add both the key-value pairs, the subsequent Ozone replication policy runs skip replicating files when the matching file on the target has the same relative file path, file name, file size and the source file’s modification time is less or equal to the target file modification time.

OPSAPS-72214: Cannot create a Ranger replication policy if the source and target cluster names are not the same

You could not create a Ranger replication policy if the source cluster and target cluster names were not the same. This issue is fixed.

OPSAPS-71853: The Replication Policies page does not load the replication policies’ history

When the sourceService is null for a Hive ACID replication policy, the Cloudera Manager UI fails to load the existing replication policies’ history details and the current state of the replication policies on the Replication Policies page. This issue is fixed now.

OPSAPS-72181: Currently Apply Host Template checks for active command on the service, if the active command is taking time (like a long-running replication command) then Apply Host Template operation will also get delayed.

This issue is fixed now for certain scenario like when host template has only gateway role then the Apply Host Template operation will not check for active command on service. If host template has other roles than gateway then the behaviour remains same. Apply Host Template with gateway roles only will not wait for any active service command.

OPSAPS-72249: Oozie database dump fails on JDK17

Oozie database dump and load commands couldn't be executed from Cloudera Manager with JDK 17. This issue is fixed now.

OPSAPS-72276: Cannot edit Ozone replication policy if the MapReduce service is stale

You could not edit an Ozone replication policy in Replication Manager if the MapReduce service did not load completely. This issue is fixed.

OPSAPS-71932: Ranger HDFS plugin resource lookup issue

For JDK 17 Isilon cluster, user was not able to create a new policy under cm_hdfs. The connection was failing with the following error message:

cannot access class sun.net.util.IPAddressUtil

The issue is fixed now. Added sun.net.util package to Ranger Admin java opts for JDK 17.

OPSAPS-71907: Solr auditing URL changed port

The Solr auditing URL generated for Ranger plugin services in the data hub cluster is correct when both the local ZooKeeper and the data lake ZooKeeper have ssl_enabled enabled. However, if the ssl_enabled parameter is disabled on the local ZooKeeper in data hub, the Solr auditing URL changed the port to use 2181.

The fix fetches the Solr auditing URL from the data context of data lake on data hub, resolving the issue where, if the ZooKeeper ssl_enabled parameter is disabled, Solr auditing uses port 2181; a rare, corner-case occurrence.

OPSAPS-71666: Replication Manager uses the required property values in the “ozone_replication_core_site_safety_valve ” in the source Cloudera Manager during Ozone replication policy run

During an Ozone replication policy run, Replication Manager obtains the required properties and its values from the ozone_replication_core_site_safety_valve. It then adds the new properties and its values and overrides the value for existing properties in the core-site.xml file. Replication Manager uses this file during the Ozone replication policy run.

OPSAPS-71659: Ranger replication policy failed because of incorrect source to destination service name mapping

Ranger replication policy failed during the transform step because of incorrect source to destination service name mapping. This issue is fixed now.

OPSAPS-71642: GflagConfigFileGenerator is removing the = sign in the Gflag configuration file when the configuration value passed is empty in the advanced safety valve

If the user adds file_metadata_reload_properties configuration in the advanced safety valve with = sign and empty value, then the GflagConfigFileGenerator is removing the = sign in the Gflag configuration file when the configuration value passed is empty in the advanced safety valve.

This issue is fixed now.

OPSAPS-71592: Replication Manager does not read the default value of “ozone_replication_core_site_safety_valve” during Ozone replication policy run

When the ozone_replication_core_site_safety_valve advanced configuration snippet is set to its default value, Replication Manager does not read its value during the Ozone replication policy run. To mitigate this issue, the default value of ozone_replication_core_site_safety_valve has been set to an empty value. If you have set any key-value pairs for ozone_replication_core_site_safety_valve, then these values are written to core-site.xml during the Ozone replication policy run.

OPSAPS-71424: The 'configuration sanity check' step ignores the replication advanced configuration snippet values during the Ozone replication policy job run

The OBS-to-OBS Ozone replication policy jobs failed when the S3 property values for fs.s3a.endpoint, fs.s3a.secret.key, and fs.s3a.access.key were empty in Ozone Service Advanced Configuration Snippet (Safety Valve) for ozone-conf/ozone-site.xml even when these properties were defined in Ozone Replication Advanced Configuration Snippet (Safety Valve) for core-site.xml. This issue is fixed.

OPSAPS-71256: The “Create Ranger replication policy” action shows 'TypeError' if no peer exists

When you click target Cloudera Manager > Replication Manager > Replication Policies > Create Replication Policy > Ranger replication policy, the TypeError: Cannot read properties of undefined error appears. This issue is fixed now.

OPSAPS-71093: Validation on source for Ranger replication policy fails

The Cloudera Manager page would be logged out automatically when you created a Ranger replication policy. This is because the source cluster did not support the getUsersFromRanger or getPoliciesFromRanger API requests. The issue is fixed now, and the required validation on the source completes successfully as expected.

OPSAPS-70848: Hive external table replication policies succeed when the source cluster uses Dell EMC Isilon storage

During the Hive external table replication policy run, the replication policy failed at the Hive Replication Export step. This issue is fixed now.

OPSAPS-70822: Save the Hive external table replication policy on the ‘Edit Hive External Table Replication Policy’ window

Replication Manager saves the changes as expected when you click Save Policy after you edit a Hive replication policy. To edit a replication policy, you click Actions > Edit Configuration for the replication policy on the Replication Policies page.

OPSAPS-70721: QueueManagementDynamicEditPolicy is not enabled with Auto Queue Deletion enabled

Whenever Auto Queue Deletion is enabled, the QueueManagementDynamicEdit policy is not enabled. This issue is fixed now and when there are no applications running in a queue, then its capacity is set to zero.

OPSAPS-70449: After creating a new Dashboard from the Cloudera Manager UI, the Chart Title field was allowing Javascript as input

In Cloudera Manager UI, while creating a new plot object, a Chart Title field allows Javascript as input. This allows the user to execute a script, which results in an XSS attack. This issue is fixed now.

OPSAPS-69782: Exception appears if the peer Cloudera Manager's API version is higher than the local cluster's API version

HBase replication using HBase replication policies in CDP Public Cloud Replication Manager between two Data Hubs/COD clusters succeed as expected when all the following conditions are true:

The destination Data Hub/COD cluster’s Cloudera Manager version is 7.9.0-h7 through 7.9.0-h9 or 7.11.0-h2 through 7.11.0-h4, or 7.12.0.0.
The source Data Hub/COD cluster's Cloudera Manager major version is higher than the destination cluster's Cloudera Manager major version.
The Initial Snapshot option is chosen during the HBase replication policy creation process and/or the source cluster is already participating in another HBase replication setup as a source or destination with a third cluster.

OPSAPS-69622: Cannot view the correct number of files copied for Ozone replication policies

The last run of an Ozone replication policy does not show the correct number of the files copied during the policy run when you load the Cloudera Manager > Replication Manager > Replication Policies page after the Ozone replication policy run completes successfully. This issue is fixed now.

OPSAPS-72143: Atlas replication policies fail if the source and target clusters support FIPS

The Atlas replication policies fail during the Exporting atlas entities from remote atlas service step if the source and target clusters support FIPS. This issue is fixed now.

OPSAPS-67498: The Replication Policies page takes a long time to load

To ensure that the Cloudera Manager > Replication Manager > Replication Policies page loads faster, new query parameters have been added to the internal policies that fetch the REST APIs for the page which improves pagination. Replication Manager also caches internal API responses to speed up the page load.

OPSAPS-65371: Kudu user was not part of the

cm_solr
                RANGER_AUDITS_COLLECTION

policy

Kudu user was not part of the default policy of cm_solr, which prevented to write any Kudu audit logs on Ranger Admin untill Kudu user was manually added to the policy.

The issue is fixed now. Added Kudu user to default policy for cm_solr - RANGER_AUDITS_COLLECTION, so that Kudu user does not need to be added manually to write audits to Ranger Admin.

Cloudera Manager 7.13.1

OPSAPS-72254: FIPS Failed to upload Spark example jar to HDFS in cluster mode

Fixed an issue with deploying the Spark 3 Client Advanced Configuration Snippet (Safety Valve) for spark3-conf/spark-env.sh.

For more information, see Added a new Cloudera Manager configuration parameter spark_pyspark_executable_path to Livy for Spark 3 in Behavioral Changes In Cloudera Manager 7.13.1.

OPSAPS-71873 - UCL | CKP4| livyfoo0 kms proxy user is not allowed to access HDFS in 7.3.1.0

In the kms-core.xml file, the Livy proxy user is taken from Livy for Spark 3's configuration in Cloudera Runtime 7.3.1 and above.

OPSAPS-70976: The previously hidden real-time monitoring properties are now visible in the Cloudera Manager UI:

The following properties are now visible in the Cloudera Manager UI:

enable_observability_real_time_jobs
enable_observability_metrics_dmp

OPSAPS-69996: HBase snapshot creation in Cloudera Manager does not work as expected

During the HBase snapshot creation process, the snapshot create command sometimes tries to create the same snapshot twice because of an unhandled OptimisticLockException during the database write operation. This resulted in intermittent HBase snapshot creation failures. The issue is fixed now.

OPSAPS-66459: Enable concurrent Hive external table replication policies with the same cloud root

When the HIVE_ALLOW_CONCURRENT_REPLICATION_WITH_SAME_CLOUD_ROOT_PATH feature flag is enabled, Replication Manager can run two or more Hive external table replication policies with the same cloud root path concurrently.

For example, if two Hive external table replication policies have s3a://bucket/hive/data as the cloud root path and the feature flag is enabled, Replication Manager can run these policies concurrently.

By default, this feature flag is disabled. To enable the feature flag, contact your Cloudera account team.

OPSAPS-72153: Invalid signature when trying to create tags in Atlas through Knox

Atlas, SMM UI, and SCHEMA-REGISTRY throw 500 error in FIPS environment.

This issue is fixed now.

OPSAPS-70859: Ranger metrics APIs were not working on FedRAMP cluster

On FedRAMP HA cloud cluster, Ranger metrics APIs were not working.This issue is fixed now by introducing new Ranger configurations.

This issue is fixed now by introducing new Ranger configurations.

OPSAPS-71436: Telemetry publisher test Altus connection fails

An error occurred while running the test Altus connection action for Telemetry Publisher. This issue is fixed now.

OPSAPS-68252: The Ranger RMS Database Full Sync command is not visible on cloud clusters

The Ranger RMS Database Full Sync command was not visible on any cloud cluster. Also, it was needed to investigate the minimum user privilege required to see the Ranger RMS Database Full Sync command on the UI.

The issue is fixed now. The command definition on service level in Ranger RMS has been updated after which the command is visible on the UI. The minimum user privilege required to see this command is EnvironmentAdmin.

OPSAPS-69692, OPSAPS-69693: Included filters for Ozone incremental replication in API endpoint

You can use the include filters in the POST /clusters/{clusterName}/services/{serviceName}/replications API to replicate only the filtered part of the Ozone bucket. You can use multiple path regular expressions to limit the data to be replicated for an Ozone bucket. For example, if you include the /path/to/data/.* and .*/data filters in the includeFilter field for the POST endpoint, the Ozone replication policy replicates only the keys that start with /path/to/data/.* or ends with .*/data in the Ozone bucket.

OPSAPS-70561: Improved page load performance of the “Bucket Browser” tab.

The Cloudera Manager > Clusters > [***OZONE SERVICE***] > Bucket Browser tab does not load all the entries of the bucket. Therefore, the page loads faster when you try to display the content of a large bucket with several keys in it.

OPSAPS-71090: The spark.*.access.hadoopFileSystems gateway properties are not propagated to Livy.

Added new properties for configuring Spark 2 (spark.yarn.access.hadoopFileSystems) and Spark 3 (spark.kerberos.access.hadoopFileSystems) that propagate to Livy.

OPSAPS-71271: The precopylistingcheck script for Ozone replication policies uses the Ozone replication safety valve value.

The "Run Pre-Filelisting Check" step during Ozone replication uses the content of the ozone_replication_core_site_safety_valve" property value to configure the Ozone client for the source and the target Cloudera Manager.

OPSAPS-70983: Hive replication command for Sentry to Ranger replication works as expected

The Sentry to Ranger migration during the Hive replication policy run from CDH 6.3.x or higher to Cloudera on cloud 7.3.0.1 or higher is successful.

OPSAPS-69806: Collection of YARN diagnostic bundle will fail

For any combinations of CM 7.11.3 version up to CM 7.11.3 CHF7 version, with CDP 7.1.7 through CDP 7.1.8, collection of the YARN diagnostic bundle will fail, and no data transmits occur.

Now the changes are made to Cloudera Manager to allow the collection of the YARN diagnostic bundle and make this operation successful.

OPSAPS-70655: The hadoop-metrics2.properties file is not getting generated into the ranger-rms-conf folder

The hadoop-metrics2.properties file was getting created in the process directory conf folder, for example, conf/hadoop-metrics2.properties, whereas the directory structure in Ranger RMS should be {process_directory}/ranger-rms-conf/hadoop-metrics2.properties.

The issue is fixed now. The directory name is changed from conf to ranger-rms-conf, so that the hadoop-metrics2.properties file gets created under the correct directory structure.

OPSAPS-71014: Auto action email content generation failed for some cluster(s) while loading the template file

The issue has been fixed by using a more appropriate template loader class in the freemarker configuration.

OPSAPS-70826: Ranger replication policies fail when target cluster uses Dell EMC Isilon storage and supports JDK17

Ranger replication policies no longer fail if the target cluster is deployed with Dell EMC Isilon storage and also supports JDK17.

OPSAPS-70861: HDFS replication policy creation process fails for Isilon source clusters

When you choose a source Cloudera Base on premises cluster using the Isilon service and a target cloud storage bucket for an HDFS replication policy in Cloudera Base on premises Replication Manager UI, the replication policy creation process fails. This issue is fixed now.

OPSAPS-70708: Cloudera Manager Agent not skipping autofs filesystems during filesystem check

Clusters in which there are a large number of network mounts on each host (for example, more than 100 networked file system mounts), cause the startup of Cloudera Manager Agent to take a long time, on the order of 10 to 20 seconds per mount point. This is due to the OS kernel on the cluster host interrogating each network mount on behalf of the Cloudera Manager Agent to gather monitoring information such as file system usage.

This issue is fixed now by adding the ability in the Cloudera Manager Agent's config.ini file to disable filesystem checks.

OPSAPS-68991: Change default SAML response binding to HTTP-POST

The default SAML response binding is HTTP-Artifact, rather than HTTP-POST. While HTTP-POST is designed for handling responses through the POST method, where as HTTP-Artifact necessitates a direct connection with the SP (Cloudera Manager in this case) and Identity Provider (IDP) and is rarely used. HTTP-POST should be the default choice instead.

This issue is fixed now by setting up the new Default SAML Binding to HTTP-POST.

OPSAPS-40169: Audits page does not list failed login attempts on applying Allowed = false filter

The Audits page in Cloudera Manager shows failed login attempts when no filter is applied. However, when the Allowed = false filter is applied it returns 0 results. Whereas it should have listed those failed login attempts. This issue is fixed now.

OPSAPS-70583: File Descriptor leak from Cloudera Manager 7.11.3 CHF3 version to Cloudera Manager 7.11.3 CHF7

Unable to create NettyTransceiver due to Avro library upgrade which leads to File Descriptor leak. File Descriptor leak occurs in Cloudera Manager when a service tries to talk with Event Server over Avro. This issue is fixed now.

OPSAPS-70962: Creating a cloud restore HDFS replication policy with a peer cluster as destination which is not supported by Replication Manager

During the HDFS replication policy creation process, incorrect Destination clusters and MapReduce services appear which when chosen creates a dummy replication policy to replicate from a cloud account to a remote peer cluster. This scenario is not supported by Replication Manager. This issue is now fixed.

OPSAPS-71108: Use the earlier format of PCR

You can use the latest version of the PCR (Post Copy Reconciliation) script, or you can restore PCR to the earlier format by setting the com.cloudera.enterprise.distcp.post-copy-reconciliation.legacy-output-format.enabled=true key value pair in the Cloudera Manager > Clusters > HDFS service > Configuration > hdfs_replication_hdfs_site_safety_valve property.

OPSAPS-70689: Enhanced performance of DistCp CRC check operation

When a MapReduce job for an HDFS replication policy job fails, or when there are target-side changes during a replication job, Replication Manager initiates the bootstrap replication process. During this process, a cyclic redundancy check (CRC) check is performed by default to determine whether a file can be skipped for replication.

By default, the CRC for each file is queried by the mapper (running on the target cluster) from the source cluster's NameNode. The round trip between the source and target cluster for each file consumes network resources and raises the cost of execution. To improve the performance, you can set the following variables to true, on the target cluster, to improve the performance of the CRC check for the Cloudera Manager > Clusters > HDFS service > Configuration > HDFS_REPLICATION_ENV_SAFETY_VALVE property:

ENABLE_FILESTATUS_EXTENSIONS
ENABLE_FILESTATUS_CRC_EXTENSIONS

By default, these are set to false.

After you set the key-value pairs, the CRC for each file is queried locally from the NameNode on the source cluster and copied over to the target cluster at the end of the replication process, which reduces the cost because round trip is between two nodes of the same cluster. The CRC checksums are written to the file listing files.

OPSAPS-70685: Post Copy Reconciliation (PCR) for HDFS replication policies between on-premises clusters

To add the Post Copy Reconciliation (PCR) script to run as a command step during the HDFS replication policy job run, you can enter the SCHEDULES_WITH_ADDITIONAL_DEBUG_STEPS = [***ENTER COMMA-SEPARATED LIST OF NUMERICAL IDS OF THE REPLICATION POLICIES***] key-value pair in the target Cloudera Manager > Clusters > HDFS service > hdfs_replication_env_safety_valve property.

To run the PCR script on the HDFS replication policy, use the /clusters/[***CLUSTER NAME***]>/services/[***SERVICE***]/replications/[***SCHEDULE ID***]/postCopyReconciliation API.

For more information about the PCR script, see How to use the post copy reconciliation script for HDFS replication policies.

OPSAPS-70188: Conflicts field missing in ParcelInfo

Fixed an issue in parcels where conflicts field in manifest.json would mark a parcel as invalid

OPSAPS-70248: Optimize Impala Graceful Shutdown Initiation Time

This issue is resolved by streamlining the shutdown initiation process, reducing delays on large clusters.

OPSAPS-70157: Long-term credential-based GCS replication policies continue to work when cluster-wide IDBroker client configurations are deployed

Replication policies that use long-term GCS credentials work as expected even when cluster-wide IDBroker client configurations are configured.

OPSAPS-70422: Change the “Run as username(on source)” field during Hive external table replication policy creation

You can use a different user other than hdfs for Hive external table replication policy run to replicate from an on-premises cluster to the cloud bucket if the USE_PROXY_USER_FOR_CLOUD_TRANSFER=true key-value pair is set for the source Cloudera Manager > Clusters > Hive service > Configuration > Hive Replication Environment Advanced Configuration Snippet (Safety Valve) property. This is applicable for all external accounts other than IDBroker external account.

OPSAPS-70460: Allow white space characters in Ozone snapshot-diff parsing

Ozone incremental replication no longer fails if a changed path contains one or more space characters.

OPSAPS-70594: Ozone HttpFS gateway role is not added to Rolling Restart

This issue is now resolved by adding the Ozone HttpFS gateway role to the Rolling Restart.

OPSAPS-68752: Snapshot-diff delta is incorrectly renamed/deleted twice during on-premises to cloud replication

The snapshots created during replication are deleted twice instead of once, which results in incorrect snapshot information. This issue is fixed. For more information, see Cloudera Customer Advisory 2023-715: Replication Manager may delete its snapshot information when migrating from on-prem to cloud.

OPSAPS-68112: Atlas diagnostic bundle should contain server log, configurations, and, if possible, heap memories

The diagnostic bundle contains server log, configurations, and heap memories in a GZ file inside the diagnostic .zip package.

OPSAPS-69921: ATLAS_OPTS environment variable is set for FIPS with JDK 11 environments to run the import script in Atlas

_JAVA_OPTIONS are populated with additional parameters as seen in the following:

java_opts = 'export _JAVA_OPTIONS="-Dcom.safelogic.cryptocomply.fips.approved_only=true ' \
'--add-modules=com.safelogic.cryptocomply.fips.core,' \
'bctls --add-exports=java.base/sun.security.provider=com.safelogic.cryptocomply.fips.core ' \
'--add-exports=java.base/sun.security.provider=bctls --module-path=/cdep/extra_jars ' \
'-Dcom.safelogic.cryptocomply.fips.approved_only=true -Djdk.tls.ephemeralDHKeySize=2048 ' \
'-Dorg.bouncycastle.jsse.client.assumeOriginalHostName=true -Djdk.tls.trustNameService=true" '

OPSAPS-71258: Kafka, SRM, and SMM cannot process messages compressed with Zstd or Snappy if /tmp is mounted as noexec

Kafka, Streams Replication Manager, and Streams Messaging Manager can now process messages compressed with Zstd and Snappy if /tmp is mounted with the noexec option.

This fix changes the default Zstd and Snappy temporary directory from /tmp to the following service specific directories.

Kafka - /var/lib/kafka
Streams Messaging Manager - /var/lib/streams_messaging_manager
Streams Replication Manager - /var/lib/streams_messaging_manager

Ensure that each service user has write and execute permission on the directory specific to their service. Otherwise, the service will fail to process compressed messages.

The Kafka service user (default: kafka) must have write and execute permission on /var/lib/kafka.
The Streams Messaging Manager service user (default: streamsmsgmgr) must have write and execute permission on /var/lib/streams_messaging_manager.
The Streams Replication Manager service user (default: streamsrepmgr) must have write and execute permission on /var/lib/streams_replication_manager.

OPSAPS-69481: Some Kafka Connect metrics missing from Cloudera Manager due to conflicting definitions

Cloudera Manager now registers the metrics kafka_connect_connector_task_metrics_batch_size_avg and kafka_connect_connector_task_metrics_batch_size_max correctly.

OPSAPS-68708: Schema Registry might fail to start if a load balancer address is specified in Ranger

Schema Registry now always ensures that the address it uses to connect to Ranger ends with a trailing slash (/). As a result, Schema Registry no longer fails to start if Ranger has a load balancer address configured that does not end with a trailing slash.

OPSAPS-69978: Cruise Control capacity.py script fails on Python 3

The script querying the capacity information is now fully compatible with Python 3.

OPSAPS-64385: Atlas's client.auth.enabled configuration is not configurable

In customer environments where user certifications are required to authenticate to services, the Apache Atlas web UI will constantly prompt for certifications. To solve this, the client.auth.enabled parameter is set to true by default. If it is needed to set it false, then you need to override the setting from safety-valve with a configuration snippet. Once it set to false, then no more certificate prompts will be displayed.

OPSAPS-71089: Atlas's client.auth.enabled configuration is not configurable

OPSAPS-71677: When you are upgrading from CDP Private Cloud Base 7.1.9 SP1 to Cloudera Base on premises 7.3.1, upgrade-rollback execution fails during HDFS rollback due to missing directory.

This issue is now resolved. The HDFS meta upgrade command is executed by creating the previous directory due to which the rollback does not fail.

OPSAPS-71390: COD cluster creation is failing on INT and displays the Failed to create HDFS directory /tmp error.

This issue is now resolved. Export options for jdk17 is added.

OPSAPS-71188: Modify default value of dfs_image_transfer_bandwidthPerSec from 0 to a feasible value to mitigate RPC latency in the namenode.

This issue is now resolved.

OPSAPS-58777: HDFS Directories are created with root as user.

This issue is now resolved by fixing service.sdl.

OPSAPS-71474: In Cloudera Manager UI, the Ozone service Snapshot tab displays label label.goToBucket and it must be changed to Go to bucket.

This issue is now resolved.

OPSAPS-70288: Improvements in master node decommissioning.

This issue is now resolved by making usability and functional improvements to the Ozone master node decommissioning.

OPSAPS-71647: Ozone replication fails for incompatible source and target Cloudera Manager versions during the payload serialization operation

Replication Manager now recognizes and annotates the required fields during the payload serialization operation. For the list of unsupported Cloudera Manager versions that do not have this fix, see Preparing clusters to replicate Ozone data.

OPSAPS-71156: PostCopyReconciliation ignores mismatching modification time for directories

The Post Copy Reconciliation script (PCR) script does not check the file length, last modified time, and cyclic redundancy check (CRC) checksums for directories (paths that are directories) on both the source and target clusters.

OPSAPS-70732: Atlas replication policies no longer consider inactive Atlas server instances

Replication Manager considers only the active Atlas server instances during Atlas replication policy runs.

OPSAPS-70924: Configure Iceberg replication policy level JVM options

You can add replication-policy level JVM options for the export, transfer, and sync CLIs for Iceberg replication policies on the Advanced tab in the Create Iceberg Replication Policy wizard.

OPSAPS-70657: KEYTRUSTEE_SERVER & RANGER_KMS_KTS migration to RANGER_KMS from CDP 7.1.x to UCL

KEYTRUSTEE_SERVER and RANGER_KMS_KTS services are not supported starting from the Cloudera Base on premises 7.3.1 release. Therefore added validation and confirmation messages to the Cloudera Manager upgrade wizard to alert the user to migrate KEYTRUSTEE_SERVER keys to RANGER_KMS before upgrading to Cloudera Base on premises 7.3.1 release.

OPSAPS-70656: Remove KEYTRUSTEE_SERVER & RANGER_KMS_KTS from Cloudera Manager for UCL

The Keytrustee components - KEYTRUSTEE_SERVER and RANGER_KMS_KTS services are not supported starting from the Cloudera Base on premises 7.3.1 release. These services cannot be installed or managed with Cloudera Manager 7.13.1 using Cloudera Base on premises 7.3.1.

OPSAPS-67480: In CDP 7.1.9, default Ranger policy is added from the cdp-proxy-token topology, so that after a new installation of CDP 7.1.9, the knox-ranger policy includes cdp-proxy-token. However, upgrades do not add cdp-proxy-token to cm_knox policies automatically.

This issue is fixed now.

OPSAPS-70838: Flink user should be add by default in ATLAS_HOOK topic policy in Ranger >> cm_kafka

The "flink" service user is granted publish access on the ATLAS_HOOK topic by default in the Kafka Ranger policy configuration.

OPSAPS-69411: Update AuthzMigrator GBN to point to latest non-expired GBN

Users will now be able to export sentry data only for given Hive objects (databases and tables and the respective URLs) by using the config "authorization.migration.export.migration_objects" during export.

OPSAPS-68252: "Ranger RMS Database Full Sync" option was not visible on mow-int cluster setup for hrt_qa user (7.13.0.0)

The fix makes the command visible on cloud clusters when the user has minimum EnvironmentAdmin privilege.

OPSAPS-70148: Ranger audit collection creation is failing on latest SSL enabled UCL cluster due to zookeeper connection issue

Added support for secure ZooKeeper connection for the Ranger Plugin Solr audit connection configuration xasecure.audit.destination.solr.zookeepers.

OPSAPS-52428: Add SSL to ZooKeeper in CDP

Added SSL/TLS encryption support to CDP components. ZooKeeper SSL (secure) port now gets automatically enabled and components communicate on the encrypted channel if cluster has AutoTLS enabled.

OPSAPS-72093: FIPS - yarn jobs are failing with No key provider is configured

The yarn.nodemanager.admin environment must contain the FIPS related Java options, and this configuration is handled such that the comma is a specific character in the string. This change proposes to use single module additions in the default FIPS options (use separate --add-modules for every module), and it adds the FIPS options to the yarn.nodemanager.admin environment.

Previously, yarn.nodemanager.container-localizer.admin.java.opts contained FIPS options only for 7.1.9, this patch also fixes this, and adds the proper configurations in 7.3.1 environments also.

This was tested on a real cluster, and with the current changes YARN works properly, and can successfully run distcp from/to encryption zones.

OPSAPS-70113: Fix the ordering of YARN admin ACL config

The YARN Admin ACL configuration in Cloudera Manager shuffled the ordering when it was generated. This issue is now fixed, so that the input ordering is maintained and correctly generated.

DMX-3364: Drop table operation works incorrectly during Iceberg replication

A replicated table was dropped automatically in the target cluster during a subsequent policy run after you dropped the table in the source cluster, and then edited the replication policy to remove the table and added another table to the replication policy. This issue is resolved.