Fixed Issues in Cloudera Manager 7.13.1
Fixed issues in Cloudera Manager 7.13.1.
- OPSAPS-72254: FIPS Failed to upload Spark example jar to HDFS in cluster mode
Fixed an issue with deploying the
Spark 3 Client Advanced Configuration Snippet (Safety Valve)
for spark3-conf/spark-env.sh.For more information, see Added a new Cloudera Manager configuration parameter
spark_pyspark_executable_path
to Livy for Spark 3 in Behavioral Changes In Cloudera Manager 7.13.1.- OPSAPS-71873 - UCL | CKP4| livyfoo0 kms proxy user is not allowed to access HDFS in 7.3.1.0
- In the kms-core.xml file, the Livy proxy user is taken from Livy for Spark 3's configuration in Cloudera 7.3.1 and above.
- OPSAPS-70976: The previously hidden real-time monitoring properties are now visible in the Cloudera Manager UI:
- The following properties are now visible in the Cloudera Manager
UI:
enable_observability_real_time_jobs
enable_observability_metrics_dmp
- OPSAPS-69996: HBase snapshot creation in Cloudera Manager does not work as expected
- During the HBase snapshot creation process, the snapshot create command sometimes tries to create the same snapshot twice because of an unhandled OptimisticLockException during the database write operation. This resulted in intermittent HBase snapshot creation failures. The issue is fixed now.
- OPSAPS-66459: Enable concurrent Hive external table replication policies with the same cloud root
- When the
HIVE_ALLOW_CONCURRENT_REPLICATION_WITH_SAME_CLOUD_ROOT_PATH
feature flag is enabled, Replication Manager can run two or more Hive external table replication policies with the same cloud root path concurrently.For example, if two Hive external table replication policies have s3a://bucket/hive/data as the cloud root path and the feature flag is enabled, Replication manager can run these policies concurrently.
By default, this feature flag is disabled. To enable the feature flag, contact your Cloudera account team.
- OPSAPS-70859: Ranger metrics APIs were not working on FedRAMP cluster
- On FedRAMP HA cloud cluster, Ranger metrics APIs were not
working.This issue is fixed now by introducing new Ranger configurations.
This issue is fixed now by introducing new Ranger configurations.
- OPSAPS-71436: Telemetry publisher test Altus connection fails
- An error occurred while running the test Altus connection action for Telemetry Publisher. This issue is fixed now.
- OPSAPS-68252: The Ranger RMS Database Full Sync command is not visible on cloud clusters
- The Ranger RMS Database Full Sync command was not visible on any cloud cluster. Also, it was needed to investigate the minimum user privilege required to see the Ranger RMS Database Full Sync command on the UI.
- OPSAPS-69692, OPSAPS-69693: Included filters for Ozone incremental replication in API endpoint
- You can use the include filters in the POST /clusters/{clusterName}/services/{serviceName}/replications API to replicate only the filtered part of the Ozone bucket. You can use multiple path regular expressions to limit the data to be replicated for an Ozone bucket. For example, if you include the /path/to/data/.* and .*/data filters in the includeFilter field for the POST endpoint, the Ozone replication policy replicates only the keys that start with /path/to/data/.* or ends with .*/data in the Ozone bucket.
- OPSAPS-70561: Improved page load performance of the “Bucket Browser” tab.
- The tab does not load all the entries of the bucket. Therefore, the page loads faster when you try to display the content of a large bucket with several keys in it.
- OPSAPS-71090: The
spark.*.access.hadoopFileSystems
gateway properties are not propagated to Livy. - Added new properties for configuring Spark 2
(
spark.yarn.access.hadoopFileSystems
) and Spark 3 (spark.kerberos.access.hadoopFileSystems
) that propagate to Livy. - OPSAPS-71271: The precopylistingcheck script for Ozone replication policies uses the Ozone replication safety valve value.
- The "Run Pre-Filelisting Check" step during Ozone
replication uses the content of the
ozone_replication_core_site_safety_valve
" property value to configure the Ozone client for the source and the target Cloudera Manager. - OPSAPS-70983: Hive replication command for Sentry to Ranger replication works as expected
- The Sentry to Ranger migration during the Hive replication policy run from CDH 6.3.x or higher to CDP Public Cloud 7.3.0.1 or higher is successful.
- OPSAPS-69806: Collection of YARN diagnostic bundle will fail
-
For any combinations of CM 7.11.3 version up to CM 7.11.3 CHF7 version, with CDP 7.1.7 through CDP 7.1.8, collection of the YARN diagnostic bundle will fail, and no data transmits occur.
Now the changes are made to Cloudera Manager to allow the collection of the YARN diagnostic bundle and make this operation successful.
- OPSAPS-70655: The hadoop-metrics2.properties file is not getting generated into the ranger-rms-conf folder
- The hadoop-metrics2.properties file was getting created in the process directory conf folder, for example, conf/hadoop-metrics2.properties, whereas the directory structure in Ranger RMS should be {process_directory}/ranger-rms-conf/hadoop-metrics2.properties.
- OPSAPS-71014: Auto action email content generation failed for some cluster(s) while loading the template file
-
The issue has been fixed by using a more appropriate template loader class in the freemarker configuration.
- OPSAPS-70826: Ranger replication policies fail when target cluster uses Dell EMC Isilon storage and supports JDK17
-
Ranger replication policies no longer fail if the target cluster is deployed with Dell EMC Isilon storage and also supports JDK17.
- OPSAPS-70861: HDFS replication policy creation process fails for Isilon source clusters
-
When you choose a source CDP Private Cloud Base cluster using the Isilon service and a target cloud storage bucket for an HDFS replication policy in CDP Private Cloud Base Replication Manager UI, the replication policy creation process fails. This issue is fixed now.
- OPSAPS-70708: Cloudera Manager Agent not skipping autofs filesystems during filesystem check
-
Clusters in which there are a large number of network mounts on each host (for example, more than 100 networked file system mounts), cause the startup of Cloudera Manager Agent to take a long time, on the order of 10 to 20 seconds per mount point. This is due to the OS kernel on the cluster host interrogating each network mount on behalf of the Cloudera Manager Agent to gather monitoring information such as file system usage.
This issue is fixed now by adding the ability in the Cloudera Manager Agent's config.ini file to disable filesystem checks.
- OPSAPS-68991: Change default SAML response binding to HTTP-POST
-
The default SAML response binding is HTTP-Artifact, rather than HTTP-POST. While HTTP-POST is designed for handling responses through the POST method, where as HTTP-Artifact necessitates a direct connection with the SP (Cloudera Manager in this case) and Identity Provider (IDP) and is rarely used. HTTP-POST should be the default choice instead.
This issue is fixed now by setting up the new Default SAML Binding to HTTP-POST.
- OPSAPS-40169: Audits page does not list failed login attempts on applying Allowed = false filter
-
The Audits page in Cloudera Manager shows failed login attempts when no filter is applied. However, when the Allowed = false filter is applied it returns 0 results. Whereas it should have listed those failed login attempts. This issue is fixed now.
- OPSAPS-70583: File Descriptor leak from Cloudera Manager 7.11.3 CHF3 version to Cloudera Manager 7.11.3 CHF7
-
Unable to create NettyTransceiver due to Avro library upgrade which leads to File Descriptor leak. File Descriptor leak occurs in Cloudera Manager when a service tries to talk with Event Server over Avro. This issue is fixed now.
- OPSAPS-70962: Creating a cloud restore HDFS replication policy with a peer cluster as destination which is not supported by Replication Manager
-
During the HDFS replication policy creation process, incorrect Destination clusters and MapReduce services appear which when chosen creates a dummy replication policy to replicate from a cloud account to a remote peer cluster. This scenario is not supported by Replication Manager. This issue is now fixed.
- OPSAPS-71108: Use the earlier format of PCR
-
You can use the latest version of the PCR (Post Copy Reconciliation) script, or you can restore PCR to the earlier format by setting the com.cloudera.enterprise.distcp.post-copy-reconciliation.legacy-output-format.enabled=true key value pair in the
property. - OPSAPS-70689: Enhanced performance of DistCp CRC check operation
- When a MapReduce job for an HDFS replication policy job fails, or when there are target-side changes during a replication job, Replication Manager initiates the bootstrap replication process. During this process, a cyclic redundancy check (CRC) check is performed by default to determine whether a file can be skipped for replication.
- OPSAPS-70685: Post Copy Reconciliation (PCR) for HDFS replication policies between on-premises clusters
- To add the Post Copy Reconciliation (PCR) script to run as a command step during the HDFS replication policy job run, you can enter the SCHEDULES_WITH_ADDITIONAL_DEBUG_STEPS = [***ENTER COMMA-SEPARATED LIST OF NUMERICAL IDS OF THE REPLICATION POLICIES***] key-value pair in the property.
- OPSAPS-70188: Conflicts field missing in
ParcelInfo
-
Fixed an issue in parcels where conflicts field in manifest.json would mark a parcel as invalid
- OPSAPS-70248: Optimize Impala Graceful Shutdown Initiation Time
- This issue is resolved by streamlining the shutdown initiation process, reducing delays on large clusters.
- OPSAPS-70157: Long-term credential-based GCS replication policies continue to work when cluster-wide IDBroker client configurations are deployed
- Replication policies that use long-term GCS credentials work as expected even when cluster-wide IDBroker client configurations are configured.
- OPSAPS-70422: Change the “Run as username(on source)” field during Hive external table replication policy creation
- You can use a different user other than hdfs for Hive external table replication policy run to replicate from an on-premises cluster to the cloud bucket if the USE_PROXY_USER_FOR_CLOUD_TRANSFER=true key-value pair is set for the property. This is applicable for all external accounts other than IDBroker external account.
- OPSAPS-70460: Allow white space characters in Ozone snapshot-diff parsing
- Ozone incremental replication no longer fails if a changed path contains one or more space characters.
- OPSAPS-70594: Ozone HttpFS gateway role is not added to Rolling Restart
- This issue is now resolved by adding the Ozone HttpFS gateway role to the Rolling Restart.
- OPSAPS-68752: Snapshot-diff delta is incorrectly renamed/deleted twice during on-premises to cloud replication
- The snapshots created during replication are deleted twice instead of once, which results in incorrect snapshot information. This issue is fixed. For more information, see Cloudera Customer Advisory 2023-715: Replication Manager may delete its snapshot information when migrating from on-prem to cloud.
- OPSAPS-70226: Atlas uses the Solr configuration directory available in ATLAS_PROCESS/conf/solr instead of the Cloudera Manager provided directory
- Atlas uses the configuration in /var/run/cloudera-scm-agent/process/151-atlas-ATLAS_SERVER/solrconf.xml.
- OPSAPS-68112: Atlas diagnostic bundle should contain server log, configurations, and, if possible, heap memories
- The diagnostic bundle contains server log, configurations, and heap memories in a GZ file inside the diagnostic .zip package.
- OPSAPS-69921: ATLAS_OPTS environment variable is set for FIPS with JDK 11 environments to run the import script in Atlas
_JAVA_OPTIONS
are populated with additional parameters as seen in the following:java_opts = 'export _JAVA_OPTIONS="-Dcom.safelogic.cryptocomply.fips.approved_only=true ' \ '--add-modules=com.safelogic.cryptocomply.fips.core,' \ 'bctls --add-exports=java.base/sun.security.provider=com.safelogic.cryptocomply.fips.core ' \ '--add-exports=java.base/sun.security.provider=bctls --module-path=/cdep/extra_jars ' \ '-Dcom.safelogic.cryptocomply.fips.approved_only=true -Djdk.tls.ephemeralDHKeySize=2048 ' \ '-Dorg.bouncycastle.jsse.client.assumeOriginalHostName=true -Djdk.tls.trustNameService=true" '
- OPSAPS-71258: Kafka, SRM, and SMM cannot process messages compressed with Zstd or Snappy if /tmp is mounted as noexec
- The issue is fixed by using JVM flags that point to a different temporary folder for extracting the native library.
- OPSAPS-69481: Some Kafka Connect metrics missing from Cloudera Manager due to conflicting definitions
- Cloudera Manager now registers the metrics
kafka_connect_connector_task_metrics_batch_size_avg
andkafka_connect_connector_task_metrics_batch_size_max
correctly. - OPSAPS-68708: Schema Registry might fail to start if a load balancer address is specified in Ranger
- Schema Registry now always ensures that the address it uses to connect to Ranger ends with a trailing slash (/). As a result, Schema Registry no longer fails to start if Ranger has a load balancer address configured that does not end with a trailing slash.
- OPSAPS-69978: Cruise Control capacity.py script fails on Python 3
- The script querying the capacity information is now fully compatible with Python 3.
- OPSAPS-64385: Atlas's client.auth.enabled configuration is not configurable
- In customer environments where user certifications are required to authenticate to
services, the Apache Atlas web UI will constantly prompt for certifications. To solve
this, the
client.auth.enabled
parameter is set totrue
by default. If it is needed to set itfalse
, then you need to override the setting from safety-valve with a configuration snippet. Once it set tofalse
, then no more certificate prompts will be displayed. - OPSAPS-71089: Atlas's client.auth.enabled configuration is not configurable
- In customer environments where user certifications are
required to authenticate to services, the Apache Atlas web UI will constantly prompt for
certifications. To solve this, the
client.auth.enabled
parameter is set totrue
by default. If it is needed to set itfalse
, then you need to override the setting from safety-valve with a configuration snippet. Once it set tofalse
, then no more certificate prompts will be displayed. - OPSAPS-71677: When you are upgrading from CDP Private Cloud Base 7.1.9 SP1 to CDP Private Cloud Base 7.3.1, upgrade-rollback execution fails during HDFS rollback due to missing directory.
- This issue is now resolved. The HDFS meta upgrade command is executed by creating the previous directory due to which the rollback does not fail.
- OPSAPS-71390: COD cluster creation is failing on INT and displays the Failed to create HDFS directory /tmp error.
- This issue is now resolved. Export options for jdk17 is added.
- OPSAPS-71188: Modify default value of dfs_image_transfer_bandwidthPerSec from 0 to a feasible value to mitigate RPC latency in the namenode.
- This issue is now resolved.
- OPSAPS-58777: HDFS Directories are created with root as user.
- This issue is now resolved by fixing service.sdl.
- OPSAPS-71474: In Cloudera Manager UI, the Ozone service Snapshot tab displays label label.goToBucket and it must be changed to Go to bucket.
- This issue is now resolved.
- OPSAPS-70288: Improvements in master node decommissioning.
- This issue is now resolved by making usability and functional improvements to the Ozone master node decommissioning.
- OPSAPS-71647: Ozone replication fails for incompatible source and target Cloudera Manager versions during the payload serialization operation
- Replication Manager now recognizes and annotates the required fields during the payload serialization operation. For the list of unsupported Cloudera Manager versions that do not have this fix, see Preparing clusters to replicate Ozone data.
- OPSAPS-71156: PostCopyReconciliation ignores mismatching modification time for directories
- The Post Copy Reconciliation script (PCR) script does not check the file length, last modified time, and cyclic redundancy check (CRC) checksums for directories (paths that are directories) on both the source and target clusters.
- OPSAPS-70732: Atlas replication policies no longer consider inactive Atlas server instances
- Replication Manager considers only the active Atlas server instances during Atlas replication policy runs.
- OPSAPS-70924: Configure Iceberg replication policy level JVM options
- You can add replication-policy level JVM options for the export, transfer, and sync CLIs for Iceberg replication policies on the Advanced tab in the Create Iceberg Replication Policy wizard.
- OPSAPS-70657: KEYTRUSTEE_SERVER & RANGER_KMS_KTS migration to RANGER_KMS from CDP 7.1.x to UCL
- KEYTRUSTEE_SERVER and RANGER_KMS_KTS services are not supported starting from the CDP 7.3.1 release. Therefore added validation and confirmation messages to the CM upgrade wizard to alert the user to migrate KEYTRUSTEE_SERVER keys to RANGER_KMS before upgrading to CDP 7.3.1 release.
- OPSAPS-70656: Remove KEYTRUSTEE_SERVER & RANGER_KMS_KTS from CM for UCL
- The Keytrustee components - KEYTRUSTEE_SERVER and RANGER_KMS_KTS services are not supported starting from the CDP 7.3.1 release. These services cannot be installed or managed with CM 7.13.1 using CDP 7.3.1.
- OPSAPS-67480: In 7.1.9, default Ranger policy is added from the cdp-proxy-token topology, so that after a new installation of CDP-7.1.9, the knox-ranger policy includes cdp-proxy-token. However, upgrades do not add cdp-proxy-token to cm_knox policies automatically.
- This issue is fixed now.
- OPSAPS-70838: Flink user should be add by default in ATLAS_HOOK topic policy in Ranger >> cm_kafka
- The "flink" service user is granted publish access on the ATLAS_HOOK topic by default in the Kafka Ranger policy configuration.
- OPSAPS-69411: Update AuthzMigrator GBN to point to latest non-expired GBN
- Users will now be able to export sentry data only for given Hive objects (databases and tables and the respective URLs) by using the config "authorization.migration.export.migration_objects" during export.
- OPSAPS-68252: "Ranger RMS Database Full Sync" option was not visible on mow-int cluster setup for hrt_qa user (7.13.0.0)
- The fix makes the command visible on cloud clusters when the user has minimum EnvironmentAdmin privilege.
- OPSAPS-70148: Ranger audit collection creation is failing on latest SSL enabled UCL cluster due to zookeeper connection issue
- Added support for secure ZooKeeper connection for the Ranger Plugin Solr audit connection configuration xasecure.audit.destination.solr.zookeepers.
- OPSAPS-52428: Add SSL to ZooKeeper in CDP
- Added SSL/TLS encryption support to CDP components. ZooKeeper SSL (secure) port now gets automatically enabled and components communicate on the encrypted channel if cluster has AutoTLS enabled.
- OPSAPS-72093: FIPS - yarn jobs are failing with No key provider is configured
- The
yarn.nodemanager.admin
environment must contain the FIPS related Java options, and this configuration is handled such that the comma is a specific character in the string. This change proposes to use single module additions in the default FIPS options (use separate --add-modules for every module), and it adds the FIPS options to theyarn.nodemanager.admin
environment.Previously,
yarn.nodemanager.container-localizer.admin.java.opts
contained FIPS options only for 7.1.9, this patch also fixes this, and adds the proper configurations in 7.3.1 environments also.This was tested on a real cluster, and with the current changes YARN works properly, and can successfully run distcp from/to encryption zones.
- OPSAPS-70113: Fix the ordering of YARN admin ACL config
- The YARN Admin ACL configuration in Cloudera Manager shuffled the ordering when it was generated. This issue is now fixed, so that the input ordering is maintained and correctly generated.