Fixed Issues for the Cloudera Data Services on premises 1.5.5 SP1
You can review the list of reported issues and their fixes in Cloudera Data Services on premises 1.5.5 SP1. Fixed issues represent selected issues that were previously logged through Cloudera Support, but are now addressed in the current Cloudera Data Services on premises release. These issues may have been reported in previous versions of Cloudera Data Services on premises as a known issue; meaning they were reported by customers or identified by Cloudera Quality Engineering teams.
- OPSX-6589 - Cloudera Embedded Container Service 1.5.4 or 1.5.5, rke2_launch
startup script fails to run because root is disallowed to use
sudo - Startup failure in Cloudera Embedded Container Service 1.5.4 or 1.5.5 when
sudois disallowed for root. In 1.5.4 SP2, therke2_launchstartup script was updated to usesudofor setting the default NFSv4 configuration. This change caused Cloudera Embedded Container Service startup failures in environments wheresudois disabled for all users (including root).The
rke2_launchinstaller script has been updated so that inlinesudocommands are no longer required and startup now proceeds without needingsudo.
- OPSX-6049 - Diagnostics service did not purge logs from /data/env
- Previously, the diagnostics service only purged logs under /data/bundles and /data/dp. Log-purging functionality has been extended to also include the /data/env directory.
- OPSX-6602 - Break the dependency between unseal vault and restart validations steps
- The restart validation step waits for pods in the cdp namespace
to become Ready. However, some pods in this namespace depend on
vault being unsealed first. Since the vault unsealing occurs after this step, the
validation process waits until it times out unless vault is unsealed manually.
To prevent this issue, the validation of the cdp namespace is excluded from the restart validation to eliminate the manual unseal dependency.
- OPSX-6349 - Upgrade logic missing for
nvidia-device-pluginin Cloudera Embedded Container Service - After upgrading from Cloudera Embedded Container Service 1.5.2 to 1.5.4 (or later),
Nvidia pods continued using legacy images because only install logic (not upgrade logic)
was present.
The installer now includes a pre-upgrade step that upgrades the
nvidia-device-plugin. After this change, upgrades of Cloudera Data Services on premises will ensure the plugin is updated too.
- OPSX-6305 - Removal of SUID bit from /opt/cni/bin/install after Cloudera Embedded Container Service upgrade
- The issue was identified during upgrades from older Calico versions, which left the binary with SUID permissions, posing a security risk. The fix involved removal of the SUID bit from the /opt/cni/bin/install binary after Cloudera Embedded Container Service upgrades to ensure compliance with security policies.
- OPSX-6245 - Airgap | Multiple pods are in pending state on rolling restart
- When performing consecutive rolling restarts on Cloudera Embedded Container Service
clusters, the
kube-controller-managerpod sometimes fails to become Ready promptly. This prevents other critical pods (including Vault) from initializing, causing the Vault-unseal step to fail.The installer now implements pre-stop and post-start actions for Cloudera Embedded Container Service restarts in air-gapped environments to ensure pods return to Ready before proceeding.
- OPSX-6241 - rke2-ingress-nginx-controller pod fails after Cloudera Embedded Container Service upgrade
-
After upgrading Cloudera Embedded Container Service, the ingress controller pod failed to start reliably. The restart sequence for the ingress component now includes forced termination of ingress pods, followed by scaling down and scaling up of the rke2-ingress-nginx-controller pod to ensure correct startup.
- OPSX-5216 - Unsupported 'internal alias for registry' option
- The internal alias for registry setting in Cloudera Embedded Container Service was enabled, but this configuration is not supported
when running Cloudera AI workloads on fresh clusters.
Do not set the internal alias for registry flag if you intend to run Cloudera AI workloads.
- OPSX-4763 - Environment state not reset after failed creation
- In EnvironmentServiceAsyncUtils, when hive warehouse external directory creation failed, the isAvailable flag was set to false. So, even when subsequent API calls succeeded, the environment remained in a non-AVAILABLE state.
- OPSX-3323 - Custom log redaction does not work for JSON files in the diag bundle
- Redaction rules (configured via the admin console) were not applied to JSON files in the
diagnostic bundle, although they worked for .log and .txt
files.
The diagnostic-bundle logic has been refactored so that JSON files are now included in log redaction. No behavioural change; only the file coverage is extended.
- OPSAPS-74598 - Removal of Internal Alias Configuration from Cloudera Embedded Container Service Installation Wizard
- ECS Internal Alias configuration is removed from Cloudera Embedded Container Service installation wizard, so that you do not enable this configuration.
- OPSAPS-75358 - Configure the 'worker-shutdown-timeout' value for nginx ingress controller to 15 minutes
- Based on the analysis for DBS workloads, the
worker-shutdown-timeoutparameter for the nginx ingress controller has been lowered to 15 minutes to ensure graceful shutdown.
- OPSX-6227 - Environment should be marked as FAILED if one of the async tasks fails
- Previously, if one asynchronous task failed during environment creation, the environment
remained in a CREATED state rather than
FAILED, causing downstream workflows to proceed
incorrectly.
The environment service now transitions the environment to FAILED status if any async task fails during creation.
- OPSX-3953 - Upgrade postgresql image used in cdp-embedded-db
-
The embedded DB PostgreSQL component is upgraded from version 10.x to 17.4 as part of the Cloudera Control Plane upgrade. When Cloudera Control Plane 1.5.5 SP1 is installed or Cloudera Control Plane 1.5.x is upgraded to 1.5.5 SP1, cdp-embedded-db will participate in the Cloudera Control Plane upgrade. If the prior version is 1.5.x, then the data is automatically migrated to the new PostgreSQL 16 database.
