Learn about the known issues for Cloudera Data Services on premises,
the impact or changes to the functionality, and the workaround.
- CDPVC-1739 - Rollback workload username generation changes to pre
1.5.4-SP1 state in Cloudera Data Services on premises
-
The current system for generating workload usernames in CDE, which shares code with
PbC, automatically converts usernames to lowercase. Additionally, it removes special
characters/spaces and adds prefixes if the IdP (Identity Provider) user ID starts with a
number.
This automated transformation creates a significant issue: Ranger, a critical
component for authorization, requires the workload username to exactly match the IdP
user ID. If an IdP user ID from LDAP/SAML contains uppercase or mixed-case
characters, special characters, spaces, or starts with a number, the CDE transformation
will result in a mismatch. This mismatch prevents Ranger from properly recognizing and
authorizing users, leading to functionality issues.
-
To resolve this discrepancy and ensure Ranger functions correctly, the affected
workload usernames in the UMS (User Management Service) database must be manually
updated.
There are two methods to achieve this:
- Manually Update the Usernames in the UMS Database.
Run the following SQL command
to update the workload
username:
update ums_actors set workload_username = '123456' where workload_username =
'u_123456'
- For a more comprehensive update of all affected workload usernames in the ECS - UMS
database, utilize the provided script:
rename_workload_username_runbook.md.
- OBS-6044 - Warning alert in the Cloudera Embedded Container Service Health Test status when a cluster is restarted for stability execution
- The following warning alert is shown in the Cloudera Embedded Container Service Health Test status when a cluster is restarted in Cloudera Manager for stability execution.
Prometheus has issues
compacting blocksThis issue occurs when WAL (Write Ahead Logs) are
corrupted.
-
-
Run the following command to access the Prometheus container's
shell:
kubectl exec -i -t -n <prometheus server namespace> <prometheus server pod name> -c
<prometheus server container name> -- sh -c "(bash || ash || sh)"
- Change the current working directory to the WAL directory of Prometheus.
- For Infrastructure Prometheus: The WAL directory location is
/Prometheus/wal. For example:cd
/prometheus/wal
- For control plane/environment: The Prometheus directory location is
/data/wal. For example: cd
/data/wal
- Note the corrupted segment from Prometheus's pod logs. Example
logs:
21T09:00:07.036Z caller=db.go:1074 level=error component=tsdb msg="compaction failed" err="WAL truncation in Compact: create checkpoint: read segments: corruption in segment
/prometheus/wal/00000026 at 10978: unexpected full record"
- Skip the compression of the corrupted segment by moving the checkpoint.
This requires renaming the checkpoint folder in the WAL directory. For example, if the
corrupted segment is
00000026 and the current checkpoint folder name
is checkpoint.00000020, then rename the checkpoint folder to
checkpoint.00000027. For example: mv
checkpoint.00000020 checkpoint.00000027
- OPSX-5810 - Cloudera Control Plane
on premises installation fails at the vault initialization
phase due to longhorn-manager pods
-
At times, longhorn-manager pods will fail to come up with repeating error messages
like:
level=error msg="Failed to save TLS secret for longhorn-system/longhorn-webhook-tls: Operation cannot be fulfilled on secrets \"longhorn-webhook-tls\": the object has been modified; please apply your changes to the latest version and try again"
This causes the Longhorn nodes to remain in a NotReady state,
stopping volumes from successfully being created/attached.
-
The following steps can be taken on an ECS Server node to fix the issue:
- Stop the Longhorn Manager daemonset by executing following
command:
kubectl -n longhorn-system patch daemonset longhorn-manager -p '{"spec": {"template": {"spec": {"nodeSelector": {"non-existing": "true"}}}}}'
- Delete the Longhorn Webhook TLS secret by executing the following
command:
kubectl delete secret longhorn-webhook-tls -n longhorn-system
- Start the Longhorn Manager daemonset by executing the following
command:
kubectl -n longhorn-system patch daemonset longhorn-manager --type json -p='[{"op": "remove", "path": "/spec/template/spec/nodeSelector/non-existing"}]'
- OPSX-5239: Updating the External Docker Registry Certificate
command fails when existing Pods are restarted.
- If a wrong certificate is updated using the path ECS-> admin->
certificates then the wrong certificate cannot be restored using the Cloudera
Manager
Update External Docker Certificate command to correct the
external docker certificate.
- If you plan to alter the external docker certificate with an invalid certificate and run
the Cloudera Manager's 'Update External Docker Certificate' command to correct the
external docker certificate, this workflow is not supported.
- For Example:
-
- Install PVC with an external docker registry.
- 2. Update the wrong certificate in the ECS configurations and run the
Update
External Docker Registry Certificate command.
- Restart all the Pods in the
cdp namespace. (Pods are in imagepull
backoff error state).
- 4. Update the correct certificate in ECS configurations and run the
Update
External Docker Registry Certificate command.
- Running the 4th step in Cloudera Manager does not support
restoring the wrong certificate.
- None
- OPSX-5403 - Typecasting fails when truststore password is
integer
- The truststore_password in the SCM
configuration should not be an integer for Private Cloud installation.
- Update truststore_password in the SCM
configuration to a non-integer value.
- OPSX-4684 - Start Cloudera Embedded Container Service command
shows finished successfully even though start docker server failed on one of the
hosts
- Docker service starts with one or more docker
roles failed to start because the corresponding host is unhealthy.
-
Make sure the host is healthy. Start the docker role in the host.
- OPSX-4391 - External docker cert not base64 encoded
-
When using Cloudera Data Services on premises on Cloudera Embedded Container Service, in some rare situations, the CA certificate for the
Docker registry in the cdp namespace is incorrectly encoded, resulting in TLS errors
when connecting to the Docker registry.
-
Compare and edit the contents of the "cdp-private-installer-docker-cert" secret in the
cdp namespace so that it matches the contents of the "cdp-private-installer-docker-cert"
secret in other namespaces. The secrets and their corresponding namespaces can be
identified using the command "kubectl get secret -A | grep
cdp-private-installer-docker-cert". Inspect each secret using the command "kubectl get
secret -n cdp cdp-private-installer-docker-cert -o yaml", replacing "cdp" with the
different namespace names. If necessary, modify the secret in the cdp namespace using
the command "kubectl edit secret -n cdp cdp-private-installer-docker-cert"
- OPSX-3323 - Custom Log redaction does not work for JSON files in
diag bundles
-
The JSON files within the diag bundle will not be redacted.
- No workaround available.
- OPSX-2772 - For Account Administrator user, update roles
functionality should be disabled
- When a user with administrative privileges accesses the User
Management > Update Roles page in the Cloudera Management Console, the user is
presented with options to select various roles. Selecting or deselecting these roles does
not change this user's privileges -- an administrative user, by default, has all
privileges, and those privileges cannot be changed.
- No workaround available.