The following are the new known issues in the 1.5.4 service pack SP1 release of CDP
Private Cloud Data Services.
- OBS-6044 - Warning alert in the ECS Health Test status when a cluster is restarted for
stability execution
- The following warning alert is shown in the ECS Health Test status
when a cluster is restarted in Cloudera Manager for stability execution.
Prometheus has issues compacting blocks
This issue occurs when WAL
(Write Ahead Logs) are corrupted.
-
-
Run the following command to access the Prometheus container's
shell:
kubectl exec -i -t -n <prometheus server namespace> <prometheus server pod name> -c
<prometheus server container name> -- sh -c "(bash || ash || sh)"
- Change the current working directory to the WAL directory of Prometheus.
- For Infrastructure Prometheus: The WAL directory location is
/Prometheus/wal
. For example:cd
/prometheus/wal
- For control plane/environment: The Prometheus directory location is
/data/wal
. For example: cd
/data/wal
- Note the corrupted segment from Prometheus's pod logs. Example
logs:
21T09:00:07.036Z caller=db.go:1074 level=error component=tsdb msg="compaction failed" err="WAL truncation in Compact: create checkpoint: read segments: corruption in segment
/prometheus/wal/00000026 at 10978: unexpected full record"
- Skip the compression of the corrupted segment by moving the checkpoint.
This requires renaming the checkpoint folder in the WAL directory. For example, if the
corrupted segment is
00000026
and the current checkpoint folder name
is checkpoint.00000020
, then rename the checkpoint folder to
checkpoint.00000027
. For example: mv
checkpoint.00000020 checkpoint.00000027
- OPSX-5810 - Private Cloud Control Plane installation fails at the vault initialization
phase due to longhorn-manager pods
-
At times, longhorn-manager pods will fail to come up with repeating error messages
like:
level=error msg="Failed to save TLS secret for longhorn-system/longhorn-webhook-tls: Operation cannot be fulfilled on secrets \"longhorn-webhook-tls\": the object has been modified; please apply your changes to the latest version and try again"
This causes the Longhorn nodes to remain in a NotReady state,
stopping volumes from successfully being created/attached.
-
The following steps can be taken on an ECS Server node to fix the issue:
- Stop the Longhorn Manager daemonset by executing following
command:
kubectl -n longhorn-system patch daemonset longhorn-manager -p '{"spec": {"template": {"spec": {"nodeSelector": {"non-existing": "true"}}}}}'
- Delete the Longhorn Webhook TLS secret by executing the following
command:
kubectl delete secret longhorn-webhook-tls -n longhorn-system
- Start the Longhorn Manager daemonset by executing the following
command:
kubectl -n longhorn-system patch daemonset longhorn-manager --type json -p='[{"op": "remove", "path": "/spec/template/spec/nodeSelector/non-existing"}]'
- OPSX-5403 - Typecasting fails when truststore password is integer
- The truststore_password in the SCM
configuration should not be an integer for Private Cloud installation.
- Update truststore_password in the SCM
configuration to a non-integer value.
- OPSX-4684 - Start ECS command shows finished successfully even though start docker
server failed on one of the hosts
- Docker service starts with one or more docker
roles failed to start because the corresponding host is unhealthy.
-
Make sure the host is healthy. Start the docker role in the host.
- OPSX-4391 - External docker cert not base64 encoded
-
When using Private Cloud Data Services on ECS, in some rare situations, the CA
certificate for the Docker registry in the cdp namespace is incorrectly encoded,
resulting in TLS errors when connecting to the Docker registry.
-
Compare and edit the contents of the "cdp-private-installer-docker-cert" secret in the
cdp namespace so that it matches the contents of the "cdp-private-installer-docker-cert"
secret in other namespaces. The secrets and their corresponding namespaces can be
identified using the command "kubectl get secret -A | grep
cdp-private-installer-docker-cert". Inspect each secret using the command "kubectl get
secret -n cdp cdp-private-installer-docker-cert -o yaml", replacing "cdp" with the
different namespace names. If necessary, modify the secret in the cdp namespace using
the command "kubectl edit secret -n cdp cdp-private-installer-docker-cert"
- OPSX-3323 - Custom Log redaction does not work for JSON files in diag bundles
-
The JSON files within the diag bundle will not be redacted.
- No workaround available.
- OPSX-2772 - For Account Administrator user, update roles functionality should be
disabled
- When a user with administrative privileges accesses the User
Management > Update Roles page in the Management Console, the user is presented with
options to select various roles. Selecting or deselecting these roles does not change this
user's privileges -- an administrative user, by default, has all privileges, and those
privileges cannot be changed.
- No workaround available.