Data Services not functioning because the certificate generation fails

This section provides steps to troubleshoot scenario where data services are not functioning because the certificates are not getting generated.

  1. Validate cert-manager is installed in cert-manager namespace in ECS by running following command:
    kubectl get ns | grep cert-manager
  2. Validate all pods are up and running by executing the following command:
    kubectl get pods -n cert-manager
    The following section provides an overview of the pods running in the cert-manager namespace, outlining the specific function of each component. Check the logs of these pods to troubleshoot the cause for any certification failure:
    • cert-manager: These pods are the core controllers for cert-manager. They are responsible for processing certificate resources, communicating with certificate authorities (like Let's Encrypt or Venafi), and ensuring that certificates are valid and up-to-date.
      • cdp-release-cert-manager-69ddcb8584-4k77q
      • cdp-release-cert-manager-69ddcb8584-zqh6g
    • cainjector: The cainjector's primary role is to inject CA (Certificate Authority) bundles into webhook configurations and other resources. This ensures that the Kubernetes API server can securely communicate with the cert-manager webhooks for validation and mutation of resources.
      • cdp-release-cert-manager-cainjector-75b7947d85-m5crd
      • cdp-release-cert-manager-cainjector-75b7947d85-rkm26
    • startupapicheck: This is a one-time job that runs when cert-manager is first deployed or upgraded. It verifies that the Kubernetes API is available and that the necessary cert-manager Custom Resource Definitions (CRDs) have been properly installed and are accessible. The Completed status indicates it has run successfully.
      • cdp-release-cert-manager-startupapicheck-5sz69
    • webhook: The webhook pods provide an HTTP server that the Kubernetes API server can query. It has two main functions: to validate cert-manager resource configurations to ensure they are correct and to provide default values for certain fields in those resources. This helps prevent misconfigurations and simplifies the user experience.
      • cdp-release-cert-manager-webhook-6b6b564f77-m8lzz
      • cdp-release-cert-manager-webhook-6b6b564f77-q9mdq
      • cdp-release-cert-manager-webhook-6b6b564f77-srf2p
    • Certificate Revocation Operator: This pod runs Cloudera's Certificate revocation operator. It is responsible for monitoring for certificate delete requests and communicating with the Venafi Trust Protection Platform to revoke certificates when they are no longer needed.
      • cdp-release-thunderhead-certrevoke-7d64fd4dbb-xtqhp