Troubleshooting errors that occur when enabling CDF for an environment

Learn how to recognize and correct common errors that occur when you are enabling CDF for an environment.

Your ability to deploy flow definitions depends on DataFlow being enabled and in good health in your target environment. If a DataFlow deployment is unhealthy in an environment, it is typically because enabling or disabling DataFlow for an environment has failed. Review the environment troubleshooting information to understand comment errors and their solutions.

Enabling DataFlow for an environment fails

When you enable DataFlow for an existing CDP environment, CDP creates the required infrastructure in your cloud environment and installs core CDP services as well as the DataFlow software in the Kubernetes cluster. The enablement process can be divided into three parts:

  1. Provisioning Cloud Infrastructure
  2. Installing core CDP software and services
  3. Installing CDF software and services

To understand why the enablement process is failing, the first step is to identify at which stage the enablement process failed.

Identifying where the enablement process failed

Use the information provided in the hover state of an environment and the status messages that have been logged in the environment’s Event History to identify where the enablement process has failed.

  1. In DataFlow, navigate to the environments page and find the environment where the enablement process failed
  2. Hover over the status icon and take note of the error message
  3. Click on the environment, select the “Alerts” tab and review the error and info events that have been logged during enablement

Enablement fails during infrastructure provisioning or core CDP software installation

If you only see a status message in the Event History indicating that the infrastructure provisioning has started but you don’t see a corresponding status message confirming that the infrastructure has been provisioned successfully, CDP was either not able to create the required infrastructure or install the core CDP software and services afterwards.

The “infrastructure provisioning failed” error message indicates that there was either an issue with creating AWS infrastructure or setting up core CDP services.

Validating infrastructure creation

CDP uses CloudFormation scripts to create the required infrastructure in your AWS account. To validate whether the requested resources have been created successfully, log in to your AWS account, navigate to CloudFormation and search for the Kubernetes cluster ID that you have extracted from the environment events and looks similar to liftie-q4nlzm5p. Verify that the CloudFormation scripts completed successfully.

If the CloudFormation script did not complete successfully, make sure that the cross account role for your CDP environment has been assigned appropriate permissions.

If the CloudFormation script completed successfully but enabling Dataflow failed before completing the infrastructure setup, this might be an indication that the Kubernetes cluster cannot communicate with the CDP control plane or other public endpoints like container image repositories. Make sure that the VPC and subnets you are using for DataFlow meet the CDP & DataFlow prerequisites.

Enablement fails during CDF software and service installation

If you see a status message that indicates that the required Infrastructure has been provisioned successfully but the enablement process still failed, this is an indication that installing and setting up the CDF software and services has failed.

The Infrastructure Provisioned status event indicates that the Kubernetes cluster has been created and core CDP services have been setup successfully.

To ensure that this is not a transient issue, use the Retry Enablement action to start the enablement process again. Retry Enablement terminates all existing resources and provisions new infrastructure.

If retrying does not help and enabling DataFlow still fails after successfully provisioning the infrastructure, copy the error message from the Event History and open a support case with Cloudera.