Troubleshooting errors that occur when enabling DataFlow for an environment fails

Learn how to recognize and correct common errors that occur when you are enabling Cloudera DataFlow (CDF) for an environment.

Your ability to deploy flow definitions depends on DataFlow being enabled and in good health in your target environment. If a DataFlow deployment is unhealthy in an environment, it is typically because enabling or disabling DataFlow for an environment has failed. Review the environment troubleshooting information to understand common errors and their solutions.

When you enable DataFlow for an existing CDP environment, CDP creates the required infrastructure in your cloud environment and installs core CDP services as well as the DataFlow software in the Kubernetes cluster. The enablement process can be divided into three parts:

  1. Provisioning Cloud Infrastructure
  2. Installing core CDP software and services
  3. Installing CDF software and services

To understand why the enablement process is failing, the first step is to identify at which stage the enablement process failed.

Identifying where the enablement process failed

Use the information provided in the hover state of an environment and the status messages that have been logged in the environment’s Event History to identify where the enablement process has failed.

  1. In DataFlow, navigate to the Environments page and find the environment where the enablement process failed.
  2. Hover over the status icon and take note of the error message.
  3. Click the environment, select the Alerts tab, and review the error and info events that have been logged during enablement.

Enablement fails during infrastructure provisioning or core CDP software installation

If you only see a status message in the Event History indicating that the infrastructure provisioning has started but you cannot see a corresponding status message confirming that the infrastructure has been provisioned successfully, CDP was either not able to create the required infrastructure or install the core CDP software and services afterwards.

The infrastructure provisioning failed error message indicates that there was either an issue with creating AWS infrastructure or setting up core CDP services.

Validating infrastructure creation – for AWS users

CDP uses CloudFormation scripts to create the required infrastructure in your AWS account. To validate whether the requested resources have been created successfully, log in to your AWS account, navigate to CloudFormation and search for the Kubernetes cluster ID that you have extracted from the environment events and looks similar to liftie-q4nlzm5p. Verify that the CloudFormation scripts completed successfully.

If the CloudFormation script did not complete successfully, make sure that the cross account role for your CDP environment has been assigned appropriate permissions.

If the CloudFormation script completed successfully but enabling Dataflow failed before completing the infrastructure setup, this might be an indication that the Kubernetes cluster cannot communicate with the CDP control plane or other public endpoints like container image repositories. Make sure that the VPC and subnets you are using for DataFlow meet the CDP and DataFlow prerequisites.

Validating infrastructure creation – for Azure users

CDP uses Azure Resource Manager to orchestrate infrastructure creation in Azure. When you enable DataFlow for a CDP environment, a Cluster ID is generated that can be used to track all associated resources in Azure. Navigate to Environments in CDF, select your Azure environment and copy the Cluster ID, which will look similar to liftie-fq6hq9sc.

First, make sure that CDF’s Azure PostgreSQL database has been created successfully. In the Azure Portal, navigate to your CDP resource group, explore Deployments under Settings, and find the database deployment associated with the previously obtained Cluster ID.

Next, validate that the AKS Kubernetes cluster and associated infrastructure has been created successfully. In the Azure Portal, use the Cluster ID to search for associated resources in your Azure subscription.

You should find a Kubernetes service as well as a new resource group called MC_<Cluster ID>_<AZURE Region>, which contains the virtual machine scale sets, load balancers, IP addresses, and storage disks that have been created for the Kubernetes cluster.

If the resources were not created successfully, make sure that the app-based credential for your CDP environment has been assigned the appropriate permissions. For more information, see Prerequisites for the provisioning credential.

If the infrastructure resources were created successfully, but enabling DataFlow failed before completing the infrastructure setup, it might be an indication that the Kubernetes cluster cannot communicate with the CDP control plane or other public endpoints like container image repositories. Make sure that the virtual network and subnets you are using for DataFlow meet the CDP and DataFlow prerequisites. For more information, see VNet and subnets.

Enablement fails during CDF software and service installation

If you see a status message that indicates that the required Infrastructure has been provisioned successfully but the enablement process still failed, this is an indication that installing and setting up the CDF software and services has failed.

The Infrastructure Provisioned status event indicates that the Kubernetes cluster has been created and core CDP services have been setup successfully.

To ensure that this is not a transient issue, use the Retry Enablement action to start the enablement process again. Retry Enablement terminates all existing resources and provisions new infrastructure.

If retrying does not help and enabling DataFlow still fails after successfully provisioning the infrastructure, copy the error message from the Event History and open a support case with Cloudera.