Enabling DataFlow for an environment

  1. From the CDP Public Cloud home page, click Cloudera DataFlow, then click Environments.
  2. Find the environment you want to enable, and click Enable to launch the Enable Environment window.

    If the Enable button is greyed out, hover over the Not Enabled icon for more details about the problem.

  3. Define the Kubernetes cluster minimum and maximum size.
    This specifies the size of the Kubernetes cluster. Your DataFlow cluster automatically scales between the minimum and maximum cluster size that you specify here.
  4. Set the Remote IP Access, if you wish to restrict access to the Kubernetes API Server to a specific IP address or range.
    This value is not required. If you leave it blank, the Kubernetes API Server may be accessed by any IP address.
  5. Select a load balancer.
    Selecting a load balancer impacts the way that you access DataFlow. If you select a private load balancer, your DataFlow UI is not publicly available. If you want to have your UI available on the internet, configure a public load balancer.
  6. Click Enable. This may take up to 45 minutes.

Your cluster status changes from Not Enabled to Enabling.

  • Hover over Enabling for environment enablement event messages to display.

  • Click the Alerts tab to see environment enablement event messages.

  • Click anywhere in your environment row to see your environment details.

Once you have enabled your DataFlow environment, you are ready to deploy your first flow definition from the catalog