Adding a Cloudera Data Engineering service

Before you can use the Cloudera Data Engineering service, you must add the service to an environment that you want to use Cloudera Data Engineering on.

Make sure that you have a working environment for which you want to enable the Cloudera Data Engineering service. For more information about environments, see Environments.

In the Cloudera console, click the Data Engineering tile. The Cloudera Data Engineering Home page displays.
Click Administration on the left navigation menu, click at the top to enable Cloudera Data Engineering service for an environment.
If the environment does not have any Cloudera Data Engineering service, the page displays a Enable a Service button that launches the same wizard.
Enter a Name for the Cloudera Data Engineering service you are creating.
In the Environment drop-down list, select or type the name of the environment that you want to enable Cloudera Data Engineering for. The displayed list dynamically updates to show environment names matching your input. When you see the correct environment, click on it to select it.
In Resource Pool drop-down list, select the name of the resource pool that you want to enable Cloudera Data Engineering service for.
note
- If the environment name is not selected in the Environment drop-down list, then the Resource Pool field will be disabled. The Resource Pool drop-down list displays cde pool and any custom pools that are created under the cde pool. If there are no resource pools created in the selected environment, then a generic resource pool named cde is displayed with the quota values of that environment and it will be created when you enable the service.
- The Cloudera Data Engineering UI does not support binary system units such as KiB, MiB, GiB, or TiB for Memory in Resource pool. Instead, you must use decimal system units such as KB, MB, GB, or TB.
For information about configuring resource pool and capacity, see Managing cluster resources using Quota Management (Technical Preview).
In Capacity , use the slider to set the maximum number of CPU cores and the maximum memory in gigabytes that can be used by this Cloudera Data Engineering service.
Optional: GPU (Technical Preview), in Capacity , use the slider to set the maximum number of GPU cores in gigabytes that can be used by this Cloudera Data Engineering service. GPU resources are limited in the cluster and all data services like Cloudera AI and Cloudera Data Engineering could share or dedicatedly set resource quotas for their experience. For information about configuring resource pool and capacity, see Managing cluster resources using Quota Management (Technical Preview).
In Cloudera Data Engineering 1.5.5 SP2 and higher releases, in the Telemetry section, select the Enable Observability Analytics checkbox to share diagnostic information about jobs and queries with Cloudera.
Prerequisite: To use Cloudera Observability SaaS with Cloudera Data Engineering, you must enable Telemetry Publisher and add the Altus Credentials on the Cloudera on premises base cluster. For more information, see Generating an API access key, Enabling the telemetry network communication for Cloudera Observability, and Adding and starting an instance of Telemetry Publisher.
note
You must run Spark jobs from Cloudera Data Engineering for the data to be available in Cloudera Observability.
Optional: Under Additional Configurations, in NFS Storage Class, specify the name of the custom NFS storage class. The storage provisioner must support ReadWriteMany access modes. By default, Cloudera Data Engineering uses CephFS provisioner in the OpenShift Container Platform and Longhorn provisioner in the Cloudera Embedded Container Service. If it does not exist, the Cloudera Data Engineering service initialisation fails.
You can specify the name of the Portworx storage class specified during the Cloudera Data Services installation to use the Portworx storage class. The storage provisioner must support ReadWriteMany access mode. You can obtain the name of the Portworx storage class from your cluster by running the kubectl get sc command. The Cloudera Data Engineering service and virtual clusters will now use the Portworx storage class instead of the default storage class of the platform.
For more information, see Installing in internet environment and Storage Classes.
Default Virtual Cluster selection is enabled by default to create a default virtual cluster after enabling a Cloudera Data Engineering service. This helps you to create your jobs easily, without having to wait to create a Cloudera Data Engineering virtual cluster as mentioned in Creating virtual clusters, making the onboarding smoother. You can turn this toggle off if you do not wish to use a default virtual cluster.
Click Enable.

The Cloudera Data Engineering Home page displays the status of the Cloudera Data Engineering service initialization. You can view logs for the service by clicking on the service vertical ellipsis (three dots) menu, and then clicking View Logs.