Adding a Cloudera Data Engineering service
Before you can use the Cloudera Data Engineering (CDE) service, you must add the service to an environment that you want to use CDE on.
Make sure that you have a working environment for which you want to enable the CDE service. For more information about environments, see Environments.
- In the Cloudera Data Platform (CDP) console, click the Data Engineering tile. The CDE Home page displays.
Click Administration on the left navigation menu, click at the top to enable CDE service for an
If the environment does not have any CDE service, the page displays a Enable a Service button that launches the same wizard.
- Enter a Name for the CDE service you are creating.
- In the Environment text box, select or type the name of the environment that you want to enable CDE for. The displayed list dynamically updates to show environment names matching your input. When you see the correct environment, click on it to select it.
In Resource Pool (Technical Preview), select the name of the
resource pool that you want to enable CDE service for.
For information about configuring resource pool and capacity, see Managing cluster resources using Quota Management (Technical Preview).
- In Capacity , use the slider to set the maximum number of CPU cores and the maximum memory in gigabytes that can be used by this CDE service.
- Optional: GPU (Technical Preview), in Capacity , use the slider to set the maximum number of GPU cores in gigabytes that can be used by this CDE service. GPU resources are limited in the cluster and all data services like CML and CDE could share or dedicatedly set resource quotas for their experience. For information about configuring resource pool and capacity, see Managing cluster resources using Quota Management (Technical Preview).
Under Additional Configurations, in NFS Storage
Class, specify the name of the custom NFS storage class. The storage
provisioner must support ReadWriteMany access modes. By default, CDE
uses CephFS provisioner in the OpenShift Container Platform and
Longhorn provisioner in the Embedded Container Service. If it does
not exist, the CDE service initialisation fails.
You can specify the name of the Portworx storage class specified during the CDP Data Services installation to use the Portworx storage class. The storage provisioner must support ReadWriteMany access mode. You can obtain the name of the Portworx storage class from your cluster by running the kubectl get sc command. The CDE service and virtual clusters will now use the Portworx storage class instead of the default storage class of the platform.
For more information, see Installing in internet environment and Storage Classes.
- Default Virtual Cluster selection is enabled by default to create a default virtual cluster after enabling a CDE service. This helps you to create your jobs easily, without having to wait to create a CDE virtual cluster as mentioned in Creating virtual clusters, making the onboarding smoother. You can turn this toggle off if you do not wish to use a default virtual cluster.
- Click Enable.