Accelerating CDE Jobs and Sessions using GPUs (Technical Preview)

CDE supports accelerating the Spark jobs and sessions using GPUs. You can optionally choose GPU acceleration using CDE UI or CLI for a Spark job and session. The supported Spark version is Spark 3.

You can leverage the power of GPUs to benefit from the faster execution times and reduced infrastructure costs without changing the existing CDE application code. By enabling GPU support, data engineers can make use of GPU resources available to the CDE service. You can configure GPU resource quota per virtual cluster which can be requested for running the Spark job or session.

Before you use GPUs to accelerate CDE jobs and sessions, you must ensure that the following are performed:
  1. Have nodes with GPU and met the Software and Hardware requirements before installing CDP Private Cloud Data Services.

  2. Set up GPU nodes.

  3. Test GPU node setup.
  4. Set the GPU resource quota to allocate GPU resources effectively for CDE. GPU resources are limited in the cluster and usually shared among all data services.