Accelerating Cloudera Data Engineering Jobs and Sessions using GPUs (Technical Preview)

Cloudera Data Engineering supports accelerating the Spark jobs and sessions using GPUs. You can optionally choose GPU acceleration using Cloudera Data Engineering UI or CLI for a Spark job and session. The supported Spark version is Spark 3.

You can leverage the power of GPUs to benefit from the faster execution times and reduced infrastructure costs without changing the existing Cloudera Data Engineering application code. By enabling GPU support, data engineers can make use of GPU resources available to the Cloudera Data Engineering service. You can configure GPU resource quota per virtual cluster which can be requested for running the Spark job or session.

Before you use GPUs to accelerate Cloudera Data Engineering jobs and sessions, you must ensure that the following are performed:

Have nodes with GPU and met the Software and Hardware requirements before installing Cloudera Data Services on premises.
Set up GPU nodes.
Test GPU node setup.
Set the GPU resource quota to allocate GPU resources effectively for Cloudera Data Engineering. GPU resources are limited in the cluster and usually shared among all data services.