Managing Virtual Cluster-level Spark configurations using the API (Technical Preview)

Learn about how to manage Virtual Cluster (VC)-level Spark configurations using the API.

Before you begin

For information about managing Virtual Cluster-level Spark configurations using the CLI, see Managing Virtual Cluster-level Spark configurations.

Creating a VC instance with VC-level Spark configurations🔗

On the control plane side, to create a Virtual Cluster (VC) instance with VC-level Spark configurations, run the following command.

curl -H "Cookie: cdp-session-token=${CST}" '[***BASE-URL***]/dex/api/v1/cluster/[***CLUSTER-ID***]/instance' \
-H "Content-Type: application/json" \
-H 'accept: application/json' \
-X POST -d '{
  "name": "vc-spark-configs-cli",
  "config": {
    "sparkConfigs": {
      "spark.executor.instances": "4",
      "is.config.vc": "true",
      "is.config.job": "false"
    },
    "resources": {
      "cpu_requests": "20",
      "mem_requests": "80Gi"
    }
  }
}'

Example for the base URL, which changes according to the region: https://console.us-west-1.cdp.cloudera.com

Payload for creating the Spark configurations for a VC

{
  "name": "vc-spark-configs-cli",
  "config": {
    "sparkConfigs": {
      "spark.executor.instances": "4",
      "is.config.vc": "true",
      "is.config.job": "false"
    },
    "resources": {
      "cpu_requests": "20",
      "mem_requests": "80Gi"
    }
  }
}

If successful, the 200, OK response is received.

Updating the Spark configurations for a VC instance🔗

On the control plane side, to update the Spark configurations for a VC instance, run the following command.

curl -H "Cookie: cdp-session-token=${CST}" '[***BASE-URL***]/dex/api/v1/cluster/[***CLUSTER-ID***]/instance/[***INSTANCE-ID***]' \
-H "Content-Type: application/json" \
-H 'accept: application/json' \
-X 'PATCH' -d '{
  "config": {
    "sparkConfigs": {
      "spark.executor.instances": "4",
      "is.config.vc": "true",
      "is.config.job": "false"
    }
  }
}'

Example for the base URL, which changes according to the region: https://console.us-west-1.cdp.cloudera.com

Payload for updating the Spark configurations

{
  "config": {
    "sparkConfigs": {
      "spark.executor.instances": "4",
      "is.config.vc": "true",
      "is.config.job": "false"
    }
  }
}

If successful, the 200, OK response is received.

Getting the Spark configuration information for the Job-runs of a VC instance🔗

On the workload side, to get the Spark configuration information for the Job-runs of a VC instance, run the following command.

In the Job-Runs endpoint, the instanceSparkConfigs field contains the Spark configurations that are applied at VC-level to a particular virtual cluster instance.

To get the API URL of the VC you want to access, on the Cloudera Data Engineering UI, navigate to Administration > Virtual Clusters and click the Cluster Details icon of the VC you want to interact with.
To copy the URL, click Actions > COPY JOBS API URL.

curl -X GET '[***JOBS-API-URL***]/job-runs/[***JOB-RUN-ID***]' \
-H 'accept: application/json' \
-H "Cookie: cde-csrf-token=${CSRF}" \
-H "Cookie: hadoop-jwt=${JWT}"

In the Spark.Conf field, the compiled list of configurations, which include the configurations applied at VC-level and at Job-level, is added with the default configurations of Spark. The configuration information is sent to Livy, which is a job submitter with precedence. The precedence hierarchy, from highest to lowest priority is: Job-Run, Job, and VC-level configurations.