Managing Virtual Cluster-level Spark configurations (Technical Preview)

Learn how to add, edit, or delete the Spark configurations at a Virtual Cluster (VC)-level so that the configurations apply to all the jobs in the Virtual Cluster, by default. You can edit the Spark configurations using the CDE UI, the CDP CLI, or the CDE API.

Earlier, the Spark configurations were added to the jobs at the job-level. If you had to apply the common configurations for all the jobs, then you had to add the configurations for each job individually.

Using this feature, you can apply the Spark configurations in the Virtual Cluster (VC) to all the jobs by default.

For information on how to manage Virtual Cluster-level Spark configurations using the API, see Managing Virtual Cluster-level Spark configurations using the API.
Edit the Spark configurations using the CDE UI, the CLI, or the CDE API.
  1. Sign in to CDE UI and click Administration.
  2. Select the relevant CDE Service.
  3. Select the relevant Virtual Cluster in the CDE Service and click the Cluster Details option.
  4. Click Configurations > Spark.

    A Spark page appears.

  5. Under Configurations, Enter Spark Configurations as key-value pairs.

    For example

    is.config.vc=true
    is.config.job=false
    
  6. Click Apply Changes.

    Result: The Spark configurations are updated and applied to the VC and the jobs in the Virtual Cluster. You can validate them by running the jobs and then verifying the Spark configurations in the Configuration tab in that particular job runtime.

  • To create a VC with the Spark configurations, run the following command:
    cdp de create-vc \
             --name               [***VIRTUAL-CLUSTER-NAME***] \
             --cluster-id         [***CLUSTER-ID***] \
             --cpu-requests       [NUMBER-OF-CPUS-REQUIRED***] \
             --memory-requests    [NUMBER-OF-MEMORY UNITS-REQUIRED***] \
             --spark-configs   \  [***KEY1***]=[***VALUE1***],[***KEY2***]=[***VALUE2***]

    For example

    cdp de create-vc \
             --name              test-vc \
             --cluster-id        cluster-sq4rpnr2 \
             --cpu-requests      20 \
             --memory-requests    20 \
             --spark-configs   \
    spark.executor.instances=2,spark.dynamicAllocation.maxExecutors=15,key1=val1
    
  • To update the Spark configurations of the VC, run the following command:
    cdp de update-vc \
            --cluster-id        [***CLUSTER-ID***] \
            --vc-id             [***VC-ID***] \
            --spark-configs \   [***KEY1***]=[***VALUE1***][***KEY2***]=[***VALUE2***]
    

    For example

    cdp de update-vc \
            --cluster-id      cluster-sq4rpnr2 \
            --vc-id           dex-app-w2rhk89r \
            --spark-configs \ spark.executor.instances=2,spark.dynamicAllocation.maxExecutors=15,key1=val1
    
  • To discard or delete all the Spark configurations of the VC, run the following command:
    cdp de update-vc \
            --cluster-id        [***CLUSTER-ID***] \
            --vc-id             [***VC-ID***] \
            --discard-spark-configs
    

    For example

    cdp de update-vc \
            --cluster-id        cluster-sq4rpnr2 \
            --vc-id             dex-app-w2rhk89r \
            --discard-spark-configs