Guidelines for Virtual Cluster upkeep
There are upkeep guidelines for Cloudera Data Engineering (CDE) Spark History Server (SHS) that you'll need to consider.
Lifecycle configuration of CDE Spark event logs
The number of Spark event logs (spark.eventLog.enabled), that are produced by Spark jobs that run via CDE Virtual Cluster, grows indefinitely with each new CDE Spark run. These event logs are not automatically deleted and are stored on the object store under <CDP env storage location>/dex/<Service ID>/<VC ID>/eventlog/.
Some examples of the event log location can look like the following:
- For Amazon Web Services (AWS): s3a://dex-storage-bucket/datalake/logs/dex/cluster-2xvl4pfp/rdw8q2sh/eventlog/
- For Azure: abfs://firstname.lastname@example.org/dex/cluster-4p54mk8j/22bnm99g/eventlog/