What's New in Apache Spark

Learn about the new features of Spark in Cloudera Runtime 7.2.18.

Spark 3 support in Oozie

Oozie introduced the new Spark 3 based Spark 3 actions. For more information, see Spark 3 support in Oozie

Spark 3.4 support

Spark 3.4 is now supported in 7.2.18.

Spark History Server with High Availability

You can configure the load balancer for Spark History Server (SHS) to ensure high availability, so that users can access and use the Spark History Server UI without any disruption. Learn how you can configure the load balancer for SHS and the limitations associated with it. For more information, see Using Spark History Servers with high
 availability

Spark cluster template update

The following Spark 2 templates were deleted:
  • Data Engineering: Apache Spark, Apache Hive, Apache Oozie
  • Data Engineering: HA: Apache Spark, Apache Hive, Apache Oozie
  • Real-time Data Mart: Apache Impala, Hue, Apache Kudu, Apache Spark
  • Data Discovery and Exploration
The following Spark 3 templates were added:
  • Data Discovery and Exploration for Spark3
  • Data Engineering: Apache Spark3 cluster template was renamed to Data Engineering: Apache Spark3, Apache Hive, Apache Oozie
Spark 3 versions which are equivalent to the deleted Spark 2 templates mentioned above:
  • Data Engineering: HA: Apache Spark3, Apache Hive, Apache Oozie
  • Data Engineering: Apache Spark3, Apache Hive, Apache Oozie
  • Real-time Data Mart: Apache Impala, Hue, Apache Kudu, Apache Spark3
  • Data Discovery and Exploration for Spark3
For more information, see Data Engineering clusters, Data Mart clusters, and Data Discovery and Exploration clusters.

Support for fault tolerant Spark Atlas hook

The spark.lineage.kafka.fault-tolerant.timeout.ms parameter was added to configure the Spark Atlas Connector so that Spark jobs can run even when Kafka brokers are down. This ensures that your job submissions do not fail. For more information, see Spark connector configuration in Apache Atlas.

Spark 2 deprecation

Spark 2 was deprecated as of 7.2.17. See , Deprecation Notices for Spark 2 .