Fixed Issues in Apache Spark

Review the list of Spark issues that are resolved in Cloudera Runtime 7.1.7 SP2.

CDPD-44696: Backporting Spark fix to CDP Private Base to fix integration with CDSW.
CDPD-47340: Backport: SPARK-33756: BytesToBytesMap's iterator hasNext method should be idempotent.
CDPD-47333: [SPARK-33504][CORE] The application log in the Spark history server contains sensitive attributes should be redacted.
CDPD-46530: Apache Ivy upgraded to 2.5.1 to avoid CVE.
CDPD-46306: [SPARK-39505][UI] Escape log content rendered in UI.
CDPD-45679: Backport:SPARK-32638: WidenSetOperationTypes in subquery attribute missing.
CDPD-45059: Added versionless symlinks for spark-*.jar files.
CDPD-45051: Added versionless symlinks for the kafka-clients.jar file.
CDPD-44393: [SPARK-38034][SQL] Optimize TransposeWindow rule.
CDPD-44019: Improve CollapseProject performance (SPARK-28090).
CDPD-43553: Jersey upgraded to 2.36 to avoid CVE.
CDPD-42862: SPARK-38992 is a security vulnerability (CVE-2022-33891).
As Spark 2.4 is end-of-life with 2.4.8, it is not fixed in branch-2.4 upstream. We have backported the fix to Spark 2.4 in CDP. See Fixed Common Vulnerabilities and Exposures 7.1.7 SP2.
CDPD-42599: Migrated log4j1 to reload4j to avoid CVE
CDPD-40104: With HIVE-24920, Hive tables that are translated to external change their location on rename, similar to managed tables. Account for this in Spark, and handle translated-to-external tables the same way as managed tables on rename.
CDPD-38590: Backport SPARK-27514: Empty window expression results in error in optimizer.
CDPD-47343: DIRECT_READER_V2 must handle delete delta files from delete & update queries
CDPD-47341: HWC DIRECT_READER_V2 task fails with NPE on reading delete delta files.
CDPD-45244: Upgrade to thrift-0.16.0 for spark-acid.

Apache Patch Information

  • SPARK-33504
  • SPARK-38034
  • SPARK-38992
  • SPARK-27514