Cloudera Runtime Release NotesPDF version

Spark

You can review the list of reported issues and their fixes for Spark in 7.3.1.100.

CDPD-76229: Optimize the processing speed of BinaryArithmetic#dataType when processing multi-column data
Restoring performance of some queries in Spark 3.4.1 to match other versions (3.3.x, 3.5.x) of Spark.

Optimized the processing speed of `BinaryArithmetic#dataType` when processing multi-column data.

Apache Jira: SPARK-45071

CDPD-75926: Backport SPARK-44653
Backported SPARK-44653 to fix cache breaking with non-trivial DataFrame unions.
Apache JIRA: SPARK-44653
CDPD-75755: [ENCODER_NOT_FOUND] Not found an encoder of the type T to Spark SQL internal representation when using Parameterized Bean
Fixed an upstream regression causing Encoder Exception for a parameterized class.
Apache JIRA: SPARK-46679
CDPD-75622: Backport upstream fixes for handling nested beans and generic type beans while creating Spark encoders.

Backporting upstream fixes from Spark 3.4 to fix the following issues:

  • Starting from Spark 3.4.x, Encoders.bean raised an exception when the passed class contains a field whose type is a nested bean with type arguments
  • From Spark 3.4.x, an exception is raised when Encoders.bean is called providing a bean having read-only properties
  • Unsupported feature of bean encoder when the superclass of the bean has generic type arguments

Apache Jira: APACHE-44634, APACHE-45081, APACHE-44910

CDPD-75353: CHAR and VARCHAR handling in Spark 3 is incompatible with Spark 2
Added a new configuration spark.cloudera.legacy.charVarcharLegacyPadding (by default set to false in Spark 3). When set to true (together with spark.sql.legacy.charVarcharAsString=true) it creates compatibility with Spark 2 behavior.
For more information refer to Migrating Spark applications
CDPD-75286: Spark History UI - StreamConstraintsException: String length exceeds the maximum length
Fixed an issue with Jackson to allow unlimited json string length in Spark event logs.
CDPD-59617: Spark - Upgrade Okio to 1.17.6 due to CVE-2023-3635
Updated okio from version 1.15.0 to 1.17.6 to address the security vulnerability CVE-2023-3635.
CDPD-74730: Backport SPARK-46239: Hide the Jetty server's version
The Jetty server's version is now hidden.

Apache Jira: SPARK-46239

CDPD-73233: Encoder not found of the type T to Spark SQL internal representation
Fixed an upstream regression of encoder exception (org.apache.spark.SparkUnsupportedOperationException: [ENCODER_NOT_FOUND]) for generic types.
Apache JIRA: SPARK-49789