Fixed Issues in Apache Spark

Review the list of Spark issues that are resolved in Cloudera Runtime 7.1.5.

CDPD-2650: Spark can't write ZSTD and LZ4 compressed Parquet to dynamically partitioned table.

This issue is resolved.

CDPD-3783: Unable to create database in spark.

This issue is resolved.

CDPD-372: YARN aggregation job is missing YARN metric folders because of timezone issues

In this release, the Spark Atlas Connector produces a spark_application entity for each Spark job. Each data flow produced by the job creates a spark_process entity in Atlas, which tracks the actual input and output data sets for that process.

CDPD-18458: [SPARK-32635] When pyspark

When pyspark.sql.functions.lit() function is used with dataframe cache, it returns wrong result.

SPARK-32635

Apache patch information

No additional Apache patches.