Fixed Issues in Apache Spark
Review the list of Spark issues that are resolved in Cloudera Runtime 7.1.6.
- CDPD-2650: Spark can't write ZSTD and LZ4 compressed Parquet to dynamically partitioned table.
- This issue is resolved.
- CDPD-3783: Unable to create database in spark.
- This issue is resolved.
- CDPD-372: YARN aggregation job is missing YARN metric folders because of timezone issues
- In this release, the Spark Atlas Connector produces a spark_application entity for each Spark job. Each data flow produced by the job creates a spark_process entity in Atlas, which tracks the actual input and output data sets for that process.
- CDPD-18458: [SPARK-32635] When pyspark
-
When pyspark.sql.functions.lit() function is used with dataframe cache, it returns wrong result.
Apache patch information
No additional Apache patches.