Known Issues in Apache Spark
Learn about the known issues in Spark, the impact or changes to the functionality, and the workaround.
- CDPD-94393:
RuntimeWarningFailed to add file" message appears even when Spark successfully loads files - In both Spark 2 and 3, due to an exception when attempting to add files to the Python path, the
RuntimeWarning: Failed to add filemessage appears even when the Python JAR file is successfully loaded. - CDPD-64788: Inserting Bloom Filters in join operations for Spark 3.4
- When
spark.sql.optimizer.runtime.bloomFilter.enabledis enabled in Spark 3.4 (CDP 7.2.18), it causes considerable improvement in many queries but may cause regression in a few others. As the improvement is more significant, this behavior is retained in Spark 3.4 in Cloudera Spark versions.
- CDPD-217: The Apache Spark connector is not supported
- The old Apache Spark - Apache HBase Connector
(
shc) is not supported in CDP releases.
