Known Issues in Apache Spark

Learn about the known issues in Spark, the impact or changes to the functionality, and the workaround.

CDPD-94393: RuntimeWarning Failed to add file" message appears even when Spark successfully loads files
In both Spark 2 and 3, due to an exception when attempting to add files to the Python path, the RuntimeWarning: Failed to add file message appears even when the Python JAR file is successfully loaded.
None. You can safely ignore the message as the file is loaded successfully and the message does not affect job completion.
CDPD-64788: Inserting Bloom Filters in join operations for Spark 3.4
When spark.sql.optimizer.runtime.bloomFilter.enabled is enabled in Spark 3.4 (CDP 7.2.18), it causes considerable improvement in many queries but may cause regression in a few others. As the improvement is more significant, this behavior is retained in Spark 3.4 in Cloudera Spark versions.
CDPD-217: The Apache Spark connector is not supported
The old Apache Spark - Apache HBase Connector (shc) is not supported in CDP releases.
Use the new HBase-Spark connector shipped in CDP release.