Known Issues in Apache Spark
Learn about the known issues in Spark, the impact or changes to the functionality, and the workaround.
- CDPD-22670 and CDPD-23103: There are two configurations in Spark, "Atlas dependency" and "spark_lineage_enabled", which are conflicted. The issue is when Atlas dependency is turned off but spark_lineage_enabled is turned on.
- Run Spark application, Spark will log some error message and cannot continue. That can be restored by correcting the configurations and restarting Spark component with distributing client configurations.
- CDPD-217: HBase/Spark connectors are not supported
- The Apache HBase Spark Connector
(
hbase-connectors/spark
) and the Apache Spark - Apache HBase Connector (shc
) are not supported in the initial CDP release.
- CDPD-3038: Launching
pyspark
displays several HiveConf warning messages - When
pyspark
starts, several Hive configuration warning messages are displayed, similar to the following:19/08/09 11:48:04 WARN conf.HiveConf: HiveConf of name hive.vectorized.use.checked.expressions does not exist 19/08/09 11:48:04 WARN conf.HiveConf: HiveConf of name hive.tez.cartesian-product.enabled does not exist