Unsupported Apache Spark Features

The following Apache Spark features are not supported in Cloudera Data Platform.

Spark

Apache Spark experimental features/APIs are not supported unless stated otherwise.
Using the JDBC Datasource API to access Hive or Impala.
Spark Streaming (DStreams) reading from Kafka topics containing transactions such as idempotent producer being used to publish records.
Spark with Kudu is not supported for ADLS data.
IPython / Jupyter notebooks is not supported. The IPython notebook system (renamed to Jupyter as of IPython 4.0) is not supported.
Certain Spark Streaming features, such as the mapWithState method, are not supported.
Thrift JDBC/ODBC server (also known as Spark Thrift Server or STS)
Spark SQL CLI
GraphX
SparkR
GraphFrames
Structured Streaming is supported, but the following features of it are not:
- Continuous processing, which is still experimental, is not supported.
- Stream static joins with HBase have not been tested and therefore are not supported.
Structured Streaming is not supported with Iceberg tables
Spark cost-based optimizer (CBO) is not supported in Spark 2
Python 3.8+ is not compatible for Spark 2.
For Spark 2.4, Python 3.7.13 and later are not supported. Spark 2.4 was tested with Python up to 3.7.12.
Hudi
Read/Write operations to a Hive bucketed table is unsupported

The HBase Connector's Atlas lineage generation is not supported in the Spark-Atlas Connector
The Hive Warehouse Connector's Atlas lineage generation is not supported in the Spark-Atlas Connector