Unsupported Apache Spark Features
The following Apache Spark features are not supported in Cloudera Data Platform.
- Apache Spark experimental features/APIs are not supported unless stated otherwise.
- Using the JDBC Datasource API to access Hive or Impala is not supported
- ADLS not supported for All Spark Components. Microsoft Azure Data Lake Store (ADLS) is a cloud-based filesystem that you can access through Spark applications. Spark with Kudu is not currently supported for ADLS data. (Hive on Spark is available for ADLS.)
- IPython / Jupyter notebooks is not supported. The IPython notebook system (renamed to Jupyter as of IPython 4.0) is not supported.
- Certain Spark Streaming features, such as the
mapWithStatemethod, are not supported.
- Thrift JDBC/ODBC server is not supported
- Spark SQL CLI is not supported
- GraphX is not supported
- SparkR is not supported
- GraphFrames is not supported
Structured Streaming is supported, but the following features of it are not:
- Continuous processing, which is still experimental, is not supported.
- Stream static joins with HBase have not been tested and therefore are not supported.
- Spark cost-based optimizer (CBO) not supported.