Unsupported Apache Spark Features

The following Apache Spark features are not supported in Cloudera Data Platform.

  • Apache Spark experimental features/APIs are not supported unless stated otherwise.
  • Using the JDBC Datasource API to access Hive or Impala is not supported
  • ADLS not supported for All Spark Components. Microsoft Azure Data Lake Store (ADLS) is a cloud-based filesystem that you can access through Spark applications. Spark with Kudu is not currently supported for ADLS data. (Hive on Spark is available for ADLS in CDH 5.12 and higher.)
  • IPython / Jupyter notebooks is not supported. The IPython notebook system (renamed to Jupyter as of IPython 4.0) is not supported.
  • Certain Spark Streaming features are not supported. The mapWithState method is unsupported because it is a nascent unstable API.
  • Thrift JDBC/ODBC server is not supported
  • Spark SQL CLI is not supported
  • GraphX is not supported
  • SparkR is not supported
  • GraphFrames is not supported
  • Structured Streaming is supported, but the following features of it are not:

    • Continuous processing, which is still experimental, is not supported.
    • Stream static joins with HBase have not been tested and therefore are not supported.
  • Spark cost-based optimizer (CBO) not supported.