Limitations of Apache Phoenix-spark connector

You should be aware of the following limitations on using the Apache Phoenix-Spark connector:

You can use the DataSource API only for basic support for column and predicate pushdown.
The DataSource API does not support passing custom Phoenix settings in configuration. You must create the DataFrame or RDD directly, if you need a fine-grained configuration.
There is no support for aggregate or distinct queries, but you can perform any operation on RDDs or DataFrame formed after reading data from Phoenix.