Configure Phoenix-Spark connector using Cloudera Manager

Learn how to configure Phoenix-Spark connector using Cloudera Manager.

  1. Follow Step 1 to Step 7(b) as mentioned in Configure HBase-Spark connector using Cloudera Manager.
  2. Add the required properties to ensure that all required Phoenix and HBase platform dependencies are available on the classpath for the Spark executors and drivers.
    1. Upload the Phoenix-Spark2 connector file:
      hdfs dfs -put /opt/cloudera/parcels/cdh/lib/phoenix_connectors/phoenix5-spark-shaded.jar /path/hbase_jars_spark2
    2. Add all the Phoenix-Spark2 connector related files that you have just uploaded in the previous steps to the spark.jars parameter:
      Spark2:
      spark.jars=hdfs:///path/hbase_jars_common/hbase-site.xml.jar,hdfs:///path/hbase_jars_spark2/hbase-spark-protocol-shaded.jar,hdfs:///path/hbase_jars_spark2/hbase-spark.jar,hdfs:///path/hbase_jars_spark2/scala-library.jar,/path/hbase_jars_common(other common files)...,hdfs:////path/hbase_jars_spark2/phoenix5-spark-shaded.jar
  3. Enter a Reason for change, and then click Save Changes to commit the changes.
  4. Restart the role and service when Cloudera Manager prompts you to restart.

Build a Spark2 application using the dependencies that you provide when you run your application. If you follow the previous instructions, Cloudera Manager automatically configures the connector for Spark. If you have not:

Consider the following example while using a Spark2 application:
spark-shell --conf spark.jars=hdfs:///path/hbase_jars_common/hbase-site.xml.jar,hdfs:///path/hbase_jars_spark2/hbase-spark-protocol-shaded.jar,hdfs:///path/hbase_jars_spark2/hbase-spark.jar,hdfs:///path/hbase_jars_spark2/scala-library.jar,hdfs:///path/hbase_jars_common/hbase-shaded-mapreduce-***VERSION NUMBER***.jar,hdfs:///path/hbase_jars_common/opentelemetry-api-***VERSION NUMBER***.jar,hdfs:///path/hbase_jars_common/opentelemetry-context-***VERSION NUMBER***.jar,hdfs:////path/hbase_jars_spark2/phoenix5-spark-shaded.jar