Configure Phoenix-Spark connector using Cloudera Manager

When using the Phoenix-Spark connector, you need the Spark connector JAR file. You can find this JAR file in the following location: /opt/cloudera/parcels/CDH/lib/phoenix_connectors

  1. Go to the Spark service.
  2. Click the Configuration tab.
  3. If you are using the HBase service on the same cluster, ensure that the HBase service is set as a dependent of the Spark service.

    Locate the HBase Service property and select the checkbox next to it.

  4. Select Scope > Gateway.
  5. Select Category > Advanced.
  6. Locate the Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-defaults.conf property or search for it by typing its name in the Search box.
  7. Add the following properties to ensure that all required Phoenix and HBase platform dependencies are available on the classpath for the Spark executors and drivers:

    Phoenix-Spark JARs:

    spark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/phoenix_connectors/phoenix5-spark-[***VERSION***]-shaded.jar
    spark.driver.extraClassPath=/opt/cloudera/parcels/CDH/lib/phoenix_connectors/phoenix5-spark-[***VERSION***]-shaded.jar
  8. Enter a Reason for change, and then click Save Changes to commit the changes.
  9. Restart the role and service when Cloudera Manager prompts you to restart.
  • Before you can use Phoenix-Spark connector for your Spark applications, you must configure your Maven settings to have a repository that points to the repository at https://repository.cloudera.com/artifactory/public/org/apache/phoenix/phoenix5-spark/ and use the dependency:
    <dependency>
       <groupId>org.apache.phoenix</groupId>
       <artifactId>phoenix5-spark</artifactId>
       <version>[***VERSION EXAMPLE: 6.0.0.7.1.6.0-297***]</version>
       <scope>provided</scope>
    </dependency>
  • Enable your IDE by adding the following dependency to your build:
    <dependency>
        <groupId>org.apache.phoenix</groupId>
        <artifactId>phoenix5-spark</artifactId>
        <version>[***VERSION EXAMPLE: 6.0.0.7.1.6.0-297***]</version>
        <scope>provided</scope>
    </dependency>
  • Build a Spark application using the Phoenix-Spark connector with the dependencies that are present in the connector.
  • Build a Spark application using the dependencies that you provide when you run your application. Use the --jars /opt/cloudera/parcels/CDH/lib/phoenix_connectors/phoenix5-spark-[***VERSION***]-shaded.jar parameter when running the spark-submit command.