Using Apache Phoenix to store and access data
Also available as:
PDF

Considerations for setting up spark

Set up Spark based on your requirement. Following are some of the considerations that you will have to take into account.

  • You should configure the 'spark.executor.extraClassPath' and ‘spark.driver.extraClassPath’ in spark-defaults.conf file to include the ‘phoenix-<version>-client.jar’ to ensure that all required Phoenix and HBase platform dependencies are available on the classpath for the Spark executors and drivers.

  • HDP Version Spark Version JARs to add (order dependent)
    >=2.6.2 (including 3.0.0) Spark 2

    phoenix-<version>-spark2.jar

    phoenix-<version>-client.jar

    >=2.6.2 (including 3.0.0) Spark 1

    phoenix-<version>-spark.jar

    phoenix-<version>-client.jar

    2.6.0-2.6.1 Spark 2 Unsupported: upgrade to at least HDP-2.6.2
    2.6.0-2.6.1 Spark 1

    phoenix-<version>-spark.jar

    phoenix-<version>-client.jar

    2.5.x Spark 1 phoenix-<version>-client-spark.jar
  • To enable your IDE, you can add the following provided dependency to your build:

    
    <dependency><groupId>org.apache.phoenix</groupId>
    <artifactId>phoenix-spark</artifactId>
    <version>${phoenix.version}</version>
    <scope>provided</scope></dependency>