Command Line Installation
Also available as:
PDF
loading table of contents...

(Optional) Starting the Spark 2 Thrift Server

To enable and start the Spark 2 Thrift Server:

  1. From SPARK_HOME, start the Spark 2 SQL Thrift Server. Specify the port value of the Thrift Server (the default is 10015). For example:

    su spark

    ./sbin/start-thriftserver.sh --master yarn-client --executor-memory 512m --hiveconf hive.server2.thrift.port=100015

  2. Use this port when you connect via Beeline.

Kerberos Considerations

If you are installing the Spark 2 Thrift Server on a Kerberos-secured cluster, the following instructions apply:

  • The Spark 2 Thrift Server must run in the same host as HiveServer2, so that it can access the hiveserver2 keytab.

  • Edit permissions in /var/run/spark2 and /var/log/spark2 to specify read/write permissions to the Hive service account.

  • Use the Hive service account to start the thriftserver process.

[Note]Note

We recommend that you run the Spark 2 Thrift Server as user hive instead of user spark (this supersedes recommendations in previous releases). This ensures that the Spark 2 Thrift Server can access Hive keytabs, the Hive metastore, and data in HDFS that is stored under user hive.

[Important]Important

When the Spark 2 Thrift Server runs queries as user hive, all data accessible to user hive is accessible to the user submitting the query. For a more secure configuration, use a different service account for the Spark 2 Thrift Server. Provide appropriate access to the Hive keytabs and the Hive metastore.

For Spark 2 jobs that are not submitted through the Thrift Server, the user submitting the job must have access to the Hive metastore in secure mode (using kinit).