(Optional) Starting the Spark Thrift Server

« Prev

	Note
	The Spark Thrift Server automatically uses dynamic resource allocation. If you use this Spark application, you do not need to set up dynamic resource allocation.

To enable and start the Spark Thrift Server:

From SPARK_HOME, start the Spark SQL Thrift Server. Specify the port value of the Thrift Server (the default is 10015). For example:
su spark
./sbin/start-thriftserver.sh --master yarn-client --executor-memory 512m --hiveconf hive.server2.thrift.port=100015
Use this port when you connect via Beeline.

Kerberos Considerations

If you are installing the Spark Thrift Server on a Kerberos-secured cluster, the following instructions apply:

The Spark Thrift Server must run in the same host as HiveServer2, so that it can access the hiveserver2 keytab.
Edit permissions in /var/run/spark and /var/log/spark to specify read/write permissions to the Hive service account.
Use the Hive service account to start the thriftserver process.

	Note
	We recommend that you run the Spark Thrift Server as user `hive` instead of user `spark` (this supersedes recommendations in previous releases). This ensures that the Spark Thrift Server can access Hive keytabs, the Hive metastore, and data in HDFS that is stored under user `hive`.

	Important
	When the Spark Thrift Server runs queries as user `hive`, all data accessible to user `hive` will be accessible to the user submitting the query. For a more secure configuration, use a different service account for the Spark Thrift Server. Provide appropriate access to the Hive keytabs and the Hive metastore.

For Spark jobs that are not submitted through the Thrift Server, the user submitting the job must have access to the Hive metastore in secure mode (via kinit).

​(Optional) Starting the Spark Thrift Server

(Optional) Starting the Spark Thrift Server