Running Hive on Spark
This section explains how to run Hive using the Spark execution engine. It assumes that the cluster is managed by Cloudera Manager.
Continue reading:
Configuring Hive on Spark
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
To configure Hive to run on Spark do both of the following steps:
- Configure the Hive client to use the Spark execution engine as described in Hive Execution Engines.
- Identify the Spark service that Hive uses. Cloudera Manager automatically sets this to the configured MapReduce or YARN service and the configured Spark service. See Configuring the Hive Dependency on a Spark Service.
Configuring the Hive Dependency on a Spark Service
By default, if a Spark service is available, the Hive dependency on the Spark service is configured. To change this configuration, do the following:
- Go to the Hive service.
- Click the Configuration tab.
- Search for the Spark On YARN Service. To configure the Spark service, select the Spark service name. To remove the dependency, select none.
- Click Save Changes to commit the changes.
- Go to the Spark service.
- Add a Spark gateway role to the host running HiveServer2.
- Return to the Home page by clicking the Cloudera Manager logo.
- Click to invoke the cluster restart wizard.
- Click Restart Stale Services.
- Click Restart Now.
- Click Finish.
- In the Hive client, configure the Spark execution engine.
Configuring Hive on Spark for Performance
For the configuration automatically applied by Cloudera Manager when the Hive on Spark service is added to a cluster, see Hive on Spark Autoconfiguration.
For information on configuring Hive on Spark for performance, see Tuning Hive on Spark.
Troubleshooting Hive on Spark
Delayed result from the first query after starting a new Hive on Spark session
Exception in HiveServer2 log and HiveServer2 is down
Out-of-memory error
Symptom
In the log you see an out-of-memory error similar to the following:15/03/19 03:43:17 WARN channel.DefaultChannelPipeline: An exception was thrown by a user handler while handling an exception event ([id: 0x9e79a9b1, /10.20.118.103:45603 => /10.20.120.116:39110] EXCEPTION: java.lang.OutOfMemoryError: Java heap space) java.lang.OutOfMemoryError: Java heap space