Hue leverages Apache Livy 3 to support Spark SQL queries in
Hue on the Apache Spark 3 engine. To enable the Spark 3 engine, specify the Livy server
URL in the Hue Advanced Configuration Snippet using Cloudera Manager, and enable the
Spark SQL notebook. Livy for Spark 3 and Spark 3 services are installed when you create
a Data Hub cluster with the Data Engineering cluster template.
Log in to Cloudera Manager as an Administrator.
Go to Clusters > HDFS > Configuration and add the following lines in the Cluster-wide
Advanced Configuration Snippet (Safety Valve) for core-site.xml
field:
Go to Clusters > Livy for Spark 3 service > Configuration and add the following configurations:
Add the hue user in the Admin
Users (livy.superusers) field.
Go to the HMS Service field and select
Hive.
Click Save Changes.
Go to Clusters > SPARK_ON_YARN > Configuration > Admin Users, add hue to the list of admin users
(spark.history.ui.admin.acls) and click Save
Changes.
Go to Clusters > SPARK > Configuration > Admin Users, add hue to the list of admin users
(spark.history.ui.admin.acls) and click Save
Changes.
Go to Clusters > SPARK 3 > Configuration > Admin Users, add hue to the list of admin users
(spark.history.ui.admin.acls) and click Save
Changes.
Go to Clusters > Hue > Configuration and enter the following lines in the Hue Service
Advanced Configuration Snippet (Safety Valve) for
hue_safety_valve.ini field and click Save
Changes:
[desktop]
app_blacklist=zookeeper, pig #custom list of blocked apps
[spark]
#This is not a thrift server port
#If this TLS/SSL is enabled then check to see whether the livy url is on https or http and modify the url accordingly.
livy_server_url=http(s)://[***LIVY-FOR-SPARK3-SERVER-HOST***]:[***LIVY-FOR-SPARK3-SERVER-PORT***]
ssl_cert_ca_verify=false
security_enabled=true
[notebook]
[[interpreters]]
[[[sparksql]]]
name=Spark SQL
interface=livy
Restart the affected services.
You can now select the Spark SQL dialect on the Hue
editor and run Spark queries from Hue.