Enabling Spark 3 engine in Hue

Hue leverages Apache Livy 3 to support Spark SQL queries in Hue on the Apache Spark 3 engine. To enable the Spark 3 engine, specify the Livy server URL in the Hue Advanced Configuration Snippet using Cloudera Manager, and enable the Spark SQL notebook. Livy for Spark 3 and Spark 3 services are installed when you create a Data Hub cluster with the Data Engineering cluster template.

  1. Log in to Cloudera Manager as an Administrator.
  2. Go to Clusters > HDFS > Configuration and add the following lines in the Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml field:
    <property>
        <name>hadoop.proxyuser.hue.groups</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.hue.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.spark.groups</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.spark.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.livy.groups</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.livy.hosts</name>
        <value>*</value>
    </property> 
    
  3. Click Save Changes.
  4. Go to Clusters > Livy for Spark 3 service > Configuration and add the following configurations:
    1. Add the hue user in the Admin Users (livy.superusers) field.
    2. Go to the HMS Service field and select Hive.
    3. Click Save Changes.
  5. Go to Clusters > SPARK_ON_YARN > Configuration > Admin Users, add hue to the list of admin users (spark.history.ui.admin.acls) and click Save Changes.
  6. Go to Clusters > SPARK > Configuration > Admin Users, add hue to the list of admin users (spark.history.ui.admin.acls) and click Save Changes.
  7. Go to Clusters > SPARK 3 > Configuration > Admin Users, add hue to the list of admin users (spark.history.ui.admin.acls) and click Save Changes.
  8. Go to Clusters > Hue > Configuration and enter the following lines in the Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini field and click Save Changes:
    [desktop]
    app_blacklist=zookeeper, pig #custom list of blocked apps
    [spark]
    #This is not a thrift server port
    #If this TLS/SSL is enabled then check to see whether the livy url is on https or http and modify the url accordingly.
    livy_server_url=http(s)://[***LIVY-FOR-SPARK3-SERVER-HOST***]:[***LIVY-FOR-SPARK3-SERVER-PORT***]   
    ssl_cert_ca_verify=false
    security_enabled=true
    [notebook]
    [[interpreters]]
    [[[sparksql]]]
    name=Spark SQL
    interface=livy
  9. Restart the affected services.
You can now select the Spark SQL dialect on the Hue editor and run Spark queries from Hue.