Setting up a Spark data connection

Spark data connections within the same environment as Cloudera AI are automatically discovered, but you can also set up a connection manually. Follow this procedure to set up a Spark data connection.

In the Workbenches UI, select the link environment for the workbench you are using. This takes you to the Environments UI.
In Environments, select Data Lake > Cloud Storage tabs.
Select the directory path shown for Hive Metastore External Warehouse, and copy it.
In Project Settings > Data Connections, click New Connection.
Enter a name for the connection.
Select the type: Spark Data Lake
Paste the value you copied in step 3 into Datalake Hive Metastore External Warehouse Directory.
Click Create.

The data connection is available to users by default. To change availability, click the Available toggle.