Example: Connect a Spark session to Hive Metastore in a Data Lake

After the Admin sets up the correct permissions, you can access the Data Lake from your project, as this example shows.

Make sure you have access to a Data Lake containing your data.

Create a project in your ML Workspace.
Create a file named spark-defaults.conf, or update the existing file with the property:
- For S3: spark.yarn.access.hadoopFileSystems=s3a://STORAGE LOCATION OF ENV>
- For ADLS: spark.yarn.access.hadoopFileSystems=abfs://STORAGE CONTAINER OF ENV>@STORAGE ACCOUNT OF ENV>
Use the same location you defined in Data Access.
Start a session (Python or Spark) and start a Spark session.

Setting up the project looks like this:

Now you can run Spark SQL commands. For example, you can: