Example: Connect a Spark session to Hive Metastore in a Data Lake
After the Admin sets up the correct permissions, you can access the Data Lake from your project, as this example shows.
Make sure you have access to a Data Lake containing your data.
- Create a project in your ML Workspace.
Create a file named spark-defaults.conf, or update the existing file with the
Use the same location you defined in Data Access.
- For S3: spark.yarn.access.hadoopFileSystems=s3a://STORAGE LOCATION OF ENV>
- For ADLS: spark.yarn.access.hadoopFileSystems=abfs://STORAGE CONTAINER OF ENV>@STORAGE ACCOUNT OF ENV>
- Start a session (Python or Spark) and start a Spark session.
Setting up the project looks like this:
Now you can run Spark SQL commands. For example, you can:
- Create a database foodb.
- List databases and tables.
- Create a table bartable.
- Insert data into the table.
- Query the data from the table.