Creating an Ozone data connection
Cloudera AI supports data connections to Ozone file systems.
You can set up a manual connection using the provided snippet example. To connect to Ozone, you must use Spark 3.
Set the following parameters:
- Valid database and table name in the
describe formatted
SQL command.
from pyspark.sql import SparkSession
# Change to the appropriate Datalake directory location
DATALAKE_DIRECTORY = "s3a://your-aws-demo/"
spark = (
.config("spark.jars", "/opt/ozone-addon/jar/ozone-filesystem-hadoop3.jar")
.config("spark.yarn.access.hadoopFileSystems", DATALAKE_DIRECTORY)
spark.sql("show databases").show()
spark.sql("describe formatted <database_name>.<table_name>").show()