List of required configuration parameters for the S3 to Databricks ReadyFlow
When deploying the S3 to Databricks ReadyFlow, you have to provide the following parameters. Use the information you collected in Prerequisites.
Parameter Name | Description |
---|---|
CDP Workload User | Specify the Cloudera machine user or workload user name that you want to use to authenticate to the object stores and to the schema registry. Ensure this user has the appropriate access rights to the object store locations and to the schema registry in Ranger or IDBroker. |
CDP Workload User Password | Specify the password of the Cloudera machine user or workload user you are using to authenticate against the object stores and the schema registry. |
CDPEnvironment | The Cloudera Environment configuration resources. |
Destination S3 Bucket | Specify the name of the destination S3 bucket you want to write
to. The full path will be constructed out of s3a://#{Destination S3
Bucket}/#{Destination S3 Path} |
Destination S3 Path | Specify the path within the destination bucket where you want to
write to. Make sure that the path starts with "/ ". The path must end with
the destination Databricks Table Id. The full path is constructed out of
s3a://#{Destination S3 Bucket}/#{Destination S3 Path} |
Partition Column | Specify the name of the column used to partition your destination Databricks table. This ReadyFlow only supports a single partition column. |
Partition Column Exists | Specify whether the destination Databricks column is partitioned. The default value is YES. |
Schema Name | Specify the schema name to be looked up in the Schema Registry used to parse the source files. |
Schema Name 2 | If your Databricks table is partitioned, specify the name of the modified schema to be looked up in the Schema Registry. This schema should not include the partition column field. |
Schema Registry Hostname | Specify the hostname of the Schema Registry you want to connect to. This must be the direct hostname of the Schema Registry itself, not the Knox Endpoint. |
Source S3 Bucket | Specify the name of the source S3 bucket you want to read from. |
Source S3 Path | Specify the path within the source bucket where you want to read files from. |