HDFS Sink Connector Properties Reference

HDFS Sink Connector Properties Reference.

The following table collects connector properties that are specific for the HDFS Sink Connector. For properties common to all sink connectors, see the upstream Apache Kafka documentation.

Property Name Description Type Default Value Accepted Values Recommended Values
hdfs.uri The Hadoop file system URI to connect to on the destination HDFS cluster. String None
hdfs.output The root directory on the HDFS cluster where all the output files will reside. The sub path has the following pattern: {topic}/{topic}_{partition}_{endoffset}.{file extension} String /tmp Any path on the HDFS file system where the role has read write permission.
hdfs.kerberos.authentication Enables or disables secure access to the HDFS cluster by authenticating with Kerberos. Boolean false true or false
hdfs.kerberos.user.principal The kerberos user principal. String null The host-dependent Kerberos principal assigned to the Kafka Connect role.
hdfs.kerberos.keytab.path The path to the Kerberos keytab file. String null In a Cloudera Manager provisioned environment, it’s recommended to use the Cloudera Manager Config Provider to automatically provision the path.
hdfs.kerberos.namenode.principal The kerberos name node principal. Required when the HDFS cluster has data encryption on. String null
hadoop.conf.path The path to the site specific Hadoop configuration XML files. Required when the HDFS cluster has data encryption on. String null
output.writer The output file writer which determines the type of file to be written to the HDFS cluster. The value of this property should be the FQCN of a class that implements the PartitionWriter interface. String null
  • com.cloudera.dim.kafka.connect.partition.writers.avro.AvroPartitionWriter
  • com.cloudera.dim.kafka.connect.partition.writers.json.JsonPartitionWriter
  • com.cloudera.dim.kafka.connect.hdfs.parquet.ParquetPartitionWriter
  • com.cloudera.dim.kafka.connect.partition.writers.txt.TxtPartitionWriter
com.cloudera.dim.kafka.connect.partition.writers.avro.AvroPartitionWriter
output.avro.passthrough.enabled Configures whether the output writer expects an Avro encoded Kafka Connect data record. Must match the configuration of value.converter.passthrough.enabled. Boolean true true or false True if input and output are both Avro.
value.converter The converter to be used to translate the value field of the source Kafka record into Kafka Connect Data format. String Inherited from Kafka Connect worker properties.
  • org.apache.kafka.connect.json.JsonConverter
  • org.apache.kafka.connect.storage.StringConverter
  • com.cloudera.dim.kafka.connect.converts.AvroConverter
com.cloudera.dim.kafka.connect.converts.AvroConverter
value.converter.schema.registry.url The URL to the Schema Registry server. String null true or false
value.converter.passthrough.enabled Configures whether the AvroConverter translates an Avro record into Kafka Connect Data or transparently passes the Avro encoded bytes as payload. Boolean true true or false True if input and output are both Avro.