Amazon S3 Sink
Learn more about the Amazon S3 Sink connector
The Amazon S3 Sink connector allows users to stream Kafka data into S3 buckets.
Configuration example
A simple configuration example for the Amazon S3 Sink connector.
The following is a simple configuration example for the Amazon S3 Sink connector. Short descriptions of the properties set in this example are also provided. For a full properties reference, see the Amazon S3 Sink properties reference.
{ "aws.s3.bucket": "bring-me-the-bucket", "aws.s3.service_endpoint": "http://myendpoint:9090/", "aws.access_key_id": "EXAMPLEID", "aws.secret_access_key": “EXAMPLEKEY", "connector.class": "com.cloudera.dim.kafka.connect.s3.S3SinkConnector", "tasks.max": 1, "key.converter": "org.apache.kafka.connect.storage.StringConverter", "value.converter": "com.cloudera.dim.kafka.connect.converts.AvroConverter", "value.converter.passthrough.enabled": true, "value.converter.schema.registry.url": "http://schema-registry:9090/api/v1", "topics": "avro_topic", "output.storage": "com.cloudera.dim.kafka.connect.s3.S3PartitionStorage", "output.writer": "com.cloudera.dim.kafka.connect.partition.writers.avro.AvroPartitionWriter", "output.avro.passthrough.enabled": true }
aws.s3.bucket
- Target S3 bucket name.
aws.s3.service_endpoint
- Target S3 host and port.
aws.access_key_id
-
The AWS secret key ID used for authentication.
aws.secret_access_key
-
The AWS secret access key used for authentication.
connector.class
- Class name of the Amazon S3 Sink connector.
tasks.max
- Maximum number of tasks.
key.converter
- The converter capable of understanding the data format of the key of each record on this topic.
value.converter
- The converter capable of understanding the data format of the value of each record on this topic.
value.converter.passthrough.enabled
- This property controls whether or not data is converted into the Kafka Connect intermediate data format before writing into an output file. Because in this example the input and output format is the same, the property is set to true, that is, data is not converted.
value.converter.schema.registry.url
- The URL to Schema Registry. This is a mandatory property if the topic has records encoded in Avro format.
topics
- List of topics to consume data from.
output.storage
- The S3 storage implementation class.
output.writer
- Determines the output file format. Because in this example the output format is Avro,
AvroPartitionWriter
is used. output.avro.passthrough.enabled
- This property has to match the configuration of the
value.converter.passthrough.enabled
property because both the input and output formats are Avro.
Amazon S3 Sink properties reference
Amazon S3 Sink connector properties reference.
The following table collects connector properties that are specific for the Amazon S3 Sink Connector. For properties common to all sink connectors, see the upstream Apache Kafka documentation.
Property Name | Description | Type | Default Value | Accepted Values | Recommended Value |
---|---|---|---|---|---|
aws.s3.bucket | The target S3 bucket name. | String | none | Any valid S3 bucket name. | |
aws.s3.service_endpoint | The target S3 host and port. | String | none | Any valid S3 endpoint. | |
aws.access_key_id | The AWS secret key ID to authenticate. | String | none | Any valid secret key issued by AWS. | |
aws.secret_access_key | The AWS secret access key to authenticate. | String | none | Any valid access key issued by AWS. | |
value.converter | Value conversion class. | String | none | com.cloudera.dim.kafka.connect.converts.AvroConverter | |
value.converter.passthrough.enabled | Configures whether the AvroConverter translates an Avro record into Kafka Connect Data or transparently passes the Avro encoded bytes as payload. | Boolean | true | true, false | True if input and output are both Avro. |
value.converter.schema.registry.url | The URL to the Schema Registry server. | String | none | ||
output.storage | The S3 storage implementation class. | String | none | com.cloudera.dim.kafka.connect.s3.S3PartitionStorage | |
output.writer | The output file writer which determines the type of file to be written.
The value of this property should be the FQCN of a class that implements the
PartitionWriter interface. |
String | none |
|
com.cloudera.dim.kafka.connect.partition.writers.avro.AvroPartitionWriter |
output.avro.passthrough.enabled |
Configures Whether the output writer expects an Avro encoded Kafka
Connect data record. Must match the configuration
of |
Boolean | none | true, false | True if input and output are both Avro. |