Flume Kudu Sink
Flume Kudu sink is a Flume sink that reads events from a channel and writes them to a Kudu table. If Kudu is installed on a node where the Flume agent runs, the Flume start script discovers it and puts the Kudu sink on the classpath of Flume, so it can be used without any additional environment configuration.
Kudu sink can be used as the following type in the Flume configuration: org.apache.kudu.flume.sink.KuduSink
For more information on the Flume Kudu sink, including all configuration parameters, see the Apache Kudu documentation.
a1.sinks.k1.type = org.apache.kudu.flume.sink.KuduSink a1.sinks.k1.masterAddresses = kudu.master.address.example.com a1.sinks.k1.tableName = mytable a1.sinks.k1.producer = org.apache.kudu.flume.sink.SimpleKuduOperationsProducer a1.sinks.k1.batchSize = 1000 a1.sinks.k1.kerberosPrincipal = myflumeprincipal a1.sinks.k1.kerberosKeytab = myflume.keytab
-
org.apache.kudu.flume.sink.AvroKuduOperationsProducer
This is an Avro serializer that generates one operation per event by deserializing the event body as an Avro record and mapping its fields to columns in a Kudu table.
Example:a1.sinks.k1.producer = org.apache.kudu.flume.sink.SimpleKuduOperationsProducer a1.sinks.k1.producer.operation = upsert a1.sinks.k1.producer.schemaPath = /tmp/myschema.json
For more information, see: Apache Kudu documentation.
-
org.apache.kudu.flume.sink.RegexpKuduOperationsProducer
This is an operations producer that generates one or more Kudu Insert or Upsert operation per Flume Event by parsing the event body as text using a custom regular expression. Values are coerced to the types of the named columns in the Kudu table using Java named-capturing groups in the regular expression.
Example:a1.sinks.k1.producer = org.apache.kudu.flume.sink.RegexpKuduOperationsProducer a1.sinks.k1.producer.pattern = (?<id>\\d+),(?<value>\\w+) a1.sinks.k1.producer.operation = upsert a1.sinks.k1.producer.unmatchedRowPolicy = IGNORE
For more information, see: Apache Kudu documentation.
-
org.apache.kudu.flume.sink.SimpleKeyedKuduOperationsProducer
This is a simple serializer that generates one Insert or Upsert per Event by writing the event body into a BINARY column. The pair (key column name, key column value) must be a header in the Event. The column name is configurable, but the column type must be a STRING. Multiple key columns are not supported.
Example:a1.sinks.k1.producer = org.apache.kudu.flume.sink.SimpleKeyedKuduOperationsProducer a1.sinks.k1.producer.operation = upsert a1.sinks.k1.producer.keyColumn = id a1.sinks.k1.producer.payloadColumn = value
For more information, see: Apache Kudu documentation.
-
org.apache.kudu.flume.sink.SimpleKuduOperationsProducer
This is a simple serializer that generates one Insert per Event by writing the event body into a BINARY column. The headers are discarded.
Example:a1.sinks.k1.producer = org.apache.kudu.flume.sink.SimpleKuduOperationsProducer a1.sinks.k1.producer.operation = upsert a1.sinks.k1.producer.payloadColumn = value
For more information, see: Apache Kudu documentation.