Flume Solr UUIDInterceptor Configuration Options
Flume can modify and drop events using Interceptors, which can be attached to any Flume source. The Solr UUIDInterceptor sets a universally unique 128-bit identifier (such as f692639d-483c-1b5f-cd61-183cb1726ae0) on each event.
Cloudera recommends assigning UUIDs to events as early as possible (for example, in the first Flume source of your data flow). This allows you to de-duplicate events that are duplicated as a result of replication or re-delivery in a Flume pipeline that is designed for high availability and high performance. If available, application-level UUIDs are preferable to auto-generated UUIDs because they enable subsequent updates and deletion of the document in Solr using that key. If application-level UUIDs are not present, you can use UUIDInterceptor to automatically assign UUIDs to document events.
The UUIDInterceptor supports the following configuration options (required options in bold):
|type||Must be set to the fully qualified class name (FQCN)org.apache.flume.sink.solr. morphline.UUIDInterceptor$Builder.|
|headerName||id||The name of the Flume header to use for setting the UUID.|
|preserveExisting||true||Determines whether to preserve existing UUID headers.|
|prefix||""||Specifies a string to prepend to each generated UUID.|
For examples, see BlobHandler and BlobDeserializer.