This is the documentation for CDH 5.0.x. Documentation for other versions is available at Cloudera Documentation.

Flume Solr BlobHandler Configuration Options

Flume has the capability to accept Flume Events by HTTP POST and GET. This is done with the help of HTTPSource.

By default HTTPSource splits JSON input into Flume events. As an alternative, Flume Solr BlobHandler is a handler for HTTPSource that returns an event that contains the request parameters as well as the Binary Large Object (BLOB) uploaded with this request. Note that this approach is not suitable for very large objects because it buffers the entire BLOB.

Flume Solr BlobHandler provides the following configuration options in the flume.conf file:

Property Name

Default

Description

handler

 

The FQCN of this class:
org.apache.flume.sink.
solr.morphline.BlobHandler

handler.maxBlobLength

100000000 (100 MB)

The maximum number of bytes to read and buffer for a given request.

For example, here is a flume.conf section for a HTTPSource with a BlobHandler for the agent named "agent":
agent.sources.httpSrc.type = org.apache.flume.source.http.HTTPSource
agent.sources.httpSrc.port = 5140
agent.sources.httpSrc.handler = org.apache.flume.sink.solr.morphline.BlobHandler
agent.sources.httpSrc.handler.maxBlobLength = 2000000000
agent.sources.httpSrc.interceptors = uuidinterceptor
agent.sources.httpSrc.interceptors.uuidinterceptor.type = org.apache.flume.sink.solr.morphline.UUIDInterceptor$Builder
agent.sources.httpSrc.interceptors.uuidinterceptor.headerName = id
#agent.sources.httpSrc.interceptors.uuidinterceptor.preserveExisting = false
#agent.sources.httpSrc.interceptors.uuidinterceptor.prefix = myhostname
agent.sources.httpSrc.channels = memoryChannel
Page generated September 3, 2015.