ScanHBase

Description:

Scans and fetches rows from an HBase table. This processor may be used to fetch rows from hbase table by specifying a range of rowkey values (start and/or end ),by time range, by filter expression, or any combination of them. Order of records can be controlled by a property ReversedNumber of rows retrieved by the processor can be limited.

Tags:

hbase, scan, fetch, get

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display NameAPI NameDefault ValueAllowable ValuesDescription
HBase Client Servicescanhbase-client-serviceController Service API:
HBaseClientService
Implementation: HBase_2_ClientService
Specifies the Controller Service to use for accessing HBase.
Table Namescanhbase-table-nameThe name of the HBase Table to fetch from.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Authorizationshbase-fetch-row-authorizationsThe list of authorizations to pass to the scanner. This will be ignored if cell visibility labels are not in use.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Start rowkeyscanhbase-start-rowkeyThe rowkey to start scan from.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
End rowkeyscanhbase-end-rowkeyThe row key to end scan by.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Time range minscanhbase-time-range-minTime range min value. Both min and max values for time range should be either blank or provided.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Time range maxscanhbase-time-range-maxTime range max value. Both min and max values for time range should be either blank or provided.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Limit rowsscanhbase-limitLimit number of rows retrieved by scan.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Reversed orderscanhbase-reversed-orderfalse
  • true
  • false
Set whether this scan is a reversed one. This is false by default which means forward(normal) scan.
Max rows per flow filescanhbase-bulk-size0Limits number of rows in single flow file content. Set to 0 to avoid multiple flow files.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Filter expressionscanhbase-filter-expressionAn HBase filter expression that will be applied to the scan. This property can not be used when also using the Columns property. Example: "ValueFilter( =, 'binaryprefix:commit' )"
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Columnsscanhbase-columnsAn optional comma-separated list of "<colFamily>:<colQualifier>" pairs to fetch. To return all columns for a given family, leave off the qualifier such as "<colFamily1>,<colFamily2>".
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
JSON Formatscanhbase-json-formatfull-row
  • full-row Creates a JSON document with the format: {"row":<row-id>, "cells":[{"fam":<col-fam>, "qual":<col-val>, "val":<value>, "ts":<timestamp>}]}.
  • col-qual-and-val Creates a JSON document with the format: {"<col-qual>":"<value>", "<col-qual>":"<value>".
Specifies how to represent the HBase row as a JSON document.
Encode Character Setscanhbase-encode-charsetUTF-8The character set used to encode the JSON representation of the row.
Decode Character Setscanhbase-decode-charsetUTF-8The character set used to decode data from HBase.
Block Cacheblock-cachetrue
  • true
  • false
The Block Cache to enable/disable block cache on HBase scan.

Relationships:

NameDescription
successAll successful fetches are routed to this relationship.
failureAll failed fetches are routed to this relationship.
originalThe original input file will be routed to this destination, even if no rows are retrieved based on provided conditions.

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
hbase.tableThe name of the HBase table that the row was fetched from
mime.typeSet to application/json when using a Destination of flowfile-content, not set or modified otherwise
hbase.rows.countNumber of rows in the content of given flow file
scanhbase.results.foundIndicates whether at least one row has been found in given hbase table with provided conditions. Could be null (not present) if transfered to FAILURE

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

None specified.