PutBigQuery

Description:

Writes the contents of a FlowFile to a Google BigQuery table. The processor is record based so the schema that is used is driven by the RecordReader. Attributes that are not matched to the target schema are skipped. Exactly once delivery semantics are achieved via stream offsets.

Additional Details...

Tags:

google, google cloud, bq, bigquery

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display NameAPI NameDefault ValueAllowable ValuesDescription
GCP Credentials Provider ServiceGCP Credentials Provider ServiceController Service API:
GCPCredentialsService
Implementation: GCPCredentialsControllerService
The Controller Service used to obtain Google Cloud Platform credentials.
Project IDgcp-project-idGoogle Cloud Project ID
Supports Expression Language: true (will be evaluated using Environment variables only)
BigQuery API Endpointbigquery-api-endpointbigquerystorage.googleapis.com:443Can be used to override the default BigQuery endpoint. Default is bigquerystorage.googleapis.com:443. Format must be hostname:port.
Supports Expression Language: true (will be evaluated using Environment variables only)
Datasetbq.dataset${bq.dataset}BigQuery dataset name (Note - The dataset must exist in GCP)
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Table Namebq.table.name${bq.table.name}BigQuery table name
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Record Readerbq.record.readerController Service API:
RecordReaderFactory
Implementations: WindowsEventLogReader
JASN1Reader
EBCDICRecordReader
YamlTreeReader
CiscoEmblemSyslogMessageReader
ReaderLookup
AvroReader
SyslogReader
CSVReader
GrokReader
IPFIXReader
ParquetReader
JsonTreeReader
ExcelReader
ScriptedReader
JsonPathReader
XMLReader
Syslog5424Reader
CEFReader
Specifies the Controller Service to use for parsing incoming data.
Transfer Typebq.transfer.typeSTREAM
  • STREAM Use streaming record handling strategy
  • BATCH Use batching record handling strategy
Defines the preferred transfer type streaming or batching
Append Record Countbq.append.record.count20The number of records to be appended to the write stream at once. Applicable for both batch and stream types
Number of retriesgcp-retry-count6How many retry attempts should be made before routing to the failure relationship.
Skip Invalid Rowsbq.skip.invalid.rowsfalseSets whether to insert all valid rows of a request, even if invalid rows exist. If not set the entire insert request will fail if it contains an invalid row.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)

Relationships:

NameDescription
successFlowFiles are routed to this relationship after a successful Google BigQuery operation.
failureFlowFiles are routed to this relationship if the Google BigQuery operation fails.

Reads Attributes:

None specified.

Writes Attributes:

NameDescription
bq.records.countNumber of records successfully inserted

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

None specified.